Deep Vision - Machine Learning Computer Vision for Processing (Library)

Hello Together

I’m happy to present you a project I’ve been working on for over a year now. With OpenCV for Processing we have a good base library for traditional image processing. However, these algorithms are more and more pushed into the background by machine learning.

Therefore I started a year ago to develop a library that (mainly) uses the Deep Neural Network (DNN) module of OpenCV to bring machine learning, especially CNN into the Processing world.

The library is not about training these networks, but about inferencing (executing a prediction) them.

:zap: The API of the library can and will still change, because it is just a pre-release.


Example: SSD MobileNet & Lightweight Open Pose

Installation

The library can be downloaded as usual from the Processing contribution manager.

Networks

At the moment, more than 25 different networks are implemented, from YOLO to Lightweight OpenPose to MIDAS. All of them have pre-trained weights. These networks are divided into different categories:

  • Object Detection :sparkles:
  • Object Segmentation
  • Object Recognition :blue_car:
  • Keypoint Detection :woman_playing_handball:t2:
  • Classification :cat2:
  • Depth Estimation :dark_sunglasses:
  • Image Processing

To keep the library small I developed a repository system that automatically downloads the files you need for the chosen network.

Performance

The implementation was optimized for the CPU and many networks run with 20-30 FPS on a good CPU. But it is also possible to use the CUDA backend of OpenCV. There you have to download a special version of the library, which only works on Linux x86/x64 and Windows and is quite big (2-5 GB).

Next Steps

At the moment I’m working on improving the speed on the CPU by optimizing the pre and post process. I am also experimenting with ONNX directly to have a second inference engine.

Additionally I’m still working on the documentation and more examples and I would be very happy if someone wants to contribute to it.

Thank you very much for testing & feedback.

22 Likes

This is amazing, really. I’ve been looking for something like this! :tada:

1 Like

This is amazing! Thank you so much. I will be using this to teach my students. Fantastic work

1 Like

Hi can I ask why i getting a red line at network.setup it saying :

OpenCV(4.5.1) modules\dnn\src\torch\THDiskFile.cpp:286: error: (-2:Unspecified error) read error: read 659639 blocks instead of 780300 in function ‘TH::THDiskFile_readFloat’

I did go though your github but I not sure why I getting this error, am I missing something?

I need more information about how you are using the library (code) and what are you trying to do. Otherwise I can not help you.

But usually if there is a problem with reading the weights, they are not downloaded correctly. I already implemented a behaviour to prevent this, but it’s not released yet.

To delete broken packages, use the following line of code before setting up the network:

deepVision.clearRepository()

Hello, I want use the library to play a video when a face is detected in the webcam and change the video to another one when there is no face in front of webcam.

I tried understanding but I am unable to declare the case using the syntax, when no face is detected in webcam.

For instance I want say if (no face is detected) { play video1 }; else { play video2}

How should I go about it

Check out the face-detection example: If you want to check if a face is currently present, just check how many faces have been detected:

if(detections.size() > 0) {
    // face is present
} else {
    // no face
}
1 Like

Hi! I´m exploring the Deep Vision library. I made a model using teachable machine and it kind of worked with Deep Vision using SSDMobileNetwork network. The thing is it only picks up one of the labels. Any ideas? What do you reckon? Thank you in advanced! This is AMAZING work!

Hi Florian, My name is Carlos Vaz. Iam a professor at a Federal University in Brazil. We are developing research that has the goal of using aerial drone images to count people and cars from top in open spaces. I found your library on Github and I want to know if it is possible to train it to do this task.

The library is an inferencing library which does not contain any support for training. But as described in the readme, it is of course possible to train a network and use it later in the library. But for areal object detection I would use a specific network anyway (for example), because the detection anchors and grids in the default networks are usually too big for small objects.

I would recommend you first try out the default networks (YOLO & MaskRCNN) and see if it works. It really depends on your images (how far away, which angle and so on).

1 Like

Would it be possible to share your trained model with me so I could test it?

1 Like

yes it would be nice to implement custom train datasets , atleast read the .pt file generated from the train data sets that can be implemented

1 Like

It is possible to load your own datasets, each network just expects a model and weight file. The DeepVision class is just a helper to download pre-trained weights, but you can initialize every network your own and hand it over your own weights.

YOLONetwork network = new YOLONetwork(
        null, // for yolov5 the weights already contains the model in onnx format
        new Path("path-to-your-onnx"), // trained weights of the model
        640, 640, // inference size
        true
);

network.loadLabels(new Path("path to a text file with labels"));
network.setTopK(100);
1 Like

ok . where do I place the custom.pt file and a means to convert it to onnx
i might need help to write pde script for this custom data set thing

id manage to get it to work standalone on python just have no idea on how to implement on processing

1 Like

Hello. nice to meet you
Thank you for the reply. Thank you for your kindness.
As far as I know, it is currently up to the ‘yolo-v4’ version. I know that yolo-v5 is not applicable.

@cansik
Dear cansik
Thank you for uploading every time.
Isn’t it correct that yolo-v5 doesn’t work in the current library like the above? Thank you for your answer.

Exporting a trained file to the ONNX format is described in the yolov5 repository: TFLite, ONNX, CoreML, TensorRT Export · Issue #251 · ultralytics/yolov5 · GitHub

Implementing YOLOv5 was a bit of a hack to re-use the already existing YOLONetwork class, that is why at the moment we have to pass a null value into the model. I am working on a cleaned up version of that.

1 Like

YOLOv5 is implemented in version 0.9.0: deep-vision-processing/YOLOv5.pde at master · cansik/deep-vision-processing · GitHub

1 Like

ah I see , so I would be using Yolov4 for the mean time . also I have problem exporting Pytorch to Onnx as im getting error . some wat similar to this ONNX cannot be exported · Issue #2692 · ultralytics/yolov5 · GitHub

@cansik

Thank you very much. thank you.

kinda want to ask which verson of Yolov did you generate the Onnx file from? because ive been getting errors from my 2021 version of yolov 5 and its the only one i manage to convert pt to onnx im getting error on Yolov5 v7