How to do shape recognition in video feed?

samuset · March 10, 2021, 7:10pm

Hi, I’m looking to have a number of cubes with different irregular polygons on top of them on a table, with a camera looking down onto them. I’d like to get the position and ID of each cube, based on the shapes on them. Is this possible with some sort of shape tracking, computer vision library? Thank you for the help in advance.

micuat · March 10, 2021, 8:50pm

Hi! Do you have anything already in mind e.g. a photo/image of those shapes?

RGB camera based tracking can be hard. I’m an enthusiastic supporter of open source software, but if you are new to computer vision, first looking into tools like vuforia or arkit may help knowing what’s possible and what isn’t (right, they are not for Processing - but sometimes it helps using other tools and combining with e.g. OSC communication can be a faster way to achieve it).

You can of course develop something with OpenCV like feature matching but if you are not familiar with computer vision (or even if you are familiar with it) you can easily spend hours and hours just tuning parameters and end up not getting good results (which used to happen to me)

samuset · March 11, 2021, 9:52am

Hi, thanks for the advice. I have something like this feature in BoofCV in mind, as described here:

If only it could not only detect the black polygons, but also differentiate between them (ie. return their type) and their 2D screen position…

micuat · March 11, 2021, 11:02am

Cool, I never heard of this. Is there a reason why you don’t want to use this library?
http://boofcv.org/index.php?title=Tutorial_Processing

samuset · March 11, 2021, 11:14am

I would happily if it could also detect the position and type of the black polygons. I’ve used BoofCV for other projects for different use cases and it’s pretty solid.

micuat · March 11, 2021, 11:18am

I haven’t looked at the code but I’m surprised if it doesn’t give you the positions if they claim sub pixel precision

micuat · March 11, 2021, 7:41pm

looking at this example, doesn’t it give you complete polygon info on the screen space…?

github.com

lessthanoptimal/BoofProcessing/blob/master/examples/PolygonFitting/PolygonFitting.pde#L61


  noFill();
  strokeWeight(3);
  stroke(255, 0, 0);

  // Draw each polygon
  for ( List<Point2D_I32> poly : polygons ) {
    if( poly.size() == 0 )
      continue;
    beginShape();
    for ( Point2D_I32 p : poly) {
      vertex( p.x, p.y );
    }
    // close the loop
    Point2D_I32 p = poly.get(0);
    vertex( p.x, p.y );
    endShape();
  }
}

samuset · March 15, 2021, 9:32am

Oh I was looking at another piece of code, where I didn’t see this. Thank you. Will look into it this week and report back.

samuset · March 17, 2021, 4:49pm

Right, so yes, thank you, this does indeed work. It gives you the vertices of the recognized polygon and from those, one can calculate the centroid of the polygon in one way or another.

One way the type of each polygon, as if they were unique markers, can be deduced is by getting the # of vertices each polygon has. This depends on the precision of the readings, of course.

micuat · March 17, 2021, 8:27pm

another tip is to give ids to the recognized shapes, check the centroids from the previous frame and inherit the id from the closest shape. But it all depends on the situation (if there’s occlusion etc). An example is ofxCv from openFrameworks

github.com

kylemcdonald/ofxCv/blob/master/libs/ofxCv/include/ofxCv/Tracker.h

/*
 the tracker is used for tracking the identities of a collection of objects that
 change slightly over time. example applications are in contour tracking and
 face tracking. when using a tracker, the two most important things to know are
 the persistence and maximumDistance. persistence determines how many frames an
 object can last without being seen until the tracker forgets about it.
 maximumDistance determines how far an object can move until the tracker
 considers it a new object.
 
 the default trackers are for cv::Rect and cv::Point2f (RectTracker and
 PointTracker). to create a new kind of tracker, you need to add a
 trackingDistance() function that returns the distance between two tracked
 objects.
 
 the tracking algorithm calls the distance function approximately n^2 times.
 it then filters the distances using maximumDistance, which can significantly
 reduce the possible matches. then it sorts the distance using std::sort, which
 runs in nlogn time. the primary bottleneck for most data is the distance
 function. in practical terms, the tracker can become non-realtime when
 tracking more than a few hundred objects. to optimize the tracker, consider

This file has been truncated. show original

samuset · March 18, 2021, 1:02pm

Yes, that would work for tracking them by id. What I was saying, though, is assigning the same id to quads, same other id to triangles, same third id to hexagons each time they appear. I.e. a kind of marker recognition.

Topic		Replies	Views
Track balls by color and shape with camera and OpenCV Libraries	4	1464	May 17, 2019
Marker position on screen Project Guidance	3	703	May 8, 2019
Any one interesting in creating projects with scene recognition and 3D reconstruction? Development	0	299	January 20, 2022
Irregular shapes collision detection Coding Questions	8	2696	April 4, 2022
Detect Mous position inside vertex shape Coding Questions	1	303	February 14, 2022

How to do shape recognition in video feed?

Related topics