How to do shape recognition in video feed?

Hi, I’m looking to have a number of cubes with different irregular polygons on top of them on a table, with a camera looking down onto them. I’d like to get the position and ID of each cube, based on the shapes on them. Is this possible with some sort of shape tracking, computer vision library? Thank you for the help in advance.

Hi! Do you have anything already in mind e.g. a photo/image of those shapes?

RGB camera based tracking can be hard. I’m an enthusiastic supporter of open source software, but if you are new to computer vision, first looking into tools like vuforia or arkit may help knowing what’s possible and what isn’t (right, they are not for Processing - but sometimes it helps using other tools and combining with e.g. OSC communication can be a faster way to achieve it).

You can of course develop something with OpenCV like feature matching but if you are not familiar with computer vision (or even if you are familiar with it) you can easily spend hours and hours just tuning parameters and end up not getting good results (which used to happen to me)

Hi, thanks for the advice. I have something like this feature in BoofCV in mind, as described here:

If only it could not only detect the black polygons, but also differentiate between them (ie. return their type) and their 2D screen position…

Cool, I never heard of this. Is there a reason why you don’t want to use this library?
http://boofcv.org/index.php?title=Tutorial_Processing

I would happily if it could also detect the position and type of the black polygons. I’ve used BoofCV for other projects for different use cases and it’s pretty solid.

I haven’t looked at the code but I’m surprised if it doesn’t give you the positions if they claim sub pixel precision

looking at this example, doesn’t it give you complete polygon info on the screen space…?

1 Like

Oh I was looking at another piece of code, where I didn’t see this. Thank you. Will look into it this week and report back.

1 Like

Right, so yes, thank you, this does indeed work. It gives you the vertices of the recognized polygon and from those, one can calculate the centroid of the polygon in one way or another.

One way the type of each polygon, as if they were unique markers, can be deduced is by getting the # of vertices each polygon has. This depends on the precision of the readings, of course.

1 Like

another tip is to give ids to the recognized shapes, check the centroids from the previous frame and inherit the id from the closest shape. But it all depends on the situation (if there’s occlusion etc). An example is ofxCv from openFrameworks

1 Like

Yes, that would work for tracking them by id. What I was saying, though, is assigning the same id to quads, same other id to triangles, same third id to hexagons each time they appear. I.e. a kind of marker recognition.