Hey there! My name is Sam and I am writing my thesis surrounding the subject of deformable displays and 3D interactions. My setup (at the moment) has a Kinect behind a textile screen that tracks one point on the screen and sends this information to processing. So at the moment I am able to track the x, y and z location and I can work with that. (Still struggling with trying to track two points though).
However, I would like to track more kind of interactions like scale, grab, rotate … Hence my question; has anyone ever been able to do this before?
Would be of great help because of the coronavirus has an impact on getting lessons surrounding programming.
[DISCLAIMER, A 10YR OLD IS NOT AN EXPERT IN ANY WAY.]
Short answer: Too specific for me personally to recall anything of this sort.
Long answer
Tracking gestures that involve arm only movement (translate, scale, rotate… etc) would be relatively okay;
Tracking gestures that involve one handed movement (one-handed scale, one-handed rotation… etc) would be quite hard;
No one has been able to generalize hand gesture recognition.
Yes it is possible, but you need to extend the setup a bit. I have recently created a library to run ML algorithms for various kind of image detection tasks. One network can be used for hand detection, which then can be used to extract the image of the hand. You can of course use your kinect, but for this you will need the extracted image of the hand.
This image then can be fed into an image classifier, which then tries to recognise the gesture (thumbs up and so on). Of course it is also possible to detect patterns in movement (check my second answer).
I have created a simple example which uses my library, together with wekinator to classify hand images. The region of interest of the hand is converted to a binary image, where all the skin-tone (only wester & asian skin tones atm) parts get white and everything else black. This is done by extracting the hue of the skin tone you would like to track.
Then the image will be resized to 16x16 pixels and sent to wekinator. Wekinator is then trained to seperate those 416 pixels into two classes: thumb up & normal.
Just realized that I miss-understood your question a bit, but still, wekinator can help you recognize moving patterns. Have a look at the following video which explains dynamic time warping:
Alright thank you very much! Do you think Wekinator can help me recognize the moving patterns of the pvector.x and pvector.y location of my KinectTracker? Would be awesome!