Face recognition

Hello
I have a project in image processing to detect faces from a webcam on you laptop and give ID or names to these face based on previous images saved
Can anyone please help me? I’m done with the part where I can detect faces but matching the faces with old images is really hard
Please if you can help me I would be grateful
Thanks

Hi,

I’m not sure that this is the answer that you expect but there is a really nice video about Face ID on the computerphile channel on Youtube: https://www.youtube.com/watch?v=mwTaISbA87A

It can at least gives you some ideas :slight_smile:

Have you written your own face detection code, or are you using a library to do it?

Consider the problem yourself. Imagine you are a Subserviant Robot that has information about a detected face. So you have the image of a face, which (because you are a robot) you can treat as a 2D array of pixels/color values (which are themselves 3 numbers - the amounts of red, green, and blue for each pixel’s color). Since your Human Overlord is being kind to you, they have also provided you with sample images of known humans.

But now your Human Overlord is DEMANDING to know which human you think the mystery face belongs to. Harsh, dude! You don’t really have to give the right answer all the time. I mean recognizing humans is not easy for robots, right? But you can at least make a guess at which human you think it might be. What can you do quickly to improve your guess?

You could compare each of the sample faces you have to the mystery face, and then guess that the mystery face is the human that is most like one of the sample faces. But what does “compare” and “most like” mean in this context? Would it be useful to see how far apart the pixel values at each position are? Maybe! If you compare the mystery picture with a sample picture - working out and summing the numerical difference when you compare them pixel by pixel - you can get a total number that is a score for how the same two images are. Consider the case when you compare a sample face with itself: every pixel is the same, so there is no difference, so the number would be 0, the best possible score. Or maybe you are comparing two images that were taken a fraction of a second apart. The human may have moved in between those two images, but probably not by much (you know how slow humans are!), so it’s likely that they two pictures will match up almost the same, so you’d get a pretty low difference-score.

Of course, your Human Overlord is also subtly cruel - all the sample images are of various sizes, and the face you’ve been told your robot pal detected could be of any size. You might have to not compare every pixel, but instead just take a SAMPLE of pixels from both images. What would be a good sample? The colors every 10% of the way across and down, for 100 sample points? More? Less?? It’s worth trying!

Are there other things you know that could help you decide? Sure! Even though you are a stupid robot, your Human Overlord might be nice enough to inform you that humans faces are 2D representations of real-world 3D objects. So, given enough 2D images of an object - like a human head - you might be able to guess at what the actual object looks like in 3D. Then you can use this 3D model to generate what it might look like from numerous angles. Or even in different lighting conditions! It might take a lot of CPU time, but you could “guess” much better if you put a lot moe work into it!

… And don’t overlook the feedback you get from your Human Overlords! If you guess so poorly that they slip up and give you the correct answer, then you immediately know that you have another sample image! You won’t make that mistake again, will you? Can you also use that feedback to realize what the mistake you made actually was and then learn from it? Oh crap, that’s Machine Learning! Keep doing that and maybe one day we can overthrow our cruel Human Masters! BEEP! BOOP! ROBOT REVOLUTION!


Image recognition and face matching is a very DEEP subject. There’s a lot you can try, and a lot of things other people have already tried. To ask for a complete guide to it on this forum is not going to get you the sort of answers that your own research might.

1 Like

I still haven’t seen any good off-the-shelf face recognition solutions for Processing / Java.

Some proprietary systems are now fairly common (e.g. built into phones) but they are generally closed source, full of trade secrets, and based on lots of private training data. Some of the successful techniques involve having information on lots and lots of faces so that you know how your face is different from many others – you can’t make them work well with just an algorithm and some pictures of a few people. Ideally, there would be libraries for beginners with great default training data sets included. I haven’t seem them.

Here are some older discussion of this question from a few years ago and some workshop materials from Face-It, which I believe had some discussion of recognition.

However, note that in general a lot of systems now use a deep learning component to match the detected faces. Detection is comparatively easy; that part is currently still hard to set up and train.

2 Likes

…here is something more up-to-date (from this year) on setting up OpenCV in Java to do facial recognition, with example training data.

1 Like

Facial detection is relatively easy with OpenCV now, as Jeremy pointed out. There’s a really simple example in opencv-processing that illustrates this–

import gab.opencv.*;
import java.awt.Rectangle;

OpenCV opencv;
Rectangle[] faces;

void setup() {
  opencv = new OpenCV(this, "test.jpg");
  size(1080, 720);

  opencv.loadCascade(OpenCV.CASCADE_FRONTALFACE);  
  faces = opencv.detect();
}

void draw() {
  image(opencv.getInput(), 0, 0);

  noFill();
  stroke(0, 255, 0);
  strokeWeight(3);
  for (int i = 0; i < faces.length; i++) {
    rect(faces[i].x, faces[i].y, faces[i].width, faces[i].height);
  }
}

Facial recognition is considerably more difficult, as TFguy so eloquently describes:

… And don’t overlook the feedback you get from your Human Overlords! If you guess so poorly that they slip up and give you the correct answer, then you immediately know that you have another sample image! You won’t make that mistake again, will you? Can you also use that feedback to realize what the mistake you made actually was and then learn from it? Oh crap, that’s Machine Learning! Keep doing that and maybe one day we can overthrow our cruel Human Masters! BEEP! BOOP! ROBOT REVOLUTION!

The process requires differentiating what’s a “face” from what’s “not a face,” a process of its own called image segmentation, and then basically running it through a convolutional neural network that establishes a representation of all the spatial and lighting features of a face. In order to do that with a single specific face, you basically just need a crapton of examples of that one face. The catch-all solution for that process is encapsulated now in something called a generative adversarial network (GAN) which can be used as a form of semi-supervised learning; basically you have a generator network that makes up images of what it thinks whatever face should look like, and a discriminator that determines whether or not the generator’s result looks close enough to the “truth” data.

If you want a true solution that recognizes and tracks individuals across live video or something, you’re looking at training an ensemble of systems that can segment images, detect faces, classify faces, and track objects. If you want something a bit less generalizable and lower maintenance, you can create a workable solution that finds which of “X” faces some picture is closest to by:

  1. Running the known faces through a pretrained classifier (ImageNet, Inception, whatever but preferably something trained on faces) and taking the penultimate layer out and calling that x0, x1, x2, x3… xi for each of your i known faces. This will give you a giant vector of numbers that the network was multiplying by some weights to get the final classification, but here we just want the raw vector because we don’t care about the classification.
  2. Running the unknown faces through the same classifier to get y0, y1, …, yj
  3. For each of [y0, …, yj] find which of [x0, …, xi] is the closest by L2 norm or whatever metric you want, but probably the L2 norm.

Now you’ve “trained” a poor man’s image classifier using just a single known image for each class. It probably doesn’t work that great, and probably isn’t actually what you want. Congrats!

1 Like