Ml5js posenet single mode still return multiple face key points

I have set detectionType: 'single' and feed my video in this way poseNet.singlePose(video);
It’s true that only one skeleton get’s detected but when 2 people are present body parts key-points get scattered between these people.
Also if one person wears a face mask only their eye get detected and the 2nd person gets mouth and ear key-points.

I know I can work around it by filtering and measuring key-point distances. But maybe there is already a build in function to keep key-points in a reasonable distance to each other.

https://ml5js.org/reference/api-PoseNet/