Request for guidance:- Making 3D objects appear static on a webcam feed using arduino & an IMU:


Request for guidance:- Making 3D objects appear static on a webcam feed using arduino & an IMU: Is this possible?


I’m new to processing (but have some coding experience in python & java / android). I am trying to design a program using a webcam, Arduino + absolute orientation IMU (Inertial Measurement Unit).

What I would like to build is an augmented reality application that, as the title says, has a series of 3D objects that are super-imposed on a live video feed. I want the objects to appear static as the webcam is rotated, and I will be using Arduino + a 6 axis absolute orientation Inertial Measurement Unit to track the rotation of the webcam. I’m running this on Windows.

I recognize that the IMU will have some drift, etc., which is (currently) a secondary issue. I am hoping I can do this using just the absolute orientation of the device, and placing the 3D objects at specific positions on a transparent sphere, which rotates to compensate for the rotation of the device so that the 3D objects appear fixed in space.

What I’d like to know, as the title says, is:

  • Is this possible, or is there a fundamental flaw in this that I’m missing?
  • Can you guide me to any similar examples, using processing, or any other solution (I do not want to use Android ARCore / Apple ARKit since they depend on surface mapping, which is a challenge here (smooth surfaces; nothing to get a tracking point / surface on)

I can get into details of how I’m envisioning the solution if anyone is curious; so far, I’ve used the Sparkfun IMU guide to make the “Serial cube visualizer” ). So I can get the IMU / Arduino to talk with Processing & pull in absolute orientation data (with almost no drift, and it remembers its orientation after power off, etc. It took some doing to get it calibrated, but once that was done, it’s pretty impressive!)

Please let me know if I need to offer any additional information. I’ve tried to be as clear as I can!
Thanks in advance :slight_smile:

I’ve cross-posted in the all about circuits forum as well, but no replies so far. If there are any, I’ll update it here. Thank you!


Do you mean an augmented reality application with a virtual object appearing at a fixed location in the “real world”, as with rendering on a fiducial?

I’m not sure if an IMU can give you enough resolution to provide any kind of stability. Maybe it could be stable enough if you were dealing with drifting forms, like virtual balloons? But for say a statue, my guess (?) is that you will be really disappointed. There are ways to try to improve performance with a mix of the compass and the camera – just throw more sensors at the problem – and you can also use markers like SIFT to try to pin down an attachment point within a good enough frame of reference… but it really depends on, very specifically, what exactly you are trying to do. Indoor, fixed viewer waving a camera in a known environment, outdoor Pokemon Go style city walker ? Etc.

Just looked at your IMU product / demo and reread your post – clearly, it is giving you good-enough orientation, and you are happy with it, so I misread you.


Here is a simple example based on

void setup() {
  size(640, 360, P3D);

void draw() {
  // Change height of the camera with mouseY
  camera(30.0, mouseY, 220.0, // eyeX, eyeY, eyeZ
         0.0, 0.0, 0.0, // centerX, centerY, centerZ
         0.0, 1.0, 0.0); // upX, upY, upZ
  object(90 + 25/2.0, 0, 0, 25);
  object(0, 90 + 50/2.0, 0, 50);
  line(-100, 0, 0, 100, 0, 0);
  line(0, -100, 0, 0, 100, 0);
  line(0, 0, -100, 0, 0, 100);

void object(float x, float y, float z, float w){
  translate(x, y, z);

The key here is that you can update camera() coordinates directly with the output of your IMU. Then just render things in Processing – set your lighting, transparency, etc. – all based around 0,0,0.

Now, will your IMU location be accurate – that’s the second key…


Oh, no. This was very useful. Thank you so much!!

Let me try & offer some more context:
a) There is no X/Y/Z translation, just limited rotation: only yaw & pitch (~60 degrees on either side of center; and 30 degrees up or down) will be allowed (for now).
b) I worry about marker based approaches because they are very sensitive to ambient light, and the UX for new users / unfamiliar users can be very frustrating. (based on my experience with both ARCore & ARKit).

What I am trying to build / planning to do is as follows:
a) Create a 3D transparent sphere centered on the camera and embed my 2D / 3D objects on the sphere
b) The position of the objects on the sphere will be assigned relative to real world objects (viz. trees, etc.)
c) As the camera rotates within the restricted field, I will use processing to “rotate” the sphere in the opposite direction, so that the sphere (and therefore the objects embedded in the sphere) appear fixed.

I have not considered compensating for skew, perspective etc., that’s the next step, if & when I can get this basic thing to work! :grin:



I would start with a simple cube that you position where you want in processing 3D space.

Then you can put the camera in its resting position in order to calibrate it.

Now you can simply use the rotateX, rotateY and rotateZ (reference) functions of processing using the direct output of you IMU.

I think it’s as straight forward as this, but I might be missing something.

What I don’t now is that since you won’t be rotating exactly around the focal point and the camera and that your IMU will also have an offset, you might end up with the 3D shapes slightly moving. Depending on the setup maybe it can be too much, I don’t know… :thinking:

1 Like

To expand on suggestions from @jb4x

One way to center the scene on the camera is to set the camera at 0,0,0, with a look based on the default look from the reference. Then apply your rotation based on IMU measures, then draw your fixed objects in absolute coordinate space, and they will appear in the view correctly.

float rad = 80;

void setup() {
  size(640, 640, P3D);

void draw() {

  // set camera in center of coordinate space
  camera(0, 0, 0, 0, 0, -(height/2.0) / tan(PI*30.0 / 180.0), 0, 1, 0);

  // align based on IMU -- here faked with mouse
  rotateY(map(mouseX, 0, width, -HALF_PI, HALF_PI));
  rotateX(map(mouseY, 0, height, -HALF_PI, HALF_PI));

  // draw objects to their fixed coordinates -- this is a set of six cubes, like die faces.
  object(rad, 0, 0, 25);
  object(-rad, 0, 0, 25);
  object(0, rad, 0, 25);
  object(0, -rad, 0, 25);
  object(0, 0, rad, 25);
  object(0, 0, -rad, 25);

void object(float x, float y, float z, float w){
  translate(x, y, z);

Don’t use the peasyCam library for this – it is mouse-focused, and it is designed to center the look point, not the camera.

1 Like

Thank you both! This was immensely useful. I think I broadly understand the code that @jeremydouglass shared (thanks again! :smiley:)

I’m going to first understand what each of these functions are, and then put together code with the IMU data.

This will take me a few days, but I’ll share the resulting code (and most likely seek inputs again because I am certain I’ll need help there!).

But thanks once again :).

Appreciate it!


Yep. I am going to try and place the IMU as close to the camera as possible; and because the IMU is offset from the center /camera by a known, fixed amount, perhaps figure out a way to compensate for that by applying a correction factor to the rotation vector. I suspect that’s going to be pretty tricky, though it sounds simple (3D translations & rotations rarely are), so we’ll see how far along I get.

FWIW, I’ll keep updating this thread on a regular basis, on the off chance this’ll be of use for someone else.

Thanks again!



I’ve made ‘some’ progress, mostly in developing my understanding of what’s happening in the code.
So far, these are the changes:
I’ve modified @jeremydouglass’s code to
a) Add the webcam feed
b) I’ve changed the shape & colour of the objects so the relative changes are easy to visualize

I’ve tried to add detailed comments on the code explaining my understanding of what is happening at each section; Given my lack of coding experience, I often find most code baffling. I am hoping someone new like me reading this will walk away with a slightly better understanding of what the logic & flow is (assuming I am correct).

Needless to say, I am no where near completion, and I have a few questions I hope you guys can help me understand / resolve! :slight_smile:

// Modified from code by Jeremy Douglass @ 
//This sketch attempts to
// a) Create a virtual 3D space (I am thinking of it as an imaginary 3D sphere)
// b) Place 6 objects in the 3D sphere at fixed coordinates 
// c) Overlay webcam feed on this
// d) Use mouse movement as proxy for IMU data 

import*; // Calls the processing video library

// Defines the function rad in ?? (degrees / radians), 
//which is used to determine coordinates in the 3D sphere 
//starting from lines 30 - 35 where the 3D objects will be "fixed"
float rad = (80); 

Capture cam;

void setup() {
  size(1280, 960, P3D); // size of the canvas; P3D invokes the 3D renderer, 
//using openGL as opposed to the default Processing renderer, which is designed for 2D

 // poll & list all webcams / cameras detected by processing as attached to the system
 String[] cameras = Capture.list();

// code from 27 through 38 checks if webcams / cameras are present & prints the list. 
// Line 37 picks the camera resolution, etc. that is used in the code. 
// Change this number based on your  specific system, the resolution / FPS you want,
// etc from the "println" output
// Code from Processing's capture() refdocs
  if (cameras.length == 0) {  
      println("There are no cameras available for capture");
  } else {
      println("Available cameras:");
      for (int i = 0; i <cameras.length; i++) {
      cam =  new Capture(this, cameras[135]);


void draw() { // draw loop
// if webcam is detected, will read frames from cam. 
  if (cam.available() == true) {;
  set(0,0, cam); 
// centers & displays webcam stream at on screen; note that this is starting from Processing 0,0; 
// so if the video resolution is less than the size of the canvas defined in setup(), 
// the video will be aligned to the top-left corner of the display

// set virtual camera in center of coordinate space (this is for the virtual 3D sphere)
  camera(0, 0, 0, 0, 0, -(height/2.0) / tan(PI*30.0 / 180.0), 0, 1, 0); 
// (first 3 values define position of eye; next 3 define position of the scene center; 
// last 3 define the orientation of the axes(?)); 
// tan(PI*30.0/180.0) defines the Field of View (?)

// align based on IMU -- here faked with mouse
  rotateY(map(mouseX, 0, width, -HALF_PI, HALF_PI)); 
//The Y/X axis rotation range is mapped to the movement of the mouse X/Y axis, 
// where mouse values are scaled from 0 to the width of the screen; 
// half pi defines the lower & upper bounds of movement
  rotateX(map(mouseY, 0, height, -HALF_PI, HALF_PI));

// draw objects to their fixed coordinates in the virtual sphere -- 
// this is a set of six 3D objects with different colours, shapes & defined positions on the virtual sphere
  objectA(rad, 0, 0, 25);
  objectB(-rad, 0, 0, 25);
  objectC(0, rad, 0, 25);
  objectD(0, -rad, 0, 25);
  objectE(0, 0, rad, 25);
  objectF(0, 0, -rad, 25);

// This set of code (71 to 120) defines the objects & thier properties

void objectA(float x, float y, float z, float h){
  translate(x, y, z);
  fill(245, 108, 40);

void objectB(float x, float y, float z, float h){
  translate(x, y, z);
  fill(241, 245, 40);

void objectC(float x, float y, float z, float w){
  translate(x, y, z);
  fill(40, 245, 51);

void objectD(float x, float y, float z, float d){
  translate(x, y, z);
  fill(40, 202, 245);

void objectE(float x, float y, float z, float r){
  translate(x, y, z);
  fill(76, 40, 245);

void objectF(float x, float y, float z, float r){
  translate(x, y, z);
  fill(189, 40, 245);

As ever, I have a series of questions (that I’ve added in the comments as well, just so someone reading this understands I don’t know what I’m talking about).
a) Is my understanding of the “float rad = (80)” at the beginning of the sketch accurate, in that it is defining the placement of the 3D objects in space?

This code works in that it is placing 6 objects super-imposed on the webcam feed. Now, I am trying to figure out how to pull in the IMU data & map it to these objects. (i.e., replace the section that says “align based on IMU - here faked with mouse”).

A conceptual question: As of now, the 3D sphere is rotating based on the fake IMU input (i.e mouse movement). Am I correct that, when I plug in the IMU quaternion data, I will have to apply the inverse of the rotation (quaternion.inverse() ) to the 3D objects so that they appear static / fixed in space as the webcam is rotated?

Thanks again for your help :slight_smile: ! I have a series of questions on processing the quaternions themelves, but I want to make sure I’m on the right track before pasting / sharing the modified / commented code.


That’s right. The variable rad is set to 80. The variable is then used as an argument to object – each of the six objects is +/-80 away along the x, y, or z axis, making a cube of boxes. If you change rad to 200, they get farther away. If you change it to 40, they get closer.

You will need to express your camera in three rotations operations around the axes – rotateX(), rotateY(), rotateZ() – or else, alternately, as a look-at with up-orientations, using camera():

If you want to convert quaternions to rotations, you could look at the example code in Shapes3D

The Shapes3D library uses quarternions to perform 3D rotations. The main class is called Rot which is effectively a quarterian by another name and is a port from the Apache maths project. This class provides many constructors for creating Rot objects which can be used for rotating PVector objects.

…or an old one-off class, which might need to be updated:

Hey, I remember having a lot of issues with Quaternions when I first started out, but then I foud a great C++ class after googling for a bit for one. I’ve decided to port it to Java so that it can be used with Processing. You can keep track of the quaternions explicitly, or if you’re lazy like me, you can just use a flavor of the rotate() function that I made at the bottom there. The original class was written by Laurent Schmalen and I just ported it over to fit in with Java.

If you are a beginning programmer, notice that converting quaternions to rotations is advanced maths / code structures.

1 Like

Thanks @jeremydouglass!

Yep; I think this will take me some time! I understand trigonometry (well, the basics), and am now trying to learn / understand Euler angles. Some useful links I’ve found are:

I will review the example code in the Shapes3D library & see if I can figure it out.

I am using the Sparkfun BNO080 IMU, and they have a demo using Processing (which is how I discovered Processing!), available here:

They use the ToxiclibsSupport class, which also appears to be using Rot. I’m pasting the code for reference:

 * <p>The ToxiclibsSupport class of the toxi.processing package provides various
 * shortcuts to directly use toxiclibs geometry datatypes with Processing style
 * drawing operations. Most of these are demonstrated in this example.</p>
 * <p>UPDATES:
 * <ul>
 * <li>2010-12-30: added sphere/cylinder resolution modulation</li>
 * </ul></p>
 * Copyright (c) 2010 Karsten Schmidt
 * This library is free software; you can redistribute it and/or
 * modify it under the terms of the GNU Lesser General Public
 * License as published by the Free Software Foundation; either
 * version 2.1 of the License, or (at your option) any later version.
 * This library is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * Lesser General Public License for more details.
 * You should have received a copy of the GNU Lesser General Public
 * License along with this library; if not, write to the Free Software
 * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA

import toxi.geom.*;
import toxi.geom.mesh.*;
import toxi.math.waves.*;
import toxi.processing.*;
import processing.serial.*;

String myString = null;
Serial myPort;  // The serial port

ToxiclibsSupport gfx;

boolean waitFlag = true;

void setup() {
  gfx=new ToxiclibsSupport(this);
  // Print a list of connected serial devices in the console
  // Depending on where your GridEYE falls on this list, you
  // may need to change Serial.list()[0] to a different number
  myPort = new Serial(this, Serial.list()[0], 9600);
  // Throw out the first chunk in case we caught it in the 
  // middle of a frame
  myString = myPort.readStringUntil(13);
  myString = null;  

void draw() {
  Quaternion RotQ = new Quaternion(1,0,0,0);
  float qMatrix[] = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};
  PMatrix M1 = getMatrix();
  // When there is a sizeable amount of data on the serial port
  // read everything up to the first linefeed
  if(myPort.available() > 30){
  myString = myPort.readStringUntil(13);
  // generate an array of strings that contains each of the comma
  // separated values
  String inQuat[] = splitTokens(myString, ",");  
  // build a Quaternion from inQuat[] array
  RotQ = new Quaternion(float(inQuat[0]), float(inQuat[1]),float(inQuat[2]),float(inQuat[3]));
  AABB cube;

  // Set some mood lighting
  ambientLight(128, 128, 128);
  directionalLight(128, 128, 128, 0, 0, 1);
  lightFalloff(1, 0, 0);
  lightSpecular(0, 0, 0);
  // Get to the middle of the screen
  // Do some rotates to get oriented "behind" the device
  // Apply the Matrix that we generated from our IMU Quaternion
  // Draw the Cube from a 3D Bounding Box
  cube=new AABB(new Vec3D(0,0,0),new Vec3D(100,100,100));;
    text("Waiting for quaternions to chew on...", 10, 30); 
    waitFlag = false;}


I am assuming this is equivalent to the Rot function offered by the Shapes3D library. However, I am confused about the following:
a) If I understand the code correctly, the AABB cube used in the Sparkfun’s sketch is being rotated using matrix operations, as opposed to using the approach shared in your post 3 days ago. Is this similar to what’s being discussed in this thread:

where the poster wants to use matrix operations to rotate / control the box.

So I need to figure out how to adapt such an approach to rotating the “virtual sphere” that I’ll be embedding my objects in?

Thank you! :slight_smile:

1 Like

If you use the matrix operations, yet another option is applyMatrix, as you are updating the built-in perspective matrix rather than a matrix representing an object – although it may be slow

1 Like

From what I can understand, Euler angles are simpler to work with compared to quaternions, and are less computationally expensive (though, I am running this on a powerful Windows PC, so I don’t know that it is such a big concern).

Since I am not going to be doing 90 degree movements in any direction, I don’t think I would need to worry about gimbal lock. So, based on the little I understand, would trying to get Euler angles from the IMU, and then use those to rotate the objects be simpler?



Not an expert, but I believe (?) a key thing for that approach would be that the Euler angle elements be intrinsic, rather than extrinsic – then they could be fed to rotateX / rotateY / rotateZ in order.

1 Like

my math is rusty, no idea what that means, please more words or a link?

1 Like


The three elemental rotations may be extrinsic (rotations about the axes xyz of the original coordinate system, which is assumed to remain motionless), or intrinsic (rotations about the axes of the rotating coordinate system XYZ , solidary with the moving body, which changes its orientation after each elemental rotation).

…Processing performs its three rotations intrinsically – each rotation operation is a modification of the previous matrix state from the point of view of that state.

…I think (?).


I have a lot to learn over the next few days, but will update this post. I think it’s possible to convert quaternions into Euler angles in Arduino itself, and use those.

What I’m trying to figure out now is the code necessary to rotate the objects based on the quaternion / Euler angles in processing. Basically, once I get the quaternion data into Processing, what’s the code I would need to rotate the 3D objects in the sketch. I am referring to the following threads for some insight, but would appreciate any pointers the folks here may have :slight_smile:

(in this one, they start with Euler angles direct from the sensor, then switch to converting the quaternions into Eulers.


Well, both Euler angles and quaternions have matrix representations. In the case of Euler angles, this is straight forward; Use three applymatrix for either 1. each of the 1-2-3 Euler angles or 2. the 3-1-3 Euler angles (whichever you prefer; They produce equivalent results).

In the case of quaternions, you’ll need a math module which lets you do matrix multiplications (quaternions of course has computational advantages over Euler angles in that you avoid gimbal lock).



Good timing! I was just looking at this project the other day. I have Arduino\BNO055 sensor working as a handheld unit communicating using serial over Bluetooth to PC\Processing. I am working on “cleaning” up the serial code and adding handshaking.


UPDATE Yellow shape in center of video does not match shape on axes below because I had rotated device PI/2 on a new board; I need to correct this in software.

More to follow…

1 Like

Looking forward! Actually, was following your discussions on the other this thread.

Is this the project you are referring to?

So cool :slight_smile:

1 Like