P3D performance with 62500 boxes

I have a program that I wrote a long time ago using the GLGraphics library and that I’m trying to port into Processing 3.0. It is a program that has 62500 boxes independently moving all over the place. In P151 I used coordinate and color lists and updated them using updateVertices(coords); and updateColors(colors). Even with 62500 boxes it ran perfectly at 60 fps. Now in P3 I have created a PShape(GROUP) called thisModel and it contains 62500 PShapes or boxes. When I try to translate the boxes in draw the framerate drops to 12 fps on my MacBookPro 2018. It is the pp.translate(m.x, m.y, m.z) line that slows everything down. ( po.getCurrentLoc() returns a Vector and works fine)

for (int i = 0; i < 62500; i++) {
PShape pp = thisModel.getChild(i);
pp.resetMatrix();
PVector m = po.getCurrentLoc();
pp.translate(m.x, m.y, m.z);
}

pa.shape(thisModel);

Is there something I’m missing in the translate function ? Any help would be great.
Thanks

Hi,

You can look at the source of PShape. I see translate calls transform. I wonder if it’s calling that new PMatrix3D() inside transform.

Have you tried calling translate() when drawing, instead of PShape.translate()? Maybe it’s faster to only alter the rendering of the vertices instead of the vertices themselves.

Thanks for your response !
I’ve also tried this inside the for loop instead of “recording” the boxes within the Grouped Shape, but it is the same slow speed ; (

pa.resetMatrix();
pa.translate(m.x, m.y, m.z);
pa.box(5);

This is a self contained program showing the same issue, right?

PShape group;

void setup() {  
  size(600, 600, P3D);
  noStroke();
  group = createShape(GROUP);
  for (int i = 0; i < 15000; i++) {
    PShape s = createShape(BOX, 5);
    s.translate(random(width), random(height));
    group.addChild(s);
  }
}
void draw() {
  background(0);
  PShape[] shapes = group.getChildren(); 
  for (int i = 0; i < shapes.length; i++) {
    //shapes[i].resetMatrix();
    shapes[i].translate(random(-1, 1), random(-1, 1));
  }
  shape(group);
  if(frameCount % 60 == 0) {
    println(frameRate);
  }
}

Yes exactly, but when you bump it up to 62500 shapes the framerate drops to 12 fps. Whereas in Processing 151 with GLGraphics I was at 60 fps for 62500 shapes and more. Can’t figure out how to optimize in P3.0 ;(

Has anyone an idea why it can handle so few cubes whereas in Processing 151 I had many many more at 60fps ? Thanks !

Can you share the code of the 1.5 version?

You can see the same issue with the performance demos of dynamic particles, immediate vs retained. There’s some really screwy stuff in PShape somewhere if you do transforms. Keep meaning to fiddle with this. Interesting to know that it perhaps performed better in the past.

I just tried examples/demos/performance/*particles* and oddly the immediate version seems to perform smoother with my integrated Intel graphics.

For instance with the Dynamic examples, Immediate gives me stable 60 fps, while Retained drops frames. Isn’t Retained supposed to be faster?

1 Like

Yes, exactly! Same here, with any GPU. It’s something to do with invalidation in the PShape. I looked at it a while ago as I was going to try and work around it, but got waylaid with other things. :man_shrugging:

it is slow with immediate boxes and retained and grouped pshapes. it is almost as if it isn’t opengl and the gpu are not handling the work. any help would be very appreciated.

If it’s urgent and the current version doesn’t work for you, could you run it on Processing 151? Or port it to OpenRNDR, openFrameworks or three.js?

AFAIK this is exactly the case - the transforms cause the tessellation to be recalculated on the CPU. You might be better looking at beginPGL and doing what you were doing before, or possibly vertex attributes and shaders?

Thanks for your comment.
it is a big complex program and already works really well in 151 with the GLGraphics library at 60 fps.
I just want to port it into the most recent version of Processing and Java 8 because as we move forward in time, it seems to become more problematic on the newer OSs.

oh ! i’m not actually sure where to begin with beginPGL… i just need a lot of single colored cubes.

I opened this issue

2 Likes

I’m not sure that is the issue completely - this seems to call through to applyMatrixImpl() which in turn is calling through to methods like applyMatrixOnPolyGeometry which is re-calculating every vertex position on the CPU and re-uploading the “retained” geometry?!

But do you want to “retain” the whole geometry in a particle system? Or have the particle as a retained shape and draw it multiple times at different transformations using the normal matrix stack?

Yeah there may be other issues, I think you’ve looked more into it. Maybe the use cases should be stated together with what works best on each.

I thought it’s not intuitive that each transformation on a PShape is creating a new object and grabbing RAM. But recalculating all the vertex positions also doesn’t sound like something to be expected. The positive side is that it might be easy to improve a lot the performance of Processing in some areas :slight_smile:

Thanks for opening the issue !

In response to your question, I simply want to have 62500 cubes, each with its own (single) color and individual movement. Using Processing 151 and GLGraphics which passed the work to the GPU, this was extremely fast (60fps). But as OSX continues its advance, Java 6 and Processing 151 become less and less supported. Hence my desire to port my thousands of lines of code into Processing 3 and Java 8.