Pixel arrays and performance issues in p5 and processing

This mainly applies to p5.js, but also Processing Java.

Running this simple script in the p5 web editor gets me 3 frames per second, at only 400x400 resolution. What is it about this script that is so slow, and is there a much faster way to do the same (or essentially the same) without the performance issues? There is no distance checks or one to many particle issues which typically crush the CPU when pixel editing, only a 2 layer nested loop, psuedorandom number generator, and editing the pixel array.

Up until now, a lot of my graphics programs rely heavily on this kind of pixel array loop, and I very quickly run into speed limitations which prevent me from realizing my ideas. How should I be approaching this issue? What do I need to learn to solve this issue? I’ve heard some float the idea of writing the draw functions for graphics cards instead of inside the main draw function, but I’ve never found a good resource to begin translating code for multi-threaded graphics processing, or even understand what that would mean in terms of p5.js. Thanks!

function setup() {
  createCanvas(400, 400);
}

function draw() {
  background(220);

  loadPixels();
  for (let x2 = 0; x2 < width; x2++) {
    for (let y2 = 0; y2 < height; y2++) {
      let pix = 4 * (width * y2 + x2);
      randNum = noise(x2 / 100, y2 / 100) * 255;
      pixels[pix + 0] = randNum;
      pixels[pix + 1] = randNum;
      pixels[pix + 2] = randNum;
    }
  }
  updatePixels();
  console.log(frameRate());
}

There’s nothing wrong with you’re code and P5 is capable of handling much more pixels per second. So it’s the noise function. There’s been discussion earlier that it isn’t as fast as people wished or hoped for.
Make a search to the forum and have a look at ways how this problem has been solved earlier.

For best performance for any type of container we need to access it sequentially.

So you need to swap the width & height order of your double loop.

That is, the outer loop goes w/ the y coordinate while the inner goes w/ the x.

Take a look at the method Blurry::drawNoise() from the online example sketch below for an example of the double loop correct order:

2 Likes

Wow, that is a fascinating sketch and its performance is astonishingly good. I think I understand about 10% of what is going on there. Would it be possible to make a version of that sketch with comments explaining what is going on and even more important, why you’re doing it that way? It looks so different from the sketches with basic syntax and functions that I’ve been doing. I think if I could understand what is going on in your sketch there is a lot of breakthroughs I could make in my own coding. I feel like to understand it properly I would have to ask you function by function and line by line “why?”

For best performance for any type of container we need to access it sequentially.

That hadn’t occurred to me but makes complete sense. Can’t believe I’ve been doing it backwards all this time.

Also, what does this syntax mean?

 { pixels: imgPixels } = img;?

I get that it’s creating some kind of variable imgPixels, somehow… but it’s under const and you’re writing to it, and why is there a colon there and what are the brackets for? Shouldn’t it be something like imgPixels = img.pixels;? Is it rewriting img.imgPixels to imgPixels? Is this some Javascript syntax or specific to the Pixel method? What’s the purpose of this?

Really struggling to see why this sketch runs so much faster than just a loop calling noise(), does the creating a new p5 instance have something to do with it?

It’s cheating a little though, b/c the p5.Image & p5.Graphics objects are just 100 x 100 and then the latter is scaled up to 650 x 600 when it’s actually displayed.

Also it’s not touching the RGB channels of the p5.Image::pixels[], only the alpha part.

{ pixels: imgPixels } = img; is indeed equivalent to imgPixels = img.pixels;.

That is called object destructuring assignment btW:

My online sketch is relying on the p5js noise() too.
I’d really love to know how to make it faster.

I tried to modify an image instead of the canvas at 100x100 and scaled it up, as well as changed the order of the loops. Still dropping to 5fps every second, 30-40 fps between. Like a weird but very consistent stutter. Not sure if there is something else I’m missing here.
.

let img;
let step = 0;

function setup() {
  createCanvas(400, 400);
  img = createImage(100, 100);
}

function draw() {
  background(100, 122, 90);
  img.loadPixels();

  for (let yy = 0; yy < img.width; yy++) {

    for (let xx = 0; xx < img.height; xx++) {

      let pix = 4 * (img.width * yy + xx);
      randNum = noise(xx / 10 + step, yy / 10, step) * 255;
      img.pixels[pix + 0] = randNum;
      img.pixels[pix + 1] = randNum;
      img.pixels[pix + 2] = randNum;
      img.pixels[pix + 3] = randNum;
    }
  }
  step += .005;
  img.updatePixels();

  image(img, 0, 0, width, height);
  console.log(frameRate());
}

Also, general coding question. Is there some way to place the declaration of “step” in or next to the loop?

Also, in the sketch you posted, what is the meaning and purpose of:

static isLittleEndian() {
    return new Uint8Array(Uint32Array.of(0x12345678).buffer)[0] === 0x78;
  }

and


 const pix32 = new Uint32Array(img.pixels.buffer),
          c = pixColor.levels,
          rgba = LITTLE_ENDIAN?
                 c[3] << 0o30 | c[2] << 0o20 | c[1] << 0o10 | c[0] : // aBGR
                 c[0] << 0o30 | c[1] << 0o20 | c[2] << 0o10 | c[3];  // RGBa

I have zero idea what that is for, what it does, or why it is needed.

Iterating over a whole image pixel by pixel is very slow no matter what we do.

My sketch is a bit more performant b/c it only changes each pixels[]’ alpha channel rather than the whole RGBa.

Most you can do is split the work for each draw() frame.

For example, in 1 frame do only even indices, and the next 1 do only odd indices.

More performance is only achievable via shade programming AFAIK:

Blurry.isLittleEndian() should always return true b/c AFAIK all mobile devices are configured as little-endian just like desktop PCs:

It means we can shorten the rgba declaration like this:
rgba = c[3] << 0o30 | c[2] << 0o20 | c[1] << 0o10 | c[0];