Update:
Turns out jafama provides __Quick versions of the methods with less accuracy. The speed up is roughly the same speed up as the optimised approximate methods used before, but the quality is much better.
Better quality: not identical, but less banding than prior quick methods!
Another optimisation was refactoring the 3 cases of integer powers to multiplication:
XYZ[0] = (Xn2 * FastMath.pow(fx, 3)); to
XYZ[0] = (Xn2 * fx * fx * fx);
And the big one: calculating the LAB value of a colorstop during its construction (once) and not recalculating every pixel. Now, for each pixel, we call lab2rgb() only, rather than a call to lab2rgb and two calls to rgb2lab(). The new LAB part in eval() looks like this:
double[] rsltclr1 = new double[4];
LAB.interpolate(currStop.labclr, prevStop.labclr, smoothStep, rsltclr1);
return LAB.lab2rgb(rsltclr1);
With all these optimisations, on a single thread, I’ve gone from ~1fps to ~23fps at 800x800 pixels (In contrast, RGB interpolation runs at ~37fps).
Another trick I’ve just thought of is rendering every nth pixel, which seems to work very well for gradients:
Modifying main loop to this:
int recentPixel = 0;
for (int i = 0, y = 0, x; y < height; ++y) {
for (x = 0; x < width; ++x, ++i) {
if (i % n == 0) {
float step = Functions.project(ox, oy, dx, dy, x, y);
recentPixel = gradient1.eval(step);
}
gradient.pixels[i] = recentPixel;
}
}
Results:
n=1 (render every pixel, as before)
n=2
n=4
n=8 (banding is finally noticeable)





