For a linear gradient, I’m iterating pixels like this:
for (int i = 0, y = 0, x; y < height; ++y) {
for (x = 0; x < width; ++x, ++i) {
float step = Functions.project(ox, oy, dx, dy, x, y);
gradientPG.pixels[i] = gradient.eval(step);
}
}
This iteration can be multi-threaded on the CPU quite easily (where each thread writes to a portion of the PGraphics pixel array):
Code...
final int threadCount = 4;
final int pixels = gradientPG.pixels.length / threadCount;
ExecutorService exec = Executors.newFixedThreadPool(threadCount);
gradientPG = p.createGraphics((int) dimensions.x, (int) dimensions.y);
gradientPG.beginDraw();
gradientPG.loadPixels();
HashSet<LinearThread> threads = new HashSet<>(threadCount);
for (int i = 0; i < threadCount; i++) {
threads.add(new LinearThread(i * pixels, pixels, ox, oy, dx, dy, gradient));
}
try {
exec.invokeAll(threads); // run threads, block until all finished
} catch (InterruptedException e) {
e.printStackTrace();
}
gradientPG.updatePixels();
gradientPG.endDraw();
Where a LinearThread
is this:
Code...
public class LinearThread implements Callable<Boolean> {
final int offset, pixels;
final float ox, oy, dx, dy;
final Gradient gradient;
public LinearThread(int offset, int pixels, float ox, float oy, float dx, float dy,
Gradient gradient) {
this.offset = offset;
this.gradient = new Gradient();
this.dx = dx;
this.dy = dy;
this.ox = ox;
this.oy = oy;
this.pixels = pixels;
}
@Override
public Boolean call() {
for (int i = offset; i < offset + pixels; i++) {
float step = Functions.project(ox, oy, dx, dy, i % p.width, (i - i % p.width) / p.height);
gradientPG.pixels[i] = gradient.eval(step);
}
return true;
}
}
The Gradient
class (a wrapper for a gradient’s colour and colour stops) eval()
method looks like this. Given a position (step) in the gradient, it calculates the color at that point using the current colour space. However, the time this method takes is dominated by the maths methods during LAB conversion.
Code...
int eval(final float step) {
final int size = colorStops.size();
// Exit from the function early whenever possible.
if (size == 0) {
return 0x00000000;
} else if (size == 1 || step < 0.0) {
return colorStops.get(0).clr;
} else if (step >= 1.0) {
return colorStops.get(size - 1).clr;
}
ColorStop currStop;
ColorStop prevStop;
float currPercent, scaledst;
for (int i = 0; i < size; i++) {
currStop = colorStops.get(i);
currPercent = currStop.percent;
if (step < currPercent) {
// These can be declared within the for-loop because
// if step < currPercent, the function will return
// and no more iterations will be executed.
float[] originclr = new float[4];
float[] destclr = new float[4];
float[] rsltclr = new float[4];
// If not at the first stop in the gradient (i == 0),
// then get the previous.
prevStop = colorStops.get(i - 1 < 0 ? 0 : i - 1);
scaledst = step - currPercent;
float denom = prevStop.percent - currPercent;
if (denom != 0) {
scaledst /= denom;
}
// Assumes that color stops' colors are ints. They could
// also be float[] arrays, in which case they wouldn't
// need to be decomposed.
switch (colorSpace) {
// TODO CALC STEP HERE ?
case HSB :
Functions.rgbToHsb(currStop.clr, originclr);
Functions.rgbToHsb(prevStop.clr, destclr);
Functions.smootherStepHsb(originclr, destclr, scaledst, rsltclr);
return Functions.composeclr(Functions.hsbToRgb(rsltclr));
case RGB :
Functions.decomposeclr(currStop.clr, originclr); // decompose int color to [R,G,B,A] (0-1)
Functions.decomposeclr(prevStop.clr, destclr); // decompose int color to [R,G,B,A] (0-1)
Functions.smootherStepRgb(originclr, destclr, scaledst, rsltclr); // lerp between colors?
return Functions.composeclr(rsltclr);
case LAB :
double[] rsltclr1 = new double[4];
double[] labA = LAB.rgb2lab(Functions.decomposeclr(currStop.clr)); // RGB->LAB
double[] labB = LAB.rgb2lab(Functions.decomposeclr(prevStop.clr));
LAB.interpolate(labA, labB, scaledst, rsltclr1); // stil in LAB
float[] inter = LAB.lab2rgb(rsltclr1);
return Functions.composeclr(inter[0], inter[1], inter[2]);
case FAST_LAB :
double[] rsltclr2 = new double[4];
double[] labC = FastLAB.rgb2lab(Functions.decomposeclr(currStop.clr)); // RGB->LAB
double[] labD = FastLAB.rgb2lab(Functions.decomposeclr(prevStop.clr));
FastLAB.interpolate(labC, labD, scaledst, rsltclr2); // stil in LAB
float[] inter1 = FastLAB.lab2rgb2(rsltclr2);
return Functions.composeclr(inter1[0], inter1[1], inter1[2]);
default :
break;
}
}
}
return colorStops.get(size - 1).clr;
}
Regarding the profiling, I’ve since replaced the Processing methods with jafama.FastMath, and I’m now working in doubles only until the very end (so only a single cast to float, rather than casting in every method) – these changes have doubled the speed.
I’ve developed a version that uses the optimised approximate methods pow()
, ln()
and exp()
methods from here; these are indeed much faster (x5), but sadly too imprecise, often making a noticeable impact in the gradient results.
Example: Left gradient uses FastMath methods; right gradient uses approximate methods – this is a more severe example.
The LAB<–>RGB conversion uses fraction exponents, logs, etc (that’s why it’s so heavy) Here’s the conversion class:
Code...
final class LAB {
// D65 standard referent
private static final double Xn = 0.950470;
private static final double Yn = 1.0;
private static final double Zn = 1.088830;
private static final double t0 = 0.137931034; // 4 / 29
private static final double t1 = 0.206896552; // 6 / 29
private static final double t2 = 0.12841855; // 3 * t1 * t1
private static final double t3 = 0.008856452; // t1 * t1 * t1
private static final double third = 1 / 3.0f;
/**
* sRGB (0-255) to LAB (CIE-L*ab)
*
* @param rgb
* @return
*/
public static double[] rgb2lab(float[] rgb) {
double[] xyz = rgb2xyz(rgb[0], rgb[1], rgb[2]);
double l = 116 * xyz[1] - 16;
if (l < 0) {
return new double[0]; // todo?
}
return new double[] { l, 500 * (xyz[0] - xyz[1]), 200 * (xyz[1] - xyz[2]) };
}
private static double rgb_xyz(double r) {
if ((r /= 255) <= 0.04045)
return r / 12.92;
return FastMath.pow((r + 0.055) / 1.055, 2.4);
}
private static double xyz_lab(double t) {
if (t > t3) {
return FastMath.pow(t, third);
} else {
return t / t2 + t0;
}
}
private static double[] rgb2xyz(double r, double g, double b) {
r = rgb_xyz(r);
g = rgb_xyz(g);
b = rgb_xyz(b);
final double x = xyz_lab((0.4124564 * r + 0.3575761 * g + 0.1804375 * b) / Xn);
final double y = xyz_lab((0.2126729 * r + 0.7151522 * g + 0.0721750 * b) / Yn);
final double z = xyz_lab((0.0193339 * r + 0.1191920 * g + 0.9503041 * b) / Zn);
return new double[] { x, y, z };
}
// --------------------------------------------------------------------------------------------
static final float Xn2 = 95.04f;
static final float Yn2 = 100;
static final float Zn2 = 108.89f;
public static float[] lab2rgb(double[] lab) {
return xyz2rgb(lab2xyz(lab));
}
/**
* https://github.com/Heanzy/bupt-wechatapp/tree/master/src/main/java/com/buptcc/wechatapp/utils
* @param Lab
* @return
*/
private static double[] lab2xyz(double[] Lab) {
double[] XYZ = new double[3];
final double L, a, b;
final double fx, fy, fz;
L = Lab[0];
a = Lab[1];
b = Lab[2];
fy = (L + 16) / 116;
fx = a / 500 + fy;
fz = fy - b / 200;
if (fx > 0.2069) {
XYZ[0] = (Xn2 * FastMath.pow(fx, 3));
} else {
XYZ[0] = Xn2 * (fx - 0.1379f) * 0.1284f;
}
if ((fy > 0.2069) || (L > 8)) {
XYZ[1] = (Yn2 * FastMath.pow(fy, 3));
} else {
XYZ[1] = Yn2 * (fy - 0.1379f) * 0.1284f;
}
if (fz > 0.2069) {
XYZ[2] = (Zn2 * FastMath.pow(fz, 3));
} else {
XYZ[2] = Zn2 * (fz - 0.1379f) * 0.1284f;
}
return XYZ;
}
private static float[] xyz2rgb(double[] XYZ) {
final float[] sRGB = new float[3];
final double X, Y, Z;
double dr, dg, db;
X = XYZ[0];
Y = XYZ[1];
Z = XYZ[2];
dr = 0.032406 * X - 0.015371 * Y - 0.0049895 * Z;
dg = -0.0096891 * X + 0.018757 * Y + 0.00041914 * Z;
db = 0.00055708 * X - 0.0020401 * Y + 0.01057 * Z;
if (dr <= 0.00313) {
dr = dr * 12.92;
} else {
dr = (FastMath.exp(FastMath.log(dr) / 2.4) * 1.055 - 0.055);
}
if (dg <= 0.00313) {
dg = dg * 12.92;
} else {
dg = (FastMath.exp(FastMath.log(dg) / 2.4) * 1.055 - 0.055);
}
if (db <= 0.00313) {
db = db * 12.92;
} else {
db = (FastMath.exp(FastMath.log(db) / 2.4) * 1.055 - 0.055);
}
dr = dr * 255; // 255 here TODO
dg = dg * 255;
db = db * 255;
dr = FastMath.min(255, dr);
dg = FastMath.min(255, dg);
db = FastMath.min(255, db);
sRGB[0] = (int) (dr + 0.5);
sRGB[1] = (int) (dg + 0.5);
sRGB[2] = (int) (db + 0.5);
return sRGB;
}
/**
* originclr, destclr, scaledst, rsltclr
*
* @param a LAB col 1
* @param b LAB col 2
* @param st step
* @param out new col
* @return
*/
public static double[] interpolate(double[] a, double[] b, float step, double[] out) {
step = Functions.calcStep(step);
out[0] = a[0] + step * (b[0] - a[0]);
out[1] = a[1] + step * (b[1] - a[1]);
out[2] = a[2] + step * (b[2] - a[2]);
return out;
}
}
Looking at it just now, I think these two lines:
double[] labA = LAB.rgb2lab(Functions.decomposeclr(currStop.clr));
double[] labB = LAB.rgb2lab(Functions.decomposeclr(prevStop.clr));
… can be computed just once for the color stop (rather than for every pixel!!!)