I want to convert RGB colour images to greyscale. I know a number of ways to do this in Processing, but the algorithms are under the bonnet. I want to do the conversion AND know what I am doing!
Here is what the author of the Color-to-Grayscale paper says:
Perhaps the simplest color to grayscale algorithm is Intensity. It is the mean of the RGB channels:
Although Intensity is calculated using linear channels, in practice gamma correction is often left intact when using datasets containing gamma corrected images. We call this method Gleam:
In terms of pixel values, Intensity and Gleam produce very different results.
When gamma corrected Intensity and Gleam are both applied to natural images, we found that Gleam produces pixel values around 20–25% smaller on average.
What is gamma correction and how would I implement the Glean color to grayscale algorithm in Processing?
RGB in Processing already is gamma-corrected, so simply taking average of the 3 RGB components gives the Gleam metric.
If we linearly interpolate over a range of values in Processing (say black-to-white, 0-255), we get an even change in the perceived brightness of each colour: a black-to-white gradient with grey at the midpoint, even though the physical light intensity is non-linear. This is the expected behavior under gamma correction.
The technical term you’re looking for is Relative Luminance. It is a specific standard for calculating the brightness of colors as perceived by humans, with scientifically derived weights.
A rather well-illustrated Twitter/X thread on the subject.
Although the Processing algorithm for grayscale conversion (using filter(GRAY)) is not an exact implementation of relative luminance, it’s conceptually similar and a close approximation for general use (only the weights differ slightly).
For comparison, here’s what the grayscale conversion function would look like if it used perceptual precision weightings.
def relative_luminance(img):
"""
Convert the image to grayscale using true relative luminance weights and bit shifting.
Reference -> https://en.wikipedia.org/wiki/Relative_luminance
"""
lum_img = createImage(img.width, img.height, RGB)
img.loadPixels()
lum_img.loadPixels()
for i in range(len(img.pixels)):
col = img.pixels[i]
# Extract RGB components using bit shifts
r = (col >> 16) & 0xff # Red component
g = (col >> 8) & 0xff # Green component
b = col & 0xff # Blue component
# Calculate the true relative luminance using scaled weights:
# Luminance = 0.2126 * Red + 0.7152 * Green + 0.0722 * Blue
# Approximation: 0.2126 * 256 = 54, 0.7152 * 256 = 183, 0.0722 * 256 = 18
lum = (54 * r + 183 * g + 18 * b) >> 8 # Bit-shift by 8 (dividing by 256)
# Set the grayscale pixel by combining the luminance value into RGB format
lum_img.pixels[i] = (col & 0xff000000) | (lum << 16) | (lum << 8) | lum
lum_img.updatePixels()
return lum_img
Thank you everyone! That was some good reading.
I made a sketch (based on the sketch by @solub) to compare different weightings, and in my humble opinion the Relative Luminance is best (on a screen). You can see for yourselves because I have uploaded the results at the end of the message.
I have also uploaded a conversion made in Adobe Illustrator > Edit Colors > Convert to Grayscale. Seems like Adobe also uses Relative Luminance…?
from __future__ import print_function
def setup():
size(1070, 892)
global img
img = loadImage("2018_431_IN3_RET.png")
def draw():
if mouseX <= img.width/4:
image(img, 0, 0)
elif mouseX > img.width/4 and mouseX <= img.width/2:
weight_R = 54
weight_G = 183
weight_B = 18
image(relative_luminance(img, weight_R, weight_G, weight_B), 0, 0)
elif mouseX > img.width/2 and mouseX <= (img.width/4)*3:
weight_R = 77
weight_G = 151
weight_B = 28
image(relative_luminance(img, weight_R, weight_G, weight_B), 0, 0)
else:
weight_R = 85
weight_G = 85
weight_B = 85
image(relative_luminance(img, weight_R, weight_G, weight_B), 0, 0)
line(img.width/4, img.height - 50, img.width/4, img.height)
line(img.width/2, img.height - 50, img.width/2, img.height)
line((img.width/4)*3, img.height - 50, (img.width/4)*3, img.height)
if ((keyPressed) and (key == 'p')):
save("something.jpg")
def relative_luminance(img, weight_R, weight_G, weight_B):
"""
Convert the image to grayscale using true relative luminance weights and bit shifting.
Reference -> https://en.wikipedia.org/wiki/Relative_luminance
"""
lum_img = createImage(img.width, img.height, RGB)
img.loadPixels()
lum_img.loadPixels()
for i in range(len(img.pixels)):
col = img.pixels[i]
# Extract RGB components using bit shifts
r = (col >> 16) & 0xff # Red component
g = (col >> 8) & 0xff # Green component
b = col & 0xff # Blue component
# Calculate the true relative luminance using scaled weights:
lum = (weight_R * r + weight_G * g + weight_B * b) >> 8 # Bit-shift by 8 (dividing by 256)
# Set the grayscale pixel by combining the luminance value into RGB format
lum_img.pixels[i] = (col & 0xff000000) | (lum << 16) | (lum << 8) | lum
lum_img.updatePixels()
return lum_img
# Calculate the true relative luminance using scaled weights:
lum = (weight_R * r + weight_G * g + weight_B * b) >> 8 # Bit-shift by 8 (dividing by 256)
# Set the grayscale pixel by combining the luminance value into RGB format
lum_img.pixels[i] = (col & 0xff000000) | (lum << 16) | (lum << 8) | lum
@glv Thanks ever so. I am trying to understand.
What is (int)?
I would love to know what is going on in the Python line: lum_img.pixels[i] = (col & 0xff000000) | (lum << 16) | (lum << 8) | lum
I understand bit shifting is faster, but I like to develop in a plonky clear way and then make the code efficient at the end. I plan to add in some other features such as turning the image into a photographic negative, white point, etc.