Bit masking (or not)?

I often see it said that bit masking is faster [1] than other simpler Processing methods. I also read that bit masking is not faster [2], and it comes with disadvantages such as extra lines of code and less clarity.

Is it still true that bit masking is faster in Python mode of Processing? Is there a way to measure and compare the speeds of say float b1 = blue(c) and float b2 = c & 0xFF?

[1]
The blue() function is easy to use and understand, but it is slower than a technique called bit masking. When working in colorMode(RGB, 255), you can achieve the same results as blue() but with greater speed by using a bit mask to remove the other color components. For example, the following two lines of code are equivalent means of getting the blue value of the color value c:

float b1 = blue(c) // Simpler, but slower to calculate
float b2 = c & 0xFF // Very fast to calculate

[2]
Note: Don’t use the bit shift operators as a means of premature optimization in Python. You won’t see a difference in execution speed, but you’ll most definitely make your code less readable.

Hello @paulstgeorge,

To optimize or not to optimize?
This very much depends on what you are doing.

If you are processing pixels in video frames in real-time (quickly enough to keep up with the incoming video) you will certainly want to optimize.

Your [2] states:

You will not see a difference with a single execution of the statements you provided.

You will see a difference if you are running it in a loop and measure the time for multiple executions.

Example:

t0 = millis();
for i in range(0xFFFFFF):
    b1 = blue(i);
t1 = millis()
println(t1-t0);

t0 = millis();
for j in range(0xFFFFFF):
    b2 =  j & 0xFF
t1 = millis()
println(t1-t0);

Output:
2007
1406

:)

3 Likes

That’s great, thank you. Your explanation and demonstration should be in the documentation.

PS I made a third clumsy comparison because to use bit shifting, the numbers often have to be converted from floats to integer. Then bit shifting method is slower than b1 = blue(i) [as that simpler method can handle floats].

But your use of a timer enables people to test for themselves and then make an informed choice!

def setup():
    noLoop()

def draw():
    t0 = millis()
    for i in range(10000000):
        j = float(i)
        k = int(j)
        b3 = k & 0xFF
        
    t1 = millis()
    print(t1-t0)

My spontaneous guess was that blue() is just a beginner-friendly wrapper around a bitwise operation? After all, you don’t want to scare off coding newcomers with talks of bit shifting and masking.

I checked in the PApplet.java source to see if my hunch was confirmed. But… this is strange, blue() is set up as follows, not as a bit mask.

public final float blue(int rgb) {
   return g.blue(rgb);
}

Though in this case g.blue() method references the current PGraphics renderer (signified by g). Let’s check out the PGraphics.java source:

public final float blue(int rgb) {
    float c = (rgb) & 0xff;
    if (colorModeDefault) return c;
    return (c / 255.0f) * colorModeZ;
}

And here we have it.

Bottom line, wrapping that bitwise operation into function call(s) overhead will clearly take longer than applying it directly. Though as @glv states nicely, any significant time gain is fully dependent on what you’re building.

It’s the Processing-typical Trading Power for Accessibility-compromise.

2 Likes

Well, duh… so much for not seeing the forrest for the trees. Just noticed now, that this whole topic is about Processing.py. :man_facepalming:

Though I assume that my general sentiment is still applicable in this context.

Absolutely.

And yes, Processing is great for non-programmers (like me)!

In this example I am comparing 3 methods:

def setup():
    size(600, 600);
    background(0);

def draw():
    
    n = 3000000  
    
# Method 1  (custom)  
        
    t0 = millis()
    for i in range(n):
        j = float(i) 
        k = int(j)
        b3 = k & 0xFF   # 1: bit masking after 2 casts (to float and then int)
        
    t1 = millis()
    d1 = t1-t0
    print(1, d1)
    
# Method 2   
    
    t0 = millis()
    for i in range(n):
        b3 = blue(i)     # 2: Using blue()
        
    t1 = millis()
    d2 = t1-t0
    print(2, d2)    
    
# Method 3
       
    t0 = millis()
    for i in range(n):
        b3 = i & 0xFF    # 3: Bit masking
        
    t1 = millis()
    d3 = t1-t0
    print(3, d3)    
    
    print(" ")    

    strokeWeight(3);
    x = 3*frameCount;
    stroke(255, 0, 0)
    point(x, d1)
    stroke(255, 255, 0)
    point(x, d2)
    stroke(0, 255, 0)
    point(x, d3)

Method 3 (green) is the fastest (bit masking directly on an integer):

It is important to do multiple tests to clearly see the differences.

:)

1 Like

@glv That’s great! An excellent way to demonstrate the different speeds.

It is very clear that method 1 is far slower than either method 2 or method 3. So is this the fuller story? Bit-masking could be faster than blue() when you don’t need to cast the numbers to integers before using?

Hello @paulstgeorge,

Your picture did not show the code for the method used and I can’t comment on that.

Bit masking is faster than blue() and is clearly stated here:

Colors are stored as 32 bit integers.
If you are working in the default mode colorMode(RGB, 255) and you want to optimize code for image editing then stick to using integers, bit manipulation and replace the helper functions for extracting and setting colors.

*** Edited the above for clarity on intended context. ***

You can see what is done under the hood in the source code and can remove the bloat and customize for your project.

Extraction:
red()
green()
blue()

Setting:
color(v1, v2, v3)

If you add code to cast or wrap it in a function it will add to the execution time.

The blue() function returns a float for any colorMode and NOT optimized for colorMode(RGB, 255) :
print(type(blue(0x00FF00))); // Console: <type 'float'>

Color is an int:
print(type(color(255, 0, 0)) // Console: <type 'int'>

Thee may be exceptions where the compiler optimizes code but that is another discussion.

My roots are in the embedded world and bit manipulation is second nature to me.
I find the helper functions in Processing and Arduino abstract away from what is really going on under the hood. They are great for a beginner programmer but can be cumbersome for an experienced programmer.

Reference:
Core Embedded Systems Skill: Bitwise Operation | by Alwin Arrasyid | Medium | Medium

A good tutorial that also has bit manipulation with comments on why it is used:

:)

1 Like

Hi! Its your code. Sorry, I should have said.

This conversation came out of the discussion: Color to grayscale algorithm

The widely used algorithms mentioned have floats for RGB values. For example,
Luminance = 0.2126 * Red + 0.7152 * Green + 0.0722 * Blue

I’ll read the Medium articles with great interest. Thank you!

The R, G and B did not need to be floats and can be efficiently extracted from the color integer with bit manipulation.
The luminance will be a float.
You may need to use floating point math in such cases and can cast results as required.

I edited my previous post for clarity on context.

@solub made some approximations here to optimize performance:

There are also examples of extracting and setting colors without the helper functions:
Color to grayscale algorithm - #5 by solub

:)