Trying to understand the "Substrate" algorithm

Dear all,

I would really appreciate if someone could help me decipher some parts of Jared Tarbell’s “substrate” algorithm.

For those who are not familiar with this famous sketch, here is a recap in Tarbell’s own words:

A single line (known internally as a “crack”) begins drawing itself from some random point in some random direction. The line continues to draw itself until it either (a) hits the edge of the screen or (b) hits another line, at which point it stops and two more lines begin. The one simple rule used in the creation of new lines is that they begin at tangents to existing lines. This process is repeated until there are too many lines to keep track of or the program is stopped.

For clarity I have boiled down the original script to its essence:

int dimx = 250;
int dimy = 250;
int num = 0;
int maxnum = 100;

// grid of cracks
int[] cgrid;
Crack[] cracks;



// MAIN METHODS ---------------------------------------------

void setup() {
  size(250,250,P3D);
//  size(dimx,dimy,P3D);
  background(255);
  
  cgrid = new int[dimx*dimy];
  cracks = new Crack[maxnum];
  
  begin();  
}

void draw() {
  // crack all cracks
  for (int n=0;n<num;n++) {
    cracks[n].move();
  }
}

void mousePressed() {
  begin();
}
    

// METHODS --------------------------------------------------

void makeCrack() {
  if (num<maxnum) {
    // make a new crack instance
    cracks[num] = new Crack();
    num++;
  }
}


void begin() {
  // erase crack grid
  for (int y=0;y<dimy;y++) {
    for (int x=0;x<dimx;x++) {
      cgrid[y*dimx+x] = 10001;
    }
  }
  // make random crack seeds
  for (int k=0;k<16;k++) {
    int i = int(random(dimx*dimy-1));
    cgrid[i]=int(random(360));
  }

  // make just three cracks
  num=0;
  for (int k=0;k<3;k++) {
    makeCrack();
  }
  background(255);
}



// OBJECTS -------------------------------------------------------

class Crack {
  float x, y;
  float t;    // direction of travel in degrees

  
  Crack() {
    // find placement along existing crack
    findStart();

  }
  
  void findStart() {
    // pick random point
    int px=0;
    int py=0;
    
    // shift until crack is found
    boolean found=false;
    int timeout = 0;
    while ((!found) || (timeout++>1000)) {
      px = int(random(dimx));
      py = int(random(dimy));
      if (cgrid[py*dimx+px]<10000) {
        found=true;
      }
    }
    
    if (found) {
      // start crack
      int a = cgrid[py*dimx+px];
      if (random(100)<50) {
        a-=90+int(random(-2,2.1));
      } else {
        a+=90+int(random(-2,2.1));
      }
      startCrack(px,py,a);
    } else {
      //println("timeout: "+timeout);
    }
  }
   
  void startCrack(int X, int Y, int T) {
    x=X;
    y=Y;
    t=T;//%360;
    x+=0.61*cos(t*PI/180);
    y+=0.61*sin(t*PI/180);  
  }
             
  void move() {
    // continue cracking
    x+=0.42*cos(t*PI/180);
    y+=0.42*sin(t*PI/180); 
    
    // bound check
    float z = 0.33;
    int cx = int(x+random(-z,z));  // add fuzz
    int cy = int(y+random(-z,z));
    
    
    // draw black crack
    stroke(0,85);
    point(x+random(-z,z),y+random(-z,z));
    
    
    if ((cx>=0) && (cx<dimx) && (cy>=0) && (cy<dimy)) {
      // safe to check
      if ((cgrid[cy*dimx+cx]>10000) || (abs(cgrid[cy*dimx+cx]-t)<5)) {
        // continue cracking
        cgrid[cy*dimx+cx]=int(t);
      } else if (abs(cgrid[cy*dimx+cx]-t)>2) {
        // crack encountered (not self), stop cracking
        findStart();
        makeCrack();
      }
    } else {
      // out of bounds, stop cracking
      findStart();
      makeCrack();
    }
  }
  
}

What I do not understand is how the line detection mechanism work.

All seem to happen in the move() function within the Crack() class after the first if statement:

if ((cx>=0) && (cx<dimx) && (cy>=0) && (cy<dimy)) {
      // safe to check
      if ((cgrid[cy*dimx+cx]>10000) || (abs(cgrid[cy*dimx+cx]-t)<5)) {
        // continue cracking
        cgrid[cy*dimx+cx]=int(t);
      } else if (abs(cgrid[cy*dimx+cx]-t)>2) {
        // crack encountered (not self), stop cracking
        findStart();
        makeCrack();
      }
    } else {
      // out of bounds, stop cracking
      findStart();
      makeCrack();
    }
  }

What I understand:

  • here cgrid is an array list of size width * height containing integers
  • at the start the begin() function set all these integers to 10001 and then randomly replaces 16 of them by a random angle.
  • cx and cy are the x and y coordinates of a moving point (drawing the crack)
  • cy*dimx+cx is an index (similar to the x + y * width we usually use for pixel indexing)
  • t is the angle previously chosen at random

consequently cgrid[cy*dimx+cx] corresponds to a specific element in the array list. That element should be an integer either equal to 10001 or between 0 and 360 (the angle)

What I do NOT understand:

  • pretty much the whole statement and the different parts it is related to
  • Why 10001 specifically ?
  • If that element in the array list is equal to 10001 (or >10000), why should it be then equal to t (the angle) ?
  • What (abs(cgrid[cy*dimx+cx]-t)<5)) corresponds to ? why <5 ?
  • What abs(cgrid[cy*dimx+cx]-t) corresponds to ? What does “angle in the list minus angle” mean here ?

And the thing is I know these very statements make all the lines stop when hitting an edge or an other line but I just can’t grasp the logic behind it… it drives me crazy.

Hope someone clever than me can help me understand all this.

8 Likes

:star: You get a gold star for the best-asked question I have ever seen.

8 Likes

I would like to give two stars to Thomas @TfGuy44 … for taking time to recognize @solub efforts.


Numbers like 5, 10000 and 10001 are an example of magic numbers. This is why it is better to assign them constants so they can clarify the functionality in the code. Maybe at the end of this exercise you could suggest a name to those values.

There are two key points for this code:

  1. Check this statement

Unfortunately this is not clear as tangents from lines is really not defined in geometry based on my limited experience. Looking at the code during crack creations, angles are changed by +90 or -90 degrees (plus some angle noise):

      if (random(100)<50) {
        a-=90+int(random(-2,2.1));
      } else {
        a+=90+int(random(-2,2.1));
      }

So I believe the line creating happens perpendicular to existing lines. This makes more sense.

  1. The other point to consider is in this line of the code that you presented: cgrid[cy*dimx+cx]=int(t);
    So this is what happens. All the points in the grid are marked with an undefined value of 10001. Initially 3 cracks are created. Also, as you mentioned, 16 random coordinates in the grid are assigned an angle. When the cracks start moving, one point is anexed to the line follwing the same angle (plus/minus some noise). That point is also assinged an angle, which derives from the parent (pre-decesor) point. When a point hits the wall (I think) or a moving point finds a position with an angle, then something happens. The algorithm enters into crack creation mode (a while loop) and randomly searches in the grid for a position that is not marked as undefined, reads the angle at that position and then randomly assigns set a value of + or - 90 (plus some noise). And then the crack starts moving from there.

Notice that the while loop has a mechanism to dictate when to stop drawing cracks. When searching for a new undefined stop, if it cannot find one, then it will not generate the crack. The algorithm is permitted to sample the grid 1000 times before giving up creating a crack. You can imagine that when the grid becomes so pupulated, 1000 times will be enough to stop generating cracks.

Anyways, this is the story of this algorithm. If you want to explore another concept, instead of the algorithm advancing a single point, make the algorithm draw 10 points per step. Or make the algorithm generate cracks only at certain defined “regularly spaced” grid positions. What do you think the outcome would be?

Two more things to talk about:

Why 10001? Well, the algorithm adds and subtrracts 90 during every crack creation, right? A single point in the grid could experience this operation multiple times. Statistically speaking, any non undefined position could vary by 90 “jumps” but because you can have a positive or negative angle jumps, your angle will sort of stay closer to zero. However, if a point is lucky enough and based on the random nature, a point could be incremented by positive 90 multiple times before it is drven back by negative 90s. In other words, an angle stored in a grid is not restricted to values between 0 and 360 but it could have values between MIN_INT and MAX_INT. The likelihood this angle reaches thes boundaries of integer values is low. The changes that an angle reaches a value of 10001 or larger? They are low as well but still probable. In the case an angle reaches or surpases this value, then it is not big deal. The algorithm will consider this point as undefined again and it will participate in a crack creation, if it is randomly selected.

Finally, the other thing we need to talk about is about the value of 5:

However, I believe I already answer this question above.

Kf

6 Likes

I would like to give three stars (and a bouquet of flowers :sunflower::sunflower:) to @kfrajer for taking the time to explain the script in a comprehensive manner.

I’ve tried to implement the core ideas of the algorithm in Python in the shortest and most understandable way possible (at least for me) based both on your clarifications and the original Java code.

Even though I now have a better understanding of what is happening I must admit that some parts are still obscure to me.

  • (in findStart()) I understand the point inherits the angle of that position and add/substract 90 degrees for perpendicularity reasons, but how can that point be starting from the line it was drawing when its coordinates (px and py) are previously selected randomly in the canvas ?

Anyway, I’ll probably find an answer by myself soon or later. Just need more time to fully grasp the logic I guess.

Python version (without stopping mechanism)

num, n_agents = 0, 100

def setup():
    global collection, grid
    size(1200, 900, P2D)
    background(255)
    smooth(8)
    
    grid = [10001 if random(1) > .00003 else int(random(360)) for e in range(width*height)]
    collection = [Agent() for e in range(n_agents)]
    for e in range(3): addAgent()

def draw():
    for a in range(num):
        collection[a].move()
        collection[a].findEdge()
        
def addAgent():
    global num
    if num < n_agents:
        collection[num] = Agent()
        num += 1
    
class Agent(object):
    def __init__(self):
        self.location = PVector(int(random(width)), int(random(height)))
        self.angle = 0.0
        self.findStart()
        
    def findStart(self):
        found  = False
        while not found:
            px, py = int(random(width)), int(random(height))
            if grid[px + py * width] < 10000: found = True
            
        if found:
            a = grid[px + py * width]
            self.angle = a + 90 if random(1) > .5 else a - 90
            self.location = PVector(px, py)
       
    def move(self):
        self.location += PVector(cos(radians(self.angle)), sin(radians(self.angle)))
        point(self.location.x, self.location.y)
        
    def findEdge(self):
        x, y = int(self.location.x), int(self.location.y)
        index = x + y * width

        if x >= 0 and x < width and y >= 0 and y < height:
            
            if grid[index] > 10000 or abs(grid[index]-self.angle) < 10: 
                grid[index] = int(self.angle)  

            else: self.findStart(); addAgent()
            
        else: self.findStart(); addAgent()
2 Likes

Are you referring to the initial 16 points? I think these are in the init function… not sure right now how findStart() is related to it. I will need to check the code again. Related to those 16 points, they are placed randomly only at the beginning. However, notice those points do not start a line. They just happen to be placed in the gird and they are invisible (it is a sub-pixel measurement). They come into play only if a line runs into any of those grid locations. This is unlikely at the beginning (not impossible), and the likelihood for this happening just increases as more lines are drawn in the grid. Notice that at the beginning of the code, only three cracks are generated, which are called right after 16 random points are assigned an angle.

Maybe you are referring to the while loop? A point is drawn in the canvas if it is not undefined. This happens when the value is less than 10001. In other words, it happens when an angle has been assigned to that position. Now, imagine you are the algorithm. So what you do is this: You randomly probe your grid (you can probe up to a max of 1000 times) until you find a coordinate that has some angle define on it. When you find this coordinate, it will belongs to a line (Exception: those 16 random points assigned through the init function call). It is very likely that the points adjacent to this point have the same angle value (not mandatory in case a previous crack was generated there before). Then you take this angle, you introduce a shift of 90 degrees, and there you go, you start a line from there.

Kf

3 Likes

Thank you @kfrajer, I’m slowly getting used to the algorithm. A very clever script in my opinion…


3 Likes

Really interesting post and nice outcomes :+1:

I prefer the second pic though!

1 Like

Very interesting! In the current examples cracks are implemented as animated brush agents, and they leave pixels as they move. This fits with the textured / material aesthetic of the output, but I can imagine a different approach.

If I was going to write a response inspired by this code, I think I would get rid of the grid-based algorithm and the pixel-checking entirely, and instead separate the act of growing the geometric data structure from the act of rendering it, so that they could be done separately.

data grower:

  1. define an array of lines defined by two PVectors, an origin and a vector, along with timing information (a begin frame and an end frame)
  2. all these things are now geometric operations:
    • grow a line – change the magnitude of the second vector.
    • create a new line – take a known line and define a mid point on it as the new origin, then make the second vector perpendicular (or whatever).
    • stop growing – use line-line collision detection.
  3. random processes now output a given line structure – that is, a set of lines that appear, grow, and stop over time – as data

renderer:

Plays back a list of lines. Any specific growth pattern can be animated and reanimated at different replay speeds, or with different color / stroke / fill / et cetera. Essentially, the crawlers job is now to follow the line paths based on the timing data, and stop when the lines end.

4 Likes

Hi Jeremy (glad to see you back),

Your suggestion sounds interesting. Could you please elaborate on the last point of your second step (stop growing line) ? What that line-line collision detection would consist of ?

I need to find some time to implement your algorithm.

1 Like

Hi @solub.

Here is an example of line-line collision detection – it returns true-false, but you can also return a PVector with the intersection point, which is very useful for truncating a growing line at exactly the right place after it crosses.

/**
 * Line-Line collision detection
 * http://www.jeffreythompson.org/collision-detection/line-line.php
 */
boolean lineLine(float x1, float y1, float x2, float y2, float x3, float y3, float x4, float y4) {
  // calculate the distance to intersection point
  float uA = ((x4-x3)*(y1-y3) - (y4-y3)*(x1-x3)) / ((y4-y3)*(x2-x1) - (x4-x3)*(y2-y1));
  float uB = ((x2-x1)*(y1-y3) - (y2-y1)*(x1-x3)) / ((y4-y3)*(x2-x1) - (x4-x3)*(y2-y1));
  // if uA and uB are between 0-1, lines are colliding
  if (uA >= 0 && uA <= 1 && uB >= 0 && uB <= 1) {
    // optionally, draw a circle where the lines meet
    float intersectionX = x1 + (uA * (x2-x1));
    float intersectionY = y1 + (uA * (y2-y1));
    fill(255, 0, 0);
    noStroke();
    ellipse(intersectionX, intersectionY, 20, 20);
    return true;
  }
  return false;
}
4 Likes

Well, I went ahead and tried implementing a vector-based line grower in Processing(Java) along the lines I outlined. I’m calling the approach “Stratigraph” – as in stratigraphy, i.e. an abstract line graph animator inspired by substrate.

A few subtleties around the use of line-line collision for cropping came up that I really wasn’t expecting:

  1. if a branch is defined as an intersection with a base line, then auto-cropping will crop anywhere that you branch, because they intersect!
  2. similarly, if you crop at an intersection, then any given intersection will be cropped twice in two different passes (both lines will be cropped, not just the first line, leaving an L).
  3. … a simple solution to both problems is to float new branches slightly away from their base, and crop slightly back from a crossing, leaving a simple field of line segments that aren’t quite touching. Now there are always 0 crossings before and after each grow operation – on any given line grow, you can then check only that line for crossings, and if you find one or more, crop only that line each time. Now there are 0 crossings again.

Another interesting issue is bounds:

  1. Once bounds checking occurs through growing, boundaries are really just lines that nothing grows from – there doesn’t need to be a special case for the picture frame.
  2. One approach is to have lines “die” when they stop growing, and become bounds. The simplest bounded stratigraph then is seeded with four dead lines around it – nothing can grow out of these, but lines die when they grow into them.
  3. Another approach is to have multiple line sets, some of which have grow rules, some of which don’t, but all of which are subject to collision detection.

This last approach makes it really easy to combine layouts and seed lines for stained glass effects or fill pouring through complex shapes.

I ended up with a nice faster-than-realtime data renderer with a realtime mode, a looping playback mode, and a scrubbing mode so you can browse back and forth through long render timelines with the mouse.

Running the grow in realtime animation mode (one line grow operation per frame) can take several minutes for a dense sketch, but pre-rendering in non-realtime takes a few seconds, and then you can scrub through it. Because any given frame is interpolated on-the-fly and can be re-rendered from a random seed, experimenting with stroke / color palete, etc. it much easier.

It isn’t as polished as @solub’s work, and it needs data import / export and in general to be cleaned up before sharing – and then it still needs to be translated into python. But I am pretty happy with it as a draft. May stick it on github later.

8 Likes

This is beautiful Jeremy !

@jeremydouglass

Hello,
we are a student group from Germany, Stuttgart and we are working on a processing project. We were working with substrate and were looking for different alternatives.

https://global.discourse-cdn.com/flex036/uploads/processingfoundation1/original/2X/4/4c5ecb5f7d141b001044a9689a98efc52e6377e0.jpeg

We really liked your example above. Would it be possible for sharing the code that you used above for generating the graphic inside the frame?

Thank you from Germany

2 Likes

Hi @Vincent – thanks for contacting me about your project. It sounds very interesting.

My code is based on @solub’s code, which is in turn based on Jared Tarbell’s project. I am happy to share it with the request that you credit these people in your project along with a link to this thread.

If that credit is not acceptable for your assignment then I’d suggest that you reverse-engineer what was done from descriptions here, and I would be happy to answer questions.

Also – are you working in Python mode or the default (Java) mode? My solution was in Processing’s Java dialect.

5 Likes

Please don’t credit me, I did nothing but to shorten Tarbell’s code and to port it to Python mode. In contrast, Jeremy came up with a real alternative (different algorithm with a few enhancements) and in that respect his work would truly deserve to be credited.

2 Likes

@jeremydouglass

Thank you for your response. We are working in Java.

We would give credit to the people who worked on it and to this thread :slight_smile:

Hello, where can you share us the code ?