# "Wave Collapse Function" algorithm in Processing

Dear all,

I would like to share with you a customized redesign of the Wave Collapse Function algorithm that works in Processing. My wish is that someone, someday, can port this sketch from Python to Java and make this algorithm available to the whole Processing community (more on that below).

What is the Wave Function Collapse algorithm ?

It is an algorithm written in 2016 by Maxim Gumin that can generate procedural patterns from a sample image or from a collection of tiles. You can see it in action here (2D “overlapping model”) and here (3D “tiled model”).

If you’re a game programmer / designer you’ve probably already heard of it as it has been used to generate terrain / buildings / cities proceduraly in a couple of recent games (see Bad North example).

Goal of this implementation:

To boil down the algorithm to its essence and avoid the redondancies and clumsiness of the original C# script (surprisingly long and difficult to read). This is an attempt to make a shorter, clearer and Processing compatible version of this algorithm, that can be understood and used by as many people as possible.

Characteristics of this implementation:

This is a rewritting of the 2D overlapping model in Processing Python mode. Unlike the original version, this implementation:

• is not object oriented, making it easier to understand / closer to pseudo-code
• is using 1D arrays instead of 2D arrays
• is using array slicing for matrix manipulation
• is populating the Wave with Python sets of indices whose size decrease as cells are “collapsed” (replacing large fixed size lists of booleans).
• is storing data in dictionnaries whose keys are progressively deleted when possible.
• is also storing data in tuples instead of lists
• is skipping entropy computations (first entropy calculation not needed since equiprobable high level of uncertainty at start)
• is displaying cells once (avoiding storing them in an array and redrawing at each frame)

For all these reasons, and also because it is Python (one-liner list comprehensions, minimalist syntax, …), this implementation is a lot, lot shorter than the original script (and its various ports) with only 70 lines of code against nearly 800 lines for the latter.
It is also faster in theory, but for obvious reasons (Jython under the hood) the following version won’t run as fast.

It should also be pointed that this implementation stores all the rotated and reflected matrices of each pattern and, like most versions, doesn’t use a backtracking system.

Full script

You’ll need to download a tiny bitmap picture (usually 8x8 or 16x16 pixels) beforehand to load it as an input. You can find a whole bunch on this dedicated github page.

from collections import Counter
from itertools import chain
from random import choice, sample

w, h, s = 96, 50, 9
N = 3

def setup():
size(w*f, h*f, P2D)
background('#FFFFFF')
frameRate(1000)
noStroke()

global W, A, H, directions, patterns, freqs

iw, ih = img.width, img.height
kernel = tuple(tuple(i + n*iw for i in xrange(N)) for n in xrange(N))
directions = ((-1, 0), (1, 0), (0, -1), (0, 1))
all = []

for y in xrange(ih):
for x in xrange(iw):
cmat = tuple(tuple(img.pixels[((x+n)%iw)+(((a[0]+iw*y)/iw)%ih)*iw] for n in a) for a in kernel)
for r in xrange(4):
cmat = zip(*cmat[::-1])
all.append(cmat)
all.append(cmat[::-1])
all.append([a[::-1] for a in cmat])

all = [tuple(chain.from_iterable(p)) for p in all]
c = Counter(all)
freqs = c.values()
patterns = c.keys()
npat = len(freqs)

W = dict(enumerate(tuple(set(range(npat)) for i in xrange(w*h))))
A = dict(enumerate(tuple(set() for dir in xrange(len(directions))) for i in xrange(npat)))
H = dict(enumerate(sample(tuple(npat if i > 0 else npat-1 for i in xrange(w*h)), w*h)))

for i1 in xrange(npat):
for i2 in xrange(npat):
if [n for i, n in enumerate(patterns[i1]) if i%N!=(N-1)] == [n for i, n in enumerate(patterns[i2]) if i%N!=0]:
if patterns[i1][:(N*N)-N] == patterns[i2][N:]:

def draw():
global H, W

if not H:
print 'finished'
noLoop()
return

emin = min(H, key = H.get)
id = choice([idP for idP in W[emin] for i in xrange(freqs[idP])])
W[emin] = {id}
del H[emin]

stack = {emin}
while stack:
idC = stack.pop()
for dir, t in enumerate(directions):
x = (idC%w + t[0])%w
y = (idC/w + t[1])%h
idN = x + y * w
if idN in H:
possible = {n for idP in W[idC] for n in A[idP][dir]}
if not W[idN].issubset(possible):
intersection = possible & W[idN]

if not intersection:
noLoop()
return

W[idN] = intersection
H[idN] = len(W[idN]) - random(.1)

fill(patterns[id][0])
rect((emin%w) * s, (emin/w) * s, s, s)

Fully annotated version
from collections import Counter
from itertools import chain
from random import choice, sample

w, h = 96, 50  # dimensions of output (array of wxh cells)
f = 9 # size factor
N = 3 # dimensions of a pattern (NxN matrix)

def setup():
size(w*f, h*f, P2D)
background('#FFFFFF')
frameRate(1000)
noStroke()

global W, A, H, directions, patterns, freqs, xs, ys

img = loadImage('Flowers.png') # path to the input image
iw, ih = img.width, img.height # dimensions of input image
xs, ys = width//w, height//h # dimensions of cells (rect) in output
kernel = tuple(tuple(i + n*iw for i in xrange(N)) for n in xrange(N)) # NxN matrix to read every patterns contained in input image
directions = ((-1, 0), (1, 0), (0, -1), (0, 1)) # (x, y) tuples to access the 4 neighboring cells of a specific cell
all = [] # array list to store all the patterns found in input

#### Stores the different patterns found in input

for y in xrange(ih):
for x in xrange(iw):

''' The one-liner below (cmat) creates a NxN matrix with (x, y) being its top left corner.
This matrix will wrap around the edges of the input image.
The whole snippet reads every NxN part of the input image and store the associated colors.
Each NxN part is called a 'pattern' (of colors). Each pattern can be rotated or flipped (not mandatory). '''

cmat = tuple(tuple(img.pixels[((x+n)%iw)+(((a[0]+iw*y)/iw)%ih)*iw] for n in a) for a in kernel)

# Storing rotated patterns (90°, 180°, 270°, 360°)
for r in xrange(4):
cmat = zip(*cmat[::-1]) # +90° rotation
all.append(cmat)
all.append(cmat[::-1]) # vertical flip
all.append([a[::-1] for a in cmat]) # horizontal flip

#### Flatten pattern matrices + count occurences

''' Once every pattern has been stored,
- we flatten them (convert to 1D) for convenience
- count the number of occurences for each one of them (one pattern can be found multiple times in input)
- select and store unique patterns only '''

all = [tuple(chain.from_iterable(p)) for p in all]  # flattening arrays
c = Counter(all) # Python counter
freqs = c.values() # number of occurences for each unique pattern
patterns = c.keys() # list of unique patterns
npat = len(freqs) # number of unique patterns

#### Initializes the 'wave' (W), entropy (H) and adjacencies (A) array lists

''' Array W (the Wave) keeps track of all the available patterns, for each cell.
At start start, all patterns are valid anywhere in the Wave so each subarray
is a list of indices of all the patterns'''

W = dict(enumerate(tuple(set(range(npat)) for i in xrange(w*h))))

''' Array H should normally be populated with entropy values.
Entropy is just a fancy way to represent the number of patterns
still available in a cell. We can skip this computation and populate
the array with the number of available patterns instead.

At start all patterns are valid anywhere in the Wave, so all cells
share the same value (npat). We must however pick one cell at random and
assign a lower value to it. Why ? Because the algorithm in draw() needs
to find a cell with the minimum non-zero entropy value. '''

H = dict(enumerate(sample(tuple(npat if i > 0 else npat-1 for i in xrange(w*h)), w*h)))

''' Array A (for Adjacencies) is an index datastructure that describes the ways
that the patterns can be placed near one another. More explanations below '''

A = dict(enumerate(tuple(set() for dir in xrange(len(directions))) for i in xrange(npat))) # explanations below

#### Computation of patterns compatibilities (check if some patterns are adjacent, if so -> store them based on their location)

''' EXAMPLE:
If pattern index 42 can placed to the right of pattern index 120,
we will store this adjacency rule as follow:

Here '1' stands for 'right' or 'East'/'E'

0 = left or West/W
1 = right or East/E
2 = up or North/N
3 = down or South/S '''

# Comparing patterns to each other
for i1 in xrange(npat):
for i2 in xrange(npat):

''' (in case when N = 3) If the first two columns of pattern 1 == the last two columns of pattern 2
--> pattern 2 can be placed to the left (0) of pattern 1 '''

if [n for i, n in enumerate(patterns[i1]) if i%N!=(N-1)] == [n for i, n in enumerate(patterns[i2]) if i%N!=0]:

''' (in case when N = 3) If the first two rows of pattern 1 == the last two rows of pattern 2
--> pattern 2 can be placed on top (2) of pattern 1  '''

if patterns[i1][:(N*N)-N] == patterns[i2][N:]:

def draw():
global H, W

# Simple stopping mechanism

''' If the dict (or arraylist) of entropies is empty -> stop iterating.
We'll see later that each time a cell is collapsed, its corresponding key
in H is deleted '''

if not H:
print 'finished'
noLoop()
return

#### OBSERVATION

''' Find cell with minimum non-zero entropy (not collapsed yet).'''

emin = min(H, key = H.get)

#### COLLAPSE

''' Among the patterns available in the selected cell (the one with min entropy),
select one pattern randomly, weighted by the frequency that pattern appears
in the input image.'''

id = choice([idP for idP in W[emin] for i in xrange(freqs[idP])]) # index of selected pattern

''' The Wave's subarray corresponding to the cell with min entropy should now only contains
the id of the selected pattern '''

W[emin] = {id}

''' Its key can be deleted in the dict of entropies '''

del H[emin]

#### PROPAGATION

''' Once a cell is collapsed, its index is put in a stack.
That stack is meant later to temporarily store indices of neighoring cells '''

stack = {emin}

''' The propagation will last as long as that stack is filled with indices '''

while stack:

''' First thing we do is pop() the last index contained in the stack (the only one for now)
and get the indices of its 4 neighboring cells (E, W, N, S).
We have to keep them withing bounds and make sure they wrap around. '''

idC = stack.pop() # index of current cell
for dir, t in enumerate(directions):
x = (idC%w + t[0])%w
y = (idC/w + t[1])%h
idN = x + y * w # index of negihboring cell

''' We make sure the neighboring cell is not collapsed yet
(we don't want to update a cell that has only 1 pattern available) '''

if idN in H:

''' Then we check all the patterns that COULD be placed at that location.
EX: if the neighboring cell is on the left of the current cell (east side),
we look at all the patterns that can be placed on the left of each pattern
contained in the current cell. '''

possible = {n for idP in W[idC] for n in A[idP][dir]}

''' We also look at the patterns that ARE available in the neighboring cell '''

available = W[idN]

''' Now we make sure that the neighboring cell really need to be updated.
If all its available patterns are already in the list of all the possible patterns:
—> there’s no need to update it (the algorithm skip this neighbor and goes on to the next) '''

if not available.issubset(possible):

''' If it is not a subset of the possible list:
—> we look at the intersection of the two sets (all the patterns that can be placed
at that location and that, "luckily", are available at that same location) '''

intersection = possible & available

''' If they don't intersect (patterns that could have been placed there but are not available)
it means we ran into a "contradiction". We have to stop the whole WFC algorithm. '''

if not intersection:
noLoop()
return

''' If, on the contrary, they do intersect -> we update the neighboring cell with that refined
list of pattern's indices '''

W[idN] = intersection

''' Because that neighboring cell has been updated, its number of valid patterns has decreased
and its entropy must be updated accordingly.
Note that we're subtracting a small random value to mix things up: sometimes cells we'll
end-up with the same minimum entropy value and this prevent to always select the first one of them.
It's a cosmetic trick to break the monotony of the animation'''

H[idN] = len(W[idN]) - random(.1)

''' Finally, and most importantly, we add the index of that neighboring cell to the stack
so it becomes the next current cell in turns (the one whose neighbors will be updated
during the next while loop) '''

#### RENDERING
''' The collapsed cell will always be filled with the first color (top left corner) of the
selected pattern '''

fill(patterns[id][0])
rect((emin%w) * xs, (emin/w) * ys, xs, ys)


The need for a Processing Java version

The reasons are threefold:

A lot of Processing users like to create games. As mentionned above, the WFC algorithm can be very useful for game design: generating terrains, buildings or even entire cities, be it in 2D or 3D. This could be a welcomed contribution for any designer who want to speed-up and expand his creative process.

Long-awaited Coding Challenge. This algorithm has been submitted a couple of times on the Coding Train’s github (and mentionned multiple times on the YT channel) as an idea for a future Coding Challenge. It seems Daniel Shiffman really liked the suggestion but never had the chance to tackle it. Providing a clear and simple example sketch in Java could be a great opportunity to revive the submission and maybe have the chance to finally see the algorithm explained to a larger audience.

The generative nature of the algorithm is closely related to the philosophy of Processing.
In my opinion, Processing is mostly about generating things: design, art, drawings, colors, shapes… It is a tool to explore the vast field of possibilites that a simple idea can lead to.
So far, the WFC algorithm has primarly been used in the field of game design but I cannot help but to think there’s so much more to do with it. Some have started generating music with it, others poetry or even wine descriptions… and I believe Processing is the perfect environement to explore the possibilities even further. It is in its DNA.

Original Algorithm

The following explanations are here to help understanding the orginal algorithm, as it was designed by Maxim Gumin. It is not an exact representation of the Python implementation above but can be a good introductory step to the code.

1/ Read the input bitmap, store every NxN patterns and count their occurences. ( optional: Augment pattern data with rotations and reflections.)

For example, when N = 3:

2/ Precompute and store every possible adjacency relations between patterns. In the example below, patterns 207, 242, 182 and 125 can overlap the right side of pattern 246

3/ Create an array with the dimensions of the output (called W for wave). Each element of this array is an array holding the state ( True of False ) of each pattern.

For example, let’s say we count 326 unique patterns in input and we want our output to be of dimensions 20 by 20 (400 cells). Then the “Wave” array will contain 400 (20x20) arrays, each of them containing 326 boolan values.

At start, all booleans are set to True because every pattern is allowed at any position of the Wave.

W = [[True for pattern in xrange(len(patterns))] for cell in xrange(20*20)]


4/ Create another array with the dimensions of the output (called H ). Each element of this array is a float holding the “entropy” value of its corresponding cell in output.

Entropy here refers to Shannon Entropy and is computed based on the number of valid patterns at a specific location in the Wave. The more a cell has valid patterns (set to True in the Wave), the higher its entropy is.

For example, to compute the entropy of cell 22 we look at its corresponding index in the wave ( W[22] ) and count the number of booleans set to True . With that count we can now compute the entropy with the Shannon formula. The result of this calculation will be then stored in H at the same index H[22]

At start, all cells have the same entropy value (same float at every position in H ) since all patterns are set to True , for each cell.

H = [entropyValue for cell in xrange(20*20)]


These 4 steps are introductory steps, they are necessary to initalize the algorithm. Now starts the core of the algorithm:

5/ Observation:

Find the index of the cell with the minimum nonzero entropy (Note that at the very first iteration all entropies are equal so we need to pick the index of a cell randomly.)

Then, look at the still valid patterns at the corresponding index in the Wave and select one of them randomly, weighted by the frequency that pattern appears in the input image (weighted choice).

For example if the lowest value in H is at index 22 ( H[22] ), we look at all the patterns set to True at W[22] and pick one randomly based on the number of times it appears in the input. (Remember at step 1 we’ve counted the number of occurences for each pattern). This insures that patterns appear with a similar distribution in the output as are found in the input.

6/ Collapse:

We now assign the index of the selected pattern to the cell with the minimum entropy. Meaning that every pattern at the corresponding location in the Wave are set to False except for the one that has been chosen.

For example if pattern 246 in W[22] was set to True and has been selected, then all other patterns are set to False . Cell 22 is assigned pattern 246 . In output cell 22 will be filled with the first color (top left corner) of pattern 246. (blue in this example)

7/ Propagation:

Because of adjacency constraints, that pattern selection has consequences on the neighboring cells in the Wave. The arrays of booleans corresponding to the cells on the left and right, on top of and above the recently collapsed cell need to be updated accordingly.

For example if cell 22 has been collapsed and assigned with pattern 246 , then W[21] (left), W[23] (right), W[2] (up) and W[42] (down) have to be modified so as they only keep to True the patterns that are adjacent to pattern 246 .

For example, looking back at the picture of step 2, we can see that only patterns 207, 242, 182 and 125 can be placed on the right of pattern 246. That means that W[23] (right of cell 22 ) needs to keep patterns 207, 242, 182 and 125 as True and set all other patterns in the array as False . If these patterns are not valid anymore (already set to False because of a previous constraint) then the algorithm is facing a contradiction and the whole algorithm must be stopped.

This process doesn’t stop to the 4 direct neighbors of the collapsed cell and has to be extended recursively to the neighbors of the neighbors, and so on, until all constraints are propagated.

8/ Updating entropies

Because a cell has been collapsed (one pattern selected, set to True ) and all its surrounding cells updated accordingly (setting non adjacent patterns to False ), the entropy of these cells have changed and needs to be computed again. (Remember that the entropy of a cell is correlated to the number of valid pattern it holds in the Wave.)

In the example, the entropy of cell 22 is now 0, ( H[22] = 0 , because only pattern 246 is set to True at W[22] ) and the entropy of its neighboring cells have decreased (patterns that were not adjacent to pattern 246 have been set to False ).

By now the algorithm arrives at the end of the first iteration and will loop over steps 5 (find cell with minimum non zero entropy) to 8 (update entropies) until all cells are collapsed.

This detailed article from Stephen Sherratt and this explanatory paper from Karth & Smith. Also this comprehensive blogpost by Trasevol_Dog.

11 Likes

Please find below an implementation of the “tiled model”.

The difference is that, unlike the “overlapping model” above, adjacency constraints are not inferred from a source image. As a result, constraints and frequencies (if needed) have to be entered manually (hard-coded).

A couple of improvements

Writing all the adjacencies of each tile can be cumbersome (especially with large tilesets) so I decided to add a connector-based system (inspired by this article by Marian42) to facilitate the task. Instead of storing all the possible adjacent tiles in each direction for each tile, it is now only required to associate each face of a tile with its corresponding symmetry index. All faces sharing the same symmetry index will be adjacent.

Also, because most tilesets don’t come with the rotated versions of each tile, I added a mechanism that rotates a tile and stores the transformated output based on its file name.

Here for example, tile “a” will be rotated 3 times at a 90-degree angle and each transformation will be added to the tiles array. “b” is not taken into account and “c” will be rotated only once. As a result the final array will contain 7 tiles:

The mechanism also automatically creates and rotates the connectors (arrays of symmetry indices) of each new rotated tile.

Limits

In some cases you may want (for cosmetic reasons) to prevent a specific tile to be placed near another tile even though adjacent rules allow this placement. So far I’m writing these avoidance rules (as I call them) by hand but it would be really convenient (provided it is possible) to automate that part as much as possible.

Full script

from random import choice, sample
from collections import deque
from java.io import File

w, h, s = 80, 45, 10
directions = ((-1, 0), (1, 0), (0, -1), (0, 1))

def setup():
size(w*s, h*s, P2D)
background('#FFFFFF')
frameRate(1000)

global W, A, H, tiles

path = File("\Knots").listFiles()
connectors = [(0, 1, 1, 0), (0, 0, 0, 0), (1, 0, 1, 0)]

tiles = []
count = 0

# Iterating through a folder of unique tiles
for i, png in enumerate(path):

# appending tile to 'tiles' array lit

# if rotation information included in tile name --> allow rotation
name = png.getName().split('.')[0]
if '-' in name:

# rotate by 90° the number of times specified in tile name
for rot in xrange(int(name[-1])):
cp = tiles[-1]
pg = createGraphics(cp.width, cp.height)
pg.beginDraw()
pg.pushMatrix()
pg.translate(cp.width, 0)
pg.rotate(HALF_PI)
pg.image(cp, 0, 0)
pg.popMatrix()
pg.endDraw()

# add rotated tile to 'tiles' array list
tiles.append(pg)

# create a corresponding "connector" array by rotating the indices of the previous tile
c = deque(connectors[i + count])
c.rotate(1)
count += 1
connectors.insert(i + count, tuple(c))

ntiles = len(tiles)

W = dict(enumerate(tuple(set(range(ntiles)) for i in xrange(w*h))))
H = dict(enumerate(sample(tuple(ntiles if i > 0 else ntiles-1 for i in xrange(w*h)), w*h)))
A = dict(enumerate([[set() for dir in xrange(4)] for i in xrange(ntiles)]))

for i1 in xrange(ntiles):
for i2 in xrange(ntiles):
if connectors[i1][0] == connectors[i2][2]:

if connectors[i1][1] == connectors[i2][3]:

def draw():
global H, W

if not H:
print 'finished'
noLoop()
return

emin = min(H, key = H.get)
id = choice(list(W[emin]))
W[emin] = {id}
del H[emin]

stack = {emin}
while stack:
idC = stack.pop()
for dir, t in enumerate(directions):
x = (idC%w + t[0])%w
y = (idC/w + t[1])%h
idN = x + y * w
if idN in H:
possible = {n for idP in W[idC] for n in A[idP][dir]}
if not W[idN].issubset(possible):
intersection = possible & W[idN]

if not intersection:
noLoop()
break

W[idN] = intersection
H[idN] = len(W[idN]) - random(.1)

image(tiles[id], (emin%w) * s, (emin/w) * s, s, s)

4 Likes

Great approach, good explanation, interesting topic. One of the best posts I ever read here. I can’t believe no one answered yet. It’s a perfect match for processing. If you understand the algorithm, it should be easy for you to translate it into Java code, even if you don’t know the language yet. It derives from the C syntax family, a concept you can hardly avoid when dealing with programming in any way.

2 Likes

Well, somebody could translate but some is just gibberish…

@needfulthing: Thanks for your feedback and the kind words !

If I had more time I could possibly, and laboriously, come up with a Java version but it would certainly be clumsy and inneficient. I would rather see someone with a good understanding of Java data types implement it in the best possible way.

An example: I chose Python sets() because they are efficient and easy to manipulate. I could simply look for the Java equivalent but:

• Would they be as efficient ? (compared to other Java data types)
• Would they be as easy to manipulate ? (it seems they’re not)
• Is there a better alternative ?
• Could I compute intersections with it ?

Only someone with a good experience of the Java language could answer these questions.

@Chrisir: Thank you for your interest !

Have you looked at the annotations I provided in the second script (just above the first gif) ?

''' Then we check all the patterns that COULD be placed at that location.
EX: if the neighboring cell is on the left of the current cell (east side),
we look at all the patterns that can be placed on the left of each pattern
contained in the current cell. '''

possible = {n for idP in W[idC] for n in A[idP][dir]}


Breaking that line down:

• brackets {} indicate we’re dealing with a Python set(). I chose this data type not only because it is efficient but also because we need later to compute the intersection of 2 sets (intersection only works with sets, not operable with lists)
• idC is the index of the Current cell (the one whose neighbors are updated)
• idP is the index of the available Pattern in that cell
• n is the index of the pattern than can be adjacent to pattern idP in direction dir

Basically we look at all the indices of the patterns available in the current cell, then for each one of them we look at all the indices of patterns that can be placed next to it, in a specific direction.
Avoiding that one-liner, it would give something like this:

possible = []
for availablePattern in currentCell:


2 Likes

Hi solub – this is a wonderful contribution, and I’m a huge fan of this work – and your effort in implementing it and documenting it so clearly. Fantastic.

A huge quick question – is the entropy array named “H” by mathematical convention?

I’m tempted to post a slight revision of your annotated code, and give all the variables semantic names – e.g. W --> wave, A --> adjacencies, H --> entropy, etc. – to make the code more readable for students and potential adaptors. The extreme concision is very elegant, but it might be a barrier to comprehension for coders without a background in this kind of math.

2 Likes

Given these sketches got some image assets, it’d be convenient to host them (including their root project folder) on GitHub.

1 Like

Unfortunately that annotated code won’t run in PDE due to the comment-indent style – it causes parse errors.

I’ve cleaned up solub’s code comments and then renamed most of the variables to make them semantic. So for example:

possible = {n for idP in W[idC] for n in A[idP][dir]}


is now written:

possible = {n for pattern_idx in wave[cell_idx] for n in adjacencies[pattern_idx][dir]}


…which is still nested loops written as a compact generator expression, but it is a bit more clear what is happening.

If someone is interested in doing a Java port, this would be one place to start. Code below:

from collections import Counter
from itertools import chain
from random import choice, sample

out_w, out_h = 96, 50  # dimensions of output (array of wxh cells)
f = 9 # size factor
N = 3 # dimensions of a pattern (NxN matrix)

def setup():
size(out_w*f, out_h*f, P2D)
background('#FFFFFF')
frameRate(1000)
noStroke()

global wave, adjacencies, entropy, directions, patterns, freqs, cell_w, cell_h

img = loadImage('Flowers.png') # path to the input image
img_w, img_h = img.width, img.height # dimensions of input image
cell_w, cell_h = width//out_w, height//out_h # dimensions of cells (rect) in output
kernel = tuple(tuple(i + n*img_w for i in xrange(N)) for n in xrange(N)) # NxN matrix to read every patterns contained in input image
directions = ((-1, 0), (1, 0), (0, -1), (0, 1)) # (x, y) tuples to access the 4 neighboring cells of a specific cell
all = [] # array list to store all the patterns found in input

#### Stores the different patterns found in input

for y in xrange(img_h):
for x in xrange(img_w):

''' The one-liner below (cmat) creates a NxN matrix with (x, y)
being its top left corner. This matrix will wrap around the edges
of the input image. The whole snippet reads every NxN part of the
input image and store the associated colors. Each NxN part is
called a 'pattern' (of colors). Each pattern can be rotated or
flipped (not mandatory).
'''

cmat = tuple(tuple(img.pixels[((x+n)%img_w)+(((a[0]+img_w*y)/img_w)%img_h)*img_w] for n in a) for a in kernel)

# Storing rotated patterns (90°, 180°, 270°, 360°)
for r in xrange(4):
cmat = zip(*cmat[::-1]) # +90° rotation
all.append(cmat)
all.append(cmat[::-1]) # vertical flip
all.append([a[::-1] for a in cmat]) # horizontal flip

#### Flatten pattern matrices + count occurences

''' Once every pattern has been stored,
- we flatten them (convert to 1D) for convenience
- count the number of occurences for each one of them (one pattern can be
found multiple times in input)
- select and store unique patterns only
'''

all = [tuple(chain.from_iterable(p)) for p in all]  # flattening arrays
c = Counter(all) # Python counter
freqs = c.values() # number of occurences for each unique pattern
patterns = c.keys() # list of unique patterns
npat = len(freqs) # number of unique patterns

#### Initializes the 'wave' (wave), entropy (entropy) and adjacencies (adjacencies) array lists

''' Array wave (the Wave) keeps track of all the available patterns, for
each cell. At start start, all patterns are valid anywhere in the Wave so
each subarray is a list of indices of all the patterns
'''

wave = dict(enumerate(tuple(set(range(npat)) for i in xrange(out_w*out_h))))

''' Array entropy should normally be populated with entropy values.
Entropy is just a fancy way to represent the number of patterns
still available in a cell. We can skip this computation and populate
the array with the number of available patterns instead.

At start all patterns are valid anywhere in the Wave, so all cells
share the same value (npat). We must however pick one cell at random and
assign a lower value to it. Why ? Because the algorithm in draw() needs
to find a cell with the minimum non-zero entropy value.
'''

entropy = dict(enumerate(sample(tuple(npat if i > 0 else npat-1 for i in xrange(out_w*out_h)), out_w*out_h)))

describes the ways
that the patterns can be placed near one another. More explanations below
'''

adjacencies = dict(enumerate(tuple(set() for dir in xrange(len(directions))) for i in xrange(npat))) # explanations below

#### Computation of patterns compatibilities (check if some patterns are adjacent, if so -> store them based on their location)

''' EXAMPLE:
If pattern index 42 can placed to the right of pattern index 120,
we will store this adjacency rule as follow:

Here '1' stands for 'right' or 'East'/'E'

0 = left or West/W
1 = right or East/E
2 = up or North/N
3 = down or South/S
'''

# Comparing patterns to each other
for i1 in xrange(npat):
for i2 in xrange(npat):

'''(in case when N = 3) If the first two columns of
pattern 1 == the last two columns of pattern 2
--> pattern 2 can be placed to the left (0) of pattern 1
'''

if [n for i, n in enumerate(patterns[i1]) if i%N!=(N-1)] == [n for i, n in enumerate(patterns[i2]) if i%N!=0]:

'''(in case when N = 3) If the first two rows of
pattern 1 == the last two rows of pattern 2
--> pattern 2 can be placed on top (2) of pattern 1
'''

if patterns[i1][:(N*N)-N] == patterns[i2][N:]:

def draw():
global entropy, wave

# Simple stopping mechanism

'''If the dict (or arraylist) of entropies is empty -> stop iterating.
We'll see later that each time a cell is collapsed, its corresponding key
in entropy is deleted
'''

if not entropy:
print 'finished'
noLoop()
return

#### OBSERVATION

''' Find cell with minimum non-zero entropy (not collapsed yet).'''

entropy_min = min(entropy, key = entropy.get)

#### COLLAPSE

''' Among the patterns available in the selected cell (the one with min entropy),
select one pattern randomly, weighted by the frequency that pattern appears
in the input image.'''

pattern_id = choice([pattern_idx for pattern_idx in wave[entropy_min] for i in xrange(freqs[pattern_idx])]) # index of selected pattern

''' The Wave's subarray corresponding to the cell with min entropy should now only contains
the id of the selected pattern '''

wave[entropy_min] = {pattern_id}

''' Its key can be deleted in the dict of entropies '''

del entropy[entropy_min]

#### PROPAGATION

''' Once a cell is collapsed, its index is put in a stack.
That stack is meant later to temporarily store indices of neighoring cells '''

stack = {entropy_min}

''' The propagation will last as long as that stack is filled with indices '''

while stack:

''' First thing we do is pop() the last index contained in the stack (the only one for now)
and get the indices of its 4 neighboring cells (E, W, N, S).
We have to keep them withing bounds and make sure they wrap around. '''

cell_idx = stack.pop() # index of current cell
for dir, t in enumerate(directions):
x = (cell_idx%out_w + t[0])%out_w
y = (cell_idx/out_w + t[1])%out_h
neighbor_idx = x + y * out_w # index of negihboring cell

''' We make sure the neighboring cell is not collapsed yet
(we don't want to update a cell that has only 1 pattern available)
'''

if neighbor_idx in entropy:

''' Check all patterns that COULD be placed at that location.
EX: if the neighboring cell is on the left of the current cell
(east side), we look at all the patterns that can be placed on
the left of each pattern contained in the current cell.
'''

possible = {n for pattern_idx in wave[cell_idx] for n in adjacencies[pattern_idx][dir]}

''' We also look at the patterns that ARE available in the
neighboring cell
'''

available = wave[neighbor_idx]

''' Now we make sure that the neighboring cell really need to
be updated. If all its available patterns are already in the
list of all the possible patterns: —> there’s no need to update
it (the algorithm skip this neighbor and goes on to the next)
'''

if not available.issubset(possible):

''' If it is not a subset of the possible list:
—> we look at the intersection of the two sets
(all the patterns that can be placed at that location
and that, "luckily", are available at that same location)
'''

intersection = possible & available

''' If they don't intersect (patterns that could have been
placed there but are not available) it means we ran into a
"contradiction". We have to stop the whole WFC algorithm.
'''

if not intersection:
noLoop()
return

''' If, on the contrary, they do intersect -> we update the
neighboring cell with that refined list of pattern's indices
'''

wave[neighbor_idx] = intersection

''' Because that neighboring cell has been updated, its number of valid patterns has decreased
and its entropy must be updated accordingly.
Note that we're subtracting a small random value to mix
things up: sometimes cells we'll end-up with the same
minimum entropy value and this prevent to always select the
first one of them. It's a cosmetic trick to break the
monotony of the animation'''

entropy[neighbor_idx] = len(wave[neighbor_idx]) - random(.1)

''' Finally, and most importantly, we add the index of that
neighboring cell to the stack so it becomes the next current
cell in turns (the one whose neighbors will be updated
during the next while loop)
'''

#### RENDERING
''' The collapsed cell will always be filled with the first color (top left
corner) of the selected pattern '''

fill(patterns[pattern_id][0])
rect((entropy_min%out_w) * cell_w, (entropy_min/out_w) * cell_h, cell_w, cell_h)

1 Like

Yes, I named the entropy array “H” by convention even though it is the entropy itself that should named “H”, not the array. Thinking about it now, “E” would be fine as well, at least it would ensure greater consistency between the arrays: W --> wave, A --> adjacency, E --> entropy.

Totally agree. Thank you so much for your revised version. It’s beyond me why the annotated script doesn’t run on your side, it works perfectly fine on my computer…

@GoToLoop

In this regard I would also add that the overlapping model is more suitable for students, mainly for 3 reasons:

• tilesets are not easy to get a hold of / create
• it is easier to manage a single bitmap image than a whole set of tiles (no rotations to compute and store, no adjacencies or frequencies to enter by hand)

Actually, the output that looks like an archipelago (first gif) was based on a input image made in Processing. Alternatively, this could be a great incentive for students / hobbysts to create their own bitmaps while learning the basics of Processing

Example

N = 16 # 16 x 16 pixels

def setup():
size(N, N)

# colors
w = color(242, 242, 240) # white
b = color(124, 215, 235) # blue
g = color(66, 217, 134) # green
B = color(0) # black
c = color(198, 149, 99) # chestnut brown

plant = [w, w, w, w, w, w, B, B, B, w, w, w, w, w, w, w,
w, w, w, w, w, B, g, g, g, B, w, w, w, w, w, w,
w, w, B, B, w, B, g, B, g, B, w, w, B, B, w, w,
w, B, g, g, B, B, g, g, g, B, w, B, g, g, B, w,
w, B, g, B, g, B, g, B, g, B, B, g, B, g, B, w,
w, B, g, g, g, B, g, g, g, B, g, g, g, g, B, w,
w, w, B, g, g, g, B, g, B, g, B, g, g, B, w, w,
w, w, w, B, g, g, B, g, B, g, g, g, B, w, w, w,
w, B, B, B, B, B, B, B, B, B, B, B, B, B, B, w,
B, b, b, b, b, b, b, b, b, b, b, b, b, b, b, B,
B, b, b, b, b, b, b, b, b, b, b, b, b, b, b, B,
B, b, b, b, b, b, b, b, b, b, b, b, b, b, b, B,
w, B, B, b, b, b, b, b, b, b, b, b, b, B, B, w,
w, w, w, B, B, B, B, B, B, B, B, B, B, w, w, w,
w, w, B, b, b, b, b, b, b, b, b, b, b, B, w, w,
w, w, B, B, B, B, B, B, B, B, B, B, B, B, w, w]

island = [b, b, b, b, b, b, b, b, b, b, b, b, b, b, b, b,
b, b, b, b, b, b, b, b, b, b, g, g, b, b, b, b,
b, b, b, b, b, b, b, b, g, g, g, g, g, b, b, b,
b, b, b, b, b, b, b, b, g, g, g, g, g, g, b, b,
b, b, b, b, b, g, g, g, g, g, g, g, g, g, b, b,
b, b, b, b, g, g, g, g, g, g, g, g, g, g, b, b,
b, b, g, g, g, g, g, g, g, g, g, g, g, g, g, b,
b, b, g, g, g, g, g, g, g, g, g, g, g, g, g, b,
b, b, c, g, g, g, g, g, g, g, g, g, g, g, g, b,
b, b, b, c, g, g, g, g, g, g, g, g, g, g, c, b,
b, b, b, b, g, g, g, g, g, g, g, g, c, c, b, b,
b, b, b, g, g, g, g, g, g, g, g, g, b, b, b, b,
b, b, b, c, g, g, c, c, g, g, g, g, b, b, b, b,
b, b, b, b, c, c, b, b, c, c, c, c, b, b, b, b,
b, b, b, b, b, b, b, b, b, b, b, b, b, b, b, b,
b, b, b, b, b, b, b, b, b, b, b, b, b, b, b, b,]