Hello, I am working on a Processing-based editor for Conway’s Game of Life, basically like a Golly-lite with a few basic features like pattern loading, grid editing, speed up/down, etc..
Full disclosure, I have used ChatGPT for my code.
(For those interested in this question but don’t know what CGOL / cellular automata are: You essentially take a grid of cells which can be either “alive” or “dead”. With a pair of rules, you determine how many living neighbors a dead cell needs to come alive on the next frame, and how many living neighbors a living cell needs to stay alive on the next frame. In my code, these cells are represented in a boolean array)
The issue at hand is that it is not like a Golly-lite, it’s more of a Golly-magnum in terms of how incredibly unperformant it is. That’s because I am not very good at programming, but the point is: I’ve now dabbled in multithreading in the attempt to somewhat optimise my code. I figured that the greatest performance bottleneck is the fact that every single cell of the grid is checked every frame in one for loop, so- what if I can distribute the workload to multiple ones, depending on the amount of cores in the user’s CPU? Only that this has had no positive effect on my performance. Average frametimes are the same, if not worse in some cases, than the old code. So I do wonder why.
This is my old code. It counts the neighbors of every single cell and then writes its new status into a new boolean array, which is then used for the next frame.
for (int x = 0; x < cols; x++) {
for (int y = 0; y < rows; y++) {
int neighbors = countLiveNeighbors(x, y);
if (grid[x][y]) {
nextGrid[x][y] = inArray(neighbors, currentRule[1]);
} else {
nextGrid[x][y] = inArray(neighbors, currentRule[0]);
}
}
}
boolean[][] temp = grid;
cache.addLast(cloneGrid(grid)); //This is for a rewind feature
if (cache.size() > maxCache) {
cache.removeFirst();
}
grid = nextGrid;
nextGrid = temp;
There are several reasons why my old code is slow and I still need to figure out ways to deal with those, but one idea I had was multithreading, something I haven’t really actively implemented into a program before.
futures.clear();
for (int i = 0; i < cores; i++) { //The grid is sliced into horizontal chunks, and each CPU core is supposed to handle one of these chunks.
final int startY = i * sliceHeight;
final int endY = (i == cores - 1) ? rows : startY + sliceHeight;
print("Core " + i + " - StartY: " + startY + " endY: " + endY + "\n"); //Just a debug line to see if the multithreading was even working. Didn't impact performance
Future<?> f = executor.submit(() -> { //Same deal as above, but now it's run by each core individually for their chunk
for (int y = startY; y < endY; y++) {
for (int x = 0; x < cols; x++) {
int neighbors = countLiveNeighbors(x, y);
if (grid[x][y]) {
nextGrid[x][y] = inArray(neighbors, currentRule[1]);
} else {
nextGrid[x][y] = inArray(neighbors, currentRule[0]);
}
}
}
}
);
futures.add(f);
}
for (Future<?> f : futures) {
try {
f.get(); //maybe this is the pain point? but waiting for all of the threads should still be faster than loading it all on one, right?
}
catch (Exception e) {
e.printStackTrace();
}
}
boolean[][] temp = grid;
cache.addLast(cloneGrid(grid)); //This is for a rewind feature
if (cache.size() > maxCache) {
cache.removeFirst();
}
grid = nextGrid;
nextGrid = temp;