Hi @QTPay,
There are 2 questions inside your original question:
 Is the implementation correct ?
 How to make an animation out of it ?
Before trying to answer those, a quick clarification.
MCMC (Markov Chain Monte Carlo) isn’t an algorithm per se but rather refers to a class of algorithms including “MetropolisHastings”, “Gibbs sampling”, “Slice sampling”, … etc. There are methods used to approximate a desired distribution. From what I understand the simulation approach you are trying to implement (Simulation of the HardCore Model) is sometimes employed as an example to illustrate this notion.
As the topic of statistics is slightly out of the scope of this forum I would strongly suggest to ask your first question on a dedicated platform like stackoverflow or crossvalidated. That said, a quick look at the original formulation of your algorithm (from page 46 of this book) shows that:

at step 3, the reassignment of the vertex value to 0
happens outside the preceding if
statement

step 4 doesn’t require any explicit instruction (leave the matrix/2Darray as it is)
A reinterpretation of this routine in Python would probably (better ask qualified persons) look like this:
N = 8 # number of rows/cols
T = 1000 # number of iterations
G = [0] * N*N # matrix as a 1D array
for iter in xrange(T):
i = int(random(N*N)) #index of vertex picked randomly (uniform)
c = int(random(2)) #coin value picked randomly (uniform)
# if heads and all neighbors = 0 > assign value 1
if c and not any(map(lambda x: G[x], neighbors(i))):
G[i] = 1
continue
G[i] = 0 # else > assign value 0
Here neighbors
can be a function or a dictionary returning the Von Neuman neighborhood of a vertex at a given index. The simple sketch below tries to render the resulting array in the style of the original author:
Populate Grid
N = 8 # number of rows/cols
T = 100 # number of iterations
G = [0] * N*N # matrix as a 1D array
off = 40
def setup():
size(off*10, off*10)
background('#FFFFFF')
translate(off*1.5, off*1.5)
#POPULATE GRID
for iter in xrange(T):
i = int(random(N*N)) #index of vertex picked randomly (uniform)
c = int(random(2)) #coin value picked randomly (uniform)
# if heads and all neighbors = 0 > assign value 1
if c and not any(map(lambda x: G[x], neighbors(i))):
G[i] = 1
continue
G[i] = 0 # else > assign value 0
#RENDER
for id, v in enumerate(G):
x = id%N
y = id//N
fill((not v)<<8)
ellipse(x*off, y*off, off>>1, off>>1)
if x:line(x*off(off*.75), y*off, x*off(off*.25), y*off)
if y:line(x*off, y*off(off*.75), x*off, y*off(off*.25))
def neighbors(id):
nbhood = set()
for i, s in enumerate((1, 1, N, N)):
nid = id+s
if 0 <= nid < N*N:
if i < 2 and nid//N != id//N:
continue
nbhood.add(nid)
return nbhood
There is however a key element that you seem to be missing. The point of the algorithm is not only to populate a grid through a Markov process but also to do it many times so as to converge to an approximation of a distribution. Roughly, this is the Monte Carlo part of the process (MCMC).
This brings us to your second issue: the animation. In order for us to provide an answer we first need to know what part of this algorithm you would like to animate:
 is it the grid population process (as seems to imply your original post) ?
 is it the succession of grids being populated one after the other ?
 is it the converging movement of their distribution ?
 is it all of the above ?
If all you need is the grid animation then the sketch above may help you get on the right track.
If on the other hand you want to plot the converging distribution then I think (better ask qualified persons) you would have to:
 store the total number of positive values (
1
) for each grid that you populate
 divide that number by the number of iterations (mean)
 save this mean value at each generation in a list
 quantize + normalize + count their occurrences to build an histogram
N = 8
sums = ()
for gen in xrange(1000):
iter = gen+1
arr = [0] * N*N
sums += populate(arr, iter),
quantized_sums = map(lambda x: int(x)+round((x%1)*10/2.)*0.2, sums) #discretize to nearest decimal every 0.2
bins = sorted(Counter(quantized_sums).items())
omax = max(map(lambda x: x[1], bins)) #maximum occurence (needed for normalization)
for i, (k, v) in enumerate(bins):
h = norm(v, 0, omax) * 120 #mapping the normalized occurence to a chosen height
### draw something here ###
Full demo
from collections import Counter
N = 8 # number of rows/cols
def setup():
size(off*10, off*10)
background('#FFFFFF')
translate(off*1.5, off*1.5)
sums = ()
for gen in xrange(1000):
iter = gen+1
arr = [0] * N*N
sums += populate(arr, iter),
quantized_sums = map(lambda x: int(x)+round((x%1)*10/2.)*0.2, sums) #discretize to nearest decimal every 0.2
bins = sorted(Counter(quantized_sums).items())
omax = max(map(lambda x: x[1], bins)) #maximum occurence (needed for normalization)
for i, (k, v) in enumerate(bins):
h = norm(v, 0, omax) * 120 #mapping the normalized occurence to a chosen height
### draw something here ###
def populate(G, T):
count = 0
for iter in xrange(T):
i = int(random(N*N))
c = int(random(2))
if c and not any(map(lambda x: G[x], neighbors(i))):
G[i] = 1
continue
G[i] = 0
count += sum(G)
return count / float(T)
def neighbors(id):
nbhood = set()
for i, s in enumerate((1, 1, N, N)):
nid = id+s
if 0 <= nid < N*N:
if i < 2 and nid//N != id//N:
continue
nbhood.add(nid)
return nbhood
convergence to a Normal distribution after 1000 generations
In both cases I would suggest first to have a look at how the draw() function works in Processing and then come back later with your tryouts.