I taught an AI to play snake! I wanted to start learning how to implement Deep Q Learning just to get my hands dirty. Contrary to my last post on neural networks, this time, the math wasn’t too hard (still struggled obviously because its math lol), but the ideas were horrible. Understanding it all and putting it into an implementation was painful. Also, in order for this algorithm to work, there are so many bugs, variables and processes to account for, so this took a bit longer than last time.
Now, for the hyperparameters:
- 12 input neurons, 1 hidden layer 64 neurons, 3 output neurons
The way this works is I feed in the distance from the agent to the apple, to itself, and to the wall in 4 directions, for 3x4 inputs. 64 neurons for the hidden layer was the most my computer could handle while being above 20 fps (I made a library for handling the neural net math so it’s pretty slow). The output neurons determine whether the player goes left, right, or straight.
- Step size of 0.001, batch size 1000, gamma 0.9
- Reward of 10 for eating apple, -10 for dying, and 0 for anything else.
- ReLU activation function for hidden layers, none for output layer
Here are some images:
For the next project, I might try learning PPO (proximal policy optimization), or NEAT/genetic algorithm.
If anyone sees this, have a great day!