So, long story short, I am working on a neural network. Here it is, plus all the assorted files:
https://drive.google.com/open?id=1GLXP1Ta0bqXB6fZLjZUJ4ysk23e7cDwt
Read through my comments to better understand it, especially the keyPressed() function to see what sorts of actions you can do. Now, its main goal is to train itself to recognize the digits found in the imported files. To do just that, launch it, press e, then a, and then enter. Wait until it finishes, press a and enter and wait again. Something should be printed out now.
Repeat this last step of pressing a, enter and waiting for as long as you want. On each second iteration of this, the epoch counter (the number right below the 28x28 grid) is gonna go up. See how far you’re gonna get before all the numbers turn into tiny squares and the console starts printing out NaN.
This is driving me nuts. It’s just simple math that is happening behind the scenes, no different from a physics simulator. The parameters (weights and biases) are adjusted with velocity parameters, which themselves are getting updated with something akin to a derivative. All of this is happening within the calcBatchDerivative() function. It’s a complicated self-referential function, but I know it works, since sometimes, the network DOESN’T break, haha.
So, my question is, what’s causing the NaN’s? I guess they all turn simultaneously due to a rippling effect, but I don’t know where this begins, nor what’s causing it. I know that NaNs can occur when a number is too large. But there isn’t anything that would cause those weights and biases to jump like this (unless I’ve screwed up the math, but that is unlikely).
If you set batchSize to 1 before launch, then press e, but not a, you can keep pressing enter to launch the calcBatchDerivative() function individually. That could be helpful for debugging.
If you have any questions about the code, I can fill you in.