 Flaw in current neural networks

I wanted to post this on a coding train episode on youtube about neural networks but I’ve been banned from commenting (again!) —

There is a basic problem with conventional neural networks. There are too many weighted sums operating off a small set of nonlinearized values. The outputs of the weighed sums then are correlated/entangled with each other. Aside from anything else this is an inefficient use of weight parameters:
https://discourse.numenta.org/t/non-linearity-sharing-in-deep-neural-networks-a-flaw/6033

There is a solution using random projections, or actually other projections that are faster to calculate as long as they have some specific properties.

Also the linear associative memory (AM) aspect of the weighted sum is poorly know, people should remember it. Linear AM does come with a lot of provisos, however in conjunction with nonlinear functions it becomes a more general type of associative memory. I sort of half explained it here:
https://discourse.numenta.org/t/towards-demystifying-over-parameterization-in-deep-learning/5985

2 Likes

this sounds very interesting,
could you show this as a processing example project
or even as a tutorial?

I’m being a LA at the moment and not writing much code. Just ruminating a bit.
I did show this related associative memory thing before:
www.gamespace.eu5.org/associativememory/index.html
The code is on a free webserver and I think that company sometimes plays games with where things link to but the link usually works.

If I understand, you are saying:

1. “conventional” neural networks are inefficient.
2. you can solve this problem (inefficiency) with “other projections that are faster to calculate as long as they have some specific properties.”

Two thoughts:

Coding Train is for (primarily) introductory learners to learn coding. That is the audience. Their goal is not to learn the most efficient techniques, but to learn basic / introductory techniques – if efficiency was the primary goal then it wouldn’t make sense to use Java / JavaScript over e.g. C++.

It is also a tutorial how-to series – if you want to make a suggested improvement to the material, that suggestions should be in the form of alternate instructions that someone could actually follow, not a suggestion that beginners do their own research in techniques that are “poorly known” with “a lot of provisos.”

An associative memory tutorial for Processing / P5 sounds interesting – if one doesn’t exist, perhaps you should develop one, or partner with someone to do that! I recall you posting the Auto-Associative demo earlier, and I found it interesting (although a bit confusing).

2 Likes

I use processing, I can post here regardless I would guess. I may use processing java again as well as processing JS. I had some kind of issue with multithreaded image loading with the Java version, maybe I forgot to use some preload function.
Point 2, I said in the link you can use random projections which take O(nln(n)). A bit slow. After thinking about it there may be faster projections you could use.

In terms of a tutorial it looks like presenting one on the weighted sum would be worthwhile. There are quite a few aspects.

1. The dot product and angular distance
2. Linear associative memory (AM)
3. Under capacity linear AM giving error correction by repetition
4. Over capacity giving recall+Gaussian noise, but still close in angular distance
5. Conversion to more general associative memory using nonlinear functions
6. The linear classifier behavior.
7. Non-linear classifier with prior application of nonlinear functions.
8. The central limit theorem and the weighed sum
9. Lots of related math like simulations equations up to the Moore–Penrose inverse.
That would actually be the basics of artificial neural networks. Which ought to be extremely well known at this stage.