Why Isn't AI Being Trained to Compress and Decompress Input Like the Human Brain?

CoderSupreme@programming.dev · 5 days ago

Why Isn't AI Being Trained to Compress and Decompress Input Like the Human Brain?

iii@mander.xyz · edit-2 5 days ago

AI does work like that.

With (variational) auto-encoders, it’s very explicit.

With shallow convolutional neural networks, it’s fun to visualize the trained kernel weights, as they often return an abstract, to me dreamlike, representations of the thing being trained for. Although derived through a different method, search for “eigenfaces” as an example of what I mean.

In the recent hype model architecture, attention and transformers, the encoded state can be thought of as a compressed version of it’s input. But human interpretation of those values is challenging.

remotelove@lemmy.ca · 5 days ago

Thats kinda is how neural networks actually function. They don’t store massive amounts of data but, similar to us, tweak and adjust complex pathways of neurons that kinda just convert an input into a response.

When you ask an LLM a question you are actually getting a list of words based on probabilities, not anything the LLM had to “think about” before responding. During its training, different patterns fed to the AI tweak and balance how and when specific neurons should fire. One way to think about it is that “memories” or data is stored in how the paths are formed, not actually in the core of the neuron itself.

There are several hundred configurations of artificial neural networks that can mimic different functions of our brains, including memory.

CoderSupreme@programming.dev · edit-2 5 days ago

Oh, so it’s mostly a side effect, but they are still primarily being trained to predict the next word.

iii@mander.xyz · 5 days ago

Not necessarily, sometimes dimensionality reduction (the more common terminology used, for what is basically compression) is the explicit goal.

Can be used for outlier detection, similarity search, etc.

During training, you find a projection of the input, for example an image, to a smaller space, and then back to the original image. This is referred to as encoding and decoding. The error fuction would be a measure of how similar the in- and output images are.