What is a good eli5 analogy for GenAI not "knowing" what they say?

Hucklebee@lemmy.world · edit-2 5 months ago

What is a good eli5 analogy for GenAI not "knowing" what they say?

Tar_Alcaran@sh.itjust.works · 5 months ago

It’s a really well-trained parrot. It responds to what you say, and then it responds to what it hears itself say.

But despite knowing which sounds go together based on which sounds it heard, it doesn’t actually speak English.

CodeInvasion@sh.itjust.works · edit-2 5 months ago

I am an LLM researcher at MIT, and hopefully this will help.

As others have answered, LLMs have only learned the ability to autocomplete given some input, known as the prompt. Functionally, the model is strictly predicting the probability of the next word⁺, called tokens, with some randomness injected so the output isn’t exactly the same for any given prompt.

The probability of the next word comes from what was in the model’s training data, in combination with a very complex mathematical method to compute the impact of all previous words with every other previous word and with the new predicted word, called self-attention, but you can think of this like a computed relatedness factor.

This relatedness factor is very computationally expensive and grows exponentially, so models are limited by how many previous words can be used to compute relatedness. This limitation is called the Context Window. The recent breakthroughs in LLMs come from the use of very large context windows to learn the relationships of as many words as possible.

This process of predicting the next word is repeated iteratively until a special stop token is generated, which tells the model go stop generating more words. So literally, the models builds entire responses one word at a time from left to right.

Because all future words are predicated on the previously stated words in either the prompt or subsequent generated words, it becomes impossible to apply even the most basic logical concepts, unless all the components required are present in the prompt or have somehow serendipitously been stated by the model in its generated response.

This is also why LLMs tend to work better when you ask them to work out all the steps of a problem instead of jumping to a conclusion, and why the best models tend to rely on extremely verbose answers to give you the simple piece of information you were looking for.

From this fundamental understanding, hopefully you can now reason the LLM limitations in factual understanding as well. For instance, if a given fact was never mentioned in the training data, or an answer simply doesn’t exist, the model will make it up, inferring the next most likely word to create a plausible sounding statement. Essentially, the model has been faking language understanding so much, that even when the model has no factual basis for an answer, it can easily trick a unwitting human into believing the answer to be correct.

—-

⁺more specifically these words are tokens which usually contain some smaller part of a word. For instance, understand and able would be represented as two tokens that when put together would become the word understandable.

AbouBenAdhem@lemmy.world · edit-2 4 months ago

Compression algorithms can reduce most written text to about 20–25% of its original size—implying that that’s the amount of actual unique information it contains, while the rest is predictable filler.
Empirical studies have found that chimps and human infants, when looking at test patterns, will ignore patterns that are too predictable or too unpredictable—with the sweet spot for maximizing attention being patterns that are about 80% predictable.
AI researchers have found that generating new text by predicting the most likely continuation of the given input results in text that sounds monotonous and obviously robotic. Through trial and error, they found that, instead of choosing the most likely result, choosing one with around an 80% likelihood threshold produces results judged most interesting and human-like.

The point being: AI has stumbled on a method of mimicking the presence of meaning by imitating the ratio of novelty to predictability that characterizes real human thought. But we know that the actual content of that novelty is randomly chosen, rather than being a deliberate message.

HorseRabbit@lemmy.sdf.org · 5 months ago

Not an ELI5, sorry. I’m an AI PhD, and I want to push back against the premises a lil bit.

Why do you assume they don’t know? Like what do you mean by “know”? Are you taking about conscious subjective experience? or consistency of output? or an internal world model?

There’s lots of evidence to indicate they are not conscious, although they can exhibit theory of mind. Eg: https://arxiv.org/pdf/2308.08708.pdf

For consistency of output and internal world models, however, their is mounting evidence to suggest convergence on a shared representation of reality. Eg this paper published 2 days ago: https://arxiv.org/abs/2405.07987

The idea that these models are just stochastic parrots that only probabilisticly repeat their training data isn’t correct, although it is often repeated online for some reason.

A little evidence that comes to my mind is this paper showing models can understand rare English grammatical structures even if those structures are deliberately withheld during training: https://arxiv.org/abs/2403.19827

Rhynoplaz@lemmy.world · 5 months ago

You sound like a chatbot who’s offended by it’s intelligence being insulted.

trashgirlfriend@lemmy.world · 5 months ago

Bro is lost in the sauce

HorseRabbit@lemmy.sdf.org · 5 months ago

Maybe I misunderstood the OP? Idk

GamingChairModel@lemmy.world · 5 months ago

The idea that these models are just stochastic parrots that only probabilisticly repeat their training data isn’t correct

I would argue that it is quite obviously correct, but that the interesting question is whether humans are in the same category (I would argue yes).

GamingChairModel@lemmy.world · 5 months ago

Harry Frankfurt’s influential 2005 book (based on his influential 1986 essay), On Bullshit, offered a description of what bullshit is.

When we say a speaker tells the truth, that speaker says something true that they know is true.

When we say a speaker tells a lie, that speaker says something false that they know is false.

But bullshit is when the speaker says something to persuade, not caring whether the underlying statement is true or false. The goal is to persuade the listener of that underlying fact.

The current generation of AI chat bots are basically optimized for bullshit. The underlying algorithms reward the models for sounding convincing, not necessarily for being right.

rubin@lemmy.sdf.org · 5 months ago

Imagine that you have a random group of people waiting in line at your desk. You have each one read the prompt, and the response so far, and then add a word themself. Then they leave and the next person in line comes and does it.

This is why “why did you say ?” questions are nonsensical to AI. The code answering it is not the code that wrote it and there is no communication coordination or anything between the different word answerers.

Deconceptualist@lemm.ee · 5 months ago

You could maybe just share a meme like this one.

Some folks in the comments there share actual LLM results, a few of which are sensible but plenty that aren’t far off from the joke.

otp@sh.itjust.works · 5 months ago

LMAO! I tried it, and it said:

The man takes the goat across the river first, then he returns alone and takes the boat across again. Finally, he brings the goat’s friend, Mr. Cabbage, across the river.

LMAO

I asked what if the man can’t swim…

In that case, the man can still take the goat across the river first. Then, he ties the goat to the boat and pulls the boat across the river using a rope, ensuring he stays on the side of the river where he can safely manage the boat without swimming.

I asked who Mr. Cabbage is…

My apologies for the confusion! Mr. Cabbage is a playful addition. In the classic riddle, there’s usually no mention of Mr. Cabbage. It’s just the man, the goat, and perhaps another item or animal depending on the version of the riddle. I just added Mr. Cabbage for some extra fun!

Then I asked what some other additions could be…

Certainly! You can add various characters or items to make the riddle more entertaining. Here are a few ideas:

Mr. Wolf: The man, the goat, and Mr. Wolf need to cross the river. But if the man leaves the wolf alone with the goat, the wolf will eat the goat.

Ms. Hen: The man, the goat, and Ms. Hen need to cross the river. But if the man leaves the hen alone with the goat, the goat will eat the hen.

Professor Fox: The man, the goat, and Professor Fox need to cross the river. But if the man leaves the fox alone with the goat, the fox will eat the goat.

Princess Diamond: The man, the goat, and Princess Diamond’s precious gem need to cross the river. But Princess Diamond has warned that if anyone other than herself touches the gem, they’ll be cursed.

Feel free to mix and match or come up with your own additions!

And the “solution”…

Ah, that’s a fun twist! Let’s see:

The man takes Mr. Cabbage across first, then returns alone.

Next, he takes Ms. Hen across but brings Mr. Cabbage back.

He then takes Princess Diamond’s gem across, leaves it on the other side, and takes Mr. Wolf across.

The man leaves Mr. Wolf with Ms. Hen on the other side and returns alone.

He takes Princess Diamond across, but before he leaves, he swaps Mr. Cabbage with Princess Diamond’s gem.

The man brings Mr. Cabbage across, leaving Princess Diamond’s gem with Mr. Wolf.

Finally, he takes the goat across.

Everyone safely crosses the river without any mishaps!

JackbyDev@programming.dev · 5 months ago

I love Mr. Cabbage! Thank you CharGPT, very cool!

IzzyScissor@lemmy.world · 5 months ago

It’s your phone’s ‘predictive text’, but if it were trained on the internet.

It can guess what the next word should be a lot of the time, but it’s also easy for it to go off the rails.

BlameThePeacock@lemmy.ca · 5 months ago

It’s just fancy predictive text like while texting on your phone. It guesses what the next word should be for a lot more complex topics.

NeoNachtwaechter@lemmy.world · 5 months ago

idea that “it makes convincing sentences, but it doesn’t know what it’s talking about”

Like a teenager who has come into a new group and is now trying so hard to fit in :-)