Everyone Is Judging AI by These Tests. But Experts Say They’re Close to Meaningless.

ModerateImprovement@sh.itjust.works · 5 months ago

Everyone Is Judging AI by These Tests. But Experts Say They’re Close to Meaningless.

sunbeam60@lemmy.one · 5 months ago

The article makes the valid argument that LLMs simply predict next letters based on training and query.

But is that actually true of latest models from OpenAI, Claude etc?

And even if it is true, what solid proof do we have that humans aren’t doing the same? I’ve met endless people who could waffle for hours without seeming to do any reasoning.

technocrit@lemmy.dbzer0.com · 5 months ago

what solid proof do we have that humans aren’t doing the same?

Humans are not computers. Brains are not LLMs…

Given a totally reasonable hypothesis (humans =/= computers) and a completely outlandish hypothesis (humans = computers), I would need much more ‘proof’ for the later.

sunbeam60@lemmy.one · edit-2 5 months ago

Well, brains are a network of neurons (we can evidentially verify this) trained on … eyes, ears, sense of touch, taste, smell and balance (rewarded by endorphins released by the old brain on certain hardcoded stimuli). LLMs are a network of neurons trained on text and images (rewarded by producing text that mimics input text and some reasoning tests).

It’s not given that this results in the same way of dealing with language, given the wider set of input data for a human, but it’s not given that it doesn’t either.

zbyte64@awful.systems · 5 months ago

Humans predict things by assigning meaning to events and things, because in nature, we’re constantly trying to guess what other creatures are planning. An LLM does not hypothesize what your plans are when you communicate to it, it’s just trying to predict the next set of tokens with the greatest reward value. Even if you were to use literal human neurons to build your LLM, you would still have a stochastic parrot.

rottingleaf@lemmy.world · 5 months ago

Information theory, entropy in Markovian processes. Read up on these buzzwords to see why.

sunbeam60@lemmy.one · 5 months ago

I think I know enough about these concepts to know that there isn’t any conclusive proof, observed in output or system state, to establish consensus that human speech output is generated differently to how LLMs generate output. If you have links to any papers that claim otherwise, I’ll be happy to read them.

rottingleaf@lemmy.world · 5 months ago

What? Humans, ahem, collect entropy every moment of their existence.

Everyone Is Judging AI by These Tests. But Experts Say They’re Close to Meaningless.

Everyone Is Judging AI by These Tests. But Experts Say They’re Close to Meaningless.

Everyone Is Judging AI by These Tests. But Experts Say They’re Close to Meaningless – The Markup