I worry that the "stochastic parrot" was premature, an idea sown early in development that will now carry along through any advances made.<p>Basically there is this innate idea that if the basic building blocks are simple systems with deterministic behavior, then the greater system can never be more than that. I've seen this is spades within the AI community, "It's just matrix multiplication! It's not capable of thinking or feeling!"<p>Which to me always felt more like a hopeful statement rather than a factual one. These guys have no idea what consciousness is (nobody does) nor have any reference point for what exactly is "thinking" or "feeling". They can't prove I'm not a stochastic parrot anymore than they can prove whatever cutting edge LLM isn't.<p>So while yes, present LLMs likely are just stochastic parrots, the same technology scaled might bring us a model that actually is "something that is something to be like", and we'll have everyone treating it with reckless carelessness because "its just a stochastic parrot".
Topical tweet from 2018:<p>> Optimist: AI has achieved human-level performance!<p>> Realist: “AI” is a collection of brittle hacks that, under very specific circumstances, mimic the surface appearance of intelligence.<p>> Pessimist: AI has achieved human-level performance.<p><a href="https://twitter.com/dmimno/status/949302857651671040" rel="nofollow noreferrer">https://twitter.com/dmimno/status/949302857651671040</a>
>"stochastic parrot" is a term coined by Emily M. Bender in the 2021 artificial intelligence research paper "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?"<p>This might be the first time the term was seen in an ’official’ context, but is it really the origin? It feels like the term has been hovering around for longer, and even Google Trends shows significant search trends way before 2021
Fun fact: philosopher Regina Rini referred to GPT-3 as a "statistical parrot" six months before the Bender et al paper came out: <a href="https://dailynous.com/2020/07/30/philosophers-gpt-3/#rini" rel="nofollow noreferrer">https://dailynous.com/2020/07/30/philosophers-gpt-3/#rini</a>
> They go on to note that because of these limitations, a learning machine might produce results which are "dangerously wrong"<p>I was initially thinking "well, yes, Nobel Prize for Stating the Obvious there", but looks like the paper was written in the far distant past of 2021, when LLMs were largely still in their babbling obvious nonsense stage, rather than the current state of the art, where they babble dangerously convincing nonsense, so, well, fair enough I suppose.<p>Amazing how fast progress has been there, though it's progress in an arguably rather worrying direction, of course.
LLMs are not stochastic though, they are deterministic and dont even require random numbers, right?<p>The term in general seems to be unfortunate because the models seem to do more than parroting. LLMs are more like central pattern generators of the nervous systems, able to flexibly create well coordinated patterns when guided appropriately
The real question to me is: in the next decade, as ML researchers roll out progressively more sophisticated systems, we can expect that generative systems- which may actually be "only stochastic parrots"- are going to create works that would fool any reasonable human being.<p>At what point does a stochastic parrot fake it till it makes it? Does it even matter? We can imagine that, within 10 years, we'll have a fully synthetic virtual human simulator- a generative AI combined with knowledge base, language parsing, audio and video recognition, basically a talking head that could join your next technical meeting and look like full contributor. If that happens, will the Timnits and the Benders of the world admit that, perhaps, systems which are indistinguishable from a human may not just be parrots, or perhaps, we are just sufficiently advanced parrotS?<p>Seen from that perspective, the promoters of stochastic parrots would seem to be luddites and close-minded, as well as discouraging legitimate, important, and valuable scientific research.
In the end, it turned out the actual innovation was doing the opposite of what this paper recommended: scaling up the LLM, improving quality by throwing lots of data at it rather than curating, and limiting bias by RLHF rather than picking the right datasets.<p>The organizations that listened to these people for even some amount of time got hosed in this situation. Google managed to oust this flock from within but not before their AIs were so lobotomized that they are wildly renowned for being the village idiot.<p>Ultimately, this paper is a triumph of branding over science. Read it if you'd like. But if you let these kinds of people into your organization, they'll cripple it. It costs a lot to get them out. Instead, simply never let them in.
I've got another word for it: recipe-fication.<p>Everything we revile about online recipe websites that spend 1000 words about the history of cooking before getting to the point, will be part and parcel of AI-written <i>anything</i>. It won't be properly proofread or edited by a human, because that would defeat the purpose.
Yoshua Bengio, Andrew Ng, Anrej Karpathy, and many other of the top researchers in the field do not believe these models are stochastic parrots, they believe they have internal world models and prompts are methods to probe those world models. Stochastic parrots is one of the dumbest takes in AI/ML.
I’d argue that all these models are stochastic parrots because they’re not embodied in any way. There is no way they can actually understand what they are talking about in any way that is tied back to the physical world.<p>What these LLMs and diffusion models and such actually are is a lossy compression method that permits structural queries. The fact that they can learn structure as well as content allows them to reason as well, but only to the extent that the rules they’re following existed somewhere in the training data and its structure.<p>If one were given access to senses and memory and feedback mechanisms and learned language that way, it might be considered actually intelligent or even sentient if it exhibited autonomy and value judgments.
A nice paper:<p>"Meaning without reference in large language models"<p>"we argue that LLM likely capture important aspects
of meaning, and moreover work in a way that
approximates a compelling account of human
cognition in which meaning arises from con-
ceptual role"<p><a href="https://arxiv.org/pdf/2208.02957.pdf" rel="nofollow noreferrer">https://arxiv.org/pdf/2208.02957.pdf</a><p>I remember Quine's meaning holism it seems to be related.<p><a href="https://en.wikipedia.org/wiki/Semantic_holism" rel="nofollow noreferrer">https://en.wikipedia.org/wiki/Semantic_holism</a>
TL;DR: the focus on the <i>implementation</i> details, and descriptions like this, are detrimental, even perilous,<p>because such accounts are both accurate, and deeply misleading.<p>This is description, but it is neither predictive, nor explanatory.<p>It <i>implies</i> a false model, rather than providing one.<p>Evergreen:<p><i>Ximm's Law</i>: every critique of AI assumes to some degree that contemporary implementations will not, or cannot, be improved upon.
Lemma: any statement about AI which uses the word "never" to preclude some feature from future realization is false.
From the article: A "stochastic parrot", according to Bender, is an entity "for haphazardly stitching together sequences of linguistic forms … according to probabilistic information about how they combine, but without any reference to meaning."<p>It seems to me that the great success transformers are now enjoying is precisely due to the fact that 'probabilistic information about how they combine' _is_ meaning.
This also relates to vision models. The existence of adversarial attacks (e.g. imperceptable changes in the image drastically changing the output) essentially demonstrate that the model has not reached the point at which the network "understands" the generalized concept it wants to disinguish.