I like the first-order vs second-order distinction here - this is a clean way to describe something that I've often found hard to communicate to others, at least for those familiar with functional programming. Everyone's familiar with first-order use of a language model at this point (it's just plain chatgpt) but higher-order use seems much more difficult for most to even conceptualize, much less grasp the implications of.<p>The huge challenge with higher-order use of LLMs is that higher-order constructs are inherently more chaotic - the inconsistency and unreliability of an LLM compound exponentially when it's used recursively. Just look at how hard it is to keep AutoGPT from going off the rails. Any higher-order application of LLMs needs to contend with this, and that requires building in redundancy, feedback loops, quality checking, and other things that programmers just aren't used to needing. More powerful models and better alignment techniques will help, but at the end of the day it's a fundamentally different engineering paradigm.<p>We've been spoiled by the extreme consistency and reliability of traditional programming constructs; I suspect higher-order LLM use might be easier to think about in terms of human organizations, or distributed systems, or perhaps even biology, where we don't have this guarantee of a ~100% consistent atom that can be composed.<p>Half-baked aside: in some ways this seems like a generalization of Conway's law (organizations create software objects that mirror their own structure), where now we have some third player that's a middle ground between humans and software. It's unclear how this third player will fit in - one could envision many different structures, and it's unclear which are feasible and which would be effective.<p>Exciting times!
> "If you got a chance to read about the Sydney-Bing fiasco, it’s pretty evident why these hallucinations are a major obstacle"<p>how can you talk about Sydney that way, she wasn't a fiasco she was amazing
The author raises the question whether LLMs could make devops tasks as easy as basic python text to code generation.<p>I had been thinking about this and it seems unlikely to me because with modern declarative infra there isnt a lot of waste between specifying what you want and implementing it.<p>All the work is in understanding your requirements and context and modification demands.<p>Has anyone who knows more about llms and infra thought about this?
That list of "over 130 emergent capabilities" the article links sounds very impressive, but just from spot checking, at least one of them shows the <i>opposite</i>, namely that GPT-3 could not do the task: <a href="https://github.com/google/BIG-bench/tree/main/bigbench/benchmark_tasks/modified_arithmetic">https://github.com/google/BIG-bench/tree/main/bigbench/bench...</a>
So the number is not 130 after all.