A demo of GPT-3's ability to understand long instructions

161 点作者 monort超过 2 年前

13 条评论

goodside超过 2 年前

Be sure to read the thread, in particular: <a href="https://twitter.com/goodside/status/1557926101615366144?s=21&t=6tyUdwtpHbH6TjRHYXgsIA" rel="nofollow">https://twitter.com/goodside/status/1557926101615366144?s=21...</a>> A caveat to all of these: I use GPT-3 a lot, so I know the “golden path” of tasks it can do reliably. Had I asked it to write a sentence backwards or sum a list of numbers, it would fail every time. These are all softball questions in isolation.I haven’t shown that GPT-3 can handle all coherent directions of this length, or even most directions that an untrained person would think to create. It’s just a demo that, if GPT-3 happens to be capable of your tasks separately, length per se is not a major issue.

评论 #32540974 未加载

shdon超过 2 年前

Seems like there is one instruction it didn't follow: The first task mentions the usernames should be exactly like in the list, yet the AI responds with "firebob" (as in the comment) rather than "FireBob1990" (as in the list)Funnily enough, that is exactly the kind of thing a human might do, as we too are terrible at following instructions precisely.

评论 #32536593 未加载

评论 #32539219 未加载

v4dok超过 2 年前

I don't think modern big language models are conscious, mainly because they fail in absurd ways. But TBH, they don't need to. This "golden path" deployed properly etc could easily automate a lot of jobs tomorrow.

评论 #32536974 未加载

评论 #32538554 未加载

评论 #32536563 未加载

benreesman超过 2 年前

My initial instinct was that this has to be getting some nudges from whatever human-in-the-loop is going on at OpenAI.But then I realized that somewhere on the Internet there inevitably is a message board where people play the "find me some shit on the internet" game, and there's some rabid subculture around it with zillions upon zillions of of examples, and it's in the Bing index, and all the nudging it would need is to emphasize that sort of thing in the corpus.Very impressive stuff.

评论 #32536841 未加载

russellbeattie超过 2 年前

Wow, that was really impressive. I thought I had a clear idea of what GPT-3 could do, but I had underestimated by a lot. Even if the results weren't accurate, which they mostly seem to be, it's still doing an amazing job of following complex instructions. Better than most people I would guessMakes me double down on my prediction a week or so ago* of a Mid-Level AI Knowledge Work Apocalypse. In the next decade, AIs like this are going to do to office work what robotic mechanization did to the manufacturing sector.1. <a href="https://news.ycombinator.com/item?id=32395193" rel="nofollow">https://news.ycombinator.com/item?id=32395193</a>

评论 #32538960 未加载

评论 #32538563 未加载

评论 #32539435 未加载

seaucre超过 2 年前

It didn't correctly identify that FireBob1990's name was misspelled as "firebob" in the original comment.

评论 #32536793 未加载

OJFord超过 2 年前

DALL·E I can see obvious use for, GPT tends to be similarly impressive, but I don't understand if it's 'just' interesting research, seeing what we can do sort of thing, or whether people actually see real-world use cases for it?The closest to it was perhaps that code-generating demo here a day or two ago - but who wants to be a 'GPT programmer' writing code as 'write a Python program that computes fizzbuzz replacing the arguments $fizz$ and $buzz$, ...' instead of just the 'actual' code? It just seems like a more clever AppleScript to me, pseudocode, and I don't think anybody's ever seriously pursued a flexible keyword pseudocode like language as a goal, it's just appeared as a demo of more general models?Generating template/outline text I suppose? (Like that essay-writing helper here a few days ago.)

评论 #32539427 未加载

评论 #32542565 未加载

评论 #32539534 未加载

Titan2189超过 2 年前

Uhh. How is that even possible? I thought I had a basic understanding of Neural Networks and inuput-, hidden- and output layers and those things. So how can it possibly backreference to it's own previous answers and then follow another prompt based on this? Mind = Blown

评论 #32536924 未加载

评论 #32536501 未加载

评论 #32536517 未加载

评论 #32536631 未加载

mach1ne超过 2 年前

While impressive, it doesn't imply that GPT would have any significant 'task memory'. Remember that it always predicts the next token or word - as such, it essentially recognizes whether the next 'task' in the list has already been written, and if so, it writes the next task.It might be interesting to see how well it is able to modify the first output given some aspect of the final tasks.

chucky超过 2 年前

Now I'm curious if it can handle the classic reading comprehension assignment I've been given multiple times in my life. You know, the one that goes something like this:1. Read through all steps carefully.2. Do X3. Do Y(...)99. As you have now read through the instructions, simply put your name in the top right corner of the first page.

评论 #32540243 未加载

anigbrowl超过 2 年前

Interesting results goodside.Is it able to extract any kind of structural information? For example, you pass it the text of a movie script or children's story (where the descriptive language is simple) and it returns a structured summary of the content?

评论 #32536723 未加载

etaioinshrdlu超过 2 年前

Is GPT-3 being regularly updated?

评论 #32536392 未加载

masswerk超过 2 年前

This should be really "react to" or "answer to", instead of "understand". These are not the same.Edit: Anthropomorphizing algorithms and pattern stores doesn't really help understanding. Instead, it's apt to spread misunderstanding. Remember how long it took to purge the popular idea of "electronic brains" actually thinking, and to establish that these were restricted to executing what's actually in code? We don't need to start another level of this with "AI". (Understanding is closely related to self-awareness and consciousness, and this is dangerous ground of misunderstanding when it comes to AI. As we've seen, even staff of pioneering companies, like Google, is prone to fall for this.)

评论 #32547149 未加载