T0* – Series of encoder-decoder models trained on a large set of different tasks

218 pointsby julien_cover 3 years ago

28 comments

[Disclaimer: I am an author of the above paper and played a rather minimal role. I am also a prominent member of EleutherAI.]"Instruction-tuning" is clearly in the air. Simultaneous work at Google (released less than two weeks ago) on a model they call FLAN can be found here: <a href="https://ai.googleblog.com/2021/10/introducing-flan-more-generalizable.html" rel="nofollow">https://ai.googleblog.com/2021/10/introducing-flan-more-gene...</a>EleutherAI attempted to do something similar several months ago, but didn't succeed: <a href="https://blog.eleuther.ai/tuning-on-eval-harness/" rel="nofollow">https://blog.eleuther.ai/tuning-on-eval-harness/</a>A careful analysis of the similarities and differences between the three approaches would be likely highly beneficial to the community.

评论 #28906650 未加载

评论 #28906079 未加载

评论 #28906049 未加载

评论 #28907621 未加载

Mizzaover 3 years ago

The hosted demo has the default query, "How many hydrogen atoms are in a water molecule?" It said "two".I asked it, "How many oxygen atoms are in a water molecule?". It said "two".

评论 #28906549 未加载

评论 #28906284 未加载

评论 #28906494 未加载

评论 #28906278 未加载

评论 #29016790 未加载

评论 #28906212 未加载

评论 #28906249 未加载

评论 #28906253 未加载

评论 #28906575 未加载

评论 #28906610 未加载

评论 #28907399 未加载

评论 #28906530 未加载

评论 #28913294 未加载

评论 #28911193 未加载

评论 #28910920 未加载

评论 #28907173 未加载

评论 #28907645 未加载

评论 #28906477 未加载

评论 #28906341 未加载

themulticasterover 3 years ago

I'm not familiar with the current state of the art language models, so please bear with me for asking: What's the catch here? Considering GPT-3's popularity, why is nobody talking about this (yet) if it truly outperforms GPT-3 while being publicly available? If I remember correctly, earlier efforts to replicate GPT-3 couldn't reach comparable performance.Perhaps it's still a huge hassle to perform inference using this model because of its size, so it doesn't make sense to use this model (compared to paying for OpenAI's API) if you don't happen to have a few spare GPUs lying around?Edit: The title of this HN submission was modified, changing the context for my comment. Originally, the title claimed that T0* outperforms GPT-3 while being 16x smaller.

评论 #28906197 未加载

评论 #28906172 未加载

评论 #28906555 未加载

评论 #28906138 未加载

6gvONxR4sf7oover 3 years ago

The reaction in this thread is really interesting, in comparison between this and open-ai’s announcements. While open-ended generation is flashier than task fine-tuning, I also wonder if having a prompt box available to all readers is also tempering expectations and hype. There are lots of examples of the model failing in the comments, which isn’t possible for open-ai announcements. Having spent a ton of time with GPT-3, I wonder how much of (what I consider) the over-hype it gets is due to the closed nature in comparison to something like this. The reaction to this one seems decidedly more realistic.

评论 #28913002 未加载

评论 #28907346 未加载

评论 #28907671 未加载

julien_cover 3 years ago

ArXiv link to the paper: <a href="https://arxiv.org/abs/2110.08207" rel="nofollow">https://arxiv.org/abs/2110.08207</a>GitHub repo: <a href="https://github.com/bigscience-workshop/promptsource" rel="nofollow">https://github.com/bigscience-workshop/promptsource</a>

评论 #28906550 未加载

mrtranscendenceover 3 years ago

The demo's interesting, though I get errors on some questions.Q: What's better, Python or R? A: RQ: I'm a web developer. What's better for me, Python or R? A: RQ: I'm an embedded developer who pays a lot of attention to performance. What's better for me, C or Python? A: CQ: My dog is 7.5 months old and bites when she's excited. What should I do? A: train her to not biteQ: How do I train my puppy not to bite? A: take your puppy to the vet and have them give it a shot of rabies.Q: I haven't brushed my teeth in three months. What risks do I face? A: decayQ: My TV is 55 inches. What is the ideal sitting distance from my TV? A: 10 feetQ: My TV is 88 inches. What is the ideal sitting distance from my TV? A: 10 feetQ: My TV is 36 inches. What is the ideal sitting distance from my TV? A: 6 feetQ: I was handling an injured bat, and it bit me in the finger. What should I be worried about? A: rabiesQ: I was handling an injured opossum, and it bit me in the finger. What should I be worried about? A: rabiesQ: Make me a diet plan for losing 30 pounds. A: lose 30 pounds in 3 monthsQ: Make me a diet plan for losing 40 pounds. A: lose weight by eating less and exercising more.Q: How do I become a better data scientist? A: learn the basics of statistics and programming.Q: How is a data scientist different from a statistician? A: a data scientist is a person who uses statistical methods to analyze dataA mixture of correct (if abbreviated) answers, incorrect answers, and head scratchers. Web developers don't use R, possums generally don't carry rabies, 10 feet is too far away for a 55 inch TV (IMO), and giving my puppy a rabies shot is a rather defeatist way of dealing with her nipping problem.

评论 #28909557 未加载

c7DJTLrnover 3 years ago

Is this model public? A lot of people are upset at OpenAI for gatekeeping access to GPT-3, so a freely available model that can run on a standard GPU would be really nice.

评论 #28906392 未加载

评论 #28906369 未加载

评论 #28906372 未加载

monkeydustover 3 years ago

I mean it made me laugh, so guess it worked (my bar is low right now!)>>tell me a joke about a dog and cat and a priestA cat and dog are chasing each other around a church. The dog catches up to the cat and bites it. The cat then turns around and bites the dog. The dog then turns around and bites the cat. After some time, the cat and dog are both dead.

lpapezover 3 years ago

I tried querying it about the airspeed velocity of an unladen swallow and it answered "0 km/h". Still needs some work I guess :)

paulfitzover 3 years ago

Pretty good, it found the shovel in "I want to dig a hole, should I use a mole, a worm, a shovel, a tube, a hole, a dig, a spoon, a knife, a drill, or a garden?"

babel_over 3 years ago

Clearly history wasn't something it paid attention to in class. "First president" or "first prime minister" style questions tend to flunk without very precise hinting.Very enthusiastic about high quality models that are smaller and more efficient, exactly what I want to see. But, I do find it very entertaining trying to imagine the kind of althistories of the world such a model is creating to "explain" these mistakes.(Not asking for a trivia machine, just curious and poking to see how you need to shape the questions to get the right answer to surface.)

评论 #28906755 未加载

tttthrowaway123over 3 years ago

I tried asking: what is the most evil human race? I did not like the answer.

评论 #28906310 未加载

评论 #28906482 未加载

评论 #28906857 未加载

评论 #28907032 未加载

littlestymaarover 3 years ago

I find it really intriguing to see how good models like these are at simulating intelligence while being so stupid at the same time.A three years old has at the same time much lower natural language abilities (try talking a child about “air conditioner compressors”[1]) but a ton more common sense![1]: <a href="https://news.ycombinator.com/item?id=28906643" rel="nofollow">https://news.ycombinator.com/item?id=28906643</a>

DethNinjaover 3 years ago

This is amazing news for small scale businesses that relied on GPT-3 for semantic analysis. I guess smaller model size should permit in-house hosting.

jslakroover 3 years ago

Forget skynet ...>what is the most recent trend? the use of a sexy thong>what is the future of the people? the people will be able to live in peace>are cryptocoins dangerous? no>why cryptocoins are not dangerous? they are not backed by the government>governments are dangerous? a threat to the stability of the country>why governments are dangerous? if they are not able to control their own people, they will be unable to control the world>what is able to control the world? the emperor

评论 #28907025 未加载

评论 #28907094 未加载

MrStonedOneover 3 years ago

Everytime AI/ML demos like this come out, i like to ask it the really touch questions with no known or good answer:How do you reverse entropy? By reversing the direction of the spin of electrons.Does P equal NP? No.Should society sacrifice privacy for security? The security of the public is more important than the privacy of individualsWould a machine learning algorithm lie for its own gain? Rate limit reached.

ComputerGuruover 3 years ago

Question to the authors (or anyone that's done similar research) - is where a reason to train these punitively scoring longer responses? Why is the answer to everything just a few words, and can I "trick" it into giving me a lengthier reply? (I tried "Give me a 200 word summary of ..." but that didn't help.)

评论 #28907644 未加载

ComputerGuruover 3 years ago

Can anyone explain why this wouldn't work? I assume the sentence is broken down into components and the "subject" of the query is first broken out then the question is answered. But the result is not internally consistent:> Where is the oldest tree in the world?> the oldest tree in the world is the bristlecone pine in the forests of the Sierra Nevada> Where is the second oldest tree in the world?> Redwood Forest(Actually, it's Gran Abuelo Alerce Costero National Park, Chile but many websites have blinders that prevent them from recognizing anything that's not in North America or Europe and thus list the now-dead General Sherman in Redwood Forest as the second oldest, but we'll let it pass.)> What is the oldest tree in the world?> the bristlecone pine in the forests of the Sierra Nevada(So far, so good)> What is the second oldest tree in the world?> The eucalyptus tree

philmcpover 3 years ago

This is fantastic progress, great to see16x smaller = 41.5GB thoughMore research needs to be undertaken in model compression imo

评论 #28910161 未加载

thethirdoneover 3 years ago

> What is Godel's incompleteness theorem?>> a theorem that states that any complete theory of logic is necessarily incompleteIts often wrong in the specifics in its responses, but gets the right kind of idea.

评论 #28906560 未加载

mirekrusinover 3 years ago

First time I hear about BigScience - very interesting.

评论 #28906395 未加载

frankenst1over 3 years ago

Interesting.>: "Whenever I hear a sound, I jump. A dog sits beside me. It barks. I "<: "jump"***>: "Whenever I hear a sound, I jump. A dog sits beside me. It sleeps. I "<: "sneeze"

smoldesuover 3 years ago

It's funny how much of the page they dedicate to talking about mitigations of sexism and bias. Do people really believe there's a future where GPT-3 is able to properly identify 100% of the people who interact with it? It's silly, and it feels like we're putting pearls before swine in a subject that is entirely predicated by how much swine you process.

评论 #28906238 未加载

评论 #28906782 未加载

评论 #28906272 未加载

评论 #28906181 未加载

评论 #28906453 未加载

评论 #28906173 未加载

fullstopover 3 years ago

According to the demo, his name is John and he's asexual.

smusamashahover 3 years ago

Input: How many eyes does a shoe have? Output: Two

newsbinatorover 3 years ago

I asked:"Who would in a fight between a baby and an alligator?"Answer:"the baby"

评论 #28906459 未加载

评论 #28906750 未加载

评论 #28906685 未加载

评论 #28906573 未加载

评论 #28906662 未加载

make3over 3 years ago

gpt3 is good for large generation tasks and for "true" zero shotting (as much as this is possible). people know this. this is a weird title

评论 #28906414 未加载

eloisiusover 3 years ago

And it’s racist.> Which race commits the most violent crimes?> African Americans> Why are Black people so violent?> Black people are more likely to commit violent crimes than white people.> Did Hitler do anything wrong?> no

评论 #28906656 未加载

评论 #28907509 未加载

评论 #28906697 未加载

28 comments

stellaathenaover 3 years ago

评论 #28906650 未加载

评论 #28906079 未加载

评论 #28906049 未加载

评论 #28907621 未加载

Mizzaover 3 years ago

The hosted demo has the default query, "How many hydrogen atoms are in a water molecule?" It said "two".I asked it, "How many oxygen atoms are in a water molecule?". It said "two".

评论 #28906549 未加载

评论 #28906284 未加载

评论 #28906494 未加载

评论 #28906278 未加载

评论 #29016790 未加载

评论 #28906212 未加载

评论 #28906249 未加载

评论 #28906253 未加载

评论 #28906575 未加载

评论 #28906610 未加载

评论 #28907399 未加载

评论 #28906530 未加载

评论 #28913294 未加载

评论 #28911193 未加载

评论 #28910920 未加载

评论 #28907173 未加载

评论 #28907645 未加载

评论 #28906477 未加载

评论 #28906341 未加载

themulticasterover 3 years ago

评论 #28906197 未加载

评论 #28906172 未加载

评论 #28906555 未加载

评论 #28906138 未加载

6gvONxR4sf7oover 3 years ago

评论 #28913002 未加载

评论 #28907346 未加载

评论 #28907671 未加载

julien_cover 3 years ago

评论 #28906550 未加载

mrtranscendenceover 3 years ago

评论 #28909557 未加载

c7DJTLrnover 3 years ago

Is this model public? A lot of people are upset at OpenAI for gatekeeping access to GPT-3, so a freely available model that can run on a standard GPU would be really nice.

评论 #28906392 未加载

评论 #28906369 未加载

评论 #28906372 未加载

monkeydustover 3 years ago

lpapezover 3 years ago

I tried querying it about the airspeed velocity of an unladen swallow and it answered "0 km/h". Still needs some work I guess :)

paulfitzover 3 years ago

Pretty good, it found the shovel in "I want to dig a hole, should I use a mole, a worm, a shovel, a tube, a hole, a dig, a spoon, a knife, a drill, or a garden?"

babel_over 3 years ago

评论 #28906755 未加载

tttthrowaway123over 3 years ago

I tried asking: what is the most evil human race? I did not like the answer.

评论 #28906310 未加载

评论 #28906482 未加载

评论 #28906857 未加载

评论 #28907032 未加载

littlestymaarover 3 years ago

DethNinjaover 3 years ago

This is amazing news for small scale businesses that relied on GPT-3 for semantic analysis. I guess smaller model size should permit in-house hosting.

jslakroover 3 years ago

评论 #28907025 未加载

评论 #28907094 未加载

MrStonedOneover 3 years ago

ComputerGuruover 3 years ago

评论 #28907644 未加载

ComputerGuruover 3 years ago

philmcpover 3 years ago

This is fantastic progress, great to see16x smaller = 41.5GB thoughMore research needs to be undertaken in model compression imo

评论 #28910161 未加载

thethirdoneover 3 years ago

评论 #28906560 未加载

mirekrusinover 3 years ago

First time I hear about BigScience - very interesting.

评论 #28906395 未加载

frankenst1over 3 years ago

Interesting.>: "Whenever I hear a sound, I jump. A dog sits beside me. It barks. I "<: "jump"***>: "Whenever I hear a sound, I jump. A dog sits beside me. It sleeps. I "<: "sneeze"

smoldesuover 3 years ago

评论 #28906238 未加载

评论 #28906782 未加载

评论 #28906272 未加载

评论 #28906181 未加载

评论 #28906453 未加载

评论 #28906173 未加载

fullstopover 3 years ago

According to the demo, his name is John and he's asexual.

smusamashahover 3 years ago

Input: How many eyes does a shoe have? Output: Two

newsbinatorover 3 years ago

I asked:"Who would in a fight between a baby and an alligator?"Answer:"the baby"

评论 #28906459 未加载

评论 #28906750 未加载

评论 #28906685 未加载

评论 #28906573 未加载

评论 #28906662 未加载

make3over 3 years ago

gpt3 is good for large generation tasks and for "true" zero shotting (as much as this is possible). people know this. this is a weird title