科技回声

7 条评论

haldujai超过 2 年前

Is it just me or does anyone else cringe when they read "is/is not all you need" in the title of an AI-related paper?Also, what is "SOTA" for review? There isn't exactly a benchmark to compare...In terms of comprehensiveness they don't mention PaLM + variants, which probably should be mentioned as it is currently the largest LLM with SOTA on several benchmarks (e.g. MedQA-USMLE).In terms of correctness, I admittedly skipped to the sections I'm familiar with (LLMs) but I don't understand why they are distinguishing 'text-science' from 'text-text', they're both text-text and there is no reason why you can't, for example, adapt GPT3.5 to a scientific domain domain (some people even argue this is a better approach). A lot of powerful language models in the biomedical domain were initialized from general language models and use out-of-domain tokenizers/vocabularies (e.g. BioBERT).The authors also make this statement regarding Galactica:"The main advantage of [Galactica] is the ability to train on it for multiple epochs without overfitting"This is not a unique feature of Galactica and has been done before. You're allowed to train LLMs for more than 1 epoch and in fact it can be very beneficial (see BioBERT as an example of increasing training length).People GENERALLY don't do this because the corpus used during self-supervised training is filled with garbage/noise, so the model starts to fit to that instead of what you desire. There is nothing special about Galactica's architecture that specifically allows/encourages longer training cycles but rather they curated the dataset to minimize garbage. As another example, my research involves radiology NLP and when doing domain adaptive pretraining on a highly curated dataset we have been going up to 8 epochs without overfitting.

评论 #34459131 未加载

评论 #34462887 未加载

评论 #34460226 未加载

评论 #34460126 未加载

soperj超过 2 年前

I really haven't found ChatGTP to be very good. Ask it the lyrics to a song and it'll come up with some of the lyrics but not all of the lyrics. Like this is what I got for "In Bloom" by Nirvana:Verse 1: Sell the kids for food Weather changes moods Spring is here again Reproductive glands Verse 2: A country battle song Multiply, exciting people Come on, join the party Come on, everybody Chorus: In bloom In bloom In bloom In bloom Verse 3: Subhumanity is fun You have one, you have none A soap impression of his wife Which he ate and donated to the National Trust Verse 4: I'm not like them But I can pretend The sun is gone But I have a light Chorus: In bloom In bloom In bloom In bloom

评论 #34461817 未加载

amelius超过 2 年前

> This work consists on an attempt to describe in a concise way the main models are sectors that are affected by generative AI and to provide a taxonomy of the main generative models published recently.I had trouble parsing this sentence.And why doesn't the abstract provide at least some basic explanation of the title?

评论 #34458776 未加载

评论 #34459705 未加载

评论 #34459130 未加载

评论 #34458468 未加载

coolspot超过 2 年前

No OpenScience’s Bloom (opensource, 175B params), no Google’s T5 (opensource, 11B).The AI/ML space moves so fast that this review is already outdated.

评论 #34460167 未加载

la64710超过 2 年前

I think the simple chat interface is why chatGPT became popular. If other people build relatively simple interfaces to leverage other models they might be successful as well.

评论 #34460565 未加载

评论 #34462268 未加载

Xeoncross超过 2 年前

A correct anticipation of the future is a big part of all successful projects.However, that is mostly do the effort required to change course. If we reach a point where it's easy enough to regenerate everything from scratch, will it be so important to correctly plan ahead?

评论 #34458257 未加载

simonw超过 2 年前

The diagrams in this paper did not fill me with confidence: there doesn't appear to be any reason for the layout of the boxes and lines in them other than to fill some space.

7 条评论

haldujai超过 2 年前

评论 #34459131 未加载

评论 #34462887 未加载

评论 #34460226 未加载

评论 #34460126 未加载

soperj超过 2 年前

评论 #34461817 未加载

amelius超过 2 年前

评论 #34458776 未加载

评论 #34459705 未加载

评论 #34459130 未加载

评论 #34458468 未加载

coolspot超过 2 年前

No OpenScience’s Bloom (opensource, 175B params), no Google’s T5 (opensource, 11B).The AI/ML space moves so fast that this review is already outdated.

评论 #34460167 未加载

la64710超过 2 年前

I think the simple chat interface is why chatGPT became popular. If other people build relatively simple interfaces to leverage other models they might be successful as well.

评论 #34460565 未加载

评论 #34462268 未加载

Xeoncross超过 2 年前

评论 #34458257 未加载

simonw超过 2 年前

The diagrams in this paper did not fill me with confidence: there doesn't appear to be any reason for the layout of the boxes and lines in them other than to fill some space.

ChatGPT is not all you need. A SOTA Review of large Generative AI models

7 条评论

ChatGPT is not all you need. A SOTA Review of large Generative AI models

7 条评论