Open Challenges in LLM Research

163 pointsby muggermuchalmost 2 years ago

15 comments

errantsparkalmost 2 years ago

Fun fact: I took the photo she used as a cover for one of her books, she asked me if she could use it and I said I'd like to be compensated and her response was something akin to "oh I was just asking assuming you'd say yes, I'm going to do it anyway". Nobody's perfect, maybe she regrets it, and it hasn't really crossed my mind in years, but I guess it still sort of irks me to be reminded of it. Anyway if anyone needs a portrait for a book cover feel free to hit me up XD.

评论 #37157935 未加载

评论 #37160229 未加载

评论 #37158484 未加载

thrwayaistartupalmost 2 years ago

Looking back in 25 years, the "Hallucination Problem" will sound a lot like the "Frame Problem" of the 1970s.Looking back, it's a bit absurd to say that GOFAI would've got to AGI if only the Frame Problem could be solved. But the important point is why that sounds so absurd.It doesn't sound absurd because we found out that the frame problem can't be solved; that's beside the point.It also doesn't sound absurd because we found out that solving the frame problem isn't the key to GOFAI-based AGI. That's also beside the point.It sounds absurd because the conjecture itself is... just funny. It's almost goofy, looking back, how people thought about AGI.Hallucination is the Frame Problem of the 2023 AI Summer. Looking back from the other side of the next Winter, the whole thing will seem a bit goofy.

评论 #37158427 未加载

评论 #37156205 未加载

评论 #37156203 未加载

评论 #37162812 未加载

blackkettlealmost 2 years ago

Seems like it is basically a blog post review of Challenges and Applications of Large Language Models which was published to arXiv last month:- <a href="https://arxiv.org/abs/2307.10169" rel="nofollow noreferrer">https://arxiv.org/abs/2307.10169</a>

评论 #37172857 未加载

fordalmost 2 years ago

So far it's been ~8 months since ChatGPT started the (popular) LLM craze. I've found raw GPT to be useful for a lot of things, but have yet to see my most frequently used apps integrate it in a useful way. Maybe I'm using the wrong apps...It'll be interesting to see what improvements (in a lab or at a company) need to happen before most people use purpose-built LLMs (or behind the scenes LLM prompts) in the apps they use every day. The answer might be "no improvements" and we're just in the lag time before useful features can be built

评论 #37156571 未加载

评论 #37156455 未加载

illusionist123almost 2 years ago

I think it's not possible to get rid of hallucinations given the structure of LLMs. Getting rid of hallucinations requires knowing how to differentiate fact from fiction. An analogy from programming languages that people might understand is type systems. Well-typed programs are facts and ill-typed programs are fictions (relative to the given typing of the program). To eliminate hallucinations from LLMs would require something similar, i.e. a type system or grammar for what should be considered a fact. Another analogy is Prolog and logical resolution to determine consequences from a given database of facts. LLMs do not use logical resolution and they don't have a database of facts to determine whether whatever is generated is actually factual (or logically follows from some set if facts) or not, LLMs are essentially Markov chains and I am certain it is impossible to have Markov chains without hallucinations.So whoever is working on this problem, good luck because you have you have a lot of work to do to get Markov chains to only output facts and not just correlations of the training data.

评论 #37158948 未加载

评论 #37156999 未加载

评论 #37161503 未加载

评论 #37157607 未加载

mattlutzealmost 2 years ago

"Never before in my life had I seen so many smart people working on the same goal"I'm not sure why but the assumptions and naivety in this opening line bothers me. There are plenty of goals and problems that orders of magnitude more people are working on today.

评论 #37160297 未加载

评论 #37160372 未加载

visargaalmost 2 years ago

Let me add a few:- organic data exhaustion - we need to step up synthetic data and its validation- imbalanced datasets - catalog, assess and fill in missing data- backtracking - make LLMs better at combinatorial or search problems- deduction - we need to augment the training set for revealing implicit knowledge, in other words to study the text before learning it- defragmentation - information comes in small chunks, sits in separate siloes, and context size is short, we need to use retrieval to bring it together for analysistl;dr We need quantity, diversity and depth in our training sets

评论 #37158767 未加载

inciampatialmost 2 years ago

One thing I'd like to see is more effort on developing citation systems for these models.What I mean is that every part of the output of an LLM should be annotated with references to the content that is most important or relevant to it.Who is leading this effort now?

评论 #37158528 未加载

yeckalmost 2 years ago

I have a hard time understanding why mechanistic interpretability has so few eyes on it. It's like trying to build a complex software system without logging or monitoring. Any other improvements you want to make on the system are going to just be trail and error with luck. The hallucination problem is one where interpretability of a model might be able to identify the failure modes that we need to address. Really any AI problem could likely be aided by a scalable approach to interpretability that is just as mundane feeling as classical software observability.

评论 #37162396 未加载

评论 #37161629 未加载

karxxmalmost 2 years ago

I have not seen many work on explainable AI regarding large language models. I remember many very nice visualizations and visual analysis tools trying to comprehend, what the network „is seeing“ (eg. in the realm of image classification) or doing

techwizrdalmost 2 years ago

I really like seeing articles or papers that describe the current advances and open challenges in a sub-field (such as [0]). They're underappreciated, but good practice or reading for folks wanting to get in the field. They're also worthwhile and humbling to look back at every few years: did we get the challenges right? How well did we understand the problem at the time?0: <a href="https://arxiv.org/abs/1912.04977" rel="nofollow noreferrer">https://arxiv.org/abs/1912.04977</a>

评论 #37159015 未加载

crosen99almost 2 years ago

The biggest challenge I’m trying to track isn’t on the list: online learning. The difficulties with getting LLMs to absorb new knowledge without catastrophic forgetting is a key factor making us so reliant on techniques like retrieval augmented generation. While RAG is very powerful, it’s only as good as the information retrieval step and context size, which quite often aren’t good enough.

Buttons840almost 2 years ago

Our APIs up to this point have been designed for computers. Json input, json output, and those are the nice ones.I wonder if a deterministic but natural language API would be any better for LLMs to integrate with? Or do LLMs already speak Json well enough?

_pdp_almost 2 years ago

Another challenge is also around how we think LLMs should be used vs understanding how LLMs can be used. It will take some time to figure this out.

matanyalalmost 2 years ago

Interesting, no mention of Groq for number 6.