Ask HN: Anyone working on something better than LLMs?

42 点作者 xucian大约 1 年前

if you think about it, next-token prediction is just stupid. it's so resource intensiveyet, it's mimicking emergent thought quite beautifully. it's shockingly unintuitive how a simple process scaled enormously can lead to this much practical intelligence (practical in the sense that's useful, but it's not the way we think). I'm aware there are multiple layers, filters, processes etc., I'm just talking about the foundation, which is next-token prediction.when I first heard that it's not predicting words, but parts of words, I immediately saw a red flag. yes, there are compounded words like strawberry (straw + berry) and you can capture meaning at a higher-resolution, but most words are not compounded, and just in general we're trying to simulate meaning instead of 'understanding' it. 'understanding' simply means knowing a man is to a woman what a king is to a queen, but without the need to learn about words and letters (that should be just an interface).I feel we're yet to discover the "machine code" for ASI. it's like we have no compiler, but we directly interpret code. imagine the speed-ups if we could just spare the processor from understanding our stupid, inefficient language.I'd really like to see a completely new approach working in the Meaning Space, which transcends the imperfect Language Space. This will require lots of data pre-processing, but it's a fun journey -- basically a parser human-machine and machine-human. I'm sure I'm not the first one thinking about itso what we've got so far?

12 条评论

plaidfuji大约 1 年前

As others have noted, Yann LeCun is looking beyond autoregressive (next-token-prediction) models. Here’s one of his slide decks that raises some interesting new concepts:<a href="https://drive.google.com/file/d/1BU5bV3X5w65DwSMapKcsr0ZvrMRU_Nbi/view?pli=1" rel="nofollow">https://drive.google.com/file/d/1BU5bV3X5w65DwSMapKcsr0ZvrMR...</a>

评论 #40353194 未加载

评论 #40330510 未加载

BugsJustFindMe大约 1 年前

> it's shockingly unintuitive how a simple process scaled enormously can lead to this much practical intelligenceA biological neuron doesn't do much. On its own, a simple process. Yet when you put a 100 billion of them together in the right 1000-connected configuration you get a human brain.

评论 #40328684 未加载

评论 #40324070 未加载

stevenAthompson大约 1 年前

> in general we're trying to simulate meaning instead of 'understanding' it. 'understanding' simply means knowing a man is to a woman what a king is to a queen, but without the need to learn about words and letters (that should be just an interface).I have no idea what I'm talking about, but what you describe is exactly what LLM's do.Words are tokens that represent concepts. We've found a way to express the relationships between many tokens in a giant web. The tokens are defined by their relationships to each other. Changing the tokens we use probably won't make much more difference than changing the language the LLM is built from.We could improve the method we use to store and process those relationships, but it will still be fundamentally the same idea: Large webs of inter-related tokens representing concepts.

评论 #40329075 未加载

jtietema大约 1 年前

I think you might find Lex Fridmans interview with Yann LeCun interesting[1]. It discusses exactly this, how LLMs just mimmick intelligent behaviour but have no understanding of the world at all. It also discusses other approaches we should look at instead of current LLMs.[1] <a href="https://youtu.be/5t1vTLU7s40" rel="nofollow">https://youtu.be/5t1vTLU7s40</a>

评论 #40323803 未加载

评论 #40328974 未加载

exe34大约 1 年前

> understanding' simply means knowing a man is to a woman what a king is to a queen,Turns out this is beautifully represented by embeddings alone!

评论 #40329878 未加载

waldrews大约 1 年前

Meaning Space transcending the imperfect Language Space? Yes, there's been some recent thinking in this direction, e.g. Zhuangzi, "Words exist because of meaning. Once you've gotten the meaning, you can forget the words. Where can I find a man who has forgotten words so I can talk with him?"

评论 #40329176 未加载

jncfhnb大约 1 年前

> understanding' simply means knowing a man is to a woman what a king is to a queen, but without the need to learn about words and letters (that should be just an interface).Citation needed

评论 #40329312 未加载

评论 #40324025 未加载

quadrature大约 1 年前

>Yes, there are compounded words like strawberry (straw + berry) and you can capture meaning at a higher-resolution, but most words are not compoundedWhat's really cool about tokenization is that it breaks down words based on how often parts of the word are used. This helps a lot with understanding different forms of words, like when you add "-ing" to a verb, make words plural, or change tenses. It's like seeing language as a bunch of building blocks.

评论 #40329720 未加载

xnx大约 1 年前

Does anyone have a guess what angle John Carmack is working on with Keen Technologies? <a href="https://dallasinnovates.com/john-carmacks-keen-technologies-partners-to-accelerate-development-of-artificial-general-intelligence/" rel="nofollow">https://dallasinnovates.com/john-carmacks-keen-technologies-...</a>

评论 #40329766 未加载

samus大约 1 年前

Look no further than here for decoding more than one following token:<a href="https://hao-ai-lab.github.io/blogs/cllm/" rel="nofollow">https://hao-ai-lab.github.io/blogs/cllm/</a>

评论 #40330600 未加载

gaganyaan大约 1 年前

I don't think you quite understand how tokenization works. Try typing "strawberry" in here:<a href="https://platform.openai.com/tokenizer" rel="nofollow">https://platform.openai.com/tokenizer</a>Tokens aren't just individual parts of compound words, they're sliced up in a way that's convenient statistically. The tokenizer has each individual character as a token, so it could be purely character-based if desired, it's just easier to compute when some common sequences like "berry" are represented by a single token. Try typing "strawberry" into the tokenizer and see it tokenized as "str", "aw", and "berry".Also, next token prediction is not stupid. A "sufficiently advanced" next token predictor must be at least as intelligent as a human, if it could predict any humans' next token in any scenario. Obviously, we're not there yet, but there's no reason to think right now that next token prediction will face any sort of limitation. Especially with new models coming out that are seeing better perfomance purely from training them much longer on the same datasets.

评论 #40329807 未加载

byyoung3大约 1 年前

its not stupid.

评论 #40330000 未加载