37 点作者 bmc7505大约 2 年前

1 comment

IsaacL大约 2 年前

This looks like a very interesting paper that takes the rare approach of actually trying to understand what all the cool new language models are doing at a fundamental level.<p>Does anyone with more knowledge of the relevant mathematics (group theory and so on) care to chime in?

评论 #35727202 未加载

Transformers Learn Shortcuts to Automata

1 comment

Transformers Learn Shortcuts to Automata

1 comment