TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Theoretical limitations of multi-layer Transformer

107 pointsby fovc4 months ago

5 comments

thesz4 months ago
<p><pre><code> &gt; ...our results give: ... (3) a provable advantage of chain-of-thought, exhibiting a task that becomes exponentially easier with chain-of-thought. </code></pre> It would be good to also prove that there is no task that becomes exponentially harder with chain-of-thought.
cubefox4 months ago
Loosely related thought: A year ago, there was a lot of talk about the Mamba SSM architecture replacing transformers. Apparently that didn&#x27;t happen so far.
评论 #42896893 未加载
hochstenbach4 months ago
Quanta magazine has an article that explains in plain words what the researchers were trying to do : <a href="https:&#x2F;&#x2F;www.quantamagazine.org&#x2F;chatbot-software-begins-to-face-fundamental-limitations-20250131&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.quantamagazine.org&#x2F;chatbot-software-begins-to-fa...</a>
byyoung34 months ago
those lemmas are wild
cs7024 months ago
Huh. I just skimmed this and quickly concluded that it&#x27;s definitely <i>not</i> light reading.<p>It sure looks and smells like good work, so I&#x27;ve added it to my reading list.<p>Nowadays I feel like my reading list is growing faster than I can go through it.
评论 #42891757 未加载