TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Byte latent transformer: Patches scale better than tokens (2024)

107 pointsby dlojudice5 days ago

3 comments

armcat5 days ago
This was previously reported 5 months ago: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=42415122">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=42415122</a> (84 comments).<p>As an aside - I am a big fan of Luke Zettlemoyer and his team at the University of Washington. They&#x27;ve been doing cool NLP research for years!
entilzha5 days ago
Great to see our paper here again! Since the paper release, we&#x27;ve also released model weights here for anyone interesting in building on top of it: <a href="https:&#x2F;&#x2F;huggingface.co&#x2F;facebook&#x2F;blt" rel="nofollow">https:&#x2F;&#x2F;huggingface.co&#x2F;facebook&#x2F;blt</a>. We also added HF Hub code to easily load the model <a href="https:&#x2F;&#x2F;github.com&#x2F;facebookresearch&#x2F;blt?tab=readme-ov-file#load-weights-via-hf-hub">https:&#x2F;&#x2F;github.com&#x2F;facebookresearch&#x2F;blt?tab=readme-ov-file#l...</a>.
评论 #43968881 未加载
dlojudice5 days ago
This BLT approach is why &quot;AI research is stalling&quot; takes are wrong. Dynamic byte-level patches instead of tokens seems genuinely innovative, not just scaling up the same architecture. Better efficiency AND handling edge cases better? Actual progress. The field is still finding clever ways to rethink fundamentals.
评论 #43967333 未加载
评论 #43966399 未加载
评论 #43966245 未加载
评论 #43966340 未加载
评论 #43965854 未加载