TechEcho

armcat5 days ago

This was previously reported 5 months ago: <a href="https://news.ycombinator.com/item?id=42415122">https://news.ycombinator.com/item?id=42415122</a> (84 comments).<p>As an aside - I am a big fan of Luke Zettlemoyer and his team at the University of Washington. They've been doing cool NLP research for years!

entilzha5 days ago

Great to see our paper here again! Since the paper release, we've also released model weights here for anyone interesting in building on top of it: <a href="https://huggingface.co/facebook/blt" rel="nofollow">https://huggingface.co/facebook/blt</a>. We also added HF Hub code to easily load the model <a href="https://github.com/facebookresearch/blt?tab=readme-ov-file#load-weights-via-hf-hub">https://github.com/facebookresearch/blt?tab=readme-ov-file#l...</a>.

评论 #43968881 未加载

dlojudice5 days ago

This BLT approach is why "AI research is stalling" takes are wrong. Dynamic byte-level patches instead of tokens seems genuinely innovative, not just scaling up the same architecture. Better efficiency AND handling edge cases better? Actual progress. The field is still finding clever ways to rethink fundamentals.

Byte latent transformer: Patches scale better than tokens (2024)

3 comments

Byte latent transformer: Patches scale better than tokens (2024)

3 comments