TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

YaFSDP: a sharded data parallelism framework, faster for pre-training LLMs

135 pointsby wiradikusuma11 months ago

2 comments

codetrotter11 months ago
I was surprised to see that the Ya part meant “Yet another”. I mean, I’ve seen it before in many acronyms. But it’s pretty tongue in cheek of them to do that here since one would expect it was just because it was made by Yandex.
评论 #40720067 未加载
评论 #40720024 未加载
dayeye200611 months ago
Any idea on what are the main tricks used to achieve gains over fsdp?
评论 #40719773 未加载