3 点作者 chse_cake大约 1 年前

1 comment

sci-genie大约 1 年前

TBH I kinda agree with the argument that distributed training is too hard. Its so architecture/compute-resources/network-topology dependent that when people open those can of worms, they quickly realize that the cost/benefit tradeoff is limited unless you are doing large-scale pre-training. its just so much easier to train as much as possible on a single node

50+ hot takes on the current and future of AI/ML

1 comment

50+ hot takes on the current and future of AI/ML

1 comment