TechEcho

1 comment

Large language models (LLMs) are useful in many NLP tasks and become more capable with size, with the best open-source models having over 50 billion parameters. However, using these 50B+ models requires high-end hardware, making them inaccessible to most researchers. In this work, we investigate methods for cost-efficient inference and fine-tuning of LLMs, comparing local and distributed strategies. We observe that a large enough model (50B+) can run efficiently even on geodistributed devices in a consumer-grade network.

Distributed Inference and Fine-Tuning of Large Language Models over the Internet

1 comment

Distributed Inference and Fine-Tuning of Large Language Models over the Internet

1 comment