科技回声

Does anyone know what the lossy compression ratio of a LLM is?((Final .safetensors [GB]) / (Total Training Data [GB])) * 100 = ?

I believe this wouldn't be meaningful, since any size LLM can be trained on any amount of data.You could measure how well it memorizes via prediction accuracy on the training set, but this wouldn't indicate whether it generalizes well.

LLaMa 3.1 has been pre-trained on 15 trillion tokens, plus some more millions for the fine-tuning. About 60 terabytes.<a href="https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/MODEL_CARD.md">https://github.com/meta-llama/llama-models/blob/main/models/...</a>The heaviest quantised LLaMa 3.1 8B is about 3.4GB.So 0.005% compression rate, if you don't mind the intelligence of a heavily quantised 8B model.

OpenAI’s GPT-3 model (175B) has an archive size of about 350 GB, with training data estimated in the hundreds of terabytes, resulting in a highly compressed ratio.

Does anyone know what the lossy compression ratio of a LLM is?((Final .safetensors [GB]) / (Total Training Data [GB])) * 100 = ?

OpenAI’s GPT-3 model (175B) has an archive size of about 350 GB, with training data estimated in the hundreds of terabytes, resulting in a highly compressed ratio.

Ask HN: Compression Ratio of LLMs?

3 条评论

Ask HN: Compression Ratio of LLMs?

3 条评论