科技回声

5 条评论

rwl4超过 1 年前

The author of the article appears to have misunderstood one important detail about Code Llama.They state:> The Code Llama models were trained on 500B tokens, whereas Llama 2 models were trained on 2T tokens. Since the Code Llama model was trained on 4x fewer tokens, maybe a CodeLlama 70B version did not perform well enough due to LLM scaling laws—there was not enough training data.But if you read the paper, on page 1, it says:> Our approach is based on gradually specializing and increasing the capabilities of Llama 2 models by applying a cascade of training and fine-tuning steps [...]In fact, they show a diagram at the top of page 3 that details the process, starting with Llama 2 foundation models.Llama 2 Foundation models (7B, 13B, 34B) -> Code training 500B -> Python / Long Context.See the paper here: <a href="https://arxiv.org/abs/2308.12950" rel="nofollow noreferrer">https://arxiv.org/abs/2308.12950</a>

评论 #37323858 未加载

评论 #37323179 未加载

评论 #37326992 未加载

评论 #37323648 未加载

ImprobableTruth超过 1 年前

>GPT-3.5 has 175B parameters versus 70B parameters in Llama 2We know that for the original version of GPT-3.5, but my assumption was that Turbo was a distilled smaller model (which is why it uses OAI's new vocab & is so much faster).If that's not the case, what could be the explanation for it being faster?

评论 #37322723 未加载

评论 #37321488 未加载

评论 #37322111 未加载

评论 #37324628 未加载

ranguna超过 1 年前

I'm not sure if I'm the only one, but I find the starcoder model to be muuuuch better than codellama 34B quantized. I can't seem to find any good coding benchmarks online comparing the two.Anyone else having a similar experience?

Havoc超过 1 年前

Managed to get code llama 34 integrated into vscode and must say it’s surprisingly usable for scaffolding and also explaining pieces of code

评论 #37327430 未加载

syntaxing超过 1 年前

Does this mean there’s most likely a non released version of llama 2 34B at Meta since they need one as a base for code llama?

评论 #37332056 未加载

5 条评论

rwl4超过 1 年前

评论 #37323858 未加载

评论 #37323179 未加载

评论 #37326992 未加载

评论 #37323648 未加载

ImprobableTruth超过 1 年前

评论 #37322723 未加载

评论 #37321488 未加载

评论 #37322111 未加载

评论 #37324628 未加载

ranguna超过 1 年前

Havoc超过 1 年前

Managed to get code llama 34 integrated into vscode and must say it’s surprisingly usable for scaffolding and also explaining pieces of code

评论 #37327430 未加载

syntaxing超过 1 年前

Does this mean there’s most likely a non released version of llama 2 34B at Meta since they need one as a base for code llama?

评论 #37332056 未加载

Understanding Llama 2 and the New Code Llama LLMs

5 条评论

Understanding Llama 2 and the New Code Llama LLMs

5 条评论