TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: Why the LLaMA code base is so short

3 点作者 kureikain超过 1 年前
I was getting into LLM and I pick up some projects. I tried to dive into the code to see what is secret sauce.<p>But the code is so short to the point there is nothing to really read.<p>https:&#x2F;&#x2F;github.com&#x2F;facebookresearch&#x2F;llama<p>I then proceed to check https:&#x2F;&#x2F;github.com&#x2F;mistralai&#x2F;mistral-src and suprsingly it&#x27;s same.<p>What is exactly those codebases? It feels like just download the models.

2 条评论

mikewarot超过 1 年前
Most neural networks are just directed graphs, with a ton of matrix multiplies and nonlinear functions at the end of each layer. The libraries to do gradient descent, training, etc.. are all there to use. It is amazing how small the actual code is, compared to the amount of compute to do the training.
arthurcolle超过 1 年前
These are just the repos that provide the inference code to run the model, it requires the weights which are available via HuggingFace or in Llama 2&#x27;s case, from here: <a href="https:&#x2F;&#x2F;ai.meta.com&#x2F;resources&#x2F;models-and-libraries&#x2F;llama-downloads&#x2F;" rel="nofollow noreferrer">https:&#x2F;&#x2F;ai.meta.com&#x2F;resources&#x2F;models-and-libraries&#x2F;llama-dow...</a>