TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Llama2.rs: One-file Rust implementation of Llama2

60 pointsby sshrootalmost 2 years ago

5 comments

jokethrowawayalmost 2 years ago
Very nice! I wanted to do something like this but then I would miss on proper CUDA acceleration and lose performance compared to using torchlib.<p>I wrote a forgettable llama implementation for <a href="https:&#x2F;&#x2F;github.com&#x2F;LaurentMazare&#x2F;tch-rs">https:&#x2F;&#x2F;github.com&#x2F;LaurentMazare&#x2F;tch-rs</a> (pytorch&#x27;s torchlib rust binding). Still not ideal but at least you get the same GPU performance you would get on pytorch.<p>...And then I spotted Candle, a new ML framework by the same author: <a href="https:&#x2F;&#x2F;github.com&#x2F;huggingface&#x2F;candle">https:&#x2F;&#x2F;github.com&#x2F;huggingface&#x2F;candle</a><p>It&#x27;s all in Rust, self contained, a huge undertaking, but it looks very promising. They already have a llama2 example!
ReactiveJellyalmost 2 years ago
(since you asked for a code review)<p>For timing benchmarks, use Instant or a similar monotonic clock instead of SystemTime.<p>The original C code makes the same mistake, using clock_realtime instead of clock_monotonic.<p>This means the benchmarks will be wrong if the program runs while ntp is fixing up the clock. This can happen right after the system gets internet, or periodically when it checks for skew. Some systems might slowly blend in ntp fixes too, which means 1 second of calendar time is not 1 second of monotonic time over a long period of time.<p>At least it won&#x27;t be affected by daylight saving. But it&#x27;s not airtight
tayo42almost 2 years ago
anyone have suggestions about where to learn about the stuff going on in this (and llama2.c repo)?<p>like the file formats, all the extra files like the tokenizer.bin file, the terminology in the sources comments, logits, transformers etc
CameronNemoalmost 2 years ago
<i>This is my first Rust project, so if you are an expert I would love a code review!</i><p>Seeing a few uses of `unsafe`, a few of `expect`. Wonder if you can mmap the binary model in without unsafe??
评论 #37017270 未加载
WiSaGaNalmost 2 years ago
This has 2 dependencies. Notably it depends on rayon.
评论 #37018950 未加载