TechEcho

13 comments

ffriendabout 1 year ago

It's also worth mentioning that the original implementation by Meta is only 300 lines of very readable code [1].[1]: <a href="https://github.com/meta-llama/llama3/blob/main/llama/model.py">https://github.com/meta-llama/llama3/blob/main/llama/model.p...</a>

评论 #40384232 未加载

评论 #40382742 未加载

评论 #40386480 未加载

评论 #40384031 未加载

评论 #40385113 未加载

评论 #40382656 未加载

joennlaeabout 1 year ago

Trainable Llama-like transformer (with backpropagation) in numpy only (~600 lines)<a href="https://github.com/joennlae/tensorli">https://github.com/joennlae/tensorli</a>

评论 #40385146 未加载

buildbotabout 1 year ago

Cool, instant cuda acceleration via cupy! `import cupy as np`

lnyanabout 1 year ago

`import jax.numpy as np`, then we also get a jax implemention after certain modifications: e.g. remove in-place index assignment, replace unsupported functions, etc

评论 #40382090 未加载

评论 #40381005 未加载

rhdunnabout 1 year ago

From the TinyStories dataset card [1] the dataset is generated by GPT-3.5 and GPT-4. Reading the discussions in the community tab [2] it looks like there are a lot of incomplete or misspelled words, incorrect grammar, and even Chinese characters in the dataset.As such, I'd be weary of using that dataset to train or evaluate models.[1] <a href="https://huggingface.co/datasets/roneneldan/TinyStories" rel="nofollow">https://huggingface.co/datasets/roneneldan/TinyStories</a>[2] <a href="https://huggingface.co/datasets/roneneldan/TinyStories/discussions" rel="nofollow">https://huggingface.co/datasets/roneneldan/TinyStories/discu...</a>

评论 #40381845 未加载

dangabout 1 year ago

We changed the URL from <a href="https://github.com/likejazz/llama3.np">https://github.com/likejazz/llama3.np</a> to the article it points to, which gives more background.

AI_hackerabout 1 year ago

How does the performance of llama3.np compare to other implementations, especially considering it's a pure NumPy implementation?

johndoughabout 1 year ago

What is the difference to the llama.np repository credited in the README? <a href="https://github.com/hscspring/llama.np">https://github.com/hscspring/llama.np</a>

评论 #40380021 未加载

kolinkoabout 1 year ago

Obligatory Recmo’s Llama1 implementation in numpy :)<a href="https://github.com/recmo/cria">https://github.com/recmo/cria</a>

Scene_Cast2about 1 year ago

The rotary embeddings bit is neat. I wonder if a complex representation would simplify vs complexify things (readability, performance, expressive power).

评论 #40379796 未加载

评论 #40380138 未加载

threatripperabout 1 year ago

> np.sin(freqs)Didn't we drop 2 pi somewhere?

xchipabout 1 year ago

Nice but the tricky part is the training data.

评论 #40383365 未加载

评论 #40382017 未加载

ulam2about 1 year ago

I'll consider superintelligence achieved if AI can do such work faithfully.

评论 #40380444 未加载

评论 #40379822 未加载

13 comments

ffriendabout 1 year ago

评论 #40384232 未加载

评论 #40382742 未加载

评论 #40386480 未加载

评论 #40384031 未加载

评论 #40385113 未加载

评论 #40382656 未加载

joennlaeabout 1 year ago

Trainable Llama-like transformer (with backpropagation) in numpy only (~600 lines)<a href="https://github.com/joennlae/tensorli">https://github.com/joennlae/tensorli</a>

评论 #40385146 未加载

buildbotabout 1 year ago

Cool, instant cuda acceleration via cupy! `import cupy as np`

lnyanabout 1 year ago

`import jax.numpy as np`, then we also get a jax implemention after certain modifications: e.g. remove in-place index assignment, replace unsupported functions, etc

评论 #40382090 未加载

评论 #40381005 未加载

rhdunnabout 1 year ago

评论 #40381845 未加载

dangabout 1 year ago

We changed the URL from <a href="https://github.com/likejazz/llama3.np">https://github.com/likejazz/llama3.np</a> to the article it points to, which gives more background.

AI_hackerabout 1 year ago

How does the performance of llama3.np compare to other implementations, especially considering it's a pure NumPy implementation?

johndoughabout 1 year ago

What is the difference to the llama.np repository credited in the README? <a href="https://github.com/hscspring/llama.np">https://github.com/hscspring/llama.np</a>

评论 #40380021 未加载

kolinkoabout 1 year ago

Obligatory Recmo’s Llama1 implementation in numpy :)<a href="https://github.com/recmo/cria">https://github.com/recmo/cria</a>

Scene_Cast2about 1 year ago

The rotary embeddings bit is neat. I wonder if a complex representation would simplify vs complexify things (readability, performance, expressive power).

Llama 3 implemented in pure NumPy

13 comments

Llama 3 implemented in pure NumPy

13 comments