科技回声

3 条评论

xianshou超过 1 年前

More excellent resources from Jay and others:The Illustrated Transformer - <a href="http://jalammar.github.io/illustrated-transformer/" rel="nofollow noreferrer">http://jalammar.github.io/illustrated-transformer/</a>Beyond the Illustrated Transformer - <a href="https://news.ycombinator.com/item?id=35712334">https://news.ycombinator.com/item?id=35712334</a>LLM Visualization - <a href="https://bbycroft.net/llm" rel="nofollow noreferrer">https://bbycroft.net/llm</a>

taliesinb超过 1 年前

That is an excellent explanation full of great intuition building!If anyone is interested in a kind of tensor network-y diagrammatic notation for array programs (of which transformers and other deep neural nets are examples), I wrote a post recently that introduces a kind of "colorful" tensor network notation (where the colors correspond to axis names) and then uses it to describe self-attention and transformers. The actual circuitry to compute one round of self-attention is remarkably compact in this notation:<a href="https://math.tali.link/raster/052n01bav6yvz_1smxhkus2qrik_0736_0884_02kdqvrzq963t.jpg" rel="nofollow noreferrer">https://math.tali.link/raster/052n01bav6yvz_1smxhkus2qrik_07...</a>Here's the full section on transformers: <a href="https://math.tali.link/rainbow-array-algebra/#transformers" rel="nofollow noreferrer">https://math.tali.link/rainbow-array-algebra/#transformers</a> -- for more context on this kind of notation and how it conceptualizes "arrays as functions" and "array programs as higher-order functional programming" you can check out <a href="https://math.tali.link/classical-array-algebra" rel="nofollow noreferrer">https://math.tali.link/classical-array-algebra</a> or skip to the named axis followup at <a href="https://math.tali.link/rainbow-array-algebra" rel="nofollow noreferrer">https://math.tali.link/rainbow-array-algebra</a>

评论 #38696570 未加载

Der_Einzige超过 1 年前

Jay Alammar is one of the greats in our field and we are lucky to have him.

3 条评论

xianshou超过 1 年前

taliesinb超过 1 年前

评论 #38696570 未加载

Der_Einzige超过 1 年前

Jay Alammar is one of the greats in our field and we are lucky to have him.

The Illustrated GPT-2: Visualizing Transformer Language Models (2019)

3 条评论

The Illustrated GPT-2: Visualizing Transformer Language Models (2019)

3 条评论