TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

The Illustrated GPT-2: Visualizing Transformer Language Models (2019)

213 点作者 epberry超过 1 年前

3 条评论

xianshou超过 1 年前
More excellent resources from Jay and others:<p>The Illustrated Transformer - <a href="http:&#x2F;&#x2F;jalammar.github.io&#x2F;illustrated-transformer&#x2F;" rel="nofollow noreferrer">http:&#x2F;&#x2F;jalammar.github.io&#x2F;illustrated-transformer&#x2F;</a><p>Beyond the Illustrated Transformer - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=35712334">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=35712334</a><p>LLM Visualization - <a href="https:&#x2F;&#x2F;bbycroft.net&#x2F;llm" rel="nofollow noreferrer">https:&#x2F;&#x2F;bbycroft.net&#x2F;llm</a>
taliesinb超过 1 年前
That is an excellent explanation full of great intuition building!<p>If anyone is interested in a kind of tensor network-y diagrammatic notation for array programs (of which transformers and other deep neural nets are examples), I wrote a post recently that introduces a kind of &quot;colorful&quot; tensor network notation (where the colors correspond to axis names) and then uses it to describe self-attention and transformers. The actual circuitry to compute one round of self-attention is remarkably compact in this notation:<p><a href="https:&#x2F;&#x2F;math.tali.link&#x2F;raster&#x2F;052n01bav6yvz_1smxhkus2qrik_0736_0884_02kdqvrzq963t.jpg" rel="nofollow noreferrer">https:&#x2F;&#x2F;math.tali.link&#x2F;raster&#x2F;052n01bav6yvz_1smxhkus2qrik_07...</a><p>Here&#x27;s the full section on transformers: <a href="https:&#x2F;&#x2F;math.tali.link&#x2F;rainbow-array-algebra&#x2F;#transformers" rel="nofollow noreferrer">https:&#x2F;&#x2F;math.tali.link&#x2F;rainbow-array-algebra&#x2F;#transformers</a> -- for more context on this kind of notation and how it conceptualizes &quot;arrays as functions&quot; and &quot;array programs as higher-order functional programming&quot; you can check out <a href="https:&#x2F;&#x2F;math.tali.link&#x2F;classical-array-algebra" rel="nofollow noreferrer">https:&#x2F;&#x2F;math.tali.link&#x2F;classical-array-algebra</a> or skip to the named axis followup at <a href="https:&#x2F;&#x2F;math.tali.link&#x2F;rainbow-array-algebra" rel="nofollow noreferrer">https:&#x2F;&#x2F;math.tali.link&#x2F;rainbow-array-algebra</a>
评论 #38696570 未加载
Der_Einzige超过 1 年前
Jay Alammar is one of the greats in our field and we are lucky to have him.