TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Transformers from Scratch

265 点作者 stablemap将近 6 年前

10 条评论

cgearhart将近 6 年前
This is a _great_ article. One of the things I enjoy most is finding new ways to understand or think about things I already feel like I know. This article helped me do both with transformer networks. I especially liked how explicitly and simply things were explained like queries, keys, and values; permutation equivariance; and even the distinction between learned model parameters and parameters derived from the data (like the attention weights).<p>The author quotes Feynman, and I think this is a great example of his concept of explaining complex subjects in simple terms.
dusted将近 6 年前
And here I was, excited to learn something about actual transformers, something involving wire and metal..
评论 #20778594 未加载
评论 #20779220 未加载
评论 #20776596 未加载
评论 #20775691 未加载
yamrzou将近 6 年前
This is the best article I have read so far explaining the transformer architecture. The clear and intuitive explanation can’t be praised enough.<p>Note that the teacher has a Machine Learning course with video lectures on youtube that he references throughout the article : <a href="http:&#x2F;&#x2F;www.peterbloem.nl&#x2F;teaching&#x2F;machine-learning" rel="nofollow">http:&#x2F;&#x2F;www.peterbloem.nl&#x2F;teaching&#x2F;machine-learning</a>
Gallactide将近 6 年前
This man was my professor at the VU.<p>Honestly his lectures were fun and easy to look forward too, I&#x27;m really glad his post is getting traction.<p>If you find his video lectures they are a really graceful introduction to most ML concepts.
评论 #20775689 未加载
isoprophlex将近 6 年前
Stellar article, I never understood self attention; this makes it so very clear in a few concise lines, with little fluff.<p>The author has a gift for explaining these concepts.
NHQ将近 6 年前
This is sweet. I&#x27;ve written conv, dense, and recurrent networks from scratch. Transformers next!<p>Plug: I just published this demo using GD to find control points for Bezier Curves: <a href="http:&#x2F;&#x2F;nhq.github.io&#x2F;beezy&#x2F;public&#x2F;" rel="nofollow">http:&#x2F;&#x2F;nhq.github.io&#x2F;beezy&#x2F;public&#x2F;</a>
ropiwqefjnpoa将近 6 年前
Ah yes, machine learning architecture transformers, I knew that.
siekmanj将近 6 年前
Wow. I have been looking for a good resource on implementing self-attention&#x2F;transformers on my own for the last week - can&#x27;t wait to read this through.
ccccppppp将近 6 年前
Noob question: I have some 1D conv net for financial time series prediction. Could a transformer architecture be better for this task, is it worth a try?
评论 #20778591 未加载
gwbas1c将近 6 年前
The title is deceiving. I thought this was an article about building your own electrical transformer, or building your own version of the 1980s toy.
评论 #20774399 未加载
评论 #20774730 未加载
评论 #20776073 未加载
评论 #20775084 未加载