TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Formal Algorithms for Transformers

106 点作者 hexhowells将近 3 年前

7 条评论

lynguist将近 3 年前
I find the distinction introduced in this paper into encoder-decoder Transformers, encoder-only Transformers and decoder-only Transformers very useful for my informal understanding of the different architectures. Thank you for this clear clarification.
评论 #32185217 未加载
sva_将近 3 年前
I like how this seems to actually be self-contained. They even have a list of notations in the end.
geysersam将近 3 年前
This is a fantastic resource. It's the missing piece of many machine learning articles.
tartakovsky将近 3 年前
Zero diagrams, but maybe they wouldn’t be helpful to clarify the concept? Guess it depends on the types of learners, I’m not sure.
评论 #32181415 未加载
godelski将近 3 年前
I can't tell who this paper is aimed at. It isn't formal. It isn't mathematical. It isn't a good description and doesn't have good coverage. I can only assume it is for citations.
ThrowawayTestr将近 3 年前
I was assuming electrical transformers.
评论 #32180358 未加载
mrhether将近 3 年前
familiar with basic ML terminology might be an understatement
评论 #32185230 未加载
评论 #32181509 未加载