科技回声

10 条评论

cgearhart将近 6 年前

This is a _great_ article. One of the things I enjoy most is finding new ways to understand or think about things I already feel like I know. This article helped me do both with transformer networks. I especially liked how explicitly and simply things were explained like queries, keys, and values; permutation equivariance; and even the distinction between learned model parameters and parameters derived from the data (like the attention weights).The author quotes Feynman, and I think this is a great example of his concept of explaining complex subjects in simple terms.

dusted将近 6 年前

And here I was, excited to learn something about actual transformers, something involving wire and metal..

评论 #20778594 未加载

评论 #20779220 未加载

评论 #20776596 未加载

评论 #20775691 未加载

yamrzou将近 6 年前

This is the best article I have read so far explaining the transformer architecture. The clear and intuitive explanation can’t be praised enough.Note that the teacher has a Machine Learning course with video lectures on youtube that he references throughout the article : <a href="http://www.peterbloem.nl/teaching/machine-learning" rel="nofollow">http://www.peterbloem.nl/teaching/machine-learning</a>

Gallactide将近 6 年前

This man was my professor at the VU.Honestly his lectures were fun and easy to look forward too, I'm really glad his post is getting traction.If you find his video lectures they are a really graceful introduction to most ML concepts.

评论 #20775689 未加载

isoprophlex将近 6 年前

Stellar article, I never understood self attention; this makes it so very clear in a few concise lines, with little fluff.The author has a gift for explaining these concepts.

NHQ将近 6 年前

This is sweet. I've written conv, dense, and recurrent networks from scratch. Transformers next!Plug: I just published this demo using GD to find control points for Bezier Curves: <a href="http://nhq.github.io/beezy/public/" rel="nofollow">http://nhq.github.io/beezy/public/</a>

ropiwqefjnpoa将近 6 年前

Ah yes, machine learning architecture transformers, I knew that.

siekmanj将近 6 年前

Wow. I have been looking for a good resource on implementing self-attention/transformers on my own for the last week - can't wait to read this through.

ccccppppp将近 6 年前

Noob question: I have some 1D conv net for financial time series prediction. Could a transformer architecture be better for this task, is it worth a try?

评论 #20778591 未加载

gwbas1c将近 6 年前

The title is deceiving. I thought this was an article about building your own electrical transformer, or building your own version of the 1980s toy.

评论 #20774399 未加载

评论 #20774730 未加载

评论 #20776073 未加载

评论 #20775084 未加载

10 条评论

cgearhart将近 6 年前

dusted将近 6 年前

And here I was, excited to learn something about actual transformers, something involving wire and metal..

评论 #20778594 未加载

评论 #20779220 未加载

评论 #20776596 未加载

评论 #20775691 未加载

yamrzou将近 6 年前

Gallactide将近 6 年前

评论 #20775689 未加载

isoprophlex将近 6 年前

Stellar article, I never understood self attention; this makes it so very clear in a few concise lines, with little fluff.The author has a gift for explaining these concepts.

NHQ将近 6 年前

ropiwqefjnpoa将近 6 年前

Ah yes, machine learning architecture transformers, I knew that.

siekmanj将近 6 年前

Wow. I have been looking for a good resource on implementing self-attention/transformers on my own for the last week - can't wait to read this through.

ccccppppp将近 6 年前

Noob question: I have some 1D conv net for financial time series prediction. Could a transformer architecture be better for this task, is it worth a try?

评论 #20778591 未加载

gwbas1c将近 6 年前

The title is deceiving. I thought this was an article about building your own electrical transformer, or building your own version of the 1980s toy.

评论 #20774399 未加载

评论 #20774730 未加载

评论 #20776073 未加载

评论 #20775084 未加载

Transformers from Scratch

10 条评论

Transformers from Scratch

10 条评论