TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

X-Transformers: A fully-featured transformer with experimental features

96 点作者 blackcat201大约 4 年前

9 条评论

throwawaybbq1大约 4 年前
FYI .. I work in deep learning and lucidrains is becoming a legend in my line of work. Seems someone who is obsessed about transformers (the deep learning ones, and rightly so, they are amazing). To the author (if you are reading this on HN), I want to thank you for the amazing work you have done!<p>For the non-DL crowd, Transformers are a tsunami in deep learning for the past few years. They are topping benchmarks in many subfields. I do research professionally and this work is amazingly useful for people like me.
评论 #27096423 未加载
评论 #27092358 未加载
fao_大约 4 年前
As others have mentioned, anything obscure like this should literally come with a Wikipedia (or other such) link to explain what it is, what it does. This is the primary problem with small project READMEs, imo. They assume you&#x27;re already familiar with them and know what the hell they are. Like, take Ironhide:<p><pre><code> https:&#x2F;&#x2F;github.com&#x2F;MrMEEE&#x2F;ironhide Optimus Support for Linux Through VirtualGL - PPA version also available </code></pre> That&#x27;s... great. So it&#x27;s doing something with GL, and it&#x27;s running on Linux, but uhhh.<p><pre><code> my branch of the original bumblebee project.. </code></pre> What is Optimus? What is Bumblebee? The trick of it is that it links to a blog where neither of these terms are ever explained. Maybe it&#x27;s to just look impressive on someone&#x27;s CV? How could I even tell the difference?<p>Likewise for this project, all you need in the README is one line that&#x27;s like:<p><pre><code> X-Transformers is a re-implementation of Machine Learning Transformers that has been built based on experimental Arxiv papers </code></pre> It&#x27;s a one-line fix but it&#x27;ll stop people like me being confused as to whether or not you&#x27;re implementing a new HTTP header
评论 #27091919 未加载
评论 #27091309 未加载
评论 #27091574 未加载
bratao大约 4 年前
lucidrains and Ice Cream are my references in terms of research, knowledge and productivity. Phil was always available to guide and hear me. One time I told him about an underground research in another language and he was kind enough to check if it had any merit.<p>About X-Transformers, it is a very great piece of engineering that implemented almost of all possible improvements in transformers. But according to my experience and Phil himself, only the Feedforward GLU and RoPe (Rotary Positional Embeddings) works (or to be fair, they show improvements in more general use-cases)
评论 #27096429 未加载
argvargc大约 4 年前
Unfortunately for me, I genuinely thought this was going to be a DIY robot build that could disguise itself as something else.
评论 #27092315 未加载
评论 #27093720 未加载
评论 #27090864 未加载
adontz大约 4 年前
I have expected to see a 3D model for Optimus Prime.
bravura大约 4 年前
What do you use for images that don’t have identical height and width? It seems the image transformer here expects square images.
krick大约 4 年前
That&#x27;s really cool. Now I need a bunch of pre-trained models for this...
mrfusion大约 4 年前
Explain like I’m a first year CS major?
评论 #27090917 未加载
评论 #27090903 未加载
评论 #27092395 未加载
shayankh大约 4 年前
absolutely fucking amazing