TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

The Illustrated Retrieval Transformer

75 点作者 jayalammar超过 3 年前

4 条评论

m_ke超过 3 年前
Using external memory instead of encoding all of the knowledge in the model will take over all branches of applied ML.<p>A recognition model should use a similar mechanism to store short term context in a memory buffer from previous frames and a large external database of long term key value pairs that retain relevant semantic information for given embeddings.<p>Doing so will make it possible to update and expand the models without having to retrain and enable much better zero&#x2F;few shot learning.<p>We already have a hacky version of this in our production app for food recognition. For new users we use a standard CNN to predict the items present in the image, once a user logs a few meals we use nearest neighbor search to match new images against previously submitted entries, which works extremely well.
评论 #29783109 未加载
评论 #29781428 未加载
jayalammar超过 3 年前
Hi HN,<p>Summary: The latest batch of language models can be much smaller yet achieve GPT-3 like performance by being able to query a database or search the web for information. A key indication is that building larger and larger models is not the only way to improve performance.<p>Hope you find it useful. All feedback is welcome!
评论 #29781137 未加载
评论 #29781552 未加载
fabbari超过 3 年前
&#x2F;s AI finally reached the pinnacle of human intelligence: when asked a question it will now roll eyes and loudly declare: &quot;Let me Google that for you&quot;.
savant_penguin超过 3 年前
Nice<p>What is the size of the database and how does it compare to the size of the gpt3 model? &quot;( 2 trillion multi-lingual tokens )&quot; But how much memory is that<p>How much compute does each method need?<p>Can you run it on a laptop?
评论 #29781548 未加载