科技回声

4 条评论

m_ke超过 3 年前

Using external memory instead of encoding all of the knowledge in the model will take over all branches of applied ML.A recognition model should use a similar mechanism to store short term context in a memory buffer from previous frames and a large external database of long term key value pairs that retain relevant semantic information for given embeddings.Doing so will make it possible to update and expand the models without having to retrain and enable much better zero/few shot learning.We already have a hacky version of this in our production app for food recognition. For new users we use a standard CNN to predict the items present in the image, once a user logs a few meals we use nearest neighbor search to match new images against previously submitted entries, which works extremely well.

评论 #29783109 未加载

评论 #29781428 未加载

jayalammar超过 3 年前

Hi HN,Summary: The latest batch of language models can be much smaller yet achieve GPT-3 like performance by being able to query a database or search the web for information. A key indication is that building larger and larger models is not the only way to improve performance.Hope you find it useful. All feedback is welcome!

评论 #29781137 未加载

评论 #29781552 未加载

fabbari超过 3 年前

/s AI finally reached the pinnacle of human intelligence: when asked a question it will now roll eyes and loudly declare: "Let me Google that for you".

savant_penguin超过 3 年前

NiceWhat is the size of the database and how does it compare to the size of the gpt3 model? "( 2 trillion multi-lingual tokens )" But how much memory is thatHow much compute does each method need?Can you run it on a laptop?

评论 #29781548 未加载

The Illustrated Retrieval Transformer

4 条评论

The Illustrated Retrieval Transformer

4 条评论