TechEcho

6 comments

nicoabout 2 years ago

Is someone doing embeddings<>embeddings mapping?For example, mapping embeddings of Llama to GPT-3?That way you can see how similar the models “understand the world”.

评论 #35379112 未加载

评论 #35379420 未加载

评论 #35389967 未加载

mustacheemperorabout 2 years ago

Could anyone point me towards a relatively beginner-friendly guide to do something like>download all my tweets (about 20k) and build a semantic searcher on top ?How can utilize 3rd party embeddings with OpenAI's LLM API? Am I correct to understand from this article that this is possible?

评论 #35378831 未加载

评论 #35378854 未加载

评论 #35379466 未加载

评论 #35379686 未加载

评论 #35380386 未加载

fzliuabout 2 years ago

I've done some quick-and-dirty testing with OpenAI's embedding API + Zilliz Cloud. The 1st gen embeddings leave something to be desired (<a href="https://medium.com/@nils_reimers/openai-gpt-3-text-embeddings-really-a-new-state-of-the-art-in-dense-text-embeddings-6571fe3ec9d9" rel="nofollow">https://medium.com/@nils_reimers/openai-gpt-3-text-embedding...</a>), but the 2nd gen embeddings are actually fairly performant relative to many open source models with MLM loss.I'll have to dig out the notebook that I created for this, but I'll try to post it here once I find it.

评论 #35381867 未加载

celestialcheeseabout 2 years ago

Very interested in this - I've been using embeddings / semantic search doing information retrieval from PDFs, using ada-002, and have been impressed by the results in testing.The reasons the article listed, namely a) lock-in and b) cost, have given me pause with embedding our whole corpus of data. I'd much rather use an open model but don't have much experience in evaluating these embedding models and search performance - still very new to me.Like what you did with ada-002 vs Instruct XL, has there been any papers or prior work done evaluating the different embedding models?

评论 #35379023 未加载

nomadiccoderabout 2 years ago

The heat map of availability time, 98.58 (Jan), 99.07 (Feb), and 99.71 (Mar) trends upwards.

评论 #35386781 未加载

breckenedgeabout 2 years ago

It’s fine to use their embeddings for a proof of concept, but since you don’t own it, you probably shouldn’t rely on it because it could go away at any time.

评论 #35380778 未加载

6 comments

nicoabout 2 years ago

Is someone doing embeddings<>embeddings mapping?For example, mapping embeddings of Llama to GPT-3?That way you can see how similar the models “understand the world”.

评论 #35379112 未加载

评论 #35379420 未加载

评论 #35389967 未加载

mustacheemperorabout 2 years ago

评论 #35378831 未加载

评论 #35378854 未加载

评论 #35379466 未加载

评论 #35379686 未加载

评论 #35380386 未加载

fzliuabout 2 years ago

评论 #35381867 未加载

celestialcheeseabout 2 years ago

评论 #35379023 未加载

nomadiccoderabout 2 years ago

The heat map of availability time, 98.58 (Jan), 99.07 (Feb), and 99.71 (Mar) trends upwards.

评论 #35386781 未加载

breckenedgeabout 2 years ago

It’s fine to use their embeddings for a proof of concept, but since you don’t own it, you probably shouldn’t rely on it because it could go away at any time.

评论 #35380778 未加载

You probably shouldn't use OpenAI's embeddings

6 comments

You probably shouldn't use OpenAI's embeddings

6 comments