48 点作者 cmcollier超过 1 年前

3 条评论

binarymax超过 1 年前

Interesting, but this aspect makes me double-take: "We demonstrate that Mistral-7B, when fine-tuned solely on synthetic data, attains competitive performance on the BEIR [ 40 ] and MTEB [27] benchmarks".<p>E5/BGE large are an order of magnitude smaller than Mistral-7B. So is this just "bigger model wins" in disguise?<p>I need to read the whole paper carefully, but this jumped out at me.

评论 #38848144 未加载

nalzok超过 1 年前

> Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)<p>I'm surprised they didn't put `Machine Learning (cs.LG)` and `Machine Learning (stat.ML)`.

3abiton超过 1 年前

I am confused, aren't LLMs already embeddings of text?

评论 #38852279 未加载

Improving Text Embeddings with Large Language Models

3 条评论

Improving Text Embeddings with Large Language Models

3 条评论