> Entropy-weighted similarity: We adjust the similarity scores between query tokens and related documents based on the entropy of each token.<p>Sounds a lot like BM25 weighted word embeddings (e.g. fastText).
Neat. I wonder how GPT-4’s query expansion might compare with SPLADE or similar masked BERT methods. Also if you really want to go nuts you can apply term expansion to the document corpus.
Very cool! Glad to see continued research in this direction. I’ve really enjoyed reading the Mixedbread blog. If you’re interested in retrieval topics, they’re doing some cool stuff.
why do i have this vibe
?<p><a href="https://blogs.perficient.com/2012/09/25/a-mathematical-model-for-assessing-page-quality/" rel="nofollow">https://blogs.perficient.com/2012/09/25/a-mathematical-model...</a>