TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

BMX: A Freshly Baked Take on BM25

91 pointsby breadislove9 months ago

9 comments

leobg9 months ago
&gt; Entropy-weighted similarity: We adjust the similarity scores between query tokens and related documents based on the entropy of each token.<p>Sounds a lot like BM25 weighted word embeddings (e.g. fastText).
评论 #41281806 未加载
antman9 months ago
How about computational complexity? There seems to be a small improvement in metrics but not sure if it is enough to switch to bmx
intalentive9 months ago
Neat. I wonder how GPT-4’s query expansion might compare with SPLADE or similar masked BERT methods. Also if you really want to go nuts you can apply term expansion to the document corpus.
deepsquirrelnet9 months ago
Very cool! Glad to see continued research in this direction. I’ve really enjoyed reading the Mixedbread blog. If you’re interested in retrieval topics, they’re doing some cool stuff.
bernihackernews9 months ago
baguetter library for the win!
yokee9 months ago
Super cool! It is definitely a good choice for the RAG system.
timsuchanek9 months ago
Amazing! When will we have this in the major databases?
herrmannfield9 months ago
why do i have this vibe ?<p><a href="https:&#x2F;&#x2F;blogs.perficient.com&#x2F;2012&#x2F;09&#x2F;25&#x2F;a-mathematical-model-for-assessing-page-quality&#x2F;" rel="nofollow">https:&#x2F;&#x2F;blogs.perficient.com&#x2F;2012&#x2F;09&#x2F;25&#x2F;a-mathematical-model...</a>
flawn9 months ago
Gemischtes Brot!