TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Word Mover's Embedding: Cheap WMD For Documents

12 pointsby vackosaralmost 5 years ago

2 comments

SomewhatLikelyalmost 5 years ago
Thanks for introducing a new (to me) idea. I didn't watch the video but I felt the write-up could have been more cohesive. Perhaps just a conclusion to tie all the ideas together. I'm also left wondering why we would use this WME approach over other document embedding techniques (averaging word vectors, paragraph vectors, smooth inverse frequency weighting, etc). Is it faster, gives better similarity estimates, etc.?
gojomoalmost 5 years ago
Interesting idea!<p>Perhaps the &#x27;random&#x27; docs could instead be generated (or even trained) for even-greater significance of the new embeddings.<p>For example: after doing LDA, generate a &#x27;paragon&#x27; doc of each topic. Or coalescing all docs of a known label together, then reducing them to D summary pseudo-words – the D &#x27;words&#x27; with minimum total WMD to all docs of the same label. Or adding further R docs into regions of maximum confusion.