TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Word Mover's Embedding: Cheap WMD For Documents

12 点作者 vackosar将近 5 年前

2 条评论

SomewhatLikely将近 5 年前
Thanks for introducing a new (to me) idea. I didn't watch the video but I felt the write-up could have been more cohesive. Perhaps just a conclusion to tie all the ideas together. I'm also left wondering why we would use this WME approach over other document embedding techniques (averaging word vectors, paragraph vectors, smooth inverse frequency weighting, etc). Is it faster, gives better similarity estimates, etc.?
gojomo将近 5 年前
Interesting idea!<p>Perhaps the &#x27;random&#x27; docs could instead be generated (or even trained) for even-greater significance of the new embeddings.<p>For example: after doing LDA, generate a &#x27;paragon&#x27; doc of each topic. Or coalescing all docs of a known label together, then reducing them to D summary pseudo-words – the D &#x27;words&#x27; with minimum total WMD to all docs of the same label. Or adding further R docs into regions of maximum confusion.