TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

The lsm command for Latent Semantic Mapping

74 pointsby lars512almost 14 years ago

7 comments

lars512almost 14 years ago
Latent semantic mapping is a technique which takes a large number of text documents, maps them to term frequency vectors (vector-space semantics), and performs dimensionality reduction into a smaller semantic space. This then lets you determine how similar in meaning different documents are. You can use this for a variety of tasks.<p>Wikipedia: Latent Semantic Mapping <a href="http://en.wikipedia.org/wiki/Latent_semantic_mapping" rel="nofollow">http://en.wikipedia.org/wiki/Latent_semantic_mapping</a><p>WWDC 2011 talk, now available: "Latent semantic mapping: exposing the meaning behind words and documents" <a href="https://developer.apple.com/videos/wwdc/2011/" rel="nofollow">https://developer.apple.com/videos/wwdc/2011/</a>
评论 #2691244 未加载
woosteralmost 14 years ago
We used this, when I was at Apple, to make the Parental Controls web content filter (which I worked on), among other things. It works surprisingly well.
spitfirealmost 14 years ago
I just can't ever see Microsoft shipping something like this available to every user. This sort of quiet progress is why I like Apple. Sure they highlight the glossy stuff, but below the surface there's so much blood and guts progress.
评论 #2691658 未加载
评论 #2691255 未加载
评论 #2692416 未加载
评论 #2691342 未加载
pepijndevosalmost 14 years ago
So is anything like this available on other platforms? Because it's way faster than <a href="http://classifier.rubyforge.org/" rel="nofollow">http://classifier.rubyforge.org/</a> , even with rb-gsl installed. I'd love it for generating related posts on my Jekyll blog.
ytersalmost 14 years ago
How have you used this? Looks pretty interesting.
samg_almost 14 years ago
I've been playing with some clustering stuff in my free time for the past few months.<p>What I've found is that the problem seems to get a lot more reasonable if you know how many clusters there are.<p>K-Means requires this information, but afaict agglomerative techniques don't. I wonder why this tool's agglomerative clustering method requires the number of clusters as an argument.
评论 #2693806 未加载
codeapealmost 14 years ago
Is it available on Linux?
评论 #2692288 未加载