TechEcho

7 comments

lars512almost 14 years ago

Latent semantic mapping is a technique which takes a large number of text documents, maps them to term frequency vectors (vector-space semantics), and performs dimensionality reduction into a smaller semantic space. This then lets you determine how similar in meaning different documents are. You can use this for a variety of tasks.<p>Wikipedia: Latent Semantic Mapping <a href="http://en.wikipedia.org/wiki/Latent_semantic_mapping" rel="nofollow">http://en.wikipedia.org/wiki/Latent_semantic_mapping</a><p>WWDC 2011 talk, now available: "Latent semantic mapping: exposing the meaning behind words and documents" <a href="https://developer.apple.com/videos/wwdc/2011/" rel="nofollow">https://developer.apple.com/videos/wwdc/2011/</a>

评论 #2691244 未加载

woosteralmost 14 years ago

We used this, when I was at Apple, to make the Parental Controls web content filter (which I worked on), among other things. It works surprisingly well.

spitfirealmost 14 years ago

I just can't ever see Microsoft shipping something like this available to every user. This sort of quiet progress is why I like Apple. Sure they highlight the glossy stuff, but below the surface there's so much blood and guts progress.

评论 #2691658 未加载

评论 #2691255 未加载

评论 #2692416 未加载

评论 #2691342 未加载

pepijndevosalmost 14 years ago

So is anything like this available on other platforms? Because it's way faster than <a href="http://classifier.rubyforge.org/" rel="nofollow">http://classifier.rubyforge.org/</a> , even with rb-gsl installed. I'd love it for generating related posts on my Jekyll blog.

ytersalmost 14 years ago

How have you used this? Looks pretty interesting.

samg_almost 14 years ago

I've been playing with some clustering stuff in my free time for the past few months.<p>What I've found is that the problem seems to get a lot more reasonable if you know how many clusters there are.<p>K-Means requires this information, but afaict agglomerative techniques don't. I wonder why this tool's agglomerative clustering method requires the number of clusters as an argument.

评论 #2693806 未加载

codeapealmost 14 years ago

Is it available on Linux?

评论 #2692288 未加载

7 comments

lars512almost 14 years ago

评论 #2691244 未加载

woosteralmost 14 years ago

We used this, when I was at Apple, to make the Parental Controls web content filter (which I worked on), among other things. It works surprisingly well.

The lsm command for Latent Semantic Mapping

7 comments

The lsm command for Latent Semantic Mapping

7 comments