TechEcho

4 comments

bravuraover 12 years ago

It seems like your cluster quality will be sensitive to the words used to seed each cluster.Why not use a standard word clustering algorithm like Brown clustering? <a href="http://acl.ldc.upenn.edu/J/J92/J92-4003.pdf" rel="nofollow">http://acl.ldc.upenn.edu/J/J92/J92-4003.pdf</a>Percy Liang wrote a great implementation in C++ that you could plug into your visualization: <a href="http://cs.stanford.edu/~pliang/software/" rel="nofollow">http://cs.stanford.edu/~pliang/software/</a>Also of interest is that Brown clustering is hierarchical, so you can get coarse or fine-grained clustering.[Aside: Here are some 2-d visualizations I made of word embeddings from a neural language model: <a href="http://metaoptimize.com/projects/wordreprs/" rel="nofollow">http://metaoptimize.com/projects/wordreprs/</a> ]

评论 #4390864 未加载

JunkDNAover 12 years ago

Would love to see what clusters from PubMed would look like. Anyone planning to run this on it?

评论 #4390859 未加载

评论 #4390868 未加载

dansoover 12 years ago

First of all, great work and thanks for sharing!I guess I know less about NLP and clustering than I thought, but what exactly does the visualization indicate?On Iteration 1/3, when I click "husband" on the sidebar and "first" shows up...what does that mean? That that's the closest cluster by distance?The visualization looks nice but the accompanying text doesn't shed much light...

评论 #4390968 未加载

username3over 12 years ago

need horizontal scrollbar

Syntactic: A lexical categorizer with a pretty visualization

4 comments

Syntactic: A lexical categorizer with a pretty visualization

4 comments