TechEcho

7 comments

I implemented the new t-SNE in sklearn, so I've got some experience in reading these diagrams. Unfortunately, as wonderful as the algorithm is, it's extremely hard to interpret what it means rigorously. I've seen many diagrams that look like this one -- and they were generated from actual noise. So take the plots with a big grain of salt :)I'd be interested in seeing more direct evidence, like SVD factorizing the PMI matrix (which is what similar to what word2vec is doing) and seeing how much of the variance is explained by the first components. If you want to do this, check out: <a href="https://minhlab.wordpress.com/2015/06/08/a-new-proof-for-the-equivalence-of-word2vec-skip-gram-and-shifted-ppmi/" rel="nofollow">https://minhlab.wordpress.com/2015/06/08/a-new-proof-for-the...</a>

评论 #10929776 未加载

评论 #10931989 未加载

评论 #10928685 未加载

vonnikover 9 years ago

I think this approach has a lot of potential, and I wonder what a statistical comparison of character co-occurrences between the Voynich manuscript and other writing systems would reveal. For anyone curious, here is Stephen Bax's video on his 2014 findings.<a href="https://m.youtube.com/watch?index=1&v=fpZD_3D8_WQ&list=LLATcCtXq6Eg7iFjmWQ1CNkA" rel="nofollow">https://m.youtube.com/watch?index=1&v=fpZD_3D8_WQ&list=LLATc...</a>He believes he has translated about 10 words in the manuscript, which is huge, and he thinks the script may have been invented to express a language once spoken between the near east and the Himalayas, maybe Turkic or Caucasian...

评论 #10928025 未加载

评论 #10929783 未加载

danharajover 9 years ago

You know, I really like this, because it's an example of the kind of structure machine learning finds without my own understanding of the training set clouding my understanding of the machine's understanding.

haddrover 9 years ago

Very interesting approach, but I would say this is just a scratch. There are several factors that might really limit statistical analysis of this manuscript [1].[1] <a href="http://www.ciphermysteries.com/2013/03/09/this-week-a-talk-at-stanford-on-the-voynich-manuscript" rel="nofollow">http://www.ciphermysteries.com/2013/03/09/this-week-a-talk-a...</a>

评论 #10937391 未加载

评论 #10929693 未加载

lawpoopover 9 years ago

>>> model.most_similar("queen")[(u'princess', 0.519856333732605), (u'latifah', 0.47644317150115967),

splitbrainover 9 years ago

First time I hear about the cultural extinction theory. If that were the case, shouldn't there be more documents using the same script? But assuming the theory is right. Is there any way to decipher it without finding a Rosetta stone?

评论 #10933602 未加载

acqqover 9 years ago

Do we get any new insight with this?

评论 #10929253 未加载

7 comments

juxtaposicionover 9 years ago

评论 #10929776 未加载

评论 #10931989 未加载

评论 #10928685 未加载

vonnikover 9 years ago

评论 #10928025 未加载

评论 #10929783 未加载

danharajover 9 years ago

haddrover 9 years ago

评论 #10937391 未加载

评论 #10929693 未加载

lawpoopover 9 years ago

>>> model.most_similar("queen")[(u'princess', 0.519856333732605), (u'latifah', 0.47644317150115967),

Voynich Manuscript: word vectors and t-SNE visualization of some patterns

7 comments

Voynich Manuscript: word vectors and t-SNE visualization of some patterns

7 comments