I did my master's thesis on authorship verification, which is exactly this problem (deciding if two texts are written by the same author).<p>I experimented with clustering, SVMs, neural nets, etc for a long time, and got mainly disappointing results.<p>Even when "modern methods" give very high confidence scores, the problem is very messy and complicated, and usually the training data is different enough from the actual data (in supervised learning scenarios) as to bring the result into question.<p>I don't have access to the paper, so can't say much more, but I've seen a lot of very good-looking results that are in fact questionable.<p>Still a fascinating problem!
Is this the same discipline that thinks Columbus was from Northeastern Spain?<p><a href="https://en.wikipedia.org/wiki/Origin_theories_of_Christopher_Columbus#Catalan_hypothesis" rel="nofollow">https://en.wikipedia.org/wiki/Origin_theories_of_Christopher...</a><p>Toilet paper.