I was in the process of reading this when I thought to check who this person is. Of course, by that time the site had failed, so I haven't read the whole thing yet.<p>But, it seems to me that the author is falling in to a trap many an unwary data "scientist" falls by not understanding the discipline of Statistics.<p>When one has the entire population data (i.e. a census), rather than a sample, there is no point in carrying out statistical tests.<p>If I know <i>ALL</i> the words spoken by someone, then I know which words they say the most without resorting to any tests simply by counting.<p>No concept of "statistical significance" is applicable because there is no sample. We can calculate the population value of any parameter we can think of, because, we have the entire population (in this specific instance, <i>ALL</i> the words spoken by all the characters).<p>FYI, all budding data "scientists" ...