Freely available preprint: <a href="https://arxiv.org/abs/2410.16107" rel="nofollow">https://arxiv.org/abs/2410.16107</a><p>I'm biased (as an author), but I think Figure 3 is the key result, as well as Tables 4 and 5 in the appendix. (Or Table 1 in the published version.) The grammatical and rhetorical differences in the models, even when asked to match an example of human text, are remarkably big. They quantify that uncanny feeling you get when encoutering LLM-written text in the wild.