This response by the authors to the reviews to me looks like they didn't really understand the objection:<p>> First, it's worth noting that different reviewers sometimes gave opposite critiques of the paper, e.g. Reviewer erh8: The conclusion in this paper is questionable... It contradicts to [1], which shows that Transformers are Turing-complete Reviewer bz3o: The main claim that transformers with finite attention span are not computationally universal is both somewhat obvious and previously already stated<p>If I'm reading the reviews correctly, the claim by both reviewers was that transformers <i>are</i> actually Turing complete, but one reviewer added that they're "obviously" not Turing complete if you restrict their memory a priori (which I would agree is obvious). So there isn't really a contradiction between the reviews.<p>From briefly skimming the paper, this does look indeed to me like researchers which aren't really familiar with theoretical CS trying to handwave their way into something that looks ground-breaking. But while you absolutely can get away with vague-ish description in a more experimental part of CS, you absolutely can't get away with it in computability theory - that field is rigorous, and basically maths.