Key questions:<p>1. The key data point seems to be Figure 6a. Where it compares performance on BABILong and claims Titans performance is at ~62%, as compared to GPT-4o-mini at ~42% for 100k sequence length.<p>However, GPT-4o and Claude are missing in this comparison - maybe because they perform better ?<p>2. There is no example provided of the Neural Memory Module in action. This is the first question I would ask of this paper.