> Our extensive evaluations with diagnostic and realistic multi-modal long-context benchmarks show that 1.5 Pro is able to maintain near-perfect recall on multi-modal versions of needle-in-a-haystack (see Section 4.2.1.2) and is able to effectively use its context to retrieve and reason over large amounts of data. This enables the model to perform realistic long-context tasks such as long-document QA from 700k-word material and long-video QA from 40 to 105 minutes long videos. Finally, 1.5. Pro has the ability to use in-context learn to translate from English to Kalamang, an extremely low resource language with fewer than 200 speakers (Visser, 2020b). This capability is achieved solely by
providing a grammar manual in its context at inference time, which demonstrates the Gemini 1.5 Pro’s remarkable ability to in-context learn from information it has never seen before at training time.<p>Very impressive. Almost total recall within a 1 million token context.