I was curious how good a transcription I could get from what may be the best multimoldal LLM currently, Gemini-1.5-Pro-Experiment-0801, so I had it transcribe five minutes of an interview between Ezra Klein and Nancy Pelosi from earlier today. The results are here:<p><a href="https://www.gally.net/temp/20240809geminitranscription/index.html" rel="nofollow">https://www.gally.net/temp/20240809geminitranscription/index...</a><p>Aside from some minor punctuation and capitalization issues, Gemini’s transcription looks nearly perfect to me. There were only one or two words that I think it misheard. If I had transcribed the audio myself, I would have made more mistakes than that.<p>One passage struck me in particular:<p><pre><code> And then he comes up with "weird," which becomes viral and the rest, and here he is.
</code></pre>
How did Gemini know to put “weird” in quotation marks, to indicate—correctly—that the speaker was referring to Walz’s use of the word as a word? According to Politico, Walz first used the word in that context in the media on July 23.<p><a href="https://www.politico.com/news/2024/07/26/trump-vance-weird-00171470" rel="nofollow">https://www.politico.com/news/2024/07/26/trump-vance-weird-0...</a>