This is the source article for the claims about Whisper, but it's not very clear about how exactly Whisper was hallucinating in medical transcripts, barring a couple of examples. Without a description of the actual audio sample, you can't tell what's a transcription error (to be expected) and what's a hallucination.<p>It's also unclear what's a 'hallucination' here. Is it just mis-transcribing unclear audio or, like with ChatGPT, is it inserting random or targeted nonsense into otherwise legible bits of conversation?<p>For misheard sentences, it's definitely not something you want in medical transcription but it's also an understandable problem that humans aren't immune to either (I'm sure QC in human-led medical transcription works hard to avoid this).<p>But for everyone who uses Whisper, yeah, there's some due diligence you reasonably expected to have:
If you have bad or unclear audio in places, you should be cross-checking the recording to make sure the transcription is accurate.<p>However, that may not be possible in the case of tools using Nabla, according to the article:<p>> It’s impossible to compare Nabla’s AI-generated transcript to the original recording because Nabla’s tool erases the original audio for “data safety reasons,” Raison said.<p>Say this is bad design, and fixable. My real concern is whether Whisper is outright hallucinating in random places or just mis-transcribing tough audio. The former makes the tool completely unusable as you'd have to cross-check every word. The latter means you just keep up your regular diligence<p>> Researchers aren’t certain why Whisper and similar tools hallucinate, but software developers said the fabrications tend to occur amid pauses, background sounds or music playing.