TechEcho

Does it do speaker recognition/ diarization? Can't see it from the repo readme

GH repo: <a href="https://github.com/aiola-lab/whisper-medusa">https://github.com/aiola-lab/whisper-medusa</a>

I'm curious which of the Whisper derivatives is actually the fastest ?Since faster-whisper claims 4x speedup over base Whisper, and I've found WhisperX to be faster still (for longer audio where it can do batch inference), at least on consumer GPUs.So with AiOla saying "50% speedup", is that actually noteworthy?

IIRC Whisper works on wave files. Can this do real time low latency continuous ASR?

Nothing of interest here, it's an ad.If you're interested, you might as well check out Gladia, at least they have a pricing section and allow you to use it as a developer, unlike just asking you to "Request a Demo".And while a sibling comment links to the GitHub repository, their entire website does not contain such a link.---Edit: My bad, for some reason I first checked the website instead of the blog post. Looks much more interesting now.

Does it do speaker recognition/ diarization? Can't see it from the repo readme

GH repo: <a href="https://github.com/aiola-lab/whisper-medusa">https://github.com/aiola-lab/whisper-medusa</a>

IIRC Whisper works on wave files. Can this do real time low latency continuous ASR?

AiOla open-sources ultra-fast ‘multi-head’ speech recognition model

5 comments

AiOla open-sources ultra-fast ‘multi-head’ speech recognition model

5 comments