TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Distil-Whisper: distilled version of Whisper that is 6 times faster, 49% smaller

277 pointsby omarfarooqover 1 year ago

16 comments

FL33TW00Dover 1 year ago
Super exciting! I&#x27;ll be shipping Distil-Whisper to whisper-turbo tomorrow! <a href="https:&#x2F;&#x2F;github.com&#x2F;FL33TW00D&#x2F;whisper-turbo">https:&#x2F;&#x2F;github.com&#x2F;FL33TW00D&#x2F;whisper-turbo</a><p>Should make running in the browser feasible even for underpowered devices: <a href="https:&#x2F;&#x2F;whisper-turbo.com&#x2F;" rel="nofollow noreferrer">https:&#x2F;&#x2F;whisper-turbo.com&#x2F;</a>
评论 #38096883 未加载
asplakeover 1 year ago
It’s a shame that the README doesn’t link to the original Whisper, or at least not prominently. There’s the etiquette, but also I still don’t really know what this does.
评论 #38096651 未加载
评论 #38096737 未加载
jankovicsandrasover 1 year ago
I&#x27;m using this: <a href="https:&#x2F;&#x2F;github.com&#x2F;guillaumekln&#x2F;faster-whisper">https:&#x2F;&#x2F;github.com&#x2F;guillaumekln&#x2F;faster-whisper</a> Smaller, faster, works well with CPU, multiple languages, etc.
评论 #38096485 未加载
评论 #38096262 未加载
cjdellover 1 year ago
I wonder if fast enough for wakeword detection in WASM. Picovoice worked extremely well for this but it&#x27;s proprietary.
评论 #38097524 未加载
评论 #38100863 未加载
评论 #38098572 未加载
评论 #38097615 未加载
GaggiXover 1 year ago
It seems they have only distilled on English data, so the distil-large-v2 model will probably perform badly with any other language, we&#x27;ll see tomorrow when they are going to release their models.
评论 #38100765 未加载
mklover 1 year ago
&gt; performs within 1% WER<p>From the paper, for short-form audio:<p>&gt; the distil-large-v2 model achieves the lowest overall average WER of 10.1%. It is one percentage point higher than the large-v2 baseline, with 5.8 times faster inference speed and fewer than half the parameters.<p>Long-form is similar, except Distil-Whisper does slightly better than Whisper (fewer hallucinations, apparently).<p>10% WER seems awfully high, and doesn&#x27;t match my experience with Whisper. Maybe my audio is nice and clean relative to their test set?
评论 #38099027 未加载
评论 #38096746 未加载
评论 #38096094 未加载
regularfryover 1 year ago
Funnily enough, `-small`, `-base` and `-tiny` versions of this would be more exciting to me. `small.en` is the largest of the original whisper models that will run anywhere near usable speed on a raspberry pi zero 2 with whisper.cpp, and it&#x27;s still too slow to really bother with for streaming. Anything smaller is too inaccurate for day to day use. If there was a distilled version which had a similar 6x speedup, that would be transformative.
评论 #38096706 未加载
评论 #38097252 未加载
yjftsjthsd-hover 1 year ago
On a partially-related note, has anyone packaged any version of whisper as an Android keyboard? It seems like a reasonably good fit, and I would be interested to see if it worked better than the deteriorating quality of Google&#x27;s offering. I think it would work even with the existing versions, but a faster+smaller version would obviously be a better fit for running on phone hardware.
评论 #38114447 未加载
zaptremover 1 year ago
How much faster in real wall-clock time is this in batched data than <a href="https:&#x2F;&#x2F;github.com&#x2F;m-bain&#x2F;whisperX">https:&#x2F;&#x2F;github.com&#x2F;m-bain&#x2F;whisperX</a> ?
评论 #38096059 未加载
apiover 1 year ago
Is there a good project out there that pairs whisper with something like llama.cpp to create a private local voice assistant?<p>Llama2 isn&#x27;t as good as GPT-4 but it&#x27;s a hell of a lot smarter at Q&amp;A than Siri or Alexa or any of those things.<p>PSA: I will pay for such a thing if it&#x27;s really good, privacy respecting, local-first, and preferably at least source available.
评论 #38097441 未加载
评论 #38101424 未加载
评论 #38097387 未加载
评论 #38097653 未加载
iAkashPaulover 1 year ago
I&#x27;ve tried the large-v2 on translate task but the results aren&#x27;t great. Guess there needs to be another round of distillation with translate task thrown in too.
VadimPRover 1 year ago
Does anyone know if it is possible to fine-tune the whisper models to add new words? Say, brand names it doesn&#x27;t yet know about?
评论 #38097069 未加载
pkoirdover 1 year ago
Have not read the paper yet but why do they only cut the decoder and not the encoder?
评论 #38098543 未加载
评论 #38098505 未加载
asadmover 1 year ago
English only it seems :(
siva7over 1 year ago
Hm isn&#x27;t this problematic from a trademark pov?
评论 #38096641 未加载
spandextwinsover 1 year ago
Nice! But next time do the press release when the product is released. Really tired of sites like HN pushing these stories out without any code or files Feels like vaporware.