TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Insanely Fast Whisper

166 点作者 pr337h4m超过 1 年前

12 条评论

siraben超过 1 年前
One feature of Whisper I think people underuse is the ability to prompt the model to influence the output tokens. This can be used to correct spelling&#x2F;context-dependent words. Some examples from my terminal history:<p><pre><code> .&#x2F;main -m models&#x2F;ggml-small.bin -f alice.wav --prompt &quot;Audiobook reading by a British woman:&quot; .&#x2F;main -m models&#x2F;ggml-small.bin -f output.wav --prompt &quot;Research talk by Junyao, Harbin Institute of Technology, Shenzhen, research engineer at MaiMemo&quot; </code></pre> Also works multi-lingual. You can use this to influence transcription to produce traditional&#x2F;simplified Chinese characters for instance.<p>Although I seem to have trouble to get the context to persist across hundreds of tokens. Tokens that are corrected may revert back to the model&#x27;s underlying tokens if they weren&#x27;t repeated enough.
评论 #38267365 未加载
评论 #38267586 未加载
评论 #38267458 未加载
评论 #38267298 未加载
coder543超过 1 年前
The submission link is weird. It has far fewer stars than the repo it is forked from, and could just be an ad for replicate.com?<p>It is missing the most recent commits from what appears to be the real source: <a href="https:&#x2F;&#x2F;github.com&#x2F;Vaibhavs10&#x2F;insanely-fast-whisper">https:&#x2F;&#x2F;github.com&#x2F;Vaibhavs10&#x2F;insanely-fast-whisper</a><p>The only added commit is adding a replicate.com example, whatever that means.
评论 #38267789 未加载
评论 #38270292 未加载
kamranjon超过 1 年前
I&#x27;m sort of confused - is this just a CLI wrapper around faster-whisper, transformers and distil-whisper? Will this be any faster than running those by themselves? There doesn&#x27;t seem to be much code here, so this is why I&#x27;m wondering if this is actually something to get excited about if I already am aware of those projects.<p>Edit: Also this seems a bit suspicious - this seems like someone just forked another persons active repo (<a href="https:&#x2F;&#x2F;github.com&#x2F;Vaibhavs10&#x2F;insanely-fast-whisper">https:&#x2F;&#x2F;github.com&#x2F;Vaibhavs10&#x2F;insanely-fast-whisper</a>) and posted as their own?
评论 #38267550 未加载
评论 #38270300 未加载
pen2l超过 1 年前
Transcription speed and accuracy keeps going up and it’s delightful to see the progress, I wish though more effort was dedicated to creating integrated solutions that could accurately transcribe with speaker diarization.
评论 #38267219 未加载
评论 #38267287 未加载
评论 #38267441 未加载
danso超过 1 年前
So what&#x27;s in the secret sauce? e.g. faster-whisper &quot;is a reimplementation of OpenAI&#x27;s Whisper model using CTranslate2&quot; and claims 4x the speed of whisper; what does insanely-fast-whisper do to achieve its gains?
sp332超过 1 年前
Why do none of the benchmarks in the table match the headline?
评论 #38267188 未加载
评论 #38267229 未加载
lartin_muther超过 1 年前
coming soon: Careless Whisper (transcribes audio but every few minutes it goes &quot;idk here they said something or other&quot;)
评论 #38269170 未加载
refulgentis超过 1 年前
Flagged, fork of project that launched last week that did this and had its own HN story.
emadda超过 1 年前
I recently released <a href="https:&#x2F;&#x2F;bigwav.app" rel="nofollow noreferrer">https:&#x2F;&#x2F;bigwav.app</a><p>It’s whisper in the browser using WASM with an transcription annotation Ui.
msoad超过 1 年前
Can this do realtime transcribing (streaming)?
评论 #38267579 未加载
评论 #38267438 未加载
asadm超过 1 年前
is there an API that lets me run whisper in a streaming manner?
deegles超过 1 年前
can this use speaker diarization?
评论 #38268078 未加载