TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: Ermine.ai – Record and transcribe speech, 100% client-side (WASM)

236 点作者 vishnumenon大约 2 年前

21 条评论

viraptor大约 2 年前
I&#x27;m after something that can transcribe medical notes and unfortunately it does not work well for that case. (almost nothing does though) There&#x27;s quite a few people interested in something that doesn&#x27;t turn &quot;laparoscopic&quot; into &quot;leper as cop it&quot;.<p>Maybe the current progress will help though. Models adjusted by your own dictionary or from postprocessing fixes would be amazing.
评论 #35453694 未加载
评论 #35455762 未加载
评论 #35455293 未加载
评论 #35453058 未加载
评论 #35452751 未加载
评论 #35459158 未加载
评论 #35511852 未加载
评论 #35511776 未加载
评论 #35461231 未加载
评论 #35452721 未加载
NiekvdMaas大约 2 年前
Nice!<p>Are you aware that whisper.cpp has a WASM-version as well? See <a href="https:&#x2F;&#x2F;github.com&#x2F;ggerganov&#x2F;whisper.cpp&#x2F;tree&#x2F;master&#x2F;examples&#x2F;whisper.wasm">https:&#x2F;&#x2F;github.com&#x2F;ggerganov&#x2F;whisper.cpp&#x2F;tree&#x2F;master&#x2F;example...</a> - demo at <a href="https:&#x2F;&#x2F;whisper.ggerganov.com&#x2F;" rel="nofollow">https:&#x2F;&#x2F;whisper.ggerganov.com&#x2F;</a>
评论 #35456371 未加载
senko大约 2 年前
When I try this on Firefox on Linux (not incognito) I get the following error:<p><pre><code> Connecting AudioNodes from AudioContexts with different sample-rate is currently not supported. index-0dae94e71b526640.js:1:2992 Uncaught (in promise) DOMException: AudioContext.createMediaStreamSource: Connecting AudioNodes from AudioContexts with different sample-rate is currently not supported. index-0dae94e71b526640.js:1 Media resource blob:https:&#x2F;&#x2F;www.ermine.ai&#x2F;e762a6f1-f292-4b23-96e0-8059a7f9d635 could not be decoded. www.ermine.ai Media resource blob:https:&#x2F;&#x2F;www.ermine.ai&#x2F;e762a6f1-f292-4b23-96e0-8059a7f9d635 could not be decoded, error: Error Code: NS_ERROR_DOM_MEDIA_METADATA_ERR (0x806e0006) </code></pre> (also, the weights json doesn&#x27;t download at all in Firefox incognito).<p>Would be good if you could pop some kind of alert (literally alert() might do the trick) on an exception just so people don&#x27;t wait for a couple of minutes before realizing something&#x27;s gone wrong :)
评论 #35454074 未加载
评论 #35456390 未加载
emadda大约 2 年前
I released a similar web WASM transcription tool recently:<p><a href="https:&#x2F;&#x2F;bigwav.app" rel="nofollow">https:&#x2F;&#x2F;bigwav.app</a>
评论 #35454081 未加载
RecycledEle大约 2 年前
I need a Windows executable that takes a directory of audio files, and transcribes them to similarly named text files, so they can be searched.<p>Example: 20230406115923.mp3 ==&gt; 20230406115923.txt 20230406083110.m4a ==&gt; 20230406083110.txt<p>I wish someone would build one and sell it for $10 a copy.
quickthrower2大约 2 年前
Related : <a href="https:&#x2F;&#x2F;developer.mozilla.org&#x2F;en-US&#x2F;docs&#x2F;Web&#x2F;API&#x2F;Web_Speech_API" rel="nofollow">https:&#x2F;&#x2F;developer.mozilla.org&#x2F;en-US&#x2F;docs&#x2F;Web&#x2F;API&#x2F;Web_Speech_...</a>
ibnbutlAn大约 2 年前
So this is all client-side and my speech is not sent antwhere or is ut &quot;client-side ui&quot; for some API?<p>I will have a look at the repository to find out, but maybe someone already looked into it.
评论 #35455924 未加载
infruset大约 2 年前
Very nice. Would it be easy to add other languages than English? Also, as others have notes, I had to open it in Chrome to make it work, Firefox didn&#x27;t work.
评论 #35458465 未加载
评论 #35452213 未加载
technocratius大约 2 年前
Love this idea! Tried a 10 sec clip on Firefox for Android. App seems to be stuck on <i>Transcribing...</i> for few mins now...
评论 #35454744 未加载
MuffinFlavored大约 2 年前
Wow, it worked on an iPhone despite it saying it wouldn&#x27;t work on Safari (needed Chrome).<p>The little &quot;replay your audio capture&quot; &lt;audio&gt; HTML element says &quot;error&quot; but the transcription actually worked.
bloominggarden大约 2 年前
This is pretty cool! Just yesterday I finished a similar demo, also using transformers.js. I am currently in the process of adding real-time transcription, do you plan on adding that?
评论 #35460682 未加载
bethecloud大约 2 年前
How are you currently distributing the audio download? Any interest in using a distributed CDN layer – happy to get help get it funded
评论 #35458238 未加载
dmix大约 2 年前
Can it identify speakers? For examples<p>SPEAKER A: blah blah<p>SPEAKER B: blah blah<p>So it can be used for transcribing phone calls?
评论 #35458320 未加载
apineda大约 2 年前
Is there a github link? I tried clicking the logo but it didn&#x27;t work.
评论 #35448666 未加载
1attice大约 2 年前
does not work in Firefox, but works draw-droppingly well in Chrome
评论 #35451628 未加载
评论 #35452509 未加载
nunobrito大约 2 年前
So nice, it worked.<p>Is there a version of this we can run on ESP32 (arduino) devices?
评论 #35441875 未加载
评论 #35453529 未加载
voicedYoda大约 2 年前
Nice! Also enjoy the ffmpeg usage. Gonna give it a try!
wdb大约 2 年前
Doesn&#x27;t seem to work in Safari
评论 #35448674 未加载
评论 #35449659 未加载
评论 #35451468 未加载
bartislartfast大约 2 年前
can we add a &quot;new&quot; button? I have to refresh to do it a second time
评论 #35458334 未加载
rado大约 2 年前
Works great, thanks
APock大约 2 年前
I want this but let me upload a file first.