TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

LLM_transcribe_recording: Bash Helper Using Mlx_whisper

39 pointsby Olshansky8 months ago

1 comment

ndr_8 months ago
This may call out to ffmpeg for pre-processing. If you&#x27;re reluctant to running that on you Mac straight, you can use this wrapper script to have ffmpeg run in a docker instance: <a href="https:&#x2F;&#x2F;gist.github.com&#x2F;ndurner&#x2F;636d37fd83aed4b875cdb66653017ae7" rel="nofollow">https:&#x2F;&#x2F;gist.github.com&#x2F;ndurner&#x2F;636d37fd83aed4b875cdb6665301...</a><p>However, I found that Whisper is thrown off by background music in a prodcast - and will not recover. (That was with the mlx-community&#x2F;whisper-large-v3-mlx checkpoint, OP uses distil-whisper-large-v3). I concluded for myself that Whisper might be used in larger processing pipelines that will handle such - can someone provide insights about that? The podcast I used it on was <a href="https:&#x2F;&#x2F;www.heise.de&#x2F;news&#x2F;KI-Update-Deep-Dive-Was-taugen-KI-Suchmaschinen-9850904.html" rel="nofollow">https:&#x2F;&#x2F;www.heise.de&#x2F;news&#x2F;KI-Update-Deep-Dive-Was-taugen-KI-...</a>.<p>I ended up using Google Gemini, which handled it well. (Blog post: <a href="https:&#x2F;&#x2F;ndurner.github.io&#x2F;mlx-whisper-gemini" rel="nofollow">https:&#x2F;&#x2F;ndurner.github.io&#x2F;mlx-whisper-gemini</a>)
评论 #41489152 未加载