TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Whisper AI – train on old TV/movies/podcasts for better speech recognition?

1 点作者 etchasketch将近 2 年前
I&#x27;m sure there are definitely copyright concerns, and I&#x27;m definitely sure that there are definitely some barriers to doing this...but I&#x27;ve been playing around with the chatgpt app talk to text, and from my understanding it uses the whisper-large-v2 model.<p>It is absolutely outstanding.<p>I have a lot of friends in who work in law and medicine, and from what I have heard, Dragon Speech Recognition is the king of the hill. From a quick wikipedia search, it seems like it is based on Hidden Markov Models&quot;. Is this something that is noticeable different than what The whisper AI is doing? And is there anything stopping someone from training a large dataset on audio and releasing a text to speech app that immediately dethrones dragon&#x2F;siri&#x2F;google dictate&#x2F;Alexa&#x2F;windows text to speech?

1 comment

brucethemoose2将近 2 年前
There have actually been some papers on &quot;better than SOTA&quot; TTS speech models with shockingly good inflection, emotion, voice imitation and such.<p>But the orgs behind them say they are hesitant to release them due to obvious misuse concerns. And I think the <i>unspoken</i> concern is that the datasets are not clean.