科技回声

12 条评论

nshm超过 1 年前

Good improvements for many languages, numbers here<a href="https://github.com/openai/whisper/blob/main/language-breakdown.svg">https://github.com/openai/whisper/blob/main/language-breakdo...</a>

评论 #38167740 未加载

评论 #38167917 未加载

评论 #38168102 未加载

评论 #38167551 未加载

dang超过 1 年前

Related ongoing threads:New models and developer products - <a href="https://news.ycombinator.com/item?id=38166420">https://news.ycombinator.com/item?id=38166420</a>OpenAI DevDay, Opening Keynote Livestream [video] - <a href="https://news.ycombinator.com/item?id=38165090">https://news.ycombinator.com/item?id=38165090</a>

Nitrolo超过 1 年前

Does anyone know of a nice UI wrapper for something like whisper.cpp?I need to write a lot of long texts for work and some good dictation software would be great. I know there's Dragon, but somehow I have not been able to find something that fits my need and is free.

评论 #38171776 未加载

评论 #38193772 未加载

评论 #38178388 未加载

评论 #38168917 未加载

评论 #38169715 未加载

jsight超过 1 年前

This seems like the best free voice recognition in general.Is there a model that is the best at wake word detection? The last that I looked, it seemed like this was fairly lacking.

评论 #38169673 未加载

alex_young超过 1 年前

Still doesn't look like it can do real-time unfortunately.Edit: I understand that you can use small samples and approximate something like streaming, but the limitation here is you wind up without context for the samples, increasing WER. It would be nice if there was some streaming option.

评论 #38167811 未加载

评论 #38309043 未加载

评论 #38265337 未加载

评论 #38167663 未加载

评论 #38167645 未加载

评论 #38173469 未加载

评论 #38169357 未加载

GaggiX超过 1 年前

This is great, but I hope in the future there would be a speech-to-text model with a focus on low-resource languages, probably by balancing the dataset similar to No Language Left Behind (NLLB) released by Meta, it's a translation model that works really well even with low-resource languages, it would be really cool something similar for speech transcription.

ComputerGuru超过 1 年前

They say whisper-3 will be available via the api soon. Does anyone know why only whisper-1 was ever made available via the api (no whisper-2)?

评论 #38168220 未加载

评论 #38168195 未加载

评论 #38168179 未加载

csjh超过 1 年前

Only 3GB, interesting to see how small SOTA models in other domains are compared to LLMs like Falcon-180B.

评论 #38184630 未加载

singularity2001超过 1 年前

did they break the api?from openai import OpenAITraceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: cannot import name 'OpenAI' from 'openai'If so where is the current documentation?

joshspankit超过 1 年前

Does anyone know if it’s able to do diarization with 3?

评论 #38180749 未加载

spandextwins超过 1 年前

With comments GitHub looks like HN except one less click to click.

tomrod超过 1 年前

Word from my GenAI contact is that this (or similar announcement) replaces the need for RAG.

评论 #38167635 未加载

12 条评论

nshm超过 1 年前

Good improvements for many languages, numbers here<a href="https://github.com/openai/whisper/blob/main/language-breakdown.svg">https://github.com/openai/whisper/blob/main/language-breakdo...</a>

评论 #38167740 未加载

评论 #38167917 未加载

评论 #38168102 未加载

评论 #38167551 未加载

dang超过 1 年前

Nitrolo超过 1 年前

评论 #38171776 未加载

评论 #38193772 未加载

评论 #38178388 未加载

评论 #38168917 未加载

评论 #38169715 未加载

jsight超过 1 年前

This seems like the best free voice recognition in general.Is there a model that is the best at wake word detection? The last that I looked, it seemed like this was fairly lacking.

评论 #38169673 未加载

alex_young超过 1 年前

评论 #38167811 未加载

评论 #38309043 未加载

评论 #38265337 未加载

评论 #38167663 未加载

评论 #38167645 未加载

评论 #38173469 未加载

评论 #38169357 未加载

GaggiX超过 1 年前

ComputerGuru超过 1 年前

They say whisper-3 will be available via the api soon. Does anyone know why only whisper-1 was ever made available via the api (no whisper-2)?

评论 #38168220 未加载

评论 #38168195 未加载

评论 #38168179 未加载

csjh超过 1 年前

Only 3GB, interesting to see how small SOTA models in other domains are compared to LLMs like Falcon-180B.

评论 #38184630 未加载

singularity2001超过 1 年前

joshspankit超过 1 年前

Does anyone know if it’s able to do diarization with 3?

评论 #38180749 未加载

spandextwins超过 1 年前

With comments GitHub looks like HN except one less click to click.

tomrod超过 1 年前

Word from my GenAI contact is that this (or similar announcement) replaces the need for RAG.

评论 #38167635 未加载

OpenAI releases Whisper v3, new generation open source ASR model

12 条评论

OpenAI releases Whisper v3, new generation open source ASR model

12 条评论