TechEcho

12 comments

nshmover 1 year ago

Good improvements for many languages, numbers here<a href="https://github.com/openai/whisper/blob/main/language-breakdown.svg">https://github.com/openai/whisper/blob/main/language-breakdo...</a>

评论 #38167740 未加载

评论 #38167917 未加载

评论 #38168102 未加载

评论 #38167551 未加载

dangover 1 year ago

Related ongoing threads:New models and developer products - <a href="https://news.ycombinator.com/item?id=38166420">https://news.ycombinator.com/item?id=38166420</a>OpenAI DevDay, Opening Keynote Livestream [video] - <a href="https://news.ycombinator.com/item?id=38165090">https://news.ycombinator.com/item?id=38165090</a>

Nitroloover 1 year ago

Does anyone know of a nice UI wrapper for something like whisper.cpp?I need to write a lot of long texts for work and some good dictation software would be great. I know there's Dragon, but somehow I have not been able to find something that fits my need and is free.

评论 #38171776 未加载

评论 #38193772 未加载

评论 #38178388 未加载

评论 #38168917 未加载

评论 #38169715 未加载

jsightover 1 year ago

This seems like the best free voice recognition in general.Is there a model that is the best at wake word detection? The last that I looked, it seemed like this was fairly lacking.

评论 #38169673 未加载

alex_youngover 1 year ago

Still doesn't look like it can do real-time unfortunately.Edit: I understand that you can use small samples and approximate something like streaming, but the limitation here is you wind up without context for the samples, increasing WER. It would be nice if there was some streaming option.

评论 #38167811 未加载

评论 #38309043 未加载

评论 #38265337 未加载

评论 #38167663 未加载

评论 #38167645 未加载

评论 #38173469 未加载

评论 #38169357 未加载

GaggiXover 1 year ago

This is great, but I hope in the future there would be a speech-to-text model with a focus on low-resource languages, probably by balancing the dataset similar to No Language Left Behind (NLLB) released by Meta, it's a translation model that works really well even with low-resource languages, it would be really cool something similar for speech transcription.

ComputerGuruover 1 year ago

They say whisper-3 will be available via the api soon. Does anyone know why only whisper-1 was ever made available via the api (no whisper-2)?

评论 #38168220 未加载

评论 #38168195 未加载

评论 #38168179 未加载

csjhover 1 year ago

Only 3GB, interesting to see how small SOTA models in other domains are compared to LLMs like Falcon-180B.

评论 #38184630 未加载

singularity2001over 1 year ago

did they break the api?from openai import OpenAITraceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: cannot import name 'OpenAI' from 'openai'If so where is the current documentation?

joshspankitover 1 year ago

Does anyone know if it’s able to do diarization with 3?

评论 #38180749 未加载

spandextwinsover 1 year ago

With comments GitHub looks like HN except one less click to click.

tomrodover 1 year ago

Word from my GenAI contact is that this (or similar announcement) replaces the need for RAG.

评论 #38167635 未加载

12 comments

nshmover 1 year ago

Good improvements for many languages, numbers here<a href="https://github.com/openai/whisper/blob/main/language-breakdown.svg">https://github.com/openai/whisper/blob/main/language-breakdo...</a>

评论 #38167740 未加载

评论 #38167917 未加载

评论 #38168102 未加载

评论 #38167551 未加载

dangover 1 year ago

Nitroloover 1 year ago

评论 #38171776 未加载

评论 #38193772 未加载

评论 #38178388 未加载

评论 #38168917 未加载

评论 #38169715 未加载

jsightover 1 year ago

This seems like the best free voice recognition in general.Is there a model that is the best at wake word detection? The last that I looked, it seemed like this was fairly lacking.

评论 #38169673 未加载

alex_youngover 1 year ago

评论 #38167811 未加载

评论 #38309043 未加载

评论 #38265337 未加载

评论 #38167663 未加载

评论 #38167645 未加载

评论 #38173469 未加载

评论 #38169357 未加载

GaggiXover 1 year ago

ComputerGuruover 1 year ago

They say whisper-3 will be available via the api soon. Does anyone know why only whisper-1 was ever made available via the api (no whisper-2)?

评论 #38168220 未加载

评论 #38168195 未加载

评论 #38168179 未加载

csjhover 1 year ago

Only 3GB, interesting to see how small SOTA models in other domains are compared to LLMs like Falcon-180B.

评论 #38184630 未加载

singularity2001over 1 year ago

joshspankitover 1 year ago

Does anyone know if it’s able to do diarization with 3?

评论 #38180749 未加载

spandextwinsover 1 year ago

With comments GitHub looks like HN except one less click to click.

tomrodover 1 year ago

Word from my GenAI contact is that this (or similar announcement) replaces the need for RAG.

评论 #38167635 未加载

OpenAI releases Whisper v3, new generation open source ASR model

12 comments

OpenAI releases Whisper v3, new generation open source ASR model

12 comments