I gave it a spin a little bit ago. Per usual, install docs didn't quite work OOTB, here's how I got it working: <a href="https://llm-tracker.info/books/howto-guides/page/speech-to-text#bkmrk-seamlessm4t" rel="nofollow noreferrer">https://llm-tracker.info/books/howto-guides/page/speech-to-t...</a><p>One limitation that seems undocumented, the current code only supports relatively short clips so isn't suitable for long transcriptions:<p>> ValueError: The input sequence length must be less than or equal to the maximum sequence length (4096), but is 99945 instead.
Will there be a whispercpp equivalent? Half the reason I love whisper is how dead simple it is to get running. I will take somewhat lower accuracy for easier operation.<p>Edit: unless there is native speaker diarization. That would be a huge value add.
Yet somehow, many here underestimated Meta’s position in AI and proclaimed that Meta was dying and was not important and far behind in the AI race.<p>How things change dramatically in one year with such exaggeration of Meta’s collapse in 2022.<p>Not only they are in the lead in $0 free AI models, they are also at the finish line in the AI race to zero.
Lol, they botched the first example - that it translates “Our goal is to create a more connected world” to Vietnamese: It has a glancing typo at the end of the sentence “hơn” instead of “hơ.” Also it really messed up the pronounciation: It read “Chúng tôi” as “Chúng ta” - they are totally different words phonetically. The pronunciation also sounds like it’s made by someone who is mentally sick. So they botched in both translation and pronunciation.<p>That’s so embarrassing - especially for something to show how good their stuff is (although I think it’s probably not the ai’s fault) - just shows how sloppy their people are.<p>I know they have plenty of Vietnamese engineers there. Did the PR dept just throw this final version of the video out without reviewing with them?
The speech recognition in their demo is very very bad (~60% in my empirical test, vs. 95% with WhisperCPP). The translation is also very inaccurate.<p>That said, I fully support open releases and look forward to future versions and improvements.