I've been looking for, and testing, various automated transcription APIs over the years and <i>never</i> found one that was high-quality for videos of interviews (background noise, people don't talk in full sentences, people use filler sounds). I'd love to find something usable -- and plan to try this one as well. Human transcription is laborious, slow, and expensive. I've toyed with the idea of a human clean-up pass after the automated transcription, but that's still labor intensive.
Wow! Circa $1.50 hour. Whereas human transcription services are more like $45 per hour.<p>This kind of price will open up entirely new applications. E.g., for ~$10/day, you could transcribe every conversation you have at work. Combine that with good search and your phone becomes a supplemental, artificial memory. "I know I talked about that with somebody but I can't remember who" becomes a thing of the past.
I've been looking into podcast transcription services, but I haven't really found a service build using similar APIs like this. Can anyone shed some light on the transcription business, is the transcription quality not yet good enough or why don't we see many of these services?
Same price as google ($0.006 per 15 second increment) but more expensive that microsft ($0.004 per 15 second increment). However, this does charge by the second after 15 seconds which is nice.<p>I just wish they would lower the minimum threshold of 15 second intervals.