TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Boxfish: Realtime Index of Every Word Spoken on TV

77 pointsby speekabout 12 years ago

16 comments

amitparikhabout 12 years ago
Closed captioning has always seemed to me to be a notoriously bad data set due to misspellings and misphrasing. Has anyone tried to do (a better) speech to text of a cable news channel, for instance?<p>Doing frequency and sentiment analysis on this dataset would be pretty interesting.
评论 #5687815 未加载
评论 #5687763 未加载
philsnowabout 12 years ago
One of the original incarnations of Google Video was something somewhat similar to this (an index of closed-captioning data from a lot of different tv streams). What they chose to do with it was different though: they allowed you to search closed-captioned content and it would show you a few thumbnails and the time of day when those words were said on air.<p>This memory is kind of hazy, ISTR it's from 2005 or so.
评论 #5688312 未加载
tibbonabout 12 years ago
I personally think there's a great deal that can be done with this data.<p>A few years ago, someone documented how to use an Arduino + Video Experimenter Shield to easily log closed captioning data (<a href="http://blog.makezine.com/2011/08/16/enough-already-the-arduino-solution-to-overexposed-celebs/" rel="nofollow">http://blog.makezine.com/2011/08/16/enough-already-the-ardui...</a>). Never got around to messing with it, but I can imagine 100 interesting things to do with that data.<p>Very cool company. I'm glad someone's doing this.
nutmegabout 12 years ago
Seems like scraping all closed captioning would be very valuable data indeed. Is there anyone else doing something like this that provides an API or data feed?
评论 #5687921 未加载
评论 #5687838 未加载
评论 #5687578 未加载
quanabout 12 years ago
I'd love to see this used on Fox News to fact check everything they say in real time.
mrilhanabout 12 years ago
I think the potential here is immense.<p>Boxfish, twitter, YouTube, Siri, and now with Ray Kurzweil @ Google... thinkers are converging on doing to every other form of content what Google did for structured documents.<p>The NLP trend is going to be amusing to watch at least (Siri, Summly), and whether its time has come in the next 5 years or not I'm not certain. But I know Ray Kurzweil knows this technology is inevitable.<p>--<p>As for BoxFish, I think this is a good example of a neatly executed, well funded startup with experienced founders and a solid space. No drama, no demo day, no immediate fires to put out, cool $3m in the bank, Deutsche Telekom AG subsidiary negotiating their deals for them, and "Yahoo just bought a kids startup for 17m" - the topic is hotter than others. This is the type of startup I for one daydream of having stock of or working at. Has high potential to be worth $mmms or $bn in the future - you know, that all depends and what not. But the makings are clearly there. Excellent work guys! Congratulations.
thereallurchabout 12 years ago
Wont this just amplify existing trends instead of exposing new ones?
评论 #5687878 未加载
评论 #5687585 未加载
skramabout 12 years ago
Here's one endpoint that seems to work and not require an API key: <a href="http://api.boxfish.com/v2/v3/trending/topics/?fields=count" rel="nofollow">http://api.boxfish.com/v2/v3/trending/topics/?fields=count</a>
RKabout 12 years ago
Sounds similar to SnapStream. "Monitor everything said on TV"<p><a href="http://snapstream.com" rel="nofollow">http://snapstream.com</a>
uptownabout 12 years ago
Reminds me a little of Bluefin Labs (acquired by Twitter). Just hook up this data with a sentiment-engine of Twitter and you can come up with some interesting correlations to how people react to television.<p><a href="https://bluefinlabs.com/" rel="nofollow">https://bluefinlabs.com/</a>
krazykringleabout 12 years ago
Also: <a href="http://archive.org/details/tv" rel="nofollow">http://archive.org/details/tv</a>
sliftyabout 12 years ago
For those interested in a real time API of caption streams you should be sure to check out Opened Captions: <a href="http://openedcaptions.com:3000/" rel="nofollow">http://openedcaptions.com:3000/</a><p>Currently only for C-SPAN but that may change!
Finsterabout 12 years ago
With the new Federal regulations stipulating that anything that originates on TV must be captioned when streamed over the internet, Boxfish will be able to get a fairly comprehensive picture of what's going on.
bravuraabout 12 years ago
Is this only for US television? Or is it global?<p>What is the reach? I know several people who would be interested in this for smaller countries.<p>I couldn't find this information on the homepage.
评论 #5688187 未加载
e3piabout 12 years ago
`HN DDOS' again? Still spinning after five minutes on:<p><a href="http://boxfish.com/#!search/Klinger" rel="nofollow">http://boxfish.com/#!search/Klinger</a>
deepinsandabout 12 years ago
Do they have a massive number of cable/satellite subcriptions? I've always wondered how they and IntoNow get their signals.
评论 #5687958 未加载