I am trying to find a Jay Leno Headlines item of my cousin's name and picture appearing in the Regina Leader Post as the Paperboy of the week/month. I have watched many videos but there are hundreds of hours and many duplicates.<p>I was hoping that the closed captions for videos were indexed by google but I am not sure if they are and if there is a certain syntax to use for searching captions.<p>I tried googling this but have only found searching within a specific video rather than across the library.<p>To be more specific, his name is Rick Lee and the Headline appeared on the show sometime around Y2K.<p>Any help and or guidance would be greatly appreciated!<p>Thanks
this is not a complete solution, but you seem to know what videos you're going to search through rather than wanting to search through everything. If that's the case, I recommend downloading the subtitles for every video via yt-dlp [1] as shown in this stackoverflow question [2] (it doesn't download the video itself this way to save bandwidth):<p><pre><code> yt-dlp --all-subs --skip-download https://www.youtube.com/watch?v=Ye8mB6VsUHw
</code></pre>
(The answer uses youtube-dl, but I prefer yt-dlp, it works with these same options when I tried it)<p>When you've downloaded all subtitles, you can simply grep through them. I hope this helps!<p>[1] <a href="https://github.com/yt-dlp/yt-dlp">https://github.com/yt-dlp/yt-dlp</a><p>[2] <a href="https://superuser.com/a/927532/1071647" rel="nofollow noreferrer">https://superuser.com/a/927532/1071647</a>
Not a solution, but somewhat related: <a href="https://youglish.com/" rel="nofollow noreferrer">https://youglish.com/</a> lets you search YouTube videos for keywords, but the purpose is to find examples of how to pronounce words from real usage. It also works for a few other languages aside from English.
Thank You to everyone who took the time to read my question and especially those that replied. I have read over the replies and there is be many good ideas/solutions to my query.<p>I am going to start at the top and work my way through them over the next little while and will definitely provide an update of what hopefully worked but it may be a few weeks.<p>I just wanted to post this comment to let you all know I appreciate the great responses and the HN community. As <B>benboozled</B> commented about his faith being restored in humanity, I am thankful for HN and all of you generous and helpful contributors.<p>M
There’s literally thousands of episodes of The Tonight Show[0], many of which are probably not on YouTube. I would reach out to the Jay Leno fan community and see if anyone has it in a private archive. Never underestimate the organization skills of a dedicated fan.<p>0: <a href="https://en.m.wikipedia.org/wiki/List_of_The_Tonight_Show_with_Jay_Leno_episodes" rel="nofollow noreferrer">https://en.m.wikipedia.org/wiki/List_of_The_Tonight_Show_wit...</a>
Someone made a website to search through channels or playlists. Not sure if it still works but it might help: <a href="https://ytks.app/" rel="nofollow noreferrer">https://ytks.app/</a>
You can try to use <a href="https://filmot.com/" rel="nofollow noreferrer">https://filmot.com/</a> to search videos by closed captions.<p>I tried to search but could not find the video.
There are browser extensions to search captions on a single video but I don't know of any better options.. which is kind of ridiculous considering Google owns YouTube.<p>Here is one: <a href="https://addons.mozilla.org/en-US/firefox/addon/youtube-captions-search/" rel="nofollow noreferrer">https://addons.mozilla.org/en-US/firefox/addon/youtube-capti...</a>
A related interesting problem would be: how to search by text keywords in the <i>audio</i> stream of the video.<p>I guess part of one solution approach could be to convert the speech to text. Could then grep it. But how would we correlate that back to the time positions in the video where those keywords occurred?
You can try using yt-fts[0]. I've had great luck with it.<p>[0] <a href="https://github.com/NotJoeMartinez/yt-fts">https://github.com/NotJoeMartinez/yt-fts</a>