科技回声

1 comment

From the Methodology section:<p>>First, for the main source of data, I chose all Mr. Beast videos with uploaded (ie. non-auto-generated) transcripts—a total of 229 out of 837 published videos on his flagship channel. This gave me a source of processable ground truth about where money was mentioned and also limited the videos to those published the last 6 years, which make up the majority of his meteoric rise. Then, I downloaded the videos in 360p and scraped their transcripts for every occurrence of a dollar amount, logging each mention with its sum, video, and context in a database that I would build on top of as I nailed down the exact timing. I used those contextual timestamps to make rough clips that I fed into the open source AI tool Whisper to (a) get a more precise measurement of where “X dollars” was actually said and (b) standardize and double check that my first scrape had gotten the amount correct. Finally, as many of the clips were still off by a few annoying and noticeable fractions of a second in any direction, I made a script that allowed me to go through each entry individually, trim or extend the clip on either end, and modify the amount one last time if my first 2 methods had failed. After all 2800+ were processed—a task that took weeks—I made a final set of clips out of higher quality versions of the videos and used Premiere to make the film’s final dizzying supercut you see before you.<p>>90% of data science is data cleaning, and I have kept this overview pretty high-level in the interest of making it accessible to a wide audience. A much longer and more technical dive into the steps needed to go from a raw YouTube archive to this video—including everything from token suppression, the comparative benefits of transcription libraries, counterintuitive ways to standardize and parse numbers in natural language, and debugging audio desyncs in clip concatenations - may appear in the future on my website.

1 comment

felipemesquita5 个月前

Mr. Beast Saying Increasingly Large Amounts of Money

1 comment

Mr. Beast Saying Increasingly Large Amounts of Money

1 comment