Good overview of all the parts involved! I was hoping they’d talk a little more about the timing aspects, and keeping audio and video in sync during playback.<p>What I’ve learned from working on a video editor is that “keeping a/v in sync” is… sort of a misnomer? Or anyway, it <i>sounds</i> very “active”, like you’d have to line up all the frames and carefully set timers to play them or something.<p>But in practice, the audio and video frames are interleaved in the file, and they naturally come out in order (ish - see replies). The audio plays at a known rate (like 44.1KHz) and every frame of audio and video has a “presentation timestamp”, and these timestamps (are supposed to) line up between the streams.<p>So you’ve got the audio and video both coming out of the file at way-faster-than-realtime (ideally), and then the syncing ends up being more like: let the audio play, and hold back the next video frame until it’s time to show it. The audio updates a “clock” as it plays (with each audio frame’s timestamp), and a separate loop watches the clock until the next video frame’s time is up.<p>There seems to be surprisingly little material out there on this stuff, but the most helpful I found was the “Build a video editor in 1000 lines” tutorial [0] along with this spinoff [1], in conjunction with a few hours spent poring over the ffplay.c code trying to figure out how it works.<p>0: <a href="http://dranger.com/ffmpeg/" rel="nofollow">http://dranger.com/ffmpeg/</a><p>1: <a href="https://github.com/leandromoreira/ffmpeg-libav-tutorial" rel="nofollow">https://github.com/leandromoreira/ffmpeg-libav-tutorial</a>