>And finally, we encounter a large issue without a good solution. In encoded videos, a key frame is a frame in the video that contains all the visual information needed to render itself without any additional metadata. These are much larger than normal frames, and contribute greatly to the bitrate. Ideally, there would be as a few keyframes as possible. However, when a new user starts consuming a stream, they need at least one keyframe to view the video. WebRTC solves this problem using the RTP Control Protocl (RTCP). When a new user consumes a stream, they send a Full Intra Request (FIR) to the producer. When a producer receives this request, they insert a keyframe into the stream. This keeps the bitrate low while ensuring all the users can view the stream. FFmpeg does not support RTCP. This means that the default FFmpeg settings will produce output that won’t be viewable if consumed mid-stream, at least until a key frame is received. Therefore, the parameter -force_key_frames expr:gte(t,n_forced*4) is needed, which produces a key frame every 4 seconds.<p>in case someone was wondering why it was a bad idea
To get it into the browser check out rtp-to-webrtc[0]<p>Another big piece missing here is congestion control. It isn’t just about keeping bitrate low, but figuring out what you can use. It is a really interesting topic to measure RTT/Loss to figure out what is available. You don’t get that in ffmpeg or GStreamer yet. The best intro to this is the BBR IETF doc IMO [1]<p>[0] <a href="https://github.com/pion/webrtc/tree/master/examples/rtp-to-webrtc" rel="nofollow">https://github.com/pion/webrtc/tree/master/examples/rtp-to-w...</a><p>[1] <a href="https://tools.ietf.org/html/draft-cardwell-iccrg-bbr-congestion-control-00" rel="nofollow">https://tools.ietf.org/html/draft-cardwell-iccrg-bbr-congest...</a>
Note that this post doesn't really cover how to stream media using WebRTC. First and foremost, because WebRTC mandates the use of DTLS to encrypt the RTP flow, thus a plain RTP stream won't work. A more apt title would be "How to use FFmpeg to generate an encoded stream that happens to match the requirements for WebRTC".<p>Still, thanks for the article; it is always interesting to see specific applications of the FFmpeg command line, because in my opinion after having read them top to bottom, FFmpeg docs are <i>very</i> lacking in the department of explaining the <i>whys</i>.<p>Random example: You read the docs of <i>genpts</i> and it is something on the line of "Enables generation of PTS timestamps". Well, thank you (/s) But really, when should I use it? What does it actually change between using it or not? <i>What scenarios would benefit from using it?</i> Etc. Etc.
What about the WebRTC part?<p>The post ends at RTP out from FFMPEG. Maybe I’m supposed to know how to consume that with WebRTC but in my investigation it’s not at all straightforward... the WebRTC consumer needs to become aware of the stream through a whole complicated signaling and negotiation process. How is that handled after the FFMPEG RTP stream is produced?
For the purpose of one to many type of live streaming you would probably want to use HLS.<p>Twitch uses it's own transcoding system. Here is a interesting read from their engineering blog [0]<p>[0] <a href="https://blog.twitch.tv/en/2017/10/10/live-video-transmuxing-transcoding-f-fmpeg-vs-twitch-transcoder-part-i-489c1c125f28/" rel="nofollow">https://blog.twitch.tv/en/2017/10/10/live-video-transmuxing-...</a>
> -bsv:v h264_metadata=level=3.1<p>This should be `-bsf:v` and it's not required since this command encodes and the encoder has been informed via `-level`.