We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Beyond the webrtc.org monoculture

00:00

Formal Metadata

Title
Beyond the webrtc.org monoculture
Subtitle
Alternative WebRTC implementations in C and Python
Title of Series
Number of Parts
561
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
WebRTC’s most prominent implementations rely on the webrtc.org codebase. In this talk, Jeremy Lainé and Lennart Grahl will present two alternative WebRTC / ORTC implementations (aiortc in Python and RAWRTC in C), their use cases, the challenges in writing these implementations and the benefits for the WebRTC ecosystem. Both Chrome’s and Firefox’s (to some extent) WebRTC implementations are built on top of a common codebase hosted at webrtc.org, which serves as the de facto reference implementation of WebRTC. While it is popular, hacking on the codebase or embedding into a custom project is a complex endeavour. Alternative implementations of WebRTC (and ORTC) exist in the form of libraries which are simpler to understand and embed: RAWRTC (in C, currently data channels only), written by Lennart Grahl aiortc (in Python), written by Jeremy Lainé The talk will focus on the following aspects: Use cases and demos for these alternative WebRTC implementations, such as Internet of Things or server-side real time audio and video processing. The genesis and challenges involved in implementing these two libraries. How the WebRTC ecosystem benefits from having such alternative WebRTC implementations.
10
58
80
111
137
Thumbnail
15:21
159
Thumbnail
18:51
168
Thumbnail
26:18
213
221
Thumbnail
15:22
234
Thumbnail
49:51
248
Thumbnail
23:06
256
268
283
Thumbnail
28:38
313
Thumbnail
1:00:10
318
Thumbnail
21:35
343
345
Thumbnail
36:13
353
Thumbnail
18:44
369
370
373
Thumbnail
44:37
396
Thumbnail
28:21
413
Thumbnail
16:24
439
455
Thumbnail
25:10
529
Thumbnail
15:36
535
Thumbnail
28:04
552
ImplementationLine (geometry)CodePairwise comparisonSoftwareBitComputer virusMereologyStreaming mediaWeb 2.0Expert systemComputer configurationLibrary (computing)VideoconferencingPower (physics)CodePerspective (visual)Projective planeFrame problemLine (geometry)World Wide Web ConsortiumComputer fileAxiom of choiceSoftware testingWeb browserTerm (mathematics)Computer programmingSynchronizationVirtual machineHypermediaAuthorizationWeightImage processingImplementationExterior algebraSource codePartial derivativeNP-hardRaw image formatOpen setVideo gameSpacetimeTwitterFigurate numberPoint (geometry)Function (mathematics)SoftwareKey (cryptography)TrailModule (mathematics)Demo (music)Broadcast programmingGraphical user interfaceComputer animationXML
Personal digital assistantDemo (music)CASE <Informatik>Demo (music)Web browserImplementationServer (computing)Formal languageWeb 2.0Row (database)Operator (mathematics)Reading (process)Radical (chemistry)Right angleLibrary (computing)Service (economics)PrototypePower (physics)Sound effectWeightIntegrated development environmentInterior (topology)Streaming mediaFrame problemBitRoboticsHypermediaCartesian coordinate systemNeuroinformatikVideoconferencingSelectivity (electronic)Process (computing)Descriptive statisticsGraphical user interfaceLastprofilLocal ringRange (statistics)Data integritySet (mathematics)Software testingFirewall (computing)Computer configurationPunched cardGlobale BeleuchtungPattern recognitionReal-time operating systemINTEGRALMathematicsPatch (Unix)Virtual machineMobile appExecution unitRaw image formatExterior algebraObject (grammar)Level (video gaming)Machine learning
WindowHill differential equationLie groupConvex hullSign (mathematics)IRIS-TNumberExterior algebraProcess (computing)ImplementationMultiplication signTotal S.A.Standard deviationControl flowFeedbackPrototypeSoftwareSoftware developerSoftware bugCASE <Informatik>Web browserCodePareto distributionMassStack (abstract data type)Sound effectLatent heatLine (geometry)AreaMaxima and minimaEncryptionBand matrixObject (grammar)Connected spaceBlock (periodic table)BuildingÜberlastkontrollePoint (geometry)Module (mathematics)Projective planeMereologyDifferent (Kate Ryan album)World Wide Web ConsortiumQuicksortKey (cryptography)Derivation (linguistics)HypermediaBlogSingle-precision floating-point formatDirected graphMechanism designFigurate numberVideoconferencingBuffer overflowClient (computing)TelecommunicationCodecMoment (mathematics)Perspective (visual)Web 2.0Centralizer and normalizerLevel (video gaming)Chemical equationServer (computing)Parity (mathematics)MathematicsStreaming mediaInstance (computer science)Source codeComputer animation
Computer animation
Transcript: English(auto-generated)
All right, hello. Hello everyone. Today, Leonard and I will be speaking about how we can get a bit more diversity in the WebRTC ecosystem, and how alternative WebRTC implementations will help us to achieve this goal.
So I'm Jeremy, I'm the CTO of Spasino. I've been involved in free software since 2000, and actively using Python since 2007. And today, my perspective on this is as the author of AIO-RTC, a Python implementation of WebRTC.
So, hi everyone, from me as well. I'm Leonard Grahl. I do enjoy network programming.
So, I'm Leonard Grahl, as I said. I do enjoy network programming. I am the author of RawRTC, which is a, well, partial WebRTC implementation, and of SaltyRTC, which is an end-to-end encrypted signaling solution. I am apparently a W3C WebRTC invited expert, and I do work for Threema when I don't work on my personal projects.
So, WebRTC, in short, is about secure peer-to-peer communications, whether you're exchanging audio, video, or data.
And if you've made use of WebRTC, you most likely have made use of the WebRTC.org codebase, which is a project which is driven by Google, and which serves as the de facto reference implementation for WebRTC. It's a codebase which is widely used by browsers. Chrome uses it extensively.
Firefox also uses it, at least for the media part, less so for data channels. And with Edge converging towards Chromium, well, you know it. Edge will be using the same stack, too.
The WebRTC.org codebase deserves a lot of credit because it has put WebRTC into the hands of millions of people. So, we probably wouldn't be talking about WebRTC today if there wasn't the WebRTC.org codebase. Nevertheless, if you've tried to integrate this codebase into your own custom project, so we're probably speaking outside the browser space,
you will probably have noticed that this is a massive project, and integrating it is hard. Tracking releases is also quite a challenge. And to give you a ballpark figure about what I mean by a large project, here is a recent tweet by someone who should know.
So, Justin said it's weighing in at 1.2 million lines of code. So, I don't know how many of you have written that kind of code, but I have not. And so, what can we do, starting from here? Do you have any other choices if you want a library which supports WebRTC for your own project?
The answer is luckily, yes. So, we're going to talk about two different libraries. For my part, I'm going to talk about AIO RTC. It is a WebRTC implementation which is written in Python.
It leverages the modern Python support for AsyncIO. It supports audio, video, and data channels, so it's got you pretty well covered. And weighing in at around 6,000 lines of code, it is much more pleasurable to hack on,
and I'm lucky enough to have reached full test coverage on it. AIO RTC started its life as a testing tool to test the availability of my company's WebRTC endpoint, and it has grown significantly since then. As a Python project, one of the key selling points is that you can tap into the broad Python ecosystem.
For its audio and video frames, AIO RTC relies on the PyAV project's audio and video frames. This is a binding to FFmpeg, so this gives you a lot of power,
whether it's in terms of reading media from various sources, whether these are MP4 files or an RTSP stream, it's got you covered, and it also gives you a lot of possibilities in how you output these media streams.
The Python ecosystem also gives you lots of options when it comes to building the signaling solutions. You have modules such as AIO, HTTP, and WebSockets, which are very handy. You also have lots of options if you want to do things like image processing or even machine learning with projects such as OpenCV and TensorFlow.
It's easy to feed these media streams or the frames of these media streams into these projects. AIO RTC comes with a growing collection of examples built right into it.
On the left here, what we have is streaming BigBuck Bunny from an MP4 file into the AppRTC demo website. This is something you can do with zero lines of code. There's an example for that built in. On the right-hand side is an example where the browser is talking to a Python-based server,
which handles both signaling and media, which applies some real-time processing on the video frames and sends them back to you, in this case with a cartoon kind of effect. What are some of the use cases for AIO RTC? On the data channel side, you can use data channels, for example, to communicate with embedded devices,
or you can have some more esoteric use cases such as running a VPN over data channels and so benefiting from the firewall punching features of WebRTC. There's an example for that on GitHub.
There's also quite a wide range of applications which involve media processing or maybe machine learning. You can do things like real-time feature extraction and recognition on video streams. I've had some users report they wanted to use AIO RTC for a central server,
which would record video streams coming from mobile devices. Or if you want, you can also build your own solution to securely access your home, your home surveillance cameras on the go from your mobile device. Obviously, this is Python, so one of the strengths of Python is how expressive the language is
and how quickly you can prototype solutions using Python. The syntax will be very familiar for anyone who has used WebRTC. You're going to find your usual RTC peer connection, and thanks to Python support for async await,
well, you just do your create offers and set local descriptions as usual. What's unusual is that you have some higher level objects such as a media player and a media recorder, which allow you to either read or write your media streams. Now, if you're operating in a really constrained environment and Python's not even an option for you,
do you have any good solutions for that? Perhaps I do. So, RTC is another alternative implementation, but it only supports data channels, so it's a little bit specialized. It is intended to be resource-friendly, so you can use it in embedded devices as well. It does use two libraries underneath, which is RE and user-sldp,
and I originally created it in 2016 for testing purposes as well. So, my former professor wanted to have a tool to showcase, well, data channels
and test and improve the data channel implementations without having to work with or untangle all the existing process implementations such as the one used by Chrome or Firefox. Okay, sure. C was a requirement, so I wrote it in C.
And, well, as I said, we used it to test a couple of things in the data channel implementations, and it was being used to patch the EOR problem, which is also known as message integrity violation, and then we backported that to Firefox.
And there were a couple of other improvements where we did the same thing. So, we also tested throughput of the data channels and then backported the necessary changes to Firefox. So, of course, you can, since it's now an existing implementation,
you can also use it for your own use cases such as applications and services. So, one of the things that I have seen that seem to be of interest is integrating it into an existing torrent library to implement WebTorrent.
Another example I've seen is some people seem to be interested in doing peer-assisted CDNs, which are just CDNs which are being, well, where you reduce the peak load by sending data via peer-to-peer, and you could use raw RTC for that.
But then there are, of course, also the embedded use cases such as using it for, well, we've used it for RC toys, so we've made an example where you can control the Lego Mindstorms robot with it, and it worked fine. But also, IoT use cases are in it as well, if there is a little bit of power on the device.
For example, exterior, interior, illumination, and yeah. Furthermore, you can integrate it into an existing WebRTC implementation. If you don't have a data channel stack yet, this might be interesting. So, if you have eyes in DTLS, for example, in a selective forwarding unit,
then you can integrate raw RTC into it so you can use data channels as well. This is one of the demos we wrote. So, this is on the right side, we see a browser that just opens multiple terminals, and it just accesses the local terminal on that device by using raw RTC,
which is kind of fun since you can punch through the net and just access your device without having to forward any ports. So, in the process of producing these two alternative WebRTC implementations,
we encountered a number of common problems or had some thoughts about this and which we wanted to share with you. First of all, what are the challenges if you decide that you want to spin your own WebRTC implementation?
Personally, the first problem was finding the relevant documentation because documentation is spread out across, let's say, two different worlds, the IETF world and the W3C world, and so you have to hunt down all the RFCs or possibly even draft RFCs
and then refer back to the W3C specs for WebRTC, and trying to wrap your mind around all this is quite challenging. So, I think we're maybe missing a sort of single entry point which would refer to these different documents
and give us a better overview of all the relevant specifications. The second challenge is that when implementing a WebRTC stack is that this is a deep stack, which involves a number of layers, and it's only getting bigger. You already need to deal with things like network connectivity at the ice level,
then through encryption key derivation using DTLS, RTP and SRTP and RTCP for the media streams, and on the data channel side, well, you need an SCTP stack.
Now, unfortunately for me, pretty much none of these building blocks existed in Python. They do now. We agree that it's a good idea to spin these out into reusable modules so that if someone else wants to take a different approach on implementing this WebRTC stack,
well, at least they have some of the basic building blocks to go on. A similar point is how to structure your code. If you've manipulated WebRTC, let's say with a browser perspective, you're used to this very central RTC peer connection object,
and from there it's kind of not very clear how you should structure your code. Luckily, ORTC object RTC, which is a sort of different approach on the same project, does break down this kind of monolithic stack into some discrete objects,
and it provides some interesting guidance in how to structure your code. This was one of Leonard's tips early on when I started implementing AIO RTC, and I'm very grateful for it. Another issue you may run into is that you may run into parts of the specs
which are ambiguous or maybe downright wrong, and so it's challenging to contribute back to the W3C as substantial contributions are only allowed for members. Now, there are a number of benefits to having alternative WebRTC implementations.
The most obvious for me is that a standard only lives up to its name if it has multiple implementers. So in this sense, diversity is really a good thing. Also, these WebRTC implementations are a lot smaller than the WebRTC.org one, and personally I find that a lot more fun to hack on and easier to integrate into custom projects.
In the process of developing these WebRTC stacks, we shook out a number of bugs in the browsers, and for the friendlier browsers such as Firefox, it's been a pleasure to contribute to the development effort. And having alternative WebRTC implementation also helps give valuable feedback to the standardization process
as you're able to prototype new features or possibly explore areas which were not originally envisioned in the WebRTC scope. Okay, so now that we've heard about the benefits of alternative implementations,
so maybe there are areas where we can improve, and I think there are. So one of the things is that browsers are still surprisingly far away from spec compliance, so if you want to get this changed, maybe this is the time to get involved.
There's another problem that is the data channel lobby I think is underrepresented in the specification process, which also brings me to the next point, which is how can we make the specification process maybe more transparent, maybe more visible, so we can involve developers and users
to provide direct feedback instead of posting it on Stack Overflow and then be done with it. Last but not least, from the feedback that we get, is that WebRTC is still misunderstood by developers, so some of them think it's mainly used for client to server,
while it is actually mostly peer-to-peer, there are of course use cases where you can use it for client to server, but it's intended to be peer-to-peer. And the purpose of the whole signaling process and how it works, how the mechanics work in the WebRTC stack, they are widely misunderstood.
So we think the documentation could maybe be improved, for example, in the Mozilla developer network, but maybe we should also post new blog posts that update the existing ones, because there are quite a lot that are outdated.
So with that, we thank you for listening. We have listed some further alternative WebRTC implementations there, if you want to look at them. Do you have any questions, if we have some time left?
In the meantime, we're going to let this figure sink in, which is kind of the ratio between the number of lines of code in WebRTC.org and in AIORTC. Yeah, so for AIORTC, what are you using for congestion control for the media side?
Congestion control on the media side, at the moment where there is, I mean, congestion control is implemented for data channels, that's part of the SCTP spec. There is no congestion control per se on the media side.
There is receiver-estimated maximum bandwidth, and so the video codec, the VP8 codec does respond to this and will adjust its bandwidth. I mean, for audio, for instance, there's absolutely no provision for this.
Either you have the bandwidth or you don't, and communication breaks down. I hope I answered the question. Last question, a quick one. Otherwise, you can meet the guys around here. So I completely agree on documentation.
It sucks, and the Mozilla developer network is just wrong 99% of the time when it comes to WebRTC, which is a real shame. So yeah, we should all, it's not really a question at all. I'm completely agreeing with you on that front.
I just want to talk about that 0.5 number. You guys, there's an awful lot of AV stuff in that SDK code base, and it's difficult to talk about a total number of lines when we're talking about transport and encryption and AV.
And I don't want people to think that, oh, why is the Google one really, really big and yours is really, really small? Definitely, this is kind of being a bit cheeky. But still, within the 6K lines of code that I mentioned, you do have a full ICE implementation support for DTLS
and all the way up to receiver-estimated maximum bandwidth. But still, I think that somewhere, for sure, AIO RTC does more than 0.5% of the features. I'm not claiming full feature parity, but still, there's a massive Pareto effect going on here.
And concerning documentation, on MDN, you can contribute things such as what browsers support which features. We've both done some contributions to this effect, and I encourage you to do the same.