Building Realtime Web Applications with WebRTC and Python
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 63 | |
Number of Parts | 119 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/20037 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Production Place | Berlin |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
EuroPython 201463 / 119
1
2
9
10
11
13
15
17
22
23
24
27
28
41
44
46
49
56
78
79
80
81
84
97
98
99
101
102
104
105
107
109
110
111
112
113
116
118
119
00:00
World Wide Web ConsortiumBuildingQuicksortStudent's t-testProjective planeReal-time operating systemTerm (mathematics)Multiplication signWordOpen setBuildingWeb 2.0Web applicationComputer animationLecture/Conference
00:42
Function (mathematics)Mechanism designImplementationWeb browserWorld Wide Web ConsortiumCodeVideoconferencingStreaming mediaContinuous trackDigital signal processingTelecommunicationInformation securityData managementBand matrixSample (statistics)Plug-in (computing)VideoconferencingFrame problemWeb browserLine (geometry)Mechanism designSource codeNoise (electronics)InternetworkingRemote procedure callHypermediaSubsetEvent horizonStreaming mediaVotingStandard deviationData managementMetropolitan area networkTelecommunicationInformation securityQuicksortGoodness of fitGene clusterNavigationFunctional (mathematics)Connected spaceCommutatorTime zoneServer (computing)Musical ensemblePhysical systemAuthorizationCodeFigurate numberClient (computing)Set (mathematics)Water vaporObject (grammar)Peer-to-peerMereologyRegulator geneConstraint (mathematics)WebcamParameter (computer programming)Universe (mathematics)GenderImage resolutionMultiplication signBand matrix2 (number)Error messageMessage passingElectronic program guideVideo trackingReal numberCASE <Informatik>Binary codeRight angleComputer animation
09:06
Server (computing)System callControl flowMessage passingHypermediaCodecBand matrixAddress spaceComputer networkClient (computing)ImplementationPhysical systemService (economics)DataflowSystem on a chipWorld Wide Web ConsortiumWeb browserUniform boundedness principleComputing platformGoogolCommunications protocolAxiom of choiceCodeSample (statistics)Link (knot theory)SummierbarkeitAverageSoftware engineeringExecution unitNormed vector spacePhysical lawUniform resource nameArc (geometry)Dimensional analysisComa BerenicesDenial-of-service attackFinite element methodEmulationSpecial unitary groupTwitterMoving averageMUDGamma functionWide area networkMessage passingComputing platformWeb 2.0Web pageWeb browserProcess (computing)Physical systemClient (computing)QuicksortRight angleNetwork socketService (economics)Server (computing)CodeLink (knot theory)InternettelefonieInterprozesskommunikationPeer-to-peerConnected spaceFunctional (mathematics)EmailDemo (music)Multiplication signDescriptive statisticsCodecMetadataSystem callDataflowGoogle App EngineOpen sourceVideoconferencingEntire functionMechanism designHypermediaMobile appBand matrixMassRule of inferenceWebsiteShift operatorPoint (geometry)Line (geometry)NumberMusical ensembleGroup actionCuboidMetropolitan area networkInternetworkingCodeMereologyComputer programmingDirection (geometry)Computer animation
17:29
Web 2.0Client (computing)BitServer (computing)Service (economics)Streaming mediaIP addressCommunications protocolInternetworkingFlow separationMultiplication signInsertion lossGroup actionSpeech synthesisAddress spaceLine (geometry)Set (mathematics)Surface of revolutionQuicksortVertex (graph theory)Computer animationLecture/Conference
Transcript: English(auto-generated)
00:15
Hello guys, I'm Tarashish, I'm an undergrad student from India, and I'm sort of new to
00:24
web development, but I was an GSOC intern for the OpenHatch project last year, so I'm gonna talk about WebRTC and how you can easily build real-time web apps with
00:41
WebRTC and Python. So first, let me tell you what this talk is about and what it's not about, it's kind of a high-level overview of what WebRTC is, what it can do, and what APIs it has, and
01:00
then there's this overview about the signaling system, we'll get to what signaling system is, and how it works, and all that, and how exactly do you implement it. Next, what it's not about, it's not really a deep guide about what WebRTC can do, how
01:27
the things work inside WebRTC, because let me tell you, I'm not that experienced to actually tell that to you, and it's not really in-depth talk about signaling, because
01:40
actually it depends on your particular use case. So let's start with WebRTC 101. First question, what exactly is WebRTC? So WebRTC is this new technology, it was developed at Google first, and then they
02:02
open-sourced it, it actually lets your browser, or any client for that matter, to talk to another client, another, its peers, in real time, before WebRTC, browsers didn't
02:20
really have the capability to get the media sources, like access your webcam or your microphone directly, we should use some kind of native plugins, and download and install them to let our browsers access the media sources, but now WebRTC actually provides
02:46
you some JavaScript API, so you can actually access media directly from your browser, and communicate that media, or even arbitrary data, like it can be textual data or binary
03:00
data to peers directly, it's sort of a peer-to-peer network. So as I told you, WebRTC mainly has three functions, first thing, sorry, oops, oops,
03:32
sorry, okay, yes, three functions, first is to access and acquire the video and audio
03:41
streams from your webcam or your microphones, second is to establish connections with your peers, and actually transmit that data, that video, audio data that you acquired to other peers, then third thing is you can communicate arbitrary data, suppose you
04:04
are building a game, you can just communicate JSON or something, or files, or whatever you want, so for these three functions, WebRTC provides three JavaScript APIs, first one is get user media, it lets you pull out all those video and audio streams, second
04:24
is RTC peer connection, it's here, the magic really happens, you communicate with your peers and all that, third is RTC data channel, data channel lets you communicate the arbitrary data that is not the audio, video only, so first get user media, acquiring
04:42
audio and video, so it really has a simple API, navigator is the global object and navigator to get user media gets you all the audio, video, tracks, you pass in some constraint object into it and specify all resolution and do you want audio or video or both,
05:07
it provides you tracks, video track and audio track, and you get channels, like left channel for your audio, right channel for audio, and it's what the code looks like, this
05:20
is the constraint object and you can pass video, true, audio, true or both, false, I don't know, then the resolution of the video you want and all that, then you pass in the success callback, so if you get the stream and whatever you want to do with it, you can do it inside the success callback and then the error callback or whatever happened,
05:47
then it's RTC peer connection object, it really does all the heavy lifting, does noise cancellation, echo cancellation, codec handling, peer-to-peer communication security
06:00
and all that, bandwidth management, it sort of figures out what the bandwidth is for the clients and then what sort of, what resolution should you transmit and all that, it's built in so you don't have to worry about that, it does sort of, does all this automatically.
06:23
Is there some way to influence the bandwidth management from the JavaScript side? No, I don't think so, it automatically does that. So this is what the code looks like, you get a peer connection object, then there
06:41
are these event handlers, so what you want to do when you get a remote video stream attached, first you actually create an offer, you send that offer to your peer, then they send you back an answer, so then the communication actually begins and then you can have a bunch
07:03
of these event handlers, so what do you want to do after you got the offer, what do you want to do after you got the remote video stream or something, then the RTC data channel, it's kind of, the API is mostly like WebSockets, but with WebSockets, you
07:24
actually send the data over to the server and then back to another client, but in case of RTC data channel, it's client to client, so the latency is pretty low and you can choose between UDP or TCP, so depends on what your data is, like if you want a reliable
07:49
connection, reliable transport or a fast transport, so you can choose and it's secure, so there's DTLS, secure by DTLS, so in thought, in summary, before WebRTC, client used to
08:12
send data to the server and then server pushed it back to another client, but now it's like client to client, so everything's fast and secure, and you know when it says
08:23
snooping in between. The question is how exactly do the peers find each other, because if you want to communicate with someone, you just can't go into the internet and say hey, I want to talk to
08:41
you, talk to me, connect to me, so you first need to figure out a way that you find your peer on the internet, but actually the WebRTC standards don't actually mention or there is no standard for finding your peer, it's up to us to implement that
09:07
in whichever way we want, so this thing is actually called signalling, so signalling is sort of the process where you find your peer and then communicate the essential data
09:28
that is needed to establish the initial connection to start transferring the actual data, so I told that WebRTC actually peer to peer, but we still need servers to actually
09:48
build the connection, so one example is what signalling does is peers connect to
10:01
a certain server, then the server tells hey, this is A, this is B, and then you can talk all you want without saying anything to the server, so before the actual call begins, the actual data begins to transfer between the clients, you need to sort of
10:23
pass these metadata to the other peers, you need to pass what codecs do you support, how is your bandwidth and stuff, and then what exactly is your public facing IP so that
10:41
the connection can be made and all that, and it's all taken care by the signalling mechanism. Okay, so this is what it looks like, the apps communicate by signalling at first to exchange the session description, once you get the session description from your other peer,
11:05
you pass it into the browser and then the media actually flows from browser to browser, you don't need server then, so let's talk about how you can actually implement signalling.
11:25
So signalling is just another messaging service, so you just need some way to, where there is bidirectional flow of data from one client to the other,
11:40
it can be via a server or you can even send your session description via email or some kind of messaging service or even a messenger if you want, it doesn't matter, if you just need some kind of messaging service and it has to be bidirectional, so
12:03
for that we have WebSocket, it's kind of the newest technology for signalling and you don't actually need to worry about browser support per WebSocket because all the browsers
12:21
that support WebRTC support WebSockets too, and then there is XMPP Jingle, I don't know what that is but it's used for all those VoIP things and then there are commercial platforms like Pusser and PubSub that you do the message processing and then Google App Engine has this
12:45
messaging channel API that lets you pass messages from one client to the other, then there is RTC data channel API of WebRTC itself, so it's not actually self-sufficient
13:02
because you still need to get the connection going but it's sort of, it's lower latency than WebSocket, so you can just do the first connection via WebSocket and then switch to RTC
13:21
data channel and the signalling will be much faster and much more secure, so it's what it looks like and the client sends the data, signalling data to the server and then the server sends it back to the other client and then stuff just happens. So let's see the code.
13:46
I actually implemented it in Tenado, so I have these rooms that, rooms are like channels and they have clients connected to it and they track
14:02
which clients are connected to each room and then there's this room handler just serving a static page and this main handler just redirecting people to a certain room. This is where the signalling actually happens, it's just a WebSocket handler that
14:22
when it receives a message it just broadcasts it to other clients in the room and that's it, that's all you need to implement the signalling system. So the code is actually, the full code is actually on GitHub, I have the link to us,
14:44
and then I just tag the server and it's all done and then in the client side I have this init function that gets the video and audio data and all that and it has all pure connection stuff that we talked about and it's really just that simple. But there's a catch,
15:12
the thing is this is kind of, it was really small in scale, when you actually
15:23
build bigger stuff with sort of like a clustered environment, it doesn't work if you just have a sort of a global dict of rooms with all the clients connected and all that. You sort of need
15:41
something better, a better messaging system and I think it can be implemented easily by using something like ZeroMQ, actually the people at TalkBox, TalkBox is this platform that
16:01
provides you web RTC infrastructure to build your apps on, they actually build this messaging system and their entire web RTC signalling system with ZeroMQ, it's called Rumor I think, I don't know if it's open source or not, and so you can always do that
16:25
inside the web socket infrastructure. So the code is on GitHub, sorry I should have actually put the link there, sorry.
16:42
It's the link but I don't know, the text should have, oh shit, the text should have said the link actually, okay. Maybe tell your GitHub user name. It's right there, Sunu on GitHub and Adzi on Twitter, thank you, any questions?
17:16
I think you have time for a few questions, yes.
17:20
No questions, thank you again. Actually I have a demo. One question, I played a bit with web RTC and there are quite some services on the internet
17:44
where you can test video and audio conferencing with a small group of people, but what was rather often the problem that with two people it works quite okay, if you try with four people it's somehow stuttering and having issues and people get lost somehow or don't get in,
18:04
is there some trick to tune it a bit somehow? It's kind of, it depends on the bandwidth because with four or five people communicating with each other at once, you sort of send each stream separately to each client, so
18:29
the trick would be to set up something on the server and then use web RTC infrastructure on the server and then sort of send the stream to the server only and then send that
18:47
to each client, so you have to upload your video stream only to one client instead of all the five separately.
19:01
Any more questions? Dougal, I think you can set up now. Well thank you for that, I know, oh one more. It's not, not traversal in any way addressed in web RTC?
19:21
Yeah, it's, there are some protocols called Stern and Stern actually lets the clients know the public IP of the client, so of course you need to try Stern, but if it still
19:42
doesn't work then you can try Stern that actually sends the data through a Stern server to the client to address. Okay, thank you. Thanks. Thank you.
20:01
I know Tarashis was worried, this is his first time speaking at EuroPython, very well done and very interesting.