Asynchronous I/O and the real-time web
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 160 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/33672 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Concurrency (computer science)Thread (computing)Position operatorSingle-precision floating-point formatWeb-DesignerContext awarenessSoftware developerMereologyMedical imagingEstimatorFood energyMetreProcess (computing)Arithmetic meanWeb 2.0Lecture/Conference
01:45
State of matterServer (computing)Gateway (telecommunications)Interface (computing)AbstractionInformation technology consultingKernel (computing)Digital photographyMultiplication signDifferent (Kate Ryan album)Web-DesignerProduct (business)Software frameworkLatent heatPoint (geometry)Cartesian coordinate system1 (number)Binary fileWeb 2.0Computer-generated imageryProxy serverCommunications protocolGenderServer (computing)TouchscreenNumberMobile appTwitterRight angleElectronic mailing listSynchronizationBitHookingComputer animation
04:53
Server (computing)Thread (computing)Dependent and independent variablesServer (computing)Flow separationThread (computing)Multiplication signDatabaseDependent and independent variablesProcess (computing)DataflowCartesian coordinate systemOperating systemCodeWeb 2.0Scaling (geometry)Product (business)Monster groupPhysical systemArithmetic meanComputer configurationRight angleVirtual machineBefehlsprozessorCore dumpClient (computing)LogicPanel painting
06:51
Scale (map)Communications protocolConcurrency (computer science)Operations researchStatement (computer science)ParsingDependent and independent variablesControl flowStatement (computer science)Operator (mathematics)Context awarenessServer (computing)Term (mathematics)Concurrency (computer science)Goodness of fitDependent and independent variablesMultiplication signClient (computing)Digital mediaVideoconferencingWeb 2.0DatabaseQuery languageMixed realityQuicksortComputer fileCodeThread (computing)Greatest elementNeuroinformatikOrder (biology)Communications protocolFunctional (mathematics)TelecommunicationSimilarity (geometry)SynchronizationWordRevision controlSystem callCASE <Informatik>Game controllerPattern languageDifferent (Kate Ryan album)Video gameProcess (computing)Observational studyPoint (geometry)Scaling (geometry)Event horizonOptical disc driveDirection (geometry)State of matterConnected spaceWeb applicationFile systemBefehlsprozessorCoroutineWater vaporCore dumpDiagramCellular automatonArithmetic meanForm (programming)Computer programmingBuildingGroup actionFrequencySingle-precision floating-point format2 (number)SequenceCache (computing)
16:20
Event horizonLoop (music)Single-precision floating-point formatThread (computing)Operations researchPhysical systemAngleInstallable File SystemDuplex (telecommunications)TelecommunicationCommunications protocolComputerSocket-SchnittstelleOpen setMessage passingClient (computing)CodeConcurrency (computer science)Software frameworkThetafunktionArmThread (computing)Computer networkForm (programming)Computer configurationSupersonic speedMultiplication signData storage deviceDigital photographyInstance (computer science)Insertion lossDatabaseBitPoint (geometry)Directed graphSoftware frameworkLibrary (computing)MereologySingle-precision floating-point formatFormal languageCodeMechanism designWeightEvent horizonSystem callRight angleCartesian coordinate systemFunctional (mathematics)WindowMobile appDescriptive statisticsWeb 2.0TelecommunicationClient (computing)Server (computing)Operator (mathematics)Socket-SchnittstelleFile systemDifferent (Kate Ryan album)NumberConcurrency (computer science)Game controllerWordDuplex (telecommunications)CASE <Informatik>Kernel (computing)Communications protocolLoop (music)Process (computing)ProgrammschleifeBefehlsprozessorRegular graphCondition numberDirection (geometry)Software testingNatural numberWeb browserPhysical systemGoodness of fitCoroutineQueue (abstract data type)Connected spaceData structureTask (computing)2 (number)JSONXML
25:49
Game controllerWeightMultiplication signEvent horizonFunctional (mathematics)Normal (geometry)Loop (music)Lecture/Conference
26:46
Game controllerWordContext awarenessEvent horizonContingency tableFigurate numberLibrary (computing)Different (Kate Ryan album)GenderCoefficient of determinationLecture/Conference
27:42
CoroutineCodeLoop (music)Event horizonGame controllerSystem callAbsolute valueRight angleChainObject (grammar)Resolvent formalismMathematicsLecture/Conference
28:11
Query languageObject (grammar)Process (computing)Similarity (geometry)CoroutineControl flowMultiplication signLecture/Conference
Transcript: English(auto-generated)
00:05
Okay, so before I introduce myself, I want to tell you a quick story. So recently I interviewed an engineer for a position who had a Node.js background.
00:23
And Node.js is really great. This is Node.js conference room. Okay, so he had a lot of Node.js background and I asked him a question. I said, so what do you like about Node.js? And he said, well, it's single threaded. It's awesome.
00:43
And I said, okay, what does that mean? And he couldn't explain it. So he didn't get the job, but the important part of the story is that it made me realize that maybe there are developers out there who are either using concurrency and async technologies
01:06
or want to use it and they really don't know what it means and they're kind of scared of this whole topic. So it's interesting that I think this room alone hosted three talks about AsyncIO just today, which really is pretty significant
01:25
and tells that a lot of people are interested in this. So this talk, I'm really going to try to demystify what this is and why is that relevant within the context of web development.
01:40
So hi, everybody. My name is Amit, Amit Nabarro. I come from Israel. This is a photo of Tel Aviv. Kind of looks like Rimini, I suppose. It's really hot right now, but it's a lot of fun. I work for 475 Cumulus, which is a consultant agency,
02:07
and I rant and let out on my Twitter feed over there. Okay, so how many people here are doing web development with Python?
02:20
Okay, that's a good number. So you guys probably know at least one of the frameworks which are now on the screen. Probably Django is the most prominent one, or at least the one mostly used, but there are all other kinds like Pyramid and Flask and I think actually it's a really long list.
02:43
And all those frameworks, first of all, those are frameworks. Those are not web servers, if anyone has ever made a mistake to confuse that. Those are libraries. But the important thing is that they all have one thing in common,
03:03
and the common thing that they all share is that they all work or implement WSGI. WSGI is a standard which was first established in 2003, and basically what it is, it's a glorified CGI, if anyone's old enough to know what that is.
03:24
And basically the whole point of WSGI was to try to create a specification where Python web frameworks can easily work under production web servers.
03:41
And the way WSGI works is that you write your app in a framework which supports WSGI, and then pretty much any production web server like Apache or NGINX can hook up to it and serve your application. And you don't really need to know how that works and you don't really need to care,
04:01
you can really just focus on the application logic. So, but what's wrong with WSGI? Well, actually nothing is wrong, so thank you all and have a great conference. I'm kidding.
04:20
There are two things that are wrong with WSGI. First of all, it's synchronous. It's synchronous in the sense that it can't handle multiple requests at the same time, and I'm going to put that in quotes because it's not entirely true, but we'll talk about this in a minute.
04:42
And then the second thing that's probably more important that it only supports the HTTP protocol. It doesn't support any other protocol. So let's dive into that a little bit. Okay, so if you are writing a WSGI-based application, this is kind of how your flow looks like.
05:03
You have a web server, or a WSGI web server. Every request that comes in, the WSGI server creates a thread, an OS thread, an operating system thread. And then this thread processes your request,
05:20
your code runs in this thread, and you handle it, and you do stuff, and you go to the database, and you do all kinds of whatever your application logic does, and then when the response is ready, it's sent back to the client. And during that time, that thread is blocked.
05:41
So if you were to run a WSGI server with a single thread, and you are doing that when you're using Django's run server, it's a single-threaded web server, you can basically only process one request at a time. If you're running a production web server like GUnicorn or UWSGI,
06:03
then potentially you can create as many threads as you want, but if you only have a few cores on your machine, it doesn't really matter how many threads you have, you are limited by the amount of CPU cores. So I said earlier it's not exactly synchronous.
06:25
Well, all your code is running synchronously one after another, and then you really don't get a lot of options there. So how do we solve this? There are some huge systems out there,
06:41
Instagram running on Django, Washington Post running on Django. How are they doing that? How are they processing so many requests? Well, what they do is they scale. That's what they do. They just run more servers and more servers, and they optimize their code, and they run more and more servers,
07:02
and that's a pretty good approach. It solves your problem. The only thing that is disappointing in this approach is that it's very linear. So eventually, depending on your code, depending on what you're doing, let's say your single thread or single score
07:22
can handle 100 requests per second. So if you need to handle 1,000, then you need 10 of those. If you need to run more than that, then you need to duplicate your servers. So the scaling is very linear, and you cannot actually improve where your pain points are
07:43
and where your bottlenecks are. So that's one of the biggest problems of WSGI. The second problem is that it only supports HTTP.
08:02
So HTTP, if anyone doesn't remember, is a stateless protocol. It means that the client sends a request, the request is processed, and then the response goes back to the client, and that's it. There's no more connection anymore.
08:21
And if the protocol is stateless, then it's very difficult to create stateful communication between clients and servers. And stateful communication or bidirectional communication or whatever you want to call it, it's a pretty hot item. Everybody wants to do it today.
08:42
And HTTP just doesn't support it. So there are all kinds of workarounds. You can do long-palling. You can do service-side events. I don't even know what that is.
09:01
Really nasty stuff. So we reached kind of like a glass ceiling. WSGI is wonderful. It allows such a huge community of engineers all over the world to quickly build web applications with Python.
09:22
But for some apps, it's just not really relevant anymore. So there is a solution. And that solution is concurrency.
09:41
That's a hell of a word. Let's see what concurrency means. Let's look at Wikipedia. So according to Wikipedia, concurrent programming is a form of computing in which several computations are executed during overlapping time periods,
10:01
concurrently instead of sequentially. That's what Wikipedia says, and we all know that Wikipedia is never wrong. So we're going to take that for granted. But what it actually means, let's see if we can use this diagram here.
10:22
But before I talk about the diagram, I want to make sort of a statement. And my statement is that most web applications, most of what they do is they perform I.O. operations. I'm talking about web applications.
10:40
Mostly, that's what they do. They go to the database to fetch data. They go to the cache to fetch data. They go to a file system. They send HTTP requests to other servers or microservices or whatnot. These are all I.O. operations.
11:01
And those I.O. operations are I.O. intensive, but not necessarily CPU intensive. And then what ends up happening is that those web servers, they mostly just wait for things to happen. We have those really powerful CPUs. And most of the time, they just wait for I.O.
11:22
to leave or come back from some other place. So if we were to look at this example, the diagram that I'm showing here, if we had a single-threaded WSGI server, and it got four requests from four different clients, it will process them sequentially.
11:43
Okay? You'll first do the blue one and then the orange one, then the green one, you guessed it, the purple one. Okay? And concurrency is all about doing it differently. And instead of doing this, we want to do this.
12:05
What does it mean? It means that, okay, I am sending an I.O. request to my Postgres database, and while I am waiting on a reply from my Postgres database, instead of waiting, I can do something else. I can handle another request, which maybe wants another Postgres query
12:23
or maybe wants me to get a file from the file system. Okay? So instead of doing all those requests sequentially, we sort of mix them together, and we hand over control from one request to another, in order to optimize what the computer is doing.
12:45
So here's another way of thinking about it. If you look at the bottom video, you see an eight-year-old request and a six-year-old request happening sequentially.
13:01
First, what is it? Peach one, and then the white one. And then on the top video, you see them happening concurrently, right? They're both processed at the same time. By the way, they're very proud to be here today, just to let you know. So, okay, okay, good, good.
13:22
You're saying we can process one request while waiting on the other request? Okay, sounds good, but how do you do that? Well, there's only one secret ingredient to this. Nobody, don't let anyone tell you differently.
13:43
You have to explicitly give up control. I know people don't like that term, give up control, but I think that within our context, that's a very good thing to have. And really, the emphasis here on the word explicit.
14:04
While you run a query on your database, in your code, you will explicitly hand over control to some other request that comes in to do its thing, knowing fully well that someone else will eventually relinquish control
14:23
and give it back to you once your query returns from the database, okay? I know that's kind of scary for first-timers, but let's see if we can clear that up. Okay, so if you've been doing some JavaScript in your professional life,
14:42
then you probably know this pattern here on the left where I use jQuery to get some data from a web server. And this is pseudocode, by the way. And then as soon as that data comes in, I have a callback, an anonymous function in this case,
15:03
and I do something with this callback once it comes back. And everyone who's done JavaScript before knows that if we look at the code here on the left, we know that do something will execute after
15:21
do something else, okay? Because do something else is called immediately when get is done, and do something is called as a callback once the request finishes. In Python, we can do something very similar. It depends on which version you're running,
15:42
but in Python 3.5, and I'm sure you've heard it before, we have those new cool words, await and async. And essentially, this is kind of similar. It's not exactly the same, but it's kind of similar where I call a function called fetch data,
16:00
and fetch data isn't a regular function. It's what's called a core routine or an async routine. And if you remember earlier, I said we relinquish control. We say, okay, I sent that request. Now, I'm going to wait for it. Somebody else, here, go ahead, use the CPU, and once my data is fetched,
16:22
I can do something, and then I can do something else. And in this case, do something is executed before do something else, because in Python, unlike JavaScript 5, we have an easy way to write asynchronous code in a structured way,
16:41
while in JavaScript 5, we cannot. In JavaScript 6, they kind of copied every cool thing from Python, so now they have await and yield and whatnot, but those concepts have been around for a while in Python. Before 3.5, we used yield from rather than await,
17:01
and we used a decorator rather than the word async. Okay, so how does this work? Earlier, I said that a WSGI server operates under the assumption that every request creates a new thread and that the request is processed by the thread,
17:22
meaning that if you get a lot of requests, you get one thread per request. There are two problems here. Problem number one is that thread management is done by the OS, not by you, which I guess most of the time it's a good idea.
17:40
And the second problem is that thread creation is expensive and limited. Creating a thread, destroying a thread is an expensive OS operation. It takes time. A single instance or server is limited with the amount of threads it can create, okay?
18:04
Therefore, if your server has to handle 50,000 requests a second, it's going to run out of thread at some point. The way concurrency works is that everything is handled on a single thread.
18:26
And unlike that interviewee, I'm going to actually try to explain it to you. So what concurrency is using is using a concept which is called an event loop. Event loop is something which is triggered by
18:43
a mechanism in the operating system kernel. In Unix, it's called EPOL. I forget what it's called in Windows. And essentially what it means is that we can create functions, we can call those functions, but we don't actually call them. We just declare them, stick them into the event loop,
19:04
and then at some point, they're going to be pulled out, executed, and then shoved back into the event loop for our code to receive, okay? So here on the right, we see all those different kinds of operations which we can do asynchronously.
19:24
We have file system access. We have data store access. And then a request comes in. It creates a coroutine. It shoves it into the event loop. And the event loop, it has a queue, right? It just has a queue of a lot of
19:43
handles to coroutines which it processes, and it'll process them one at a time. And as soon as one coroutine explicitly gives up control, okay? Let's go back to that giving up control thing. As soon as you call await or yield from or whatever mechanism in different languages,
20:03
you basically tell the event loop, okay, I'm going to wait now. Go ahead, run those other coroutines in your queue, and then come back to me when I'm done. And to me, that's kind of like good citizenship, you know?
20:22
You say, okay, I'm not going to waste those shared CPU resources. I'm going to wait on my thing, and then knowing well that I'm going to get back the CPU once my data comes back. So essentially, this is how event loops work in a nutshell, okay?
20:43
It's kind of a little bit more complicated than that, but we don't have time for that today. And then that really solves the problem of processing a lot of requests at the same time. The second thing which is really nifty and cool
21:03
is this thing called WebSockets. We're going to go back to Wikipedia here. Okay, so WebSocket is a computer communication protocol providing full duplex communication channels over a single TCP connection. Okay, first of all, do not mistake WebSockets with regular sockets.
21:23
It's not the same thing, okay? WebSockets is a protocol on top of HTTP which was created for a single purpose, and that is having bidirectional communication with a browser. That's it, okay? And how does that work actually?
21:41
You have a client. A client makes an HTTP request to the server and then with a request to upgrade. I want to upgrade my relationship with you from a single direction to a bidirectional communication. And then at that point, both server and client can send back and forth requests to one another.
22:02
And then each side can also terminate the request if needed. That's really the shortest explanation of WebSockets ever. Okay, so let's talk a little bit about what's available for us.
22:26
When it comes to libraries that do concurrency, we have Twisted and Tornado which are pretty old. By the way, Tornado is just absolutely fantastic. And if you are forced to use Python 2,
22:42
then Tornado is probably your option. Otherwise, I would suggest you use AsyncIO. It's part of the standard library and it has a growing ecosystem around it and it's very promising. If you need a web framework,
23:00
let's not confuse framework with libraries, then on top of AsyncIO, we have Sanic which I was exposed to first time here in this conference and I was really impressed. If you come from Flask, Sanic would be really nice. AOHDP is also a fantastic option.
23:20
I've been using it for a while now and I'm very happy and I'm very happy with the way it performs and the way it moves forward. And if you are on Django and really doing non-Whiskey is not an option for you, then Django channel seems to be a good compromise and I suggest you look into it.
23:44
Okay, the advantages are efficiency, kind of self-evident by what I just explained, not having to wait for something to happen and in the meantime being able to do something else. It solves the C10K problem.
24:01
C10K is, you can look it up on Wikipedia, essentially a description of what happens when you have 10,000 requests per second on your server. You can spawn tasks and you can do it easily without a system like Celery
24:21
and not Celery is amazing, but if you just need simple stuff, then you can do that without Celery and obviously it gives you bidirectional communication and that's really, you know, if you want to increase the user experience of your apps, then bidirectional communication is kind of a must-have.
24:43
Pitfalls, very hard to debug, even worse when you have to test. If I had more time, I'd show you. You really have to watch for locks and race conditions and once you get your hands dirty, you're going to run into those
25:01
and the most important thing is, I tell that to everyone, if you don't write concurrent code all the way, then you're wasting your time. So if you have a concurrent web server but your database access isn't concurrent, then you've done nothing. Everything is not concurrent. So your code has to be concurrent all the way.
25:24
Thank you. Thank you very much for listening. Do we have time for questions? Awesome. Any questions?
25:48
So you showed us the process with the keyword await, which handles giving up the control. What does the other keyword is doing?
26:04
The async, I mean. Async is essentially telling, it acts kind of like a decorator. It takes a normal Python function and it says instead of just executing it, return something which is called a future
26:21
and unfortunately we didn't have time to delve into that and then you put that future into the event loop and that gets executed on the event loop time rather than immediately. Once you dive into this and you get your hands dirty, you will figure this out really quickly.
26:47
So you said that the main thing is that you should give up the control explicitly. Can you compare that or can you just say a few words about a solution like gEvent or the implicit context switching?
27:04
So gEvent, thanks for bringing that up. gEvent is an implicit way of handing over control. By the way, Django channels is using gEvent under libEvent. Not really sure what the difference is.
27:21
And I think the main difference is that you really have absolute control over when you are handing over control rather than let the library figure it out based on what's going on. I think that's the main difference. Over here.
27:41
Great talk. So you said async returns a future, so coroutines return futures and the await keyword kind of gives control to the event loop and after this is ready, we continue with the rest of the code. Is there an API if we want to do something more complex
28:00
like work with the future object, make a promise chain when everything's resolved, call this callback. Something like in JavaScript world right now. Absolutely. The example I showed is like the simplest example. You can launch hundreds of coroutines,
28:24
finish calling all of them, collecting all your future objects and then wait for them. You can say, okay, they're going to happen one after another but if one of them fails, I want out. You have a lot of flexibility. If you're coming from the JavaScript world
28:41
and you're doing jQuery with promises and then and when and all that, it's kind of similar, it's not the same but you have a lot of flexibility. And there's still time for one final question so if not, then there will be a coffee break.
29:02
Please give Amida a hand. Thank you. Thank you very much.