We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Understanding Non-blocking IO

00:00

Formal Metadata

Title
Understanding Non-blocking IO
Title of Series
Part Number
75
Number of Parts
173
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language
Production PlaceBilbao, Euskadi, Spain

Content Metadata

Subject Area
Genre
Abstract
Vaidik Kapoor - Understanding Non-blocking IO As an engineer working on any web stack, you may have heard about Blocking and Non-Blocking IO. You may as well have used any framework or library that supports Non-Blocking IO. After all, they are very useful as you don't want to block execution of other tasks while one task is waiting to complete a network call to another service (like HTTP call to an API or may be a TCP call to your database). Non-Blocking IO while doing tasks and not wait for IO. This also helps us handle a lot many connections than we possibly could with Blocking IO. Python supports Non-Blocking IO, but we always use some existing 3rd party library that hides all the gory details and makes it all look like black magic to the uninitiated. But there is nothing like black magic. This presentation will be an introductory talk focused at explaining how Non-Blocking IO works, which is the basis of libraries like Gevent, Tornado and Twisted. We will learn about how Non-Blocking IO can be implemented using the most basic modules that form the base for the above mentioned libraries. Hopefully after this talk, Non-Blocking IO will not be an unsolved mystery for you anymore.
Keywords
51
68
Thumbnail
39:40
108
Thumbnail
29:48
GoogolAreaMultiplication signUMLLecture/Conference
Sample (statistics)Level (video gaming)System programmingComputer networkSoftware developerWorld Wide Web ConsortiumReal numberFunction (mathematics)Block (periodic table)CodeBlock (periodic table)CuboidContent (media)Scripting languageConnected spaceDependent and independent variablesWeb 2.0Projective planeArithmetic progressionCASE <Informatik>Line (geometry)Cartesian coordinate systemMultiplication signMathematicsWeb-DesignerFunctional (mathematics)Library (computing)System callSoftwareCodeParallel portSoftware frameworkRight angleBitExistenceDatabaseView (database)StatisticsLimit (category theory)SequelCombinational logicScaling (geometry)Interactive televisionNeuroinformatikLecture/Conference
GoogolCodeoutputCartesian coordinate systemComputer programLine integralLecture/Conference
Independence (probability theory)Physical systemTask (computing)Function (mathematics)Server (computing)Web 2.0Task (computing)Process (computing)Server (computing)DatabaseSystem callCartesian coordinate systemCASE <Informatik>Thread (computing)WritingMultiplication signWeb applicationComputer animation
GoogolCartesian coordinate systemAreaPhysical systemMultiplication signSet (mathematics)Computer fileMultiplicationTask (computing)BefehlsprozessorServer (computing)CodeWritingLecture/Conference
Military operationMiniDiscRWE DeaDensity of statesClient (computing)Server (computing)Computer iconMetropolitan area networkMaxima and minimaNetwork socketInformationDiscrete element methodData typeAsynchronous Transfer ModeValue-added networkDuality (mathematics)Computer fileDirectory serviceReal numberGrand Unified TheoryJava appletHand fanInsertion lossZoom lensBitVisual systemTotal S.A.Computer networkLevel (video gaming)Ext functorArmSystem callLine (geometry)Event horizonLength of stayHigher-order logicSicCartesian coordinate systemResidual (numerical analysis)WeightNetwork socketVirtual machineOcean currentScripting languageNumberClient (computing)Level (video gaming)Connected spaceSoftware2 (number)Block (periodic table)BlogState of matterCodeData modelLine (geometry)Bit rateView (database)Process (computing)Physical systemComputer fileDistanceEndliche ModelltheorieLetterpress printingSoftware developerSpacetimeContent (media)Buffer solutionComputer programVideoconferencingAddress spaceVideo gameStatement (computer science)MathematicsHeat transferLocal ringStandard deviationServer (computing)Loop (music)IterationSystem callKernel (computing)Error messageSlide ruleBroadcast programmingReading (process)Operator (mathematics)ImplementationLecture/Conference
Network socketMaxima and minimaValue-added networkMetropolitan area networkServer (computing)Client (computing)Total S.A.Chi-squared distributionDiscrete element methodData acquisitionLength of stayDirectory serviceExt functorComputer fileRobotBlock (periodic table)Grand Unified TheorySystem callPhysical systemEvent horizonDensity of statesPort scannerSet (mathematics)Software bugNewton's law of universal gravitationData Encryption StandardCodeTask (computing)OvalInformation systemsUniform boundedness principlePersonal area networkTask (computing)Generating functionDifferent (Kate Ryan album)Operator (mathematics)MereologyLine (geometry)WordOperating systemNetwork socketMultiplication signInformation securityOnline helpElement (mathematics)Exception handlingAreaIterationCodeHypermediaBefehlsprozessorDivisorEvent horizonProcess (computing)Endliche ModelltheorieParameter (computer programming)Cycle (graph theory)Physical systemMultiplicationFile viewerCompass (drafting)Electronic mailing listComputer fileObject (grammar)Right angleSet (mathematics)Sign (mathematics)NumberBuffer solutionMathematicsMappingTelecommunicationCorrespondence (mathematics)DistanceDescriptive statisticsHeegaard splittingReal numberSelectivity (electronic)Arithmetic meanAffine spaceUrinary bladderForm (programming)SubsetFunction (mathematics)Wrapper (data mining)Direction (geometry)System callScripting languageBlock (periodic table)Revision controlLoop (music)Array data structureClient (computing)WritingElectronic signatureSocket-Schnittstelle
GoogolHand fanMetropolitan area networkRobotJava appletTask (computing)Information systemsClient (computing)ImplementationWordSheaf (mathematics)Physical systemTask (computing)SoftwareGenderMultiplication signComplex (psychology)Event horizonLibrary (computing)Single-precision floating-point formatComponent-based software engineeringForcing (mathematics)Game theoryInformation securityState of matterTheoryServer (computing)Web serviceComputer multitaskingSelectivity (electronic)Web 2.0ImplementationMathematicsLengthEndliche ModelltheorieComplex numberShared memoryMetropolitan area networkComputer fileSlide ruleLoop (music)CodeOperating systemGenerating functionLecture/ConferenceMeeting/Interview
GoogolBoom (sailing)Connected spaceLimit (category theory)Physical systemNumberMultiplication signComputer fileServer (computing)Selectivity (electronic)CASE <Informatik>Operating systemSynchronizationAxiom of choicePoint (geometry)WordEndliche ModelltheorieUniverse (mathematics)Complex (psychology)Lecture/Conference
GoogolBoom (sailing)Red HatRobotLecture/ConferenceXML
Transcript: English(auto-generated)
Good morning, everyone. Good evening, everyone. First of all, I'm really excited to be here at EuroPython. It's my first time. And it's also amazing to be the speaker here. So thanks to everybody who was involved in selecting talks and giving me this opportunity. Today
we are going to be talking about non-blocking, specifically how does that work with Python. So a very high-level overview. We are going to look at what is non-blocking IO and try to understand this by examples. And essentially, why do we need to do this really? So this is going to be not very practical examples, but understanding of the concept as to what
really goes on when you talk about non-blocking IO and whenever you try to use that practically. And it is a rather beginner-level talk, so expect the content to be like that.
But just a little introduction about myself. I'm Vedic. I've been working with Python for about four years. As of now, I work with a startup based out of New Delhi, India. And I'm an infrastructure engineer there. And just in case you want to connect with me here,
I'll give you some of the social networks links. But some background as to why I'm doing this talk. Long back when I was in college, I started out as a web developer. And just out of curiosity, you start moving down the stack, exploring more things. And somewhere there was this project which required,
it was a web application, required a little bit of scaling to handle more connections on just one box that was very limited resources-wise. And just along the line when I was checking out some material, some content around how to scale web applications, I encountered gment.
And well, I used it, but then it was that time when I didn't really understand how it worked. So I always wondered how does this thing really work. But other than that, what I'm going to talk about today is something that I have not really seen anybody talk about, especially at any Python
conferences. You would see just about every other Python conference have a talk on Twisted or Tornado or Asyncio these days. In fact, I think there's a workshop going on right now, parallel to this talk. But nobody really talks about what are the underlying infrastructure that
these libraries or frameworks make use of. And I plan to shed some light on that. So non-blocking IO, or just before we start understanding what it is, let's look at what is blocking. So blocking, just a very simple definition of it is,
a function or a code block is blocking if it has to wait for something to complete. Right. That's probably the simplest definition I could come up with. And what that means is that, let's say, if you have a function that makes a HTTP request to another API,
and does something with whatever response it gets, the script cannot really progress until it gets a response. Extend the same example to any kind of network interaction, network call. For example, talking to your MySQL or Postgres database,
and your application cannot progress or do the next thing until you get a response back from that. Other examples could be, you know, you have some function that does some statistics or some mathematical computing, computation, and that just takes time. For example,
some complex integration, and it may take a while before the next thing can be done in the program or in the application. Another example could be waiting for the user to input something. For example, on the console, all these things are blocking. So the problem with blocking code
is that it is capable of dealing execution. And as long as tasks are related to each other, that's fine, because you cannot do one thing, because it depends on the other one. But what if there are independent tasks in your application which can actually progress
with each other, if not at the same time? So for example, you have a single-threaded web server, which is usually the case in Python world. We don't really write many multi-threaded web applications, at least with C Python. And you get a request. Your request handler is running,
and it makes a call to your database. And at the same time, you get another request. So the first handler is running. It single-threaded your web server. It cannot really serve the other request, because the first request is blocking the other request. Another simple example could be that you have workers consuming tasks from some queue, and
usually you offload, you know, heavy processing to workers, which asynchronously process your tasks. So if your worker is doing something, and it is blocking, well, it could be any reason why it's blocking, but if it is blocking because of IO, then you're basically
not doing anything at that time, because when your code gets blocked because of IO, you're not really making use of any CPU. But what it really comes down to is that the overall system is not able to progress. I mean, blocking is fine, but blocking other things that are independent, which could be done
while any other thing is blocking and not doing anything useful at that time, that is not good. And as engineers who want to write systems or applications that need to serve multiple of users or multiple of, you know, consumers,
they may be users or they may be any other application, we don't really want that. So now let's talk about IO, at least for modern applications, the applications that we write today. Things like dealing with the network, reading or writing from your file system,
doing operations on pipe, these are the kinds of things that would fall under IO. That may not be exhaustive, but in general, if you want to define any kind of IO operation, it would be dealing with file descriptors or doing any operations on file descriptors.
So today, at least for the scope of this talk, we are going to talk about dealing with the network and how to implement non-blocking IO while, you know, doing networking in Python. So non-blocking IO is essentially dealing with IO so that it does not block execution or
execution does not get delayed because of IO. And to understand that, let's look at some example. I don't know if that is good enough for people to read. No? Yes? Okay. It's small. That doesn't seem to be helping. Okay. What I can try to do is
better? Yes. All right. So we will look at some code in WIM instead. So this is a
very hello world kind of a thing that we look at when we start out with network programming in Python. This is a very simple server. And all it does is it accepts connection. It waits for some data from the client that is trying to connect, gets the data,
prints that data and then waits, tries to get more data and does that as long as the client is trying to send some data. So this is a very simple script. Probably most of us here in the room who have done any kind of network programming in Python would have
written something like that. So for the scope of this talk, we are not really going to be making use of this anymore. This is as simple as it gets. We won't really look at this again.
And here's an almost as simple client as our server was. There's, again, nothing. It just tries to connect to the same server, this client, and tries to send some data. To be precise, it tries to send about 70 MB of data or so. So when we
I should, in fact, have the server also open. I don't know if that is I was counting on my slides to where so that I hope that content was readable, but
probably we'll have to make do with this. So here there's a server again. And if we look at this code on line 11, the server blocks because it waits to receive a connection. It won't proceed further unless, you know, a client tries to connect to the server. And after that, on line 13, it blocks until the client sends some data
to the server so that the server can actually read the data and do whatever with the data it wants to do. Then it prints out the data and then blocks again when it tries to get more data from the client and keeps on doing this until the client stops sending any data.
On the other hand, our client script blocks at this call where the client is actually trying to send all of that data. That is 70 MB of data. So, I mean, it's understandable that it'll
take a while. And both the client and server as of now, at least, they block, right? So I'll just rush through the slides because this is what I was talking in these slides there. And the problem with this is when we run this, assuming that the server is running
a client, the client takes about 45 seconds to run on this machine that I'm using right now. And while it was just trying to, I mean, the client script was not really trying to do anything as such other than sending the data. But if it really wanted to do something, it couldn't have
because the script would just block on socket 10. So let's see how we can improve that. So for non-blocking network IO in Python, at the most basic level, what it comes down to is this. You make a socket non-blocking by calling the set-blocking method on it
and tell it to not be blocking anymore. So you pass it a false or a zero and it makes the socket non-blocking. And how does that work and how does it really look like in real world? So again, I'll switch back to the, I'll switch back to Wim here.
And this is the same client that I was showing earlier, but just with one minor change, there's one additional line here. On line five, we have set the socket to be non-blocking. And everything else is exactly the same. And when we run the script or when we run the client,
it's not exactly the same thing. We see an assertion error as we put an assert statement on the last line of the script. And it says the bytes sent to the server is not equal to the bytes we wanted to send. Now this happened as soon as we changed our script, our client's script.
And what set-blocking did was that it did not send all of the data. So the subtlety here is that when, in the previous script, when the socket was not non-blocking or the socket was blocking,
how sock.send really works is it makes whatever system call it needs to make to send the data to the other process on the other end of the connection. But what really happens is the process copies or sends the bytes that it wants to send to the other side and passes that to
the kernel. So the amount of data that can be accommodated in the write buffer of the kernel for that write call or for that send call gets passed to the kernel space. The kernel gets that and then basically puts the process to sleep because the write buffer is full.
And that's the reason why our call gets blocking. So when the call was blocking, the process passed the number of bytes it could to the write buffer. Then the kernel takes care of that, sends it across to the other side of the connection, and then in the meantime it puts the process to sleep. And then it brings the process back up
or wakes the process up when the write buffer goes empty and then gets more bytes to send and keeps doing that until all the data has been transferred. And while the process is sleeping, it's not really making use of any CPU, so we could potentially do something else in that time.
When we made that socket non-blocking, sock.sent returned immediately. Basically what happened was it just transferred the number of bytes or the amount of data it could give to the kernel to send to the other side of the connection and returned immediately saying these are the number of bytes I could transfer so far. So it just sent the number of bytes it could
immediately and not block at all, but what it gave us back in return was the number of bytes that were transferred. And that is useful stuff so that we can, as of now, our client failed. It does not send all the data, but we can use that to send all the data in another way.
So let's look at another improvement to that script. Here's a slightly different client. All we're trying to do is essentially the same thing.
Our socket is non-blocking, but we put that in a while loop as long as we have not sent all the data. It just tries to send that data again and again and again. So sock.sent returns immediately telling us the number of bytes that were transferred, and now that we know how many
bytes were transferred, we can try to send the remaining bytes in the next iteration. So this is essentially how we made our socket non-blocking, and we just keep trying to send more data. The problem with the script here is
the good thing is that we have achieved non-blocking socket or non-blocking IO here, but we are wasting again a lot of CPU cycles in running that while loop because most of the times we will not be able to send that data because write buffer will not be empty. So if we run this, actually, we will probably end up spending most of our time in this
exception block instead of actually being able to send this. And this, it will actually go here and succeed only whenever that is seldom the write buffer goes empty. And this is, again, a good improvement or a good change. We can now make use of this to make more improvements and
probably do something more useful instead of just only trying to send this data. So I'll show you another improvement to this very client with some minor changes again. And this
is, again, the same, pretty much the same client as we saw just before this, again, but with one
extra line, which is just this. And this is something new, which probably, if you've not really ever tried to look into how non-blocking, not just non-blocking IO, but this kind of infrastructure works, this might look new. Select, basically, so we are doing exactly the
same thing, but our while loop will block at this line as long as we don't have the same socket available for writing again. So as long as the write buffer becomes, as long as the write buffer is full, this call will block here. So what select does is
it stops us from wasting those CPU cycles that we were, you know, wasting earlier trying to call soft.send again and again in every iteration of the while loop. But what does select exactly do is rather interesting. So select is nothing but a system
call, and it's an infrastructure provided by the operating system for monitoring file descriptors for events. So events like, is a file descriptor ready for writing, or is this file descriptor ready for reading, or is this file descriptor ready for handling some kind of exceptions?
And what select, the select API that we just used, what we saw there, is just a wrapper of the, direct wrapper of the syscall, but it also makes using select rather simple as compared to the C API. But if you understand how to use it with Python,
you would probably be able to make sense out of it when you look at some C code. And the signature is like this. You pass select three sets of file descriptors, or three arrays of file descriptors, and these arrays are basically three different arrays which you want to monitor for either read event, or write event, or
an exception on that file descriptor. And there's a fourth optional argument, which is timeout, which basically tells how long do you want select to block until any of the file descriptors passed for monitoring become available.
So earlier in our previous example, we did not use that fourth argument, so select would block indefinitely, but you can change that and adjust that according to your needs. And it returns a subset of file descriptors as passed earlier, telling these are the
file descriptors which are available for performing whatever operation you were monitoring them for. And when we talk about file descriptors, at least in Python world, what select accepts is any object that implements the file number method. So usually sockets have a, I mean, not usually, in fact, all sockets have a file number method
on them, because sockets are essentially file descriptors, and at least in Python we have, if you call file number on socket, you get the corresponding file descriptor for it. So a good improvement that we made here was that we, at least, we are doing essentially
the same thing, that is trying to send data again and again in a while loop, but we make our process as long as we don't have another, we don't have our socket available for, you know, performing that operation. And here we look at one last example, which is
where, so far, we have only been trying to, you know, send data in a non-blocking fashion and not doing anything along with it, but essentially the idea of doing non-blocking IO is that even if you are single-threaded and, well, while you are not doing anything
constructive, you should be able to do something else if you can. So here's another example. Continuing with our previous example, but just changed a little bit, we have created two tasks. One task is the same that we wanted to send some data to our server, and the other is just
a task which makes use of CPU, it just tries to increment a counter, and we have put a sleep there just so that if we, when we run this, the output is, you know, easy to read.
So what we are going to do is we are cooperatively let both of these tasks proceed with the help of generator functions. So this is one task which is essentially a generator function. It increments a counter in a while loop, and it yields after
every iteration. And the other task is the same thing that we were trying to achieve in our client, and it's, again, the same code, which is trying to send the same amount of data, but the only difference is that where we were calling select, instead of calling select there, we yield, and we yield the socket that we were, socket object that we were using for
sending this data, and we also eat what the operation that we want to monitor this socket for. And finally, we have our main block where we implement another,
a slightly different version of our previous ugly while loop. Probably this is uglier, but here we, what we do is we try to execute both these tasks one by one and then monitor for file descriptors and execute the corresponding tasks whenever their file descriptors are ready
to take more data. So here we have, I mean, you don't have to read through the entire code, but the essential thing to understand here is that what we are doing is we have a huge while loop where we are running this while loop as long as we have some task to execute,
or we have some file descriptor to monitor. So whenever there is a file descriptor to monitor, the reason is that there is some task that wants to perform, do something after that when the file descriptor becomes available for doing that operation.
So all we do is we run every task one by one, and if the task or the generator function for the task yields a socket and asks us to monitor it, we, you know, we just keep a mapping of all those file descriptors and the corresponding task, and further down we call select
to monitor those file descriptors or socket objects and do whatever needs to be done accordingly. The difference in this select call is that we have used the timeout
argument here, and we have set the timeout to be zero because we don't really want to block. Why don't we want to block? Well, if the file descriptor is not ready for execution, we at least can let our other task, which was incrementing counters, proceed as long as these file descriptors are not ready for any operation on them. So we call select, and
on every iteration we check if the file descriptor is ready to be monitored, and we create a pending task list where we keep pushing the task that needs to be executed in the next iteration of the while loop, and this is pretty much it. So these changes, although not
pleasurable again, probably I'll share this code somewhere so that you can read it when you have more time at hand, but what we really achieved here was cooperative scheduling using generator functions and select, and we let two independent tasks proceed
along with each other in a single-threaded script, and you can actually say that this is probably your first network event loop implemented here. It's a poor man's scheduler that we also implemented. So I think I have just one minute left, and I'm going to rush through
the rest of the things as soon as possible. So we just looked at select, but our operating system actually provides more infrastructure for monitoring file descriptors. There's something called poll. Poll came soon after, I don't know if soon after select, but the implementation or
the technical details are pretty much the same other than the API itself. It is probably as bad as select, so time complexity for select and poll is probably the same. EPOL and KQ are pretty
much the defectors today. Most of the web servers like Nginx make use of EPOL and KQ. If you're using Twisted Tornado G event on Linux or BSG systems, chances are that they have EPOL implementation or KQ implementation and you're using it.
And I think I don't have more time, but yes, I'd be happy to have more questions. Other slides were essentially about what other libraries you have in the Python world, but probably we can catch up if you want to know more about that. So any questions?
Well, this is maybe more related to the system, but are you aware of the limit of asynchronous connections you might have? Any top limit?
A limit of synchronous connections. I'm sorry, I'm not aware of that, but if there is something like that, I'd be happy to know about it. Well, in Linux, I was testing one tunneling server and it had like 4,000 because select
in the CRP... About select, yes, yes. Select is indeed limited by the number of file descriptors, but I think you can change that if you are compiling the operating system yourself. But we don't do that, so yes, most of the times you'll be limited by
the number of file descriptors you can monitor using select. But having said that, I mentioned something about select's time complexity, that select as compared to EPOL or KQ tends to be slower and nobody really uses it. But it really depends on what is your use case of using select. So if you don't really have
many file descriptors to monitor, probably select would be a better choice to make as compared to EPOL before EPOL actually starts shining out. Thank you so much.