Introduction to aiohttp
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 53 | |
Number of Parts | 169 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/21100 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Core dumpSoftware developerLecture/Conference
00:40
Core dumpSoftware developerSoftware maintenanceSoftware developerFlow separationSoftware maintenanceLibrary (computing)MereologyDigital electronicsDevice driverExtension (kinesiology)Position operatorRule of inferenceCodeRadiusComputer animationLecture/Conference
01:50
Client (computing)Parallel portServer (computing)TelecommunicationWeb browserOrder (biology)Recurrence relationClassical physicsMedical imagingEngineering drawingDiagramProgram flowchart
02:32
InformationLevel (video gaming)Object (grammar)CodeVirtual machineTransformation (genetics)Lecture/Conference
03:10
Thread (computing)Task (computing)Limit (category theory)Server (computing)Power setCellular automatonWeb 2.0Task (computing)Library (computing)Thread (computing)Client (computing)SoftwareEndliche ModelltheorieScaling (geometry)DiagramProgram flowchartLecture/Conference
04:16
CodeProjective planeReading (process)Data storage deviceSampling (statistics)RootDecision theoryLibrary (computing)Goodness of fitMereologyComputer animation
05:01
Coma BerenicesSystem callClient (computing)MeasurementGoodness of fitCycle (graph theory)Binary codeDependent and independent variablesPoint (geometry)Library (computing)MereologyTotal S.A.Lecture/ConferenceXML
06:21
Data typeEmailLetterpress printingContent (media)Web 2.0Disk read-and-write headHTTP cookiePhysical systemData storage deviceDependent and independent variablesConnected spaceGroup actionLecture/ConferenceXML
07:14
MereologyBridging (networking)Condition numberServer (computing)Multiplication signPhysical systemTraffic reportingSoftwareCodeWeb browserClient (computing)Lecture/Conference
08:08
SynchronizationHessian matrixContent (media)Data typeOcean currentConnected spaceOpen setResultantDemosceneDependent and independent variablesData managementProduct (business)SequenceoutputSingle-precision floating-point formatMultiplication signResponse time (technology)Server (computing)Context awarenessTelecommunicationReading (process)XMLUML
09:25
CoroutineThumbnailRule of inferenceSynchronizationFunction (mathematics)WordBit ratePermutationCoroutineSystem callFunctional (mathematics)Complex (psychology)Rule of inferenceWeightLimit (category theory)Lecture/ConferenceXML
10:42
Home pageTask (computing)Coma BerenicesFlow separationPower (physics)ResultantOrder (biology)Complete metric spaceWeightComputer programmingGreatest elementElectronic program guideLecture/Conference
11:48
SynchronizationThread (computing)Context awarenessCodeSequenceMultiplication signMereologySynchronizationOperating systemComputer programmingReal numberPoint (geometry)Task (computing)Graph (mathematics)Condition numberIdeal (ethics)ResultantMusical ensembleInformation securityView (database)Diagram
12:57
Client (computing)Operating systemFrequencyTask (computing)Server (computing)Lecture/Conference
13:40
SynchronizationResultantInternetworkingException handlingInformationConnected spaceData managementMathematicsContext awarenessSystem callGraph (mathematics)Communications protocolMeasurementXMLLecture/Conference
14:32
Client (computing)SynchronizationWorld Wide Web ConsortiumSocket-SchnittstelleControl flowServer (computing)Price indexMobile appRouter (computing)Message passingForm (programming)Process (computing)Web 2.0Cycle (graph theory)Revision controlFunctional (mathematics)GenderObject (grammar)Content (media)Dependent and independent variablesSet (mathematics)Server (computing)TelecommunicationSubject indexingGraph coloringCartesian coordinate systemIterationException handlingRight angleDirected graphQuicksortVideo gameGreatest elementCodeWeightClient (computing)CoroutineLevel (video gaming)Loop (music)MappingTask (computing)Connected spaceView (database)Software frameworkPattern languageClassical physicsRoutingXML
17:27
Different (Kate Ryan album)Server (computing)Position operatorClient (computing)Execution unitMathematical analysisQuicksortWeb 2.0TelecommunicationSoftware frameworkClassical physicsLecture/Conference
19:09
World Wide Web ConsortiumMobile appSocial classStandard deviationMetropolitan area networkFunction (mathematics)View (database)Web 2.0CodeXML
19:45
SynchronizationWorld Wide Web ConsortiumServer (computing)Socket-SchnittstelleValue-added networkControl flowFlow separationClient (computing)Server (computing)Inequality (mathematics)Dependent and independent variablesRange (statistics)Different (Kate Ryan album)Message passingSoftwareLecture/ConferenceXML
20:30
Connected spaceTelecommunicationForm (programming)Different (Kate Ryan album)Product (business)DataflowResultantFamilyGoodness of fitMechanism designMultiplication signReading (process)Process (computing)Integrated development environmentSingle-precision floating-point formatCodeOrder (biology)Parametrische ErregungClient (computing)Parallel computingLecture/Conference
21:44
Integrated development environmentProcess (computing)Single-precision floating-point formatSoftware testingVertex (graph theory)Cycle (graph theory)QuicksortProcess (computing)Software developerDistanceReverse engineeringInternetworkingProxy serverAsynchronous Transfer ModeCodeMathematical analysisComputer animationLecture/Conference
23:03
SynchronizationLoop (music)Computer programmingDirection (geometry)Thread (computing)Task (computing)Dependent and independent variablesException handlingTelecommunicationCASE <Informatik>Complete metric spaceServer (computing)Web 2.0XMLLecture/Conference
24:05
SynchronizationException handlingAsynchronous Transfer ModeError messageRun time (program lifecycle phase)Computer fileLine (geometry)System callObject (grammar)Event horizonLoop (music)Exception handlingPlotterQuicksortBitObject (grammar)Run time (program lifecycle phase)Point (geometry)CodeResultantError messageDataflowSoftware testingComputer programmingLoop (music)WritingXML
25:03
SynchronizationTask (computing)Chi-squared distributionSocial classLoop (music)Software testingWorkstation <Musikinstrument>Client (computing)Protein foldingComplete metric spaceFunctional (mathematics)Task (computing)Software testingNeighbourhood (graph theory)State of matterMedical imagingDistribution (mathematics)Default (computer science)Pattern languageSoftwareLoop (music)Connected spaceWritingSoftware frameworkLecture/ConferenceXML
26:43
Musical ensembleTask (computing)Default (computer science)Proxy serverCellular automatonSoftware testingLoop (music)Message passingSoftware frameworkLecture/Conference
27:28
SynchronizationWorld Wide Web ConsortiumMobile appRouter (computing)Client (computing)Execution unitCoroutineMessage passingSoftware testingPlug-in (computing)Client (computing)Unit testingThread (computing)Price indexQuicksortEntropie <Informationstheorie>XML
28:22
outputCartesian coordinate systemAdditionGoodness of fitPressureDimensional analysisStack (abstract data type)CodeThread (computing)Buffer overflowComputer fileLecture/Conference
29:04
SynchronizationLoginPasswordObject (grammar)Set (mathematics)Software testingEndliche ModelltheorieClient (computing)Game controllerRule of inferenceDefault (computer science)Loop (music)Event horizonSubsetWritingUnit testingXML
29:39
SynchronizationLoginPasswordObject (grammar)Data storage deviceMobile appInterior (topology)Software testingClient (computing)2 (number)Cartesian coordinate systemConnected spaceSpacetimeLine (geometry)Fault-tolerant systemObject (grammar)Loop (music)DatabaseNamespaceLecture/ConferenceXML
30:38
Connected spaceConnectivity (graph theory)WaveSoftware frameworkMereologyComputer fileDefault (computer science)Library (computing)Task (computing)Right angleNatural languageTerm (mathematics)BitPoint (geometry)Real numberDifferent (Kate Ryan album)Instance (computer science)AreaGenderWeb 2.0Software testingBoss CorporationWeb applicationFunctional (mathematics)Dependent and independent variablesLimit of a functionServer (computing)SoftwareTrailPhysical lawStructural loadCodeWritingComputing platformBefehlsprozessorCurvatureDatabaseCASE <Informatik>Proxy serverLatin squareObservational studyArithmetic meanFood energyLinear regressionMessage passingProduct (business)PlanningLoop (music)Cache (computing)Mobile appFluid staticsBenchmarkEvent horizonThread (computing)Software bugMultiplication signReverse engineeringError messageSheaf (mathematics)2 (number)Directed graphLecture/Conference
Transcript: English(auto-generated)
00:00
I'll introduce Andrew Svetlov, and yeah, I think you know what he's talking about better than me because you are here. I'm just here for the next talk, but I'm very interested. Okay, thank you.
00:22
Okay guys, I'll try to make a simple enough introduction for IRGDP. I'm Andrew Svetlov. I'm a Python core developer. I've been using Python for 16 years and core developer for the last four months.
00:46
I took a part in asyncIO development as a committer, and now I'm a maintainer for IRGDP. After all, there are a dozen other libraries under libs, umbrella-like, Postgres driver, several IRGDP extensions,
01:14
radius driver, well radius is not 100% fine, Kafka, whatever.
01:20
So, trust me, I know something how to work with asyncIO code. Why do we need asynchronous code at all? That's very easy.
01:51
Now is the age of microservices, right? And let's imagine classic situation. We have a client, it's a browser or client which goes through API.
02:09
We have a front side, and front side communicates with a lot of microservices. Internal microservices and external services like Twitter, GitHub, Facebook, whatever.
02:24
And this architecture, we usually should perform many parallel HTTP requests from our front side to microservices.
02:40
Collect data back, transform it, and return to user. And we can use threads, and we can use asynchronous code. With threads, casual machine can support about 1000 threads,
03:03
but with asyncIO, with lightweight threads, amount increases to a million. Powerful server may support more than 1000 threads, maybe 10, 100,000, top limit.
03:23
But it's still low limitation. AsyncIO and any asynchronous network allows to scale more, allows to do more parallel work. Only if your parallel threads are IO-bound.
03:46
It does help with CPU-bound, CPU-heavy tasks and all, but it helps very well if you try to IO. So, HTTP is a library dedicated to work with web to handle HTTP from both sides, server and client.
04:15
It supports persistent connection, web sockets out of the box, and many, many other things.
04:22
I'll describe later. The library has three years long story. At very, very beginning, it was part of asyncIO. In the last time, it was called TULIP.
04:41
But Guido asked to rip it out. We instructed it into a new project, HTTP, and it was a good decision, because HTTP was very young in those days. It changes quickly, and we released 22 releases so far.
05:08
It's much faster than IO HTTP release cycle. A hundred and a half contributors, good coverage, so it's a measure enough library.
05:23
Client API. Client API somehow looks like a request. It's not a total API copy, but it's part of it. Well, I believe everybody knows how requests work.
05:44
It's maybe the most popular third-party library for Python. We make requests, get call, get response back, ask for response status code, get response body, as text, as JSON, as binary files, streaming and so on.
06:08
How to translate it in IO HTTP? No way, sorry. That's because in IO HTTP, we don't want to encourage bad usage practice.
06:31
It still works, I can write it, but it's duplicated. Let's go further. Request has a session concept.
06:44
Session is a container for cookie storage and for connection pool. And you can perform get request session, return response back, and ask for response text.
07:07
Session supports keepalives. Session supports cookies. Keepalives for free may make your system of, I don't know how, three, five, seven times faster.
07:20
It depends on your networking and how long your server is from client. But I highly recommend to use session request every time. Unfortunately, the first page, which user documentation page, which user request documentation, don't encourage session usage.
07:57
So that is the most part of request code, which operates with request.
08:03
It's not optimal from my experience. Now, we have our HTTP client code. We have session, our HTTP session. We have a get response, but you see, we use asynchronous.
08:26
It's asynchronous context manager for handling resources. For aggressively closing all open resources, open connection, open response, everything.
08:42
And it's, I think, because it works in asynchronous way. And we have a response here. For reading response body, we should use await syntax.
09:04
Again, because usually you have header, response header, immediately. And reading the whole response time or response body or reading by chunk, by chunk takes time.
09:23
It requires input out communication with server. So we require await here. Now, I want to say a couple words about coroutines.
09:47
It's a little complex concept, but for casual user, it can be divided into very simple rules, how to use it.
10:02
So coroutine is a function which is not def, but async def. If you see async def, it means you deal with coroutine. If you have to call coroutine, put await keyword before the call.
10:27
Like await sleep or await function. And if your function contains await inside, it should be coroutine itself. So the function itself should be async def also.
10:45
That's it. Next thing, the power of asynchronous approach, asynchronous way for making programs.
11:02
If we need to fetch several resources in parallel, we can create several tasks. Task is a lightweight thread and execute it in parallel weight for all tasks together, for all results,
11:23
or use another async IO API for getting results as they are complete, completion order. And our fetcher will try to fetch both Google com and Python or home pages in parallel.
11:52
What does it mean in time point of view? Synchronous code executes three fetches one by one in sequence.
12:06
And obviously it's long. If you start three real threads in Python, thanks to Gil and other concations,
12:22
these threads executed in parallel, but gray part of this graph is a waiting state. When program doesn't work, but waiting for switch for Gil releasing and other things.
12:41
Async code executes everything in the same thread, usually it's the main thread. But switch between tasks. And it does it very quickly without need for operation system context switch.
13:04
And thanks to it, it supports million billion tasks easily. Next thing what you should know when you works with client API is timeout.
13:23
Why? Because every request, every try to acquire data from server may help for very long period. 10 minutes, 30 minutes without disconnection, exception, without any information, any suggestion what target server is done.
13:53
Very often you get exception quickly, but sometimes you can. Why?
14:01
Because internet works in this way. We cannot change internet protocol, but we can wrap our calls on the timeout context manager and it's safe.
14:21
I recommend to use timeouts everywhere. And context manager is very convenient for this. A WebSocket, this is very simple example how to use WebSockets. Instead of saying session get or session post, you use session WebSocket connect pointer, WebSocket endpoint, get a WebSocket object.
14:53
We have to iterate our messages in WebSocket. I think for is, I think for iteration.
15:01
It may wait for next message if still not available. And on waiting, I think your loop will switch to other tasks, pending tasks. When the message is ready, we check for message content. If it's ready for closing, we close WebSocket.
15:24
And also we do something like ping pong communication. This is very basic pattern and all WebSockets client code looks like this. This is for client side.
15:45
Now server. It's about server. I believe everybody knows how to write Hello World in Django. In Django, you have a view.
16:01
It's a function which accepts request and return response. You file URL map, which maps for URL to view function.
16:24
And execute it while Django manage command. In ICP, we have almost the same. When we developed HTTP web server, high level web API, we tried to be very close to Django, Flask, or Battle,
16:50
to classic VSGI frameworks. Only with one exception. Our code is asynchronous. So we have also view.
17:02
It's called web handler in ICP documentation, which accepts request, return response. We have an application register route for view, run everything. But our index is not bare function.
17:24
It's coroutine, which means we can do asynchronous work inside coroutine. And the most obvious is we can request another resources via client API.
17:44
We can do WebSocket communications. And we, in opposite to classic VSGI framework, we shouldn't return an answer as quick as possible. We can wait.
18:02
End user, client, obviously will not see an answer quickly. But it does have a server waiting inside our web handler does stop other handlers if they have something to do.
18:27
It's the main and the biggest difference between ICP and all other VSGI framework. I don't know how many VSGI frameworks there are,
18:44
but many dozens, at least. Tarnada. Tarnada is also asynchronous framework, as well as twisted.
19:02
But Tarnada built on another concept. In Tarnada, you have a Tarnada request handler, and user should derive from it and override get method or post method.
19:20
It works. I don't prefer this way, but can live with it. But we found that utilizing view web handler concept is much easier for understanding for end users.
19:42
That's why we built asynchronous code in this way, not in Tarnada. But WebSockets for server looks almost the same as client looks like.
20:04
The only difference, we should create WebSocket response, prepare it from request, and iterate our messages from WebSocket response. Almost the same.
20:21
The only thing that you should keep in mind is like all other network communication, you have to have some mechanism for handling timeouts.
20:46
Because WebSockets can stop working without any notification for a long time. And parametric ping and ping-pong communication can prevent these handles
21:03
and inform server quickly that the client is connected. What I suggest for how to develop asynchronous programs effectively.
21:22
At first, run, test, run, or create your development environment as a single process. After creating your code, after debugging, testing, you can deploy it in different containers, in different processes, on different nodes.
21:46
All under, say, NGINX, reverse proxy mode. But for development, it's much, much easier to put everything into the same process.
22:01
Wow. Sorry, I didn't expect it. It's a big fail. No, it's not suspended.
22:36
Sorry for this.
22:46
We are here. So HTTP code does require tools like Celery.
23:09
You just can create a long-running task, fork new thread inside your program. It's much more easier. Sometimes you need persistent pending tasks, which will be restarted if task execution has failed.
23:29
But it's a rare situation. In the most case, you can just create new tasks in the server and that's it. And you can do very long communication inside web handler.
23:45
It's very convenient. Next simple thing. If you forget to close gracefully your response or do something with stuff, you will get an exception like...
24:04
It's not an exception, it's actually a warning. A huge exception was never tried. It's wrong. It's programming error. You see runtime error as what exception was pushed into future before getting a result.
24:21
But I have no idea where in the code it was happening. So run our program with Python asyncio-debug-environment flow and you will see trace back to the point where bad code was created, where object was created.
24:49
Also, I highly recommend to pass loop everywhere or don't pass it at all. Why it's important? It's important because it's easy to write tests with explicit loop.
25:09
I told about client session. Use it for sharing state and for keeping open connection for supporting keep-alive network pattern.
25:28
We have one session. We have several tasks for fetches. We wait for all these tasks completion. That's it.
25:41
How to write tests? Easy, but create new loop for every test in setup. Disable a default loop and on teardown close the loop. Also, test function in test framework should be regular function, not corgi.
26:05
So this trick can help very well. In IOTP, we have a decorator which does this call, run until complete. But for understanding, we should have coroutine inside regular function and run this coroutine.
26:30
Why it's important to disable global loop? Because with global loop enabled, we can start a task in one test, finish test successfully,
26:45
execute more test to 10, and on 12 test execution, your task from the first test may fail. Because they share the same loop.
27:05
Disabling default loop makes tests really separate. Why important to pass loop everywhere? Don't clash between two different tests.
27:25
This is an example how it works with pytest framework. Honestly, I prefer pytest nowadays. I tried unit tests for long, for about a decade, but switched to pytest.
27:43
Finally, we have a pytest HTTP plugin and everything looks easy and short. We have application, we have test client with get post method and other.
28:02
And you see, we create application, pass in handler, unit and perform test request to our test server, analyze result back.
28:21
It's very, very easy and very, very convenient to have tested and testing coroutines in the same thread. You don't have to start separate thread for starting application or separate process. You have all together and you can easy insert, import PDB, PDB set trace,
28:45
or whatever you prefer for debugging your file. I found it's very convenient. And the last one that I want to mention. It's from Stack Overflow question, ask it yesterday.
29:03
A guy created code like this. This is Mongo client, MongoDB client. And it works manual run, but when he started to write unit tests for this, test hands.
29:23
What, why? Just because I think I owe a motor client. Accept loop. It was coupled to event loop, default event loop.
29:40
First test finished. Second test started. Created another loop, but asking for finding data using motor client coupled to old, not used anymore loop is a bad idea.
30:06
It has. But what to do? Request has app. It's an application. Application is a dig-like object. You can push everything into this namespace.
30:20
And save. When create an application, we create Mongo client, push it into our application namespace, register clean up for graceful shutdown connection to Mongo. This is the same you can do with databases with everything.
30:41
It's very good. Register clean up on clean up signal and that's it. So let's give this section. It's not very interesting actually. Questions?
31:16
Are there any questions or comments? Am I correct that disabling global event loop, main event loop makes execution not a sink anymore for testing?
31:29
So that test platform will be executed one by one and not sharing the same thread.
31:41
Not sharing the main thread. Just to simplify debugging. There are ways to use another approach. You can create event loop and register the new event loop as default every time for every test.
32:04
Default event loop will be new instance. But I found it's error problem. Sorry. Maybe it's the heaviest way but the most safe way to pass loop everywhere.
32:22
Maybe I don't, sorry. But it's my opinion and my experience. Okay, we have another question here. So I was doing benchmarking of a service on asynchronous framework and on AO HTTP.
32:40
And what I've noticed is that I can have a 16 millisecond latency on each request for like say 20,000 requests a second. And when I double the load, so 40,000 requests a second, it's still the same latency. So what happens? So it appears that it was still free CPU earlier but I've doubled the load but it's still the same latency.
33:09
Did you observe something like that or no? I don't know. I need to take a look on this benchmark more carefully.
33:22
Sorry, cannot predict what. Okay, so I will talk to you later I hope. Thanks. More questions? Here. I mean just a question. I had a few other questions. Just a question from a previous questioner. Was the latency actually, was it in asyncio or was it a network latency or in the network stack?
33:47
Just a question. Very interesting talk. I do quite a bit of web apps. Python has a great story in terms of writing web apps. Really easy to write a web app.
34:02
And an absolute pain in the neck to deploy them. So you can write an app very simply in Flask, in Django, in any other of the frameworks. And then once you deploy it you have to start configuring Nginx, you have to start Celery for your long running tasks. And you have Redis for your caching.
34:25
And you have lots of different components with fragile connections between them. So the whole thing becomes a bit of a nightmare and a very fragile and hard to debug. So HTTP asyncio looks like it can simplify the whole thing a great deal.
34:42
You mentioned getting rid of Celery because you can do the long running task using the async framework. That's great but I'm just wondering why you suggested deploying behind Nginx. Can't we just use basic HTTP server as a production server.
35:08
So that's one question. Another question is HTTP 2 support. HTTP can be used as just web server without reverse proxy.
35:25
But for production I recommend to push it behind Nginx. Why? First you usually have to have a static file, files.
35:41
If it's a requirement Nginx do it much better than Python. We have the static file support but for real performance use Nginx. And another point much more important. Nginx has very long story for preventing malware, for preventing attacking code.
36:13
It's done by limiting buffer size. It has very good experience in this area.
36:24
We tried the best but I don't... I'm not 100% sure we have no holes for attackers.
36:42
About HTTP 2 we have plans but have nothing ready yet. There is a pull request but it's still not ready. We have one more question here. Hi, thank you for the talk.
37:00
So my question is a bit more like do you have any big features in mind to add to the library? Meaning that if there is a roadmap or these kinds of things. A roadmap? Well, we are not so formalized to have roadmaps.
37:23
But HTTP is hosted on GitHub. We have issues and that's it. In...
37:40
NER tileheads are HTTP 2 support, performance and nested sublocations. And fixing our bugs. Users find it, of course. Any other question?
38:00
If this is not the case, please thank again Andrew.