The Report Of Twisted’s Death
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 113 | |
Number of Parts | 169 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/21093 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Software developerCore dumpDisintegrationWorld Wide Web ConsortiumData managementBinary fileExplosionLie groupInternetworkingTwitterLocal ringWebsiteRight anglePerspective (visual)QuicksortSoftware developerDifferent (Kate Ryan album)Web 2.0Network socketBitData managementCore dumpBinary codeBlogProjective planeCodeOpen sourceWeb browserMultiplication signRandom matrixRouter (computing)MereologyScaling (geometry)Line (geometry)Form (programming)INTEGRALSystem callCuboidSoftwareGrass (card game)WeightComputer animationMeeting/Interview
02:45
World Wide Web ConsortiumSoftware frameworkComputer clusterThread (computing)Process (computing)Server (computing)Condition numberRead-only memoryOverhead (computing)Interpreter (computing)Parallel computingExplosionGreen's function1 (number)Parallel portFrequencyInterpreter (computing)Goodness of fitSemiconductor memorySoftwareEvent horizonCartesian coordinate systemLogicDisk read-and-write headProcess (computing)Connected spaceCodeNP-hardCondition numberPoint (geometry)Stress (mechanics)Control flowBitUniversal product codeCASE <Informatik>MathematicsProgramming languageThread (computing)Level (video gaming)Default (computer science)QuicksortExtension (kinesiology)Web 2.0Perspective (visual)Multiplication signPhysical lawOverhead (computing)FamilyMultiplicationDatabase transactionType theorySoftware frameworkVirtual memoryProduct (business)Uniform resource locatorStack (abstract data type)Similarity (geometry)Slide ruleDatabaseScaling (geometry)Proper mapJSONXML
06:49
Physical systemFunction (mathematics)Core dumpSocket-SchnittstelleComputer fileOpen setWritingElectronic mailing listImplementationTransportprotokollComputer networkBlock (periodic table)Pattern languageControl flowConcurrency (computer science)Parallel computingPopulation densityThread (computing)Client (computing)Process (computing)BefehlsprozessorDatabaseObject (grammar)Software frameworkControl flowIdentity managementSocket-SchnittstelleBefehlsprozessorCommitment schemeThread (computing)Computer fileFunctional (mathematics)Open setClient (computing)ChainResultantBitElectronic mailing listCore dumpObject (grammar)Direction (geometry)Web 2.0Communications protocolProcess (computing)Content (media)DatabaseData managementRandomizationImplementationSeries (mathematics)Server (computing)Physical systemTask (computing)Connected spaceQuicksortInformationMultiplication signSystem callWeb applicationLaptopSoftware2 (number)Computer hardwareCASE <Informatik>Cartesian coordinate systemConcurrency (computer science)Overhead (computing)Event horizonParallel portDemo (music)Uniform resource locatorBuffer solutionCycle (graph theory)Block (periodic table)Pattern languagePopulation densityDifferent (Kate Ryan album)Operator (mathematics)1 (number)AdditionParticle systemExecution unitFormal grammarSelectivity (electronic)WordLevel (video gaming)WritingEndliche ModelltheorieCuboidReading (process)Scheduling (computing)Vector potentialEntire functionScripting languageDisk read-and-write headFlow separationPower (physics)Resolvent formalismTwitterMultiplicationSampling (statistics)Computer animation
14:20
Event horizonSynchronizationWeightUniqueness quantificationEvent horizonReal number1 (number)Order (biology)Computer configurationDomain nameCartesian coordinate systemQuicksortPoint (geometry)Covering spacePattern languageCASE <Informatik>BitFunctional (mathematics)Control flowThread (computing)Weight3 (number)LengthJSONXML
16:07
Software frameworkCoroutineSynchronizationControl flowLibrary (computing)ImplementationEvent horizonDefault (computer science)Proxy serverDirection (geometry)CodeLine (geometry)Event horizonFlow separationImplementation1 (number)CoroutineControl flow2 (number)QuicksortSoftware frameworkMultiplication signSoftware bugResultantTask (computing)Library (computing)Functional (mathematics)Type theoryData structureReduction of orderBitDifferent (Kate Ryan album)ProteinPlanningSuspension (chemistry)WeightCASE <Informatik>System callTerm (mathematics)Mathematical analysisCurvatureElectric generatorSingle-precision floating-point formatFrequencyPhysical lawMedical imagingView (database)BlogComputer animationJSONXML
20:17
Inclusion mapCoroutineComputer programmingComputer multitaskingCommunications protocolArchitectureSynchronizationAbelian categoryCommunications protocolIn-System-ProgrammierungQuicksortProxy serverControl flowPrimitive (album)Physical systemSource codeEvent horizonException handlingTransportation theory (mathematics)AbstractionParallel computingMultiplication signLevel (video gaming)Computer architectureMultiplicationWindowNetwork topologyParticle systemLine (geometry)ProteinSimilarity (geometry)Instance (computer science)Process (computing)Computer animationLecture/Conference
22:10
CodeLine (geometry)Software testingDistribution (mathematics)Installation artCommunications protocolSynchronizationParity (mathematics)InternetworkingWorld Wide Web ConsortiumWordEmailThread (computing)InformationMultiplication signBitCodeSoftware testingWebsiteQueue (abstract data type)System callSquare numberImplementationAnalogyCore dumpLine (geometry)QuicksortGenderFigurate numberCuboidParticle systemData dictionarySlide ruleEquivalence relationDirect numerical simulationUtility softwareGraph (mathematics)Social classPairwise comparisonCommunications protocolComputer animationDiagram
25:36
FacebookSoftware frameworkWorld Wide Web ConsortiumSimilarity (geometry)Streaming mediaCommunications protocolDisintegrationControl flowCoroutineMilitary operationSoftware frameworkFacebookQuicksortSimilarity (geometry)Level (video gaming)Extension (kinesiology)BitArmControl flowEvent horizonCommunications protocolWordWeb 2.0INTEGRALMixed realityComputer animationJSONXML
26:46
Military operationSynchronizationFocus (optics)Pauli exclusion principleCoroutineCodeDependent and independent variablesWeb pageMoving averageTheory of everythingGrand Unified TheoryEmailDot productBootingApplication service providerForm (programming)Patch (Unix)Physical systemCodeCore dumpLambda calculusQuicksortCommunications protocolWeb 2.0Regular graphFunctional (mathematics)Focus (optics)Abstraction2 (number)BitEvent horizonPatch (Unix)Revision controlResultantWritingMereologySinc functionLine (geometry)Block (periodic table)Web page3 (number)Control flowOverhead (computing)Software frameworkSynchronizationPhysicalismWeightInformation securityDifferent (Kate Ryan album)Matching (graph theory)Celestial sphereElectric generatorSpacetimeInterrupt <Informatik>Particle systemComputer fileVotingCodierung <Programmierung>DialectArithmetic meanSystem callPoint (geometry)Computer animation
31:26
Set (mathematics)Multiplication signBranch (computer science)Communications protocolCuboidMultiplication signQuicksortBranch (computer science)Projective planeCommunications protocolCuboidJSONXMLUML
32:10
TelnetFile Transfer ProtocolCommunications protocolInheritance (object-oriented programming)Social classFactory (trading post)Line (geometry)Server (computing)Revision controlWeb browserWriting1 (number)Square numberInheritance (object-oriented programming)Communications protocolClient (computing)Electronic mailing listCodeServer (computing)Web pageFreezingDensity of statesPattern languagePhysical systemParticle systemPoisson-KlammerIncidence algebraInstallation artJSONXML
33:41
Library (computing)EncryptionInterface (computing)Public key certificateRenewal theoryWorld Wide Web ConsortiumCodeSocket-SchnittstelleSoftware frameworkUniform boundedness principleInheritance (object-oriented programming)Library (computing)Multiplication signCommunications protocolCodeProjective planeRenewal theoryQuicksortPublic key certificateTransport Layer SecurityInterface (computing)Surjective functionSoftware testingCASE <Informatik>Particle systemCompilerWebsiteAngleShared memoryPhysical systemSingle-precision floating-point formatBasis <Mathematik>ArmDependent and independent variables
35:12
Server (computing)Revision controlCodeWorld Wide Web ConsortiumSocket-SchnittstelleUniform boundedness principleSoftware frameworkInheritance (object-oriented programming)Software testingImplementationSpeech synthesisCodeSoftware testingAutomationRight angleMathematicsQuicksortBinary codeProjective planeOcean currentRevision controlStaff (military)Cycle (graph theory)JSONXMLComputer animationLecture/Conference
37:36
SpeicherbereinigungJust-in-Time-CompilerMachine codeControl flowCASE <Informatik>Message passingCompilerProduct (business)Core dumpConnected spaceDirect numerical simulationInterior (topology)Software testingVideo gameMultiplication signComputer animationDiagramProgram flowchart
38:23
Computing platformSoftware testingWindows ServerRevision controlCurve fittingComputing platformSoftware testingParticle systemQuicksortSoftware frameworkMereologyInternet service providerRight angleVideo gameData structureWindowProduct (business)Pattern languageEstimatorCodeComputer fileLogic gateComputer animationJSONXML
40:01
Content (media)CodeRamificationQuicksortGroup actionCondition numberDifferent (Kate Ryan album)Semiconductor memoryCore dumpProcess (computing)Interpreter (computing)Thread (computing)Communications protocolRevision controlCASE <Informatik>DatabaseConcurrency (computer science)BitMedical imagingBefehlsprozessorContext awarenessDatabase transactionSystem callLatent heatSequenceMehrprozessorsystemCrash (computing)SoftwareProper mapOptimization problemParallel portNatural languageImplementationJust-in-Time-CompilerSheaf (mathematics)Extension (kinesiology)Finite-state machineEvent horizonFreewareInterface (computing)CoroutineMultiplication signConstraint (mathematics)EmailWrapper (data mining)InternetworkingProgramming paradigmDependent and independent variablesDevice driverBoiling pointExterior algebra1 (number)BuildingAuthorizationWritingLibrary (computing)Functional (mathematics)MereologyWeb pageSoftware maintenanceRight angleControl flowSoftware frameworkElectronic mailing listStatement (computer science)Utility softwareIntegerAreaGoodness of fitPattern languageArmParticle systemIntegrated development environmentComputer programmingDescriptive statisticsDivisorCorrespondence (mathematics)VarianceWorkloadFrequencyMappingComputer-assisted translationType theoryBenchmarkSeries (mathematics)Coefficient of determinationTrailWeightData miningFamilyCone penetration testSpeech synthesisNetwork topologyCartesian coordinate systemReal numberMultiplicationField (computer science)Data structureNetwork socketExecution unitLimit (category theory)Term (mathematics)Basis <Mathematik>Interrupt <Informatik>WaveFood energyVirtual machineVariety (linguistics)Mathematical optimizationState of matterHeegaard splittingComputer animationSource codeJSONXMLProgram flowchart
Transcript: English(auto-generated)
00:01
So, hello, I am Amber Brown, commonly known as Hawk Owl on the internet. Here is my Twitter and my website. So I have quality Twitter posts. Those of you that follow me know that I'm lying. So I live in Perth, Western Australia.
00:23
If you're wondering where that is in the world, it's right there. I've come from like 13,000 kilometers. So hopefully, you know, it's been good so far. I mainly work on the Twisted project in my open source stuff.
00:40
So I'm a core developer and a release manager, and I've single-handedly ported the most code to Python 3. So I firstly had a hand in porting about 40,000 lines of code, which is about 20% to 25% of Twisted's code base,
01:01
as well as some auxiliary things and some things that use Twisted. So this is pretty much how it is when I'm working on it. And I'm here today because of my work, which is Crossfire. We do web socket routers and web socket RPC and stuff for your browsers and all of that.
01:24
So I also do same sort of release management there as well, binary release management and more porting to Python 3, as well as web API and REST integration into Crossfire. So the original idea for this talk comes from two people,
01:42
Russell Keith-McGee and Glyph Libikowicz. So Russ asked me the question, why is Twisted relevant when there's asyncio? Now he knows the reason why it's relevant, but he likes to play devil's advocate. But I think it is kind of worth answering for those that aren't as ingrained into it as I may be
02:01
or as he may be from me ranting about it endlessly whenever I see him. Glyph published a blog post about it and it goes into a lot of the same things that I'm going to go into, but it's good and I recommend checking it out. And he talks about it from more of a long sort of perspective
02:22
as he's been part of the project for a very long time and I've been on it for three or four years, so it's a bit of different perspectives as far as the time scale goes. Now, one of the core problems you have when you're writing any software, pretty much ever, is that you want to do some form of IO.
02:43
Now, I mainly do web stuff. So you have web frameworks and they're all pretty good, except the problem with it, from this hilarious joke, is that the more conventional ones like Django, Pyramid, Flask, all of that,
03:02
they only really serve one request at any one time. Now, the way you get around that is you deploy using runners and these runners have multiple copies and they put these multiple copies in threads and processes. So you are effectively still processing one request at a time,
03:21
but you're handling the request in parallel. So it sort of gets around it in an okay sort of way. Now, in Python specifically, threads or processes won't really help you with what's called C10K, which is 10,000 concurrent connections.
03:43
Now, when you try, it sort of ends up looking a bit like this. Mainly because in Python and in program languages in general, threads are very hard to use safely. You end up with race conditions and it's really hard to reason about your code
04:02
from a purely reading-the-code perspective because threads decide when they want to change between each other. Not you. You don't get control over that. You can try and get control over it using explicit yield points, but this doesn't always work.
04:23
They're also a bit hard to scale with in Python specifically because if you have one thread per connection, you're going to have thread memory overhead. Now, that's like the Python stack. So by default, it's like eight megabytes on Python, on virtual memory. Now, even if it doesn't use all of that,
04:41
that can stack up pretty quickly. If you're just using 128 kilobytes per your thread for the stack and some various other things, if you have 10,000 of those threads, you're going to have like 1.3 gigabytes of overhead without doing any processing at all. None of your business logic, none of your fancy application stuff are just threading.
05:03
And until the Gilectomy happens, or until software transactional memory in PyPy becomes a thing that you can use without downsides, you don't end up with parallelism either because the global interpreter lock means that only one of these threads may be running Python at any one time. You can get around it with C extensions
05:21
so that you can do heavy lifting in those, but if you're writing Python, you're probably going to have, at least in the early stages, everything in Python. You can't afford to put everything in C extensions. Cython makes this a bit easier, but it's still something that you have to special case
05:41
the easiest way of maintaining it and the fastest production code, which is not good because you want them to be kind of one and the same. You also won't do threads properly. Pretty much no one in this room can do threads properly. Even if you've written thread using code, there's probably some subtle thing
06:01
where it's going wrong, and you won't know until it's really, really bad. A fun thing is when people say that you can do threading properly, especially when they like C threading. Yeah, that works. There's applications that use it. If you look at the CVE database and search for race condition and look at when there's thread race conditions,
06:23
there's been untold amounts of damage from not handling a thread properly, from a small race condition happening. Micro-threads like gevents and event-learn, they're not really better. They still have some of the similar problems, and Glyph talks about them much better than I ever could,
06:42
so I recommend checking that out. All these slides will be up online, so you don't have to worry about getting this even shortened URL. Now, something that Twisted uses, as well as Tornado, AsyncIO, all of those frameworks, is non-threaded asynchronous I.O. So we all use it. It's the common approach
07:00
compared to, for example, event-learns and gevent, which use green threads. So Twisted was one of the first. It's been around for known history, at least since 2001. There was bits of CVS before it, but all of that is lost to time, so let's just pretend it's done in 2001.
07:22
I recently moved to get a... We're catching up on the 21st century. It's amazing. AsyncIO is a bit newer, so it's been around since 2012. It was one of our first commits. In the very core of them, they both use the identical system calls.
07:41
They're called selector functions. Now, they're like SELECT, POLE, EPOL, and what happens is you give them a list of file descriptors. For example, sockets, open files, Unix pipes, all sorts of things. Rarely anything that has a file descriptor. And it will tell you what is ready to have operations done on it. The most common ones of these are reading and writing,
08:03
because you will have, for example, you won't be able to read anything if the client hasn't sent you anything. You won't be able to write anything if the send buffer is completely full. These selector functions tell you when you can do these things without blocking. You can tell it to do it in the location there, and it won't take an indeterminate amount of time.
08:23
Selector loops can quite easily handle thousands of thousands of thousands of thousands of open sockets and events per second. So for example here, this is just on my Mac. It can support C10K on my Mac with just making Ulimit so it can accept 10,000 connections. And it works fine with not really that much CPU load.
08:42
So it is something that you can do on commodity hardware. You can do it on a standard laptop. You might want to have a bit more of a beefy machine if you're serving actually 10,000 concurrent real people all doing real work. But it's not a problem to handle that many connections. With these selector loops and these selector frameworks
09:03
like AsyncIO and Twisted, you can just do it. You don't need to worry about having things in C or that to handle that many connections. Generally what happens is that data is channeled through a transport. So for example a TCP connection, a UDP datagram, a Unix socket, a file or anything
09:21
to a protocol implementation. So a protocol implementation is a thing that actually takes the bytes and transforms that into something useful. For example, HTTP. It goes from a series of random bytes on the wire with whatever data and content might be there. And then a HTTP protocol will actually pass that into something you can interact with.
09:42
Now in these frameworks, sending data is queued until the network is ready because if you're trying to send a one megabyte file and you've only got a 512k bit uplink, it's going to be quite a few cycles until it can send all of that data down on the network. And nothing blocks because it waits until things can be done
10:02
and then it just says, you know, while I'm waiting for this, you can serve all these other connections. I don't care. So the thing that uses the selector loops, the thing that uses the selector functions are called IO loops or twisted parlance reactors, named after the reactor pattern because data coming in
10:22
is events and you react to it. The great thing about it is that you end up with much higher density per core. Now that C10K demo thing that I screenshotted there, that was only using one core. That wasn't using multiple threads or multiple cores. That was just a single CPU. You also don't need to have threads around.
10:41
So it works on things that don't support threads and it also works on, it also means that you don't have to have the thread overhead. You still end up with no parallelism because you're still only one CPU and one thread, but you end up with concurrency. You can handle multiple requests at once because when you can't continue serving one request,
11:04
you yield and let the loop handle the next one. The best case for it are the sort of applications that a lot of us write in today. So they do a lot of IO. For example, sending stuff down the network. For example, on Twitter, you don't really do a lot
11:22
of CPU intensive stuff. You maybe send some pictures and send some text and mainly wait for the database to come back with that, or for the client to send you some information or all of that sort of thing. So because you're not using a lot of CPU per connection, you can hold 10,000, 20,000, as many as you want really,
11:41
as much as you have RAM, as much as your IO loop can handle in one second. It also works really well when you have high latency clients because clients might take an indeterminate amount of time to respond. Now if you have 10 threads and each one is serving a client, that is uploading a picture.
12:00
And the client suddenly decides to not send you any data for 500 milliseconds, that thread is being occupied for 500 milliseconds doing nothing. But it's still blocked and still waiting for data to come. And nowadays, you're probably waiting on either the clients to send you information or the database to give you information.
12:20
Generally, most web applications nowadays are thin layers on top of databases and on top of sort of task management systems like Celery that actually go and do the hard processing on specialized data farms or on other boxes, not your web servers. Some of the implementations come with some nice abstractions
12:40
so that you don't have to handle all of this directly. The most common one is that they provide an object and that object is a stand-in for some results in the future and a way of telling you when that result has happened. Now, future in AsyncIO is one of these.
13:04
Twisted users, deferreds, they are very much the same. They have slightly different ways of operating but they both have the core concept, a thing that you can pass around when you don't actually have a result yet. For example, a deferred, if you want to, you have a deferred, which is an empty thing, it does not have a result yet.
13:22
You tell it that you want to print when it calls back with a result and then you call it back with a result. So now the deferred has a value and it works through the callback chain. Futures work very much the same. You have a feature, which is empty, then you add a callback to it and then you set a result. Now, there are two different ways of saying it.
13:41
You have callback here and you have set result there but they're pretty much identical. Apart from one little thing, deferreds run callbacks as soon as they're able to. So they'll run it synchronously and won't yield to the IO loop. But futures will schedule a callback to happen on the next IO loop, which is a bit fairer scheduling.
14:03
That's pretty much the core difference. So, if we've got Twisted, why do we need a new solution? Why do we need asyncio?
14:21
Only 2012 was kind of a bit of a mess, as far as Python dev was concerned. G events and event length weren't ported yet, so that only happens this year, I believe, so it's been quite a while coming. Not much of Twisted was ported, so you couldn't really build any real applications on Twisted on Python 3. Most of Tornado had been ported, though,
14:41
so you could write it if you were using that, but Tornado is slightly less used than G events and Twisted and all of that. So one of the major ones was ported, but it doesn't cover everyone. Elsewhere, Node.js was completely exploding in popularity.
15:00
Everyone was using it. PayPal was like, yeah, let's port everything to it. Everything was sort of happening very fast over there. And async await landed in .NET 4.5, so it was sort of a nicer way of doing this sort of asynchronous stuff. Now, Node.js is quite similar to asyncio in Twisted. It has the same event loop at its core.
15:22
It uses libuv, which is a layer on top of all of those selector functions. But it all works very much the same, and it did sort of give credence to the idea of this being a workable solution for pretty much everyone, that it was no longer something that was kind of niche, that you couldn't just put things in threads anymore,
15:41
that there was a real use case for this. Python 3 adoption was kind of getting there. It's always taken a while for Python 3 to get massive adoption. But there wasn't really anything that was sort of Python 3's cool thing. There wasn't anything that you could really look at,
16:01
because there wasn't typing yet. There wasn't any asyncio. It was nice cleanups, but that was about it. So why asyncio specifically? It was designed around coroutines. Now, coroutines in Python are a special kind of generator. So a generator is a function that it sort of suspends.
16:25
So it might not have a value yet, so it suspends until it does. So kind of similar to a future node effect. Python 3.5 especially contains syntax that makes features act like coroutines, and coroutines act like features and various other things
16:41
that sort of make them sort of work together. So for example here, we have this code example here. It's just a loop that prints time for five seconds, or every second for five seconds. Now you'll see the special thing there. Do I have the thingy? Yeah. So the special thing here is the async def.
17:02
Now that is what defines a coroutine. You also have this special keyword here called await. Now await is very much like yield is in Python 2 and Python 3, except it doesn't actually talk to the generator itself. It delegates to a sub-generator.
17:22
Now it's a little bit strange how this works, but it does mean that the implementation is a lot cleaner, and it does mean that it works a lot nicer, especially when you have async def and there's async for. All these sorts of things are introduced in 3.5, and it made working with coroutines and working with asynchronous stuff so much easier
17:41
because all you need to do to await is just type await, and then if you can await on this, you will. So asyncio.sleep1 returns a future. Now in this coroutine, you can await on a future, and then it will wait for the results.
18:00
So if, for example, I'll explain this a bit better. So what happens is this line here suspends until the future returned by asyncio.sleep actually has a result. So it doesn't just keep looping. It waits for one second. Now because it uses the event loop here, it doesn't actually mean that this waits one second.
18:21
All it does is tells the reactor in one second, stop suspending this feature. Tell this feature that it has a result in one second so that it will continue with the loop. So you don't have to worry about callbacks. You don't have to worry about all that sort of structuring of your code because it just acts kind of like your old Python code used to be.
18:41
It's very sort of Pythonic. Another thing that asyncio was really meant to do was repair the library API fragmentation because you've got Twisted, you've got Tornado, you've got even G of N, for example, all have a different way of doing things, and there shouldn't really be so many different ways of doing the same task.
19:01
And if you look at the Zen of Python, there should be one, and probably, and should only be one, way of doing things. So this sort of hopefully was like, here's the one that is how you do it, and all these frameworks can implement it, and you don't have to do it three or four different ways. You just do it like one.
19:21
Of course we all know that XKCD comic about how, as 30 standards, we should define it under one, and now there's 31 competing standards. But, you know, this time it's kind of different. We hope. It also was meant to reduce duplication
19:41
because asyncio would implement the same thing that all of these selector frameworks had internally, which was the selector loop. Now if asyncio brought its own, then that means that all these other frameworks don't have to have what is essentially the same code. There can be one central one that is centrally maintained,
20:01
all the bug fixes happen there, all of the knowledge can be poured into the one implementation, and you don't end up with several ones that have several small bugs, or work slightly differently, or have downsides, you just have one, and then it can hopefully be the best thing. So, does asyncio replace twisted?
20:22
Well, no. They both do the same sort of thing. They have cooperative single-threaded multitasking, they have primitives for supporting asynchronous programming, like features, like fords and coroutines, like inline callbacks and twisted, sort of. These are the same system APIs, you know, select, poly, poll, KQ, ICP on Windows,
20:44
and asyncio kind of took the protocols and transports abstraction from twisted, which separates the thing that is the wire and the thing that processes the actual bytes off the wire as two separate concepts, which is really handy if you've got things like TCP except it's actually TCP over some other protocol,
21:06
which happens, for example, I think it's like TCP over SOX or whatever, the old proxy sort of thing. It works a lot better if you separate them out so that the individual protocols don't have to care about what their transport is. So, it has the same sort of benefits of twisted in that regard.
21:25
It's also very architecturally similar internally. If you read the twisted reactor source code, and you read the asyncio event loop source code, you can see the same things being done, slightly different ways, but they're essentially doing the same thing. And it's a newer and standard API,
21:40
and it's just there in Python 3.4. You don't have to pip install anything, you don't have to worry about any of that, you just import asyncio and off you go. Now, where this falls down is that twisted is an asyncio thing. An asyncio itself is an asyncio thing. And it's the same kind of thing,
22:01
and surely you only need one of these things. So, twisted, just replace your twisted usage with asyncio. Well, that's some work, because asyncio is like an apple, and twisted is a fruit salad. Twisted is, for example, much bigger.
22:20
And, for example here, if you look at the amount of lines of code we have, it's a lot more. We also have a lot more comments, which I like. So, if you remove the tests, and then you just do the pure implementation, it's like ten times bigger. Now, that's not because twisted is ten times as bloated,
22:42
or it does the same thing in ten times the amount of code. It does a lot more than that. It's not only the core reactor, it is also protocols, for example, HTTP, IMAP, POP3, DNS, SSH, all of these different things. And it kind of all does this in one package,
23:02
because when twisted was first made, nothing else really kind of did it in the same way. And there were a lot of things that twisted does that Python didn't have yet. For example, auto dictionaries. We had our own, I think we recently removed it when we dropped 2.6 support.
23:20
We had to have our own auto dictionary lying around, because it wasn't in Python until 2.7. Or 2.6, I forget. And one big package was very much easier to distribute in the early days of Python. You didn't have pip, you didn't have PyPI, that worked as well as it did. And even if you did have PyPI, it was down every 20 minutes.
23:41
And it wasn't a good time. So if you just have one package, it was a lot easier to use, and a lot easier to install, because it was just that one thing. And it also came with basically everything you needed, the batteries included, more or less. If we pull this down to what AsyncIO does, and the equivalent twisted code,
24:02
they're very much the same. The cores are essentially equivalent. And this equivalent core is basically those primitives, the core AsyncIO utils, a couple of Python utilities that twisted has that's in Python 3 now, and a couple of protocols that use all of the above.
24:20
So some quite basic protocols. Oh, sorry. Oh, sorry, I did the wrong slide. This slide is showing that early Python code, for example, Django 1.9, is very much the same sort of size as Twisted. So Twisted is big, but Django is also big. This is the size for comparison.
24:41
Lots of graphs in this one. So as you can see, they're roughly about the same size in lines of code. Twisted is a little bit bigger, but you know. But if you actually look at what Twisted does internally, and what you need AsyncIO to do, to do the equivalent sort of thing, it is very much sort of the same.
25:02
You're going to end up with a lot of code. So some people say, you know, AsyncIO isn't bloated. Twisted is bloated. Look how big it is. Look at all of the code. It's very big and bloated. Well, we just do stuff.
25:21
We also have some protocol implementations that aren't quite in AsyncIO, like HTTP2. I don't think that's in AOHP2 yet. There might be one or two, but Twisted has it like nearly first class support for example. But enough about Twisted. Let's talk about Tornado.
25:40
Who here has used Tornado? So Tornado is another asynchronous framework for Python. It's a specifically asynchronous web framework. It's made by FriendFeed, which was labeled by Facebook and then was torn apart and, you know, dissolved because that's what happens if you get caught by Facebook.
26:04
It's sort of similar in some ways. The transport is very similar to the IO stream, but their protocol is a little bit mixed in. They don't have to worry about the generality there we go, first flub of the word generality that Twisted and AsyncIO have. It does implement its own selector loop.
26:22
It does actually have Twisted and AsyncIO integration. So you can yield defers or you can yield features. And they might actually remove their event loop and just replace it with AsyncIO. So as you can see, they've got a bit further into using the standard sort of thing and they're a really great example of interoperation.
26:42
And is this kind of the future of Twisted? Now, interoperation is hard. As anyone that's ever had to work with a system that's similar but not quite the same, it kind of has its difficulties.
27:00
My focus has been the async await keyword. This was introduced as pep0492 and I believe it was mostly written by Yuri Simlov? I can't pronounce his last name. It was introduced in Python 3.5, so it's in a Python you can use right now. And it's pretty cool. I gave the code example earlier. It makes things a lot easier to read
27:21
and looks a lot like your regular Python code. You don't have to worry about callbacks, you don't have to worry about callback hell, you don't have to worry about lambdas for all sorts of things just to add a value together and then pass it down the Google backchain because you just await and then you just do it on the next line. So await, as I explained,
27:41
gets the result of a coroutine. It sort of works. You have a coroutine and you await on a coroutine, which is sort of like a feature now in AsyncIO. They sort of act like each other and they are a special kind of generator. Similar to yieldfrom. It delegates to a sub-generator and it lets you have asynchronous code
28:01
executed in asynchronous style, which is the main draw to it. I guess it's had a sort of the same sort of thing since around 2006 called inline callbacks, where you use the old yield function, which is very much similar to how gevent does it, that you use yield and then you write sort of your standard Python code
28:22
and you don't worry about callbacks as much. But I am working on the interop and coming soon is a little thing called initial deferred. And what that does is it takes a coroutine and it turns it into a deferred. And then that coroutine itself can await on deferreds.
28:41
So if you want to write twisted code on Python 3.5, you can just await things. You don't have to worry about deferreds or callbacks or anything like that. If it returns a deferred, you just await on it. And then because it's async-def, it is actually a coroutine.
29:03
Yes, I said it 12 seconds ago and I forgot. It's a coroutine. So you just go down here and show deferreds of the function that takes the coroutine and that returns a deferred. So that means that you can write this code and if it's like I accept a deferred, you can be like, well, I'm just going to write this async-def function.
29:21
And the other code doesn't know because it returns a deferred and you can yield deferreds inside of it. Also coming is a AsyncIO reactor, which is a twisted reactor on top of AsyncIO. So sort of replacing those twisted internals with AsyncIO in the sort of original idea of what
29:42
was supposed to happen. So it's on top of AsyncIO. So that means that you can share the different things. So you can have a twisted protocol. Or in this example, for example, this is AIO HTTP. So using UV loop, which is the high-performance AsyncIO
30:02
reactor. So this trick here is a twisted thing and AIO HTTP is an AsyncIO thing. We just get the reactor and we just tell it, yeah, we're running. And this here is just a coroutine, which is AIO HTTP coroutine. So for example, handling a web request.
30:20
And we're doing some twisted, some Trek stuff in there. And we just go deferred to future. And then the AsyncIO code just believes that's AsyncIO code. It'll wait and wait for the deferred to fire. And because they're running on the same reactor underneath, they won't block the other.
30:42
So you'll be able to have some AsyncIO stuff, some twisted stuff, and it won't really matter. So yeah, here's the core part of it. Just deferred or deferred to future. So hopefully, the next version of Twister will come with it. AsyncIO does need to patch one or two little things.
31:01
I've been discussing with them at PyCon US. It's got to pull up on it a bit more. But it is there. It is very close to having that sort of thing, where you can have tornado, twisted, and AsyncIO all using the same event loop. And you can sort of bring your own and bring the different abstractions from the different frameworks and use whatever you're most comfortable with.
31:21
But why is Twister itself, apart from AsyncIO, still worth using? It's released often. We have three plus times a year that we have releases. 2016 is set to have five releases, which is quite often in a sort of a size of twisted sort of project. That means that we are going to be able to get features
31:40
out a lot quicker than, for example, AsyncIO, because AsyncIO has to wait either for a new PIP release, which I'm not sure when they do it, or wait for a new Python release. They're time-based releases taken off our trunk branch. So we just say, yep, we'll release here. So you don't end up with big features that are sort
32:01
of half-merged, because the trunk still has to keep working. So you can get the cutting edge pretty safely. We do have a lot of protocols under the box. So here's just a small list of them, like just some of the random ones that I've seen people use. Not a finger that much, but it's in our tutorial. So a lot of them are ported to Python 3.
32:23
Some of them aren't. It just comes down to someone saying, oh, we use that protocol. We want to be on Python 3. And then I sit up at 3 AM porting it. And it's ported, basically. And it's super easy to actually make your own protocols. So if you need to talk to some custom system, or you feel like running your own protocol for whatever reason,
32:41
you can just do it in Twisted. Same with asyncio. It's very quite easy. So that there is just an example of something that just echoes out whatever you send to it on the command line. We also have HP2, which is really cool, because this is pure Python, HTTP2.
33:01
So this is without nginx, without Apache, all of that. Pure Python. So you can just pip install twisted, square brackets, HTTP2, and then just set up your TLS certificate, because that's how it negotiates that. Your browser says, I want HTTP2 in the TLS request.
33:22
And then it'll just let you have HTTP2. And pretty soon, we're going to have all the server push stuff and the client support. And it's kind of cool that you can just do this in Python. And this also means that when we get deferred to future and future deferred working, that you can write asyncio code that uses HP2.
33:42
We also do have established library support. We've been around for a very long time. And we do have a lot of handy little things. One of my favorite libraries is txacme and txsni, which is a Python interface to let's encrypt. And it lets you do automatic certificate renewal. So if, for example, you go to my website, which is, at least for now, .net, it'll go onto HTTPS.
34:05
Yes, now I don't actually have to provision the certificates or anything like that. I just turn on txacme. And it goes and automatically gets it. It gets the certificate and does the challenge. It handles all of that. And it sets it up. So I have like a straight A on the Koalas SSL test
34:21
without ever actually having to look at a certificate. There's Hendrix, which is like a whiskey runner, which uses Twisted. It lets you do WebSockets and TLS and run Twisted code inside your blocking, like Django or Flask or Def Code. So it's a pretty cool project.
34:41
It's Autobahn, which is one of the things I work on, which, for example, because Twisted and AsyncIO share the same sort of protocol and transport abstractions, it's a WebSocket library that has a single protocol and then sort of shims for AsyncIO and Twisted. So you have the same sort of dependable base of WebSockets.
35:01
And it works the same on AsyncIO and Twisted. And it works pretty well under PyPy, the optimizing JIT compiler, which is also very good. And also the HTTP2 stuff here also works very well under PyPy and is actually probably one of the faster HTTP2 implementations out there.
35:21
We're also a very dependable base, because we try not to break a code. Now, as some people that have been on the receiving end of my releases may know, we don't always do this. But Twisted is a very big project, and we try not to. And that's kind of at least half like not breaking a code.
35:40
We have deprecation cycles. So we don't have 2.0. We don't have 3.0. We say that we want to get rid of the usage of this. So we're going to have a new version that does things right. And we're going to deprecate the old one. And in a year or more, sometimes depends, we'll actually just remove it.
36:00
So when you upgrade from, for example, 16.3, something might be deprecated. You see the deprecation warning. You fix it. And then come by like 17.3, it's gone. So it means that we're constantly getting the new stuff and updated stuff. And it's a lot more fluid than if you have, for example, the big 2.0 release that breaks everything.
36:22
And then you end up never porting. You just need to make sure you're on the latest version of Twisted, which is pretty easy, because the releases are every couple of months. So they're not huge changes. They're rather small. So you can upgrade with basically impunity. You can just see that there's deprecation warnings. And you run your tests against it, because you have tests.
36:42
Don't you? Yes, you run your tests. And then you can go, OK, everything is fine. And then when something does break, you're just fixing one little thing, not the entire bunch. We also have code review. Code review is sort of the thing that Twisted did, and now everyone else is doing it.
37:02
Because it's great. We have lots of automated tests, like thousands and thousands and thousands of tests. So we try and make sure that everything in our code base will work, because we have a test to prove it. We're about 90%. So your code will only break if you're using the 10%,
37:20
which is actually probably stuff like, what was it? If you use the MSN support, which I removed, because it sucked. I can't actually use MSN anymore. It's kind of terrible. You can also add PyPy to it, because most of our tests pass. We're working on getting the last 10 or 15 tests.
37:42
They're all CPython assumptions, like they're just in the garbage collector, sorry. It seems like when it goes out of scope, it'll be immediately garbage collected. Not the case on PyPy, and we need to alter our tests for that. We have a lot of people that run it in production. We run it in production ourselves. And the speed benefit is absolutely amazing.
38:01
You can handle twice as many TCP connections just by switching out your Python compiler. You can serve a bajillion more DNS requests. And you can do so much more templating per second, because it's a just-in-time compiler. So those core inner loops are all transpiled to machine code. And then they go really fast.
38:23
We support a bunch of platforms. So that means that the tests pass on the platform, and we gate any mergers that it needs to pass. So we end up with a huge bunch, so pretty much nearly anything you run, it'll work. There's even a couple of other platforms unofficially supported that work pretty well.
38:43
We have people running it on like the HP UX, like that Unix thing. And it works, and I don't know how, but you know. We support Python 2.7 on all platforms, Python 3.4 and 3.5 on Linux.
39:02
Python 3.3 as well, but I don't think there's any current platforms that aren't end-of-life, so we don't test on it anymore. PyPy's closed. A few tests remain. And PyPy 3, which is actually 3.3.5, is getting worked on. So that means that you'll be able to have your Python 3 code and also a fast code, rather than picking clean Python
39:20
3 code or fast Python 2 code. And support's coming to Windows soon for Python 3, 3.4, and 3.5, just cleaning up the last little things. And most of all, the reason why I think Twisted and Tornado and all of those other frameworks that aren't async code have real value is that competition is really good.
39:42
We fit in this ecosystem, if only as competitors, because we can have some things that are good, and then AsyncIO can go and do things better, and that means that we have to go do things better to compete. So it means that we just keep moving forward, all together, as a community. And as the interoperation gets better, it means that we all benefit.
40:02
Now, where to from here for AsyncIO? Well, interoperation is the big thing, because then that means that you can use all of your old Twisted code and your new AsyncIO code and your new Twisted code and all of that on Python 3. There's the AsyncSig mailing list, which I haven't actually subscribed to yet.
40:22
But we are going to be talking a lot more in the coming weeks, coming months, coming years, especially, about interoperation between all the frameworks and making it so that everything sort of works together. Now, if you want to know some more about, for example, protocols, I recommend Corey Benfield.
40:41
He's one of the request maintainer and the author of the Twisted HTTP 2 support. PyCon US talk, building protocol libraries the right way. Alternately titled, you do it wrong, and all of my libraries except one do it right. No, wait, I'll boil you around.
41:01
And thinking in coroutines by Lukas Langer at PyCon US. So also a good thing about thinking, talking about how coroutines themselves work and how they especially work in this context-based in code and how they work internally, which is kind of good to know if you ever run into some issues about it. And questions.
41:21
So ask questions. If you would like to yell at me about how I'm wrong, please wait until afterwards. You can yell at me how I'm wrong. I love it. Yes. Scrapy, I would say that there's probably not any value in it,
41:43
especially once the interoperation stuff comes into account. So Scrapy is, I believe you write, it's like a whole bunch of tools and then you do write asynchronous stuff for fetching the pages and processing them, right? So in that case, because Scrapy is large,
42:04
I don't think that something like Scrapy could really survive a transition to AsyncIO without some major rift. You can't completely break all the existing code and all of that without some ramifications. Now, with, like, interoperation stuff, then that means that you sort of,
42:21
it doesn't matter what Scrapy itself is written in because it'll work with your AsyncIO database adapters, the AsyncIO other things. But as far as, like, Scrapy and other projects, it's kind of worth just keeping on whatever you're on and waiting for everything to catch up talking with each other.
42:40
Because you just can't rewrite that amount of code without something going wrong. And that's the unfortunate bit, is that you can't just take one and go, oh, this is better because all the existing code won't work. And that's really valuable to some people.
43:02
This one? Okay, so this will wait here and this will wait here?
43:21
Okay, so the reason is that this git here, this is actually a thing about Trek's interface, is that .get will return when you have headers and then the .content on the .get, on the thing that's returned there, will return when the entire body is fetched.
43:42
So that's just a particular thing for this API, that you get an early response that's not the entire body and then you get the rest of the body, which might take hours, minutes, days, depending on how big it is and what your connection is. So that's just a purely sort of thing there. You can use as many await statements as you want
44:02
and all of that, so it's really just limited by your imagination. But, yeah, this is sort of an interesting example showing the early interop. As you can see, there's some really ugly, terrible stuff here for how AO, HTTP thinks headers are
44:20
and how Twisted thinks headers are. Because we think ours are lists of... Yeah, it's different. But, yes. Orange, yep. Yes, yep.
44:53
Yeah, so that's actually up in... almost up in review right now, is that, so that's what this Twisted out Internet AsyncIO Reactor,
45:01
which is in almost review, is pretty much doing, is that what it is, is that so you have the AsyncIO event loop and then all this here is just maps, the function calls to the AsyncIO function calls. Yes, so AsyncIO manages all that. Yep. And that's just because
45:21
the interface to the Reactor is very similar, but we use CamelCase and AsyncIO uses not CamelCase, StakeCase, and we use different names for things. So it's just purely as a thing going up presently, so that things keep working on top.
45:41
This is pretty much up for review and there exists something currently on PyPI called TXTulip, which is just actually this, and I've just improved it a little bit to make it work on, I think, e-poll a bit better, but it's been done before, so yeah. Yes.
46:04
TXIO, so, yes, that's one of the things Crossbar and Toveno, yeah, Crossbar and formally Toveno we actually work on, that's what we use for Crossbar and, sorry, we use it for Autobahn. So the thing about TXIO is that
46:22
because Autobahn needs to work on Python 2.7, we can't really use the coroutine way of interop. Which will be, sadly, Python 3.5+, because coroutines sort of give that little gap where we can do the compatibility. So if you need to support Python 2, TXIO is sort of good at that,
46:44
but the better way, well, it is sort of two things. You don't want to use it too heavily because you want to do in the way of, where is it, Corey Benfield's thing. So, where he's essentially got his protocol,
47:01
which is the meat and potatoes of the HTTP2 support, and all of that is synchronous. It doesn't use futures or the thirds. It's like a state machine and then you have a wrapper around that that handles making the futures, making the thirds. Now, TXIO is that for Autobahn, and it's useful for some other projects,
47:21
but I would say that going forward, for writing new code, the coroutines way of interop would be better because it's a lot more Pythonic, while TXIO, you're sort of reduced to a common delimiter, the lowest common delimiter of what futures and the thirds both do
47:41
to make it work. So you don't end up using like deferreds how you would use deferreds or futures how you would use futures because you sort of have to use them both at the same time. So, it's good for current software, but there will be more optimal solutions in the coming years, and when we drop Python 2 support
48:01
as like a community, then it will get a bit better. So, the only competitor that Twisted on PyPy has in the asyncio world
48:21
is uvloop. So, where is uvloop? Sorry, I'm jumping between these so much. So, uvloop, no, that's not uvloop. This is uvloop. So, uvloop here, that is the only thing that can come anywhere close to Twisted on PyPy.
48:40
Now, the problem with uvloop is that uvloop is the core event loop, it's in C, but if you look at Yuri's benchmarks, it doesn't actually make something like AIO-HV any faster because that there, you're restricted by Python's interpretation. So, while PyPy in that case, even if our reactor is slower,
49:02
all of the actual protocol code is much, much faster. So, I'd say that when PyPy 3 comes out and uvloop gets a port to CFFI, so it works better on PyPy and other sorts of things, then you're going to have a truly fast asyncio,
49:20
but that's just like, it's plenty fast already and uvloop makes it really good for most things, but it's Twisted and PyPy is still top of the pack as far as I'm concerned, just because the JIT works on all your protocol code as well, which is, in real-world applications, that's the bulk of the processing that's happening,
49:40
not the reactor. Sure. Okay, so, like, uh, tornado has the multi-process implementation where people just wait too late and they implement, like, uh, locking ports. They just say, oh, to remove that option,
50:04
I actually can teach people the proper... Yeah, so, yeah, so the problem with multi-processing is that multi-processing is great, well, multi, as in, running it in multiple processors is great in the context where you have CPU-bound workloads, so if you're doing lots of math,
50:21
doing lots of things like that. So, in that case, you're still going to need multi-processing, you're still going to need, I think, async code has a thing in concurrent called, like, process executor or something like that, which is, sort of, you run some code in a process and it returns a future. Now, that sort of thing is still going to be valuable going forward, simply because
50:41
until we have true threading in Python, which, although Larry Hastings' Gelectomy is coming up and, I think, going well, we've still got all these Python versions that won't have that, and it might not even land in Python, because it might break the C API too much. So, both, yes, it's good to say, don't do blocking calls,
51:02
because blocking networking calls are the devil and should not be done, but we also need to twist it and async all of that, get it easier to run code in processes really easily for the sort of thing where it's not networking, where it is CPU workloads, where it's processing images or doing natural language processing
51:22
and all that sort of thing. So, sort of half and half is that you probably shouldn't use multiprocessing for talking to the network, but it does still have a lot of value. And we need to get better at supporting it going forward, sort of having one central way of doing it, which AsyncO has one, but that's not two, I think.
51:43
Yes. You first and then you. Sorry? Yep.
52:01
I would say not. The eventual future is probably going to be that we ditch all of our reactors and twisted becomes the protocols, and that's it, rather than the other way around. And it's more likely that we're going to split out more and more of our Python utilities, for example, deferred, which is purely just a Python utility,
52:23
and split all that into different sort of Python packages so it can be more widely used. So, that is the sort of feature I optimistically see Twisted going, is that it is essentially just a bunch of protocols for AsyncO, but that's not going to happen until
52:40
at least 20, like 30, sorry, 2023 because we are giving, so with the 2020 Python 2 drop, Twisted's not subscribing to that because we could only start porting at Python 3.3, it was the first version we could realistically port code, and we want to give our users
53:01
the five years death notice. So, once it goes, we drop 2.7 support, which is most likely in 2023, maybe longer, it depends, maybe shorter, maybe everything is ported tomorrow and then we can just drop it tomorrow. I would say that that would be the eventual best case
53:21
is that we don't have the reactor, we don't have all these utilities, we are just protocols. Thank you, sorry, did the person yep, yep, little hour.
53:52
So, talking to databases, so for example, one great thing is, okay, so a lot of the current ones talk to a socket, so they wouldn't work
54:01
as stands. There is a library for Twisted called CxPostgres, which ramps the native Async parts of the Postgres C library, so it just uses that and does all of that, but you will ultimately need to write brand new database drivers that are natively asynchronous, so
54:23
you're going to have to write a lot of code for that. The usual way to do it is just wrap it in the thread pool and go shrug, but that's not the optimum way. Native ones will always work better, will be more efficient, but yes, it will require a lot of code rewriting, which is fun.
54:44
Anyone else? Oh, yes. In paradigms that don't require like Jill and stuff, so can you think of for example, currency or actually in particular Jill is a true constraint.
55:03
The Jill is only a true constraint when you are dealing with C extensions. With C extensions. So the thing that PyPy has is, it's got this experimental thing called the STM, which is software transactional memory. Now when that kind of gets forward a bit more, it doesn't crash as much
55:20
because STM is a bit it's a wonderful technology. Basically what it means is that rather than having a global interpreter lock, you have a lot of finer locks. And you lock specific bits of memory. Now when you do something like twisted or asyncio and all of that, where you have the core reactor and then
55:40
all the things that come down from it. But these individual sort of handlers don't talk to each other. So that means that in the sort of the Jill free worlds, you just say all of these have a software transactional memory lock for their own sections and then they all run in parallel. So the Jill if you, in a world where there isn't
56:02
C extensions, or there's better C extensions and you're writing twisted and asyncio and all of that then Jill is not required. Essentially. Because it's just a thing that the C API needs and also prevents race conditions. That's also the other thing while software transactional memory
56:21
also prevents race conditions by having finer locking. So if one tries to talk to memory, then the other one is locked, it will actually run in sequence not parallel. So it will work around it. Jill is not required, basically. But it's there because everything is horrible.
56:41
Anyone else? No? No more questions? Come on. You look like you want to ask a question. All of you I just pointed out. So I'm being general. Am I? Oh, okay. Okay, how about this? Who wants coffee?
57:02
Okay. I love being in the last slot because I can just run over time. It's great. Sorry?