We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

PyPy meets Python 3 and Numpy

00:00

Formal Metadata

Title
PyPy meets Python 3 and Numpy
Title of Series
Number of Parts
160
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
PyPy meets Python 3 and Numpy [EuroPython 2017 - Talk - 2017-07-14 - Anfiteatro 2] [Rimini, Italy] PyPy is an alternative Python implementation whose JIT often gives seriously better performance than CPython. Now PyPy supports, in beta version, two major new application domains: Python 3.x, and Numpy and the rest of the scientific stack. These are each an important milestone for a subset of the Python community. Thanks to a grant by Mozilla, "PyPy3" now largely supports Python 3.5 with one or two extensions from Python 3.6. Full support should be very close. (Note that PyPy2 will not disappear, if only because PyPy itself is written in Python 2.7.) Numpy and the major packages of the scientific stack are now starting to work well with PyPy (PyPy2 mostly, but also PyPy3). This is thanks to progress in "cpyext" emulating the CPython C API, as well as fixes to the packages in collaboration with the upstream developers. We will also mention some more "what's new in PyPy" topics from the last couple of years
95
Thumbnail
1:04:08
102
119
Thumbnail
1:00:51
Coma BerenicesSoftwareIntelType theoryPattern languageFormal languageComputer programmingRevision controlProjective planeCondition numberImplementationLecture/Conference
Gamma functionUltraviolet photoelectron spectroscopyBeta functionGamma functionHypermediaFunctional (mathematics)Line (geometry)Student's t-testHand fanInferenceComplete metric spacePoint (geometry)Directed graphSet (mathematics)BenchmarkDigital electronicsWeightTraffic reportingWeb 2.0Parameter (computer programming)Software testing2 (number)Network socketLecture/ConferenceXML
BenchmarkString (computer science)Limit (category theory)Stack (abstract data type)Standard deviationLibrary (computing)Partial derivativeUltraviolet photoelectron spectroscopyPattern languageRevision controlIntegerInterface (computing)Different (Kate Ryan album)View (database)Type theoryModule (mathematics)Metropolitan area networkOrder (biology)Key (cryptography)Modulo (jargon)Form (programming)Inheritance (object-oriented programming)Network topologyCodeStreaming mediaArithmetic progressionPlotterStack (abstract data type)WindowComputer animation
Stack (abstract data type)Plot (narrative)SineModule (mathematics)EmulatorBoundary value problemAlgorithmComplex (psychology)Intrusion detection systemCodeDatabase transactionSoftwareRead-only memoryResultantString (computer science)Presentation of a groupPoint (geometry)Extreme programmingExtension (kinesiology)CompilerComputer programmingRight angleEigenvalues and eigenvectorsMachine codeStandard deviationArray data structureHacker (term)EmulatorLogicModule (mathematics)Graph (mathematics)MereologyBoundary value problemSoftwareMultiplication signSubject indexingSemiconductor memoryPoisson-KlammerDatabase transactionProcess (computing)BitNear-ringCASE <Informatik>Pointer (computer programming)Address spaceType theoryControl flowExergieIntegerArithmetic meanSingle-precision floating-point formatNeuroinformatikMatrix (mathematics)Buffer solutionDivisorFerry CorstenTheory of relativityInstance (computer science)SummierbarkeitLecture/ConferenceXML
SoftwareRead-only memoryDatabase transactionThread (computing)Revision controlSchmelze <Betrieb>Gotcha <Informatik>WindowScanning tunneling microscopeDebuggerMultiplication signSemiconductor memoryDatabase transactionSoftwareInterpreter (computing)Computer programmingSoftware testingResultantPoint (geometry)Ring (mathematics)ProgrammschleifeComplex (psychology)Revision controlDebuggerHypermediaProcess (computing)Different (Kate Ryan album)Flow separationChainPhysical systemReverse engineeringParallel portKey (cryptography)Arithmetic meanCASE <Informatik>Parameter (computer programming)Order (biology)Figurate numberGame theoryElement (mathematics)FreewareElectronic mailing listLoop (music)Virtual machineMultilaterationSoftware bugMathematicsScanning tunneling microscopeUniform resource locatorMereologyThread (computing)Lecture/ConferenceXML
DebuggerRead-only memoryCodeJust-in-Time-CompilerRevision controlMachine codeUltraviolet photoelectron spectroscopyEinbettung <Mathematik>CompilerVirtual machineFunctional (mathematics)Instance (computer science)System callExtension (kinesiology)Right angleSemiconductor memorySinc functionProjective planeBitComputer programmingReduction of orderMathematicsLoop (music)Library (computing)Multiplication signWebsiteInterface (computing)Profil (magazine)CodeMachine codeStandard deviationJust-in-Time-CompilerFunction (mathematics)Module (mathematics)Phase transitionWater vaporReverse engineeringSimilarity (geometry)Survival analysisDebuggerPlanningPoint (geometry)DigitizingEstimator
WebsitePerformance appraisalFrame problemXMLLecture/Conference
QuicksortCheat <Computerspiel>Lecture/Conference
Module (mathematics)Extension (kinesiology)Direction (geometry)Copyright infringementLecture/Conference
CASE <Informatik>Endliche ModelltheorieStaff (military)Exception handlingModule (mathematics)Standard deviationObject (grammar)BitFiber bundleDisk read-and-write headLecture/Conference
Module (mathematics)outputSubsetElectronic mailing listLimit (category theory)Set (mathematics)TwitterCodeGeneric programmingLecture/Conference
Order (biology)Order of magnitudeNumberCASE <Informatik>Multiplication signImplementationVideo gameNetwork topologyBeat (acoustics)Lecture/Conference
MereologyPlanningVideo gameInsertion lossMultiplication signLine (geometry)Theory of everythingCodeLecture/Conference
Transcript: English(auto-generated)
So, I'm going to present, to give you basically what is five landing talks, end to end. Because I'm going to present to you what's, well, what is new in PyPy,
what's new mostly from last year. And well, a lot of small things are new, small in quotes. So what is PyPy first? PyPy is an other implementation of Python, I guess you kind of know that if you're here in my talk.
It's mostly a dropping replacement, which means while you have this project that use C Python, you start with Python foo.py, then well, you can start with pypy foo.py, and it should work the same way.
So, so far PyPy supports Python, on the language version 2.7. Now, like in this few months, it's about to support Python 3.5, the language.
And well, the advantage, why would you use PyPy instead of C Python? The most common reason is just performance. There are a lot of programs that are just much faster on PyPy. Okay, so first let's talk about PyPy 3.5.
So it's Python 3.5 support, which is released in, well, beta was the name I wanted to say. I was told that maybe I should use another name, so I'm using gamma instead. So why is beta not a good name?
Because it works, because it's very stable, it's very fast and everything, but it could be wrong in a few details everywhere, which means it's probably missing some functions in the socket module. This new socket method there is not implemented,
or this new extra keyword argument of that function over here is not implemented. It's this kind of thing that is not done yet. But it is there and it works, and it's fast. So the current status is please try it and report issues if you are actually finding
these kind of missing functionalities, because it's not like, just by running the C Python test suite, we find most of the missing functions, but not all of them.
So a special thank you goes to Mozilla, because Mozilla has actually been giving us a big grant, like four people one year, to do this work of Python 3.5. And this is just a set of benchmarks
running on Python 3.5. This is just benchmarks for the new keywords, async and await on 3.5, which basically shows that it's fast. So the red lines are speed in requests per second
on PyPy, and the blue lines are on Python, and it's async, sorry, AEO HTTP, gevent, tornado-web, curio, and h11, and on the right, twisted-and-cline.
So it's fast, cool. I should mention for completeness that this doesn't, I mean, if you are writing an async HTTP, and you are writing it for performance on C Python, then you would probably not use any of these,
but instead use things like libuv, which is pretty fast on C Python, but which is not included here, because it's all C, basically. These plots are more about the performance of pure Python code.
Okay, the status, it's roughly complete, so complete 3.5 support, plus the fstrings, which is a new feature in 3.6, which we also added into 3.5, because why not, and because people tend to like this feature.
Okay, it has good performance. It is tested mostly on Linux only so far, which means that some Windows or OSX-specific interface are probably missing. The first final version, I mean,
the first version that will not be called beta or gamma, or whatever, will be released soon. Once we get there, then I'm confident that we'll continue on progress to Python 3.6,
and probably 3.7 next year, et cetera. Mm-hmm, okay, so this was my first lightning talk. The second one is about scientific stack. So, if you are using the scientific stack,
you are using, well, first NumPy. So, the news, what is new from last year, well, first, it's that, well, there are two different kind of modules that you can use inside PyPy.
One is NumPy, that's a standard version of NumPy that you download with pip install NumPy, and there is another one called NumPyPy, which is a built-in module of on PyPy, and which is not complete. So, what we did is basically this.
NumPyPy is deprecated, don't use this anymore. Instead, just use the standard NumPy. So, what we did is to support enough of the C, the C layer, the C path and C API layer in order to be able to run NumPy.
So, the end result is that NumPy really works. I mean, here I put 99.9% because there are two failing tests out of the thousands and thousands, and there are bad doc strings or something. So, another interesting bit is that it works on PyPy,
the standard PyPy that implements Python 2, but also on top of the PyPy that implements Python 3, 3.5 in this case. Okay, and well, from this point, you can actually build on top of it,
and things mostly work. As in, you can run a Jupyter, Matplotlib, Pandas, et cetera, and all this package mostly work in PyPy. So, here is an example that is not at all impressive, but so, it's a standard Jupyter notebook,
and I can run this, and I get a Matplotlib. I get Matplotlib to print me, oops, sorry. I got Matplotlib to print me this nice graph. So, well, the only thing here is that it happens to be running inside PyPy 2.7.
It would work just as well inside PyPy 3.5, so it works. Cool, yes. Okay, so what I said about the C-C extension is more generally, if all siphon modules work,
or more generally, any C-C extension module mostly works nowadays, out of the box, on PyPy. So, this is all thanks to C-Py X, which is our own C-C API re-implementation or emulation.
So, well, yes, it's an emulation, which means that it is a bit slow. So, it means it is slower to run all this NumPy and the complete scientific side. If you run this and compare the performance with the performance that you would get on top of C-Py, you will find that PyPy is slower.
So, less so than up to last year, but still slower. Well, there are workarounds. So, I mean, what is slow is really the boundary
between Python and C. So, for instance, if you are doing a lot, if in your program you happen to be doing quite a lot of re-indexing, it's slow, because every single re-indexing is going to cross this boundary. On the other hand, if what you are using NumPy for is computing the eigenvectors of one huge matrix,
then it's a complex and lengthy process, which is just as fast, because you cross the path and C boundary exactly once. And, well, so far, there is this speed hack here
that, well, don't use it, basically, but it's very good right now. You take your ND array, you consider it as a buffer, you get a CFFI pointer to this address in memory,
you cast it to a double star, which means an array of doubles, and you get P, which is a pointer to the first and all items in the array. So, once you have this P, you can do P bracket zero,
P bracket one, et cetera, index it. Just by doing that, well, you're replacing this, where every single time you write this, you get a very slow process that needs
to go through layers and layers. If you replace it with P bracket index, then it is jitted into one machine instruction. So, it's great for speed. So, well, obviously, what we are going to do
in the future, in the near future, probably, is actually add a few special cases, so that this ND array index becomes as fast as that hack.
Okay, so we have planned to improve some, and some funding would be welcome. And then, yes, for now, just try it out on your own code and see if it helps. I mean, basically, if you have anything
that just use NumPy and does nothing else, then no, it's not going to be faster on PyPy, it's probably going to be slower, yes. But if you have some bigger program that really has parts that are really doing logic in Python as well, then the complete speed of your complete program might get faster
by running PyPy. Okay, so now, a completely unrelated topic, software transactional memory. This is a topic that I kind of talked about the past few years.
PyPy STM, software transactional memory, it would be a way to get rid of the global interpreter log, and I've been asked about 10 times during this week, what is the status? So, here is the answer. Sorry, it does not seem to work.
Why does it not seem to work? Well, the basic idea is that you would take, I mean, the idea that a fault might work is that you take your program, a big and complicated program that does tons of things,
at one point it has a loop, a loop that does something on every element of some collection, but every single thing it does is mostly independent. So, by adding, like, you would add a hint or something,
and then the process would turn the execution of this mostly independent part to be really run in parallel, and then you have a conflict detection system to figure out which of these things that run in parallel
actually gave the correct answer. The correct answer meaning, gave the same answer as they would have given if they were not run in parallel. So, it's a very nice idea. However, in practice, you get conflicts between the thread and the conflicts are hard to find.
So, it means that you're going to spend two hours using some tools that I don't even know what they are and they need to be invented, only to find one conflict, and then, yes, it's obvious it's this thing that conflicts with that thing, so you fix that, try again, poof,
exactly the same bad performance as before, so repeat the process. So, you need to repeat the process several times, you have no clue how many times, and then, at the end, suddenly, you get good performance. Yes, cool, it works. Well, and then the main problem is that it is your program,
like, the program is not frozen, right? The next time you change anything in your program, poof, you're going to reintroduce some conflict, and you're back to square one, and yes.
So, I think this is enough of an issue to kind of kill the idea. It's like, you cannot test for the results unless you do some kind of complex test that the execution runs in less than 30 seconds,
and then you hope that when you're running the test, your machine is not actually doing anything else, for example, poof, I don't know. So, yes, the idea does not work. I mean, it's put on hold for now,
and, well, the kind of thing we are thinking about is, yes, okay, let's do a standard JIL-free PyPy. It would not be an STM PyPy. It would be basically the same thing as Larry Husting is doing for C Python. It's a lot of work for us.
Well, see landing talk later. Okay, reverse debugger. What reverse debugger is something that I gave a landing talk last year here.
So, reverse debugger is an essential tool that everybody needs like once a year or once every two years, something like that. Myself, I've needed it twice since the last year. But it's essential.
It's a debugger with the ability to go forward and backward. So, you run your program up to a point where you get nonsense, and then, well, then like in 95% of the case, you can think of, you can think 10 minutes and figure out why it's nonsense and you fix the bug.
But in the five remaining percent of the case, you really have no clue why you are getting this nonsense. And in order to figure this out, well, use rough DB, basically. Rough DB lets you follow the values
where they come from in reverse, really. So, you put a watch point on a value, for example, and you can run your program backwards. So, the program will run until the point where that value changed.
It's just very useful. So, here is a URL. It's implemented by a different version of PyPy, but it is a version of PyPy with almost no restriction, like it's a version of PyPy, which can run C-C extension modules, for instance.
It's fine. You cannot, of course, go inside the C-C extension and go backwards step by step, but you can go backward over a complete call to a C extension module, for example, that works.
Okay, so last, a few random changes that occurred in PyPy last year. We improved the JIT, mostly by some reduction in warm-up time, which means that,
like if your program really, if your program really took a lot of time to start, then now it takes a bit less time, probably still a lot, but less. We reduced the memory usage, memory consumption, mostly the memory consumption that occurred
during the startup. So, this is just general JIT improvement. We have VMprof nowadays. So, VMprof is a high performance profiler for Python codes. So, think about Hotspot, for example,
which is a very old profiler for CPython. So, it's something similar, but the advantage of this is that it works the same way on Python, CPython, and PyPy. And also, if you use the PyPy version,
you can actually follow and see this function here was executed by turning the function of this loop inside the function was, this loop inside the function was turned into machine code, and you can actually see the machine code.
So, you can see when things don't go right for some definition of right. CFFI, CFFI is a way to call C from Python. It works identically on CPython and PyPy.
So, this alone is the reason why I think CFFI is great, but, well, from what I've heard, from what I regularly now, from what I regularly hear inside conferences,
CFFI is actually used by a lot of projects, and that includes a lot of projects that don't have anything to do with PyPy. So, the biggest improvement since last year is embedding, which means basically calling Python from C. So, if you have a program written in C
or whatever, actually, C sharp, C, C++. So, if you have this big program, you can write, you can write a bit of Python and then compile this.
Well, you write a bit of Python, then you execute the CFFI compiler on this, and the output is a standard C library. A standard C library that has the interface that you specify. So, it means really, you can write any C library
with any interface, and you write it purely in Python. And people are actually using it on RIN2, so it works, cool.
Okay, so, next year, what is going on, what is going to happen inside PyPy during the next year is probably something like that. We will continue to polish PyPy 3.5, probably go and start 3.6.
Numpy in the scientific stack, yes, we will continue working on it. I mean, by now, it's mostly like fixing the major performance issues, and we have actually ideas about how to fix them, so it's just work, just quote, quote.
We will play with PyPy without a gill, and do things like the reverse debugger, we need to port it to PyPy 3.5, which means basically writing an extra module, which needs to be adapted from PyPy 2.7 to 3.5.
Well, this is the kind of thing that we are looking forward. So, thank you, here is the website.
Thank you, thank you, having a great talk. We have six minutes for questions, so. Okay, just one question there.
Hi, do you hear me? Because some people have trouble hearing speakers, exactly. I understand. Thanks for the very interesting talk. There was another talk that mentioned a new frame evaluation API from PEP,
I don't know, 523 or something, which is used by PyCharm, I saw it in a PyCharm debugging talk. It's also apparently intended for jetting CPython, can you shed some lights on how this relates to PyPy, and whether, I don't know, yeah, you know more about this?
Yes, so yes, it is indeed probably a good thing for CPython, but it's completely unrelated to PyPy. I mean, that's the end of the story, I would say. What we are going to do about it in PyPy
is just add support for this indirection for the CPython accession module, and that's it. So you said that you now support Cython really well. Do you support Cython specifically? Did you add support for Cython,
or is it just that the improvements in CPyxt make it possible to run the extension modules that Cython generates, and if that's the case, did you do special things to make sure in particular Cython worked well, or did it just fall out of all the work you did on CPyxt? Right, so no, it is all standard CPyxt work,
but we really had the problem of running this big Cython module like Pandas or something and had to tweak CPyxt until it supports enough of the C API, like enough of the details of the C API to be able to run,
because Cython is doing a bit of crazy things like making up some PyFrame objects or something like that. I got a question about the SGM. Would it be possible or useful just to have a limited subset of PyPy supporting SGM,
like a list, a dictionary, and a set implemented that rock is SGM, then import as extra module? Would it be possible, and does it fit in the philosophy you follow in this PyPy? Yes, I don't know. The answer is I don't know.
I mean, yes, maybe it is indeed a good idea to have an independent module in which you send code, but like not generic Python code, but specialized code on a heavy trend using STM, maybe.
You said that CPyxt, calling across CPyxt is slow. What kind of order of magnitude of slow are we talking? Order of magnitude, are we talking nanoseconds, milliseconds?
So, what order of magnitude, I'm not sure, maybe five times, something like that, I think. It's a number that can be improved by adding more special case. I mean, CPyxt has tons of special case for calling a C implementation of a built-in, for example,
and we don't have them just because. So, it's a matter of adding that, but, yeah. So, there will always be maybe two or three times slower order of magnitude. Yes, too bad.
Hello, as you know, we have the end of life of Python 2 coming up fairly soon in 2020. Is the end of life for Python 2? Is there plans to keep, I guess, progressing with PyPy 2 after that,
or will that be seen as beating a dead horse? Yeah, that's a very loaded question. I don't exactly know how to answer it. I mean, how? Well, I would say, imagine that we'll just continue
supporting Python 2 forever, basically, just because PyPy itself is written as a pile of Python 2 code. But yes, that's about all I can say.
So, sorry, we are running out of time. Well, let's thank Armin again. Great talk, great project. Thank you very much.