Performant Python


Formal Metadata

Performant Python
Title of Series
Part Number
Number of Parts
Kloss, Burkhard
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
Release Date

Content Metadata

Subject Area
Burkhard Kloss - Performant Python Python is a great language. Easy to learn, friendly to use, widely used. It is not, however, renowned for being fast. In a lot of situations that does not matter. Sometimes it really does. This talk will introduce you to some tools and techniques for making sure your Python code becomes fast enough – without turning into a maintenance nightmare. Warning: may contain small bits of other languages.
Goodness of fit Word Lecture/Conference Code
Revision control Computer programming Goodness of fit Computer animation Code Code Energy level Right angle Quicksort Measurement Writing
Point (geometry) Writing Domain name Word Web service Computer simulation Computer animation Lecture/Conference Database Moment (mathematics) Code Limit (category theory)
Programmer (hardware) Lecture/Conference Java applet Formal language
Slide rule Lecture/Conference Code Java applet Interface (computing) Abstraction Number
Root Computer animation Multiplication sign Order (biology) Computer program Mereology Quicksort Mathematical optimization Software maintenance
Software engineering Torus Computer Complex (psychology) Code Writing Programmer (hardware) Computer animation Lecture/Conference Supersonic speed Compilation album output Right angle Quicksort Data type Reading (process)
Software engineering Complex (psychology) Computer animation Code Multiplication sign Forest Computer worm Units of measurement Condition number Maxima and minima
Writing Computer programming Computer animation Lecture/Conference Code Measurement Data management Perspective (visual)
Point (geometry) Computer programming Batch processing Theory of relativity Code Computer Bit Mereology Measurement Perspective (visual) System call Hypothesis Formal language Arithmetic mean Causality Lecture/Conference Operator (mathematics) Core dump Hydraulic jump
Context awareness Computer animation Term (mathematics) Code Bit Line (geometry) Library (computing)
Computer programming Computer animation Lecture/Conference Code Mereology Total S.A.
Point (geometry) Computer animation Lecture/Conference Code Letterpress printing Total S.A. Measurement
Random number generation Computer animation Lecture/Conference Code Computer Software testing Letterpress printing Metropolitan area network
Computer programming Computer animation Right angle Total S.A.
Scaling (geometry) Computer animation Lecture/Conference Meeting/Interview Code Multiplication sign
Computer animation Random access Iteration Total S.A.
Functional programming Perfect group Computer animation Iteration Bit Right angle Canonical ensemble Total S.A. Proper map Functional (mathematics)
Revision control Computer animation Similarity (geometry) Line (geometry) Measurement
Number Computer animation Multiplication sign Bit Functional (mathematics)
Computer animation Lecture/Conference Bit Right angle Formal language
Summation Computer animation Multiplication sign Order of magnitude Number
Summation Sparse matrix Numeral (linguistics) Computer animation Lecture/Conference Electronic mailing list Data structure Number
Summation Slide rule Computer animation Code Energy level Core dump Floating point Bit
Summation Slide rule Computer animation Core dump Floating point Subtraction Compiler
Revision control Summation Computer animation Code Set (mathematics)
Revision control Summation Slide rule Computer animation Core dump Floating point Subtraction
Slide rule Arithmetic mean Lecture/Conference Computer hardware Object (grammar) Data management Number Computer architecture
Point (geometry) Process (computing) Computer animation Code Projective plane Area
Computer programming Nachlauf <Strömungsmechanik> Computer animation Lecture/Conference Code Internetworking Software developer Lemma (mathematics) Noise
Word Computer programming Data dictionary God Word Computer animation Electronic mailing list Streaming media Data dictionary
Word Metropolitan area network Data dictionary Computer animation Code Streaming media Physical law Unit testing
Laptop Reading (process) Word Computer programming Computer animation Multiplication sign Mereology 2 (number)
Point (geometry) Meeting/Interview Horizon Office suite Mereology
Point (geometry) Slide rule Computer animation Profil (magazine) Real number Pattern language 2 (number) Formal language
Standard deviation Source code User profile Raw image format Computer animation Computer file Profil (magazine) Computer program Library (computing)
System call Computer file Multiplication sign Interior (topology) Code Primitive (album) Bit Set (mathematics) Total S.A. System call Functional (mathematics) Number Cumulant Computer animation Lecture/Conference Personal digital assistant Function (mathematics) Lie group
Point (geometry) Computer programming Installation art Computer animation Lecture/Conference Profil (magazine) Line (geometry) Multiplication sign Line (geometry) Functional (mathematics)
Slide rule Data dictionary Game controller Algorithm Time line <Programm> Line (geometry) Multiplication sign Interior (topology) Counting Function (mathematics) Limit (category theory) Data dictionary Functional (mathematics) Number User profile Mathematics Content (media) Computer animation Linear search Lecture/Conference Personal digital assistant Profil (magazine) Units of measurement
Computer programming Slide rule Raw image format Observational study Real number Streaming media Set (mathematics) 2 (number) Word Word Computer animation Personal digital assistant Data structure Data structure
Run time (program lifecycle phase) Computer programming Mathematics Computer animation Real number Multiplication sign Interpreter (computing) Bit
Point (geometry) Computer animation Lecture/Conference
Numerical analysis Computer animation Lecture/Conference Gravitation
good morning thank you that's the we just welcome thank you very much for coming to my talk I am in going to talk about performed Python now um slightest organized forgot to bring batteries from a taking things of the wondering back and forth there will be
what I mean by performed by I like us to think about writing good fast Python code now the 3 words in there good fast and Python and to my mind they kind of have equal significance this talk what good code 1st not good code as complicated as it goes a concept we all know good
code when we see it right um there's there's all sorts of measurements you can apply A and this academic literature about what is good code and what's not good code as a working definition I just say it's cold that we can easily read and understand
and the next thing is Foster now of bias in the programming in Python fold 0 1 lessons about version 1 . 6 um and then also there's a level about Python but I don't write Python because it's false trade but is fast enough for pretty much everything I need to do and with a few tricks it is actually fast
enough for everything that is at the moment now I have to current limit that slightly my
domain is um mainly financial simulations so for me know I do floating point arithmetic all day long and for that is fast enough if your domain is web services so I huge databases you might not get that much of this talk but they wouldn't open you enjoy it and the last word
and we want to talk about this and weakening there's the old joke about it Cooper no real problem at right 420 language and it's um
as a card-carrying steepest programmer I can certainly write C + + or Java in Python um and
particularly when people write java and python with lots of abstract interfaces Nafta methods it kind of irritates me but weakening so I want to focus also on writing perform a code for a for a number reasons and 1 is because it's it's uh actually much nicer
to read and write C + + and Python or Java and Python and but also is faster so let's move on with the slide that's going
of thank you before we talk about speed let's reconsider the famous quote from news uh basically 9 7 % of the time we shouldn't worry about small inefficiencies um the 3 % of the time that we should worry about them it really matters so that mind the mean 1 of the things and trying to get across to people is that they're my slides out of
sort of order might and that
computers cheap programmers are expensive right and every now and then you hear about skills shortages shortage of programmers had never heard about a sort of computers recently now
again so let's move on what I'm up Sonic you've has anybody
ever done this read this center Python could hands excellent most of you have those of you haven't just type input this at the Python from to get a whole lot of this and and the like there's a that the top 3 kind of
making make the most sense to me beautiful is better than ugly who could argue with that explicit is better than implicit and I think that's also very important and simple is always better than complex uh I have to always tell myself I'm a bearable brain and complex code bothers me and and was
at the center In a debugging you code is 10 times more difficult than writing it uh so if the code you write is as complex as you can write it you have no chance of understanding of being able to do about it and so that's always something I have to bear in mind I forest conditions
OK so seems 0 or come in through here and phoning nights about my my working definition of Python can and happy to have a philosophical discussion about this uh over coffee later is just a clean readable as we all will
talk a little about profiling um and I don't believe the quote about that you can't manage what you can't measure I think for people management perspective that's bullshit and from a programming
perspective is absolutely correct right there's no point making any assumptions on building a hypothesis about you code this you can measure what you get is doing and once you have a measure of course uh and the key things to speed things up here is to do less and the fossils most reliable coder somebody much point out and I pointed out is code that you don't write narrative code in there and executes instantaneously has no books so whenever you can do less um or jump ahead to cheat uh and get somebody else to do it for you that is win in the and programming and the other 2 topics that come up a little bit later is vectorizing and
paralyzing slightly interchangeable and vectorizing from my from my perspective over somebody writes and also this language and like to work quickly basically means that uh when the past couple of 2 the past data off to um foster language faster subsystems we do it in 1 go and operational batch of data and 2 basic pay the overhead of of the core of the call wants and was paralyzing means using as many cause computers as it can so let's go back to Python performance now we know on on the interwebs and all of and you constant here about this part the slope compared to the XYZ what does it mean for it I don't think that's In a Python being being slow in relation to the question is is part slow I don't think it is for any anything that I do and I do a fair bit of the performance and
discuss in that domain and and you know we have a a
couple of thousand lines of library code um and most about this not perform sensitive at all very few hotspots aware that you were about performance but once you are a little bit about them and then properly um it actually terms other prices fast enough so that's what code obvious big enough for the
people in the back seat so this is this is kind of their Python 101 code this is the kind of
code were written in 1999 part of Python programming and you it's that simple it's clear it's readable
it works you all 3 of those are pretty pretty good that
should be used for a little piece of code to have it as a starting point is it false well let's right little gadget to
measure it and this is really not a similar to IPython climate uh humming iPod users in why don't
you of like those of you don't use IPython why this when the user no books myself but as a command line you completed
and let's look at this code right so if I run this and if we
use Miller Test go there and then
around 10 out of 10 million random numbers and it comes back and just over half a 2nd that seems pretty damn quick
to me yeah I paid to play the old man got here but you know the computers I started programming on couldn't handle 10 million random numbers much less add them
up half a 2nd so that's not too bad so why because the moral performs well they're were about Python performance compared to C + + a c right and so
let's look at the same program in C. through this and and way that we had see again good I'll take mine to this cluster of them later again and so this is very similar but if
retirement will find out that
it is about 40 times faster than a broader which is a bit irritating and you notice for bragging rights over
coffee if you talk to C + + program or if you really have something very critical so that's molecular scale again this code is is
actually pretty much
exactly the same as the now I
think we can make it faster by making a better so if for just go
to slightly more modern more
Pythonic style and instead of you access using random access iterator which and instead of using random access and using a C-style falling
if we just use a um proper Python and iterate over a collection is actually I think a little bit faster than uh yeah and below so to me this is kind of the canonical perfect python functions and for to do some
then In the current fashion is forgoing all functional programming and right a
functional version of it not desperately keel of myself but you know as a for a fashion
that to and so what does does
this and changing exchanger to to make it more following actually make it faster the yes it
does so if you look at the the 2nd line down you know we've got a 100 % of 50 % which is the similarity measure and and performance
improvement by making a little bit simpler than that of the cleaner and it'll be tidier which is great but it's still about 20 times slower than in the so and let's think about what else we
have an optional uh and obviously you all know Python so you know that we have some function built into Python and we should just use that instead of reinventing the wheel badly um and that
takes us to pretty much the same ballpark as C
+ + is still a little bit slower but you know at least we're in the same town right so I think you know this kind of shows that actually if you write Poisson properly uh and noble language and use the facilities in the language you know I get something that's not 3 or 4
times slower than C + + but that isn't you know an order of magnitude 2 orders of magnitude so it's the from most things is probably not something have to worry about but we have a few more so than
tricks of sleep so this is what I do
the work that I basically is number like um and this is this is what I mean by cheating right so I don't want to write it
building meanwhile because In writing optimized numerical efficiency of sparse is hard work and much rather get somebody else to do for me and then I speak the number I have done that and so this is really good now let's look at the performance of that that we're cheating
slightly because we're we're converting from Python a list to a number array a because it's more efficient data structure but possible where she
fossil in the 1st place uh that surprised level but last nite when I put this slide together and but you know it's so this is kind of what I would you know what we do all that is not very surprising because we're uh you know we're
comparing C code written by somebody hasn't rewritten see for living for 20 years and to some domestic but for those the world and the faster this way so this other things we can do that because sometimes known by his little bit
awkward to use a funeral using a major not doing uh basically matrix arithmetic and so we can change again in a different way um over the
vectorized compiler by following this cheap and this is the slide
from the forward see Python if
we use pipeline we had a set of what that with any of the version of the code we just as fossils it was passed so you know if you have performance problem look
at some of the new jets and my point great as you know we didn't even have to worry about rewriting anything ominous still think you should write them all by forming version of the code but the
performance gain you can get from the cheating with pipeline is fantastic so let's look at that and now there's something different
about this slide from the previous slide and that's the fact that we're missing the nonpolar version now pipeline employed haven't and have had
problems integrating in the past and they're slowly getting better that the meaning and you can actually now installed on pi and pi pi together and it works I just didn't have a chance to try it on my laptop and before coming to the slides and yet be probably and the thing you might want to look at so uh if if you wanna look at objectives is not my number which is gives number of deployment opportunities to to use GPU so if you have
a kind of problem that fits a GPU the in my experience that's 1 of those situations where if your problems fixed the the hardware architecture it's fantastic um but it has to be pretty close to that of your problems to the hardware architectures so you management very not that was
a cheat and remember this uh now this is interesting this code has probably been stolen more than any other code at run across from the project of associated attributed to
Picasso Tennison and jobs and I don't really care and if it if it's good enough for any of them it's good enough for me so I think the key 1 of the really key things to get a new point fathers to actually you know see what other the work that there
already checked uh what you can take from other people and their use of improve it profiling and doing this so
let's start with some routine although not the code and so this underground yeah that than they used to be something that people did before the internet was invented to keep themselves amused like and but then they really tedious to work out by hand and quite fun sometimes with political developments their so that's that's even in a computer program to work them out so all whatever
the there we and 1st of all read the this addiction dictionary so we have a list of all the words will to look at and then we have a very simple and of this program to final anagrams now
let's look at this and that and a
couple things about this code it worked
and it was really really easy to write and no you might not believe me that
this works but I do have unit test prove afterwards if you really don't want it and so if we just
look at how fast this runs
then uh again when we're looking at a little timing honest and if you if you just interested in finding a few anagrams discover runs in subsection of a lot of matter uh if you want to uh you know do this at all then you
start having to profile I think that's a useful to remember um you know it's only worth spending time on uh speeding up so that's actually used at a level of if you run the program once a week and it takes 5 hours to the best if you all the time has around 3 hours then you have a problem so that's that's no this so this part 3 on an old laptop takes 1 minute 14 seconds um you know it's not not slow but it's
interestingly enough when I around this point of point 7 problems put forward to to be faster what's a good because we all want to win switch over to 3 uh another poll who's still using positive and then so using Python bond know that the
horizon about half of you using part 3 and then just point out this is this is not a general general Python 3 is generally slower than Python to so that's don't upgrade get home and get back to the office switch over to Python 3 the world will be a better place and so that run this pipeline and OK we've gone from
you know 1 minute something important 3 uh 0 40 seconds and by the 2 to 20 seconds which is pretty damn good you for a not doing anything apart from changing the command line um but let's let's assume that you know well
really mean Boston he says 22 seconds so too long to find all the anagrams in English language um so let's look at profiling now patterns great point has a profile included uh just so I can gage how quickly to go for the profiling slides who knows the profile of no OK the rest of you should learn
it is worthwhile noting so the pardon profile is really really
simple use of them if you can see this and really all we do is call Python minus and to invoke the module see profile uh which evokes the profile and that spits out a huge text file here um you II probably have included anyway so just have a quick look and and
run it and that's great set of filed that looks a bit like this now that I'm a
laser pointer but basically shows the number of calls time span has been called the time spent In the function and the function calls the and then the file name and alignment and in unsurprisingly um if you look at the cumulative time column we spend pretty much all the time and the final anagrams functions now in this case is not
terribly useful if you're profiling a big program was to finding out what functions take all your time um is this is the 1st step and then about optimizing all the functions you have you need to kind of apply at times and so and let's see if we can do better than this you know if you just look at this function we can't really tell what's going on here let's see what all the tools we have no there's another problem with having doesn't come installed with point but the full that is easy to install it in the line profile takes no time at all
we have to do is to add profile that decorate on the function that the previous the hot spots and then we're run control of the slide is command line not a difficult and this gives us some really interesting output so 1st of all it tells us how long the trial function is executed and that's in this case is entirely surprising what is and sometimes find the actually the hit count more interesting than the timing because they account kind tells you a lot of but in this case you know what time 99 . 4 % of our time is spent in this kind of dictionary look of which is slightly surprising why was surprising until I thought about it and the we have figured this out already that the problem is we're doing a linear search the dictionary inside the limit in an incentive falling so and time then gives us no and squared algorithm not a great idea if you
have a large number of items and so we can make a 1 change to this function which is the commodity and this is again slightly
cheated so we just change it to a set of words that uh this study cheating because a online trained actually were fundamentally changing the data structure of the program operates on makes it nice convenient to do that in this case we're looking at it and and that now runs in 1 . 4
seconds this this i think fast enough compared to the woman important things that will this slide and surprise
ahead of this when I run and point and and apply private was actually slower now I mean you know that isn't entirely surprising if you if you think about a little bit because I just takes takes warm-up time and has to do some more work than interpreters
and if your program is morning has such a short execution time then that not dramatize afterward but then so that we have uh with a very simple change
comes to you at the end of so that was the point of the so this will
thank you very much for very moving fj
November questions that they will all be around uh for the day so if you wanna gravity gravity thank you


  726 ms - page object


AV-Portal 3.9.1 (0da88e96ae8dbbf323d1005dc12c7aa41dfc5a31)