Writing faster Python

Video in TIB AV-Portal: Writing faster Python

Formal Metadata

Writing faster Python
Title of Series
Part Number
Number of Parts
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
Release Date

Content Metadata

Subject Area
Sebastian Witowski - Writing faster Python Presentation on how you can write faster Python in your daily work. I will briefly explain ways of profiling the code, discuss different code structures and show how they can be improved. You will see what is the fastest way to remove duplicates from a list, what is faster than a for loop or how “asking for permission” is slower than “begging for forgiveness”. ----- Did you know that Python preallocates integers from -5 to 257 ? Reusing them 1000 times, instead of allocating memory for a bigger integer, can save you a couple of milliseconds of code’s execution time. If you want to learn more about this kind of optimizations then, … well, probably this presentation is not for you :) Instead of going into such small details, I will talk about more "sane" ideas for writing faster code. After a very brief overview of how to optimize Python code (rule 1: don’t do this, rule 2: don’t do this yet, rule 3: ok, but what if I really want to do this ?), I will show simple and fast ways of measuring the execution time and finally, discuss examples of how some code structures could be improved. You will see: - What is the fastest way of removing duplicates from a list - How much faster your code is when you reuse the built-in functions instead of trying to reinvent the wheel - What is faster than the good ol’ for loop - If the lookup is faster in a list or a set (and when it makes sense to use each) - How the “It's better to beg for forgiveness than to ask for permission” rule works in practice
Arithmetic mean Code
Writing Presentation of a group Code Code
Writing Slide rule Programming language Process (computing) Code Game theory Formal language
Programming language Software developer Software developer Universe (mathematics) Computer
Source code Touchscreen Googol Computer program 3 (number) Website Pattern language Mathematical optimization Computer programming
Computer program Sampling (statistics) Website Rule of inference Mathematical optimization Power (physics) Vacuum
Logical constant Area Semiconductor memory State of matter Multiplication sign Computer hardware Video game Bit Mathematical optimization Product (business) Number
Point (geometry) Word Software Code Multiplication sign Computer cluster Code Software testing Line (geometry) Rule of inference Mathematical optimization
Computer cluster Code Software testing Mathematical optimization
User profile Software Profil (magazine) Multiplication sign Weight Mereology
Module (mathematics) User profile Graphical user interface Functional (mathematics) File format Multiplication sign Active contour model Statistics Computer programming Library (computing)
Complex (psychology) Variety (linguistics) Code Mereology Goodness of fit Mathematics Different (Kate Ryan album) Energy level Data structure Mathematical optimization Fiber (mathematics) Social class Computer architecture Physical system Area Programming language Algorithm Constraint (mathematics) Projective plane Database System call Software Summierbarkeit Energy level Mathematical optimization
Source code Algorithm Presentation of a group Algorithm Mereology Field (computer science) Number Energy level Energy level Summierbarkeit Mathematical optimization Error message Data structure Mathematical optimization Resultant
Source code Algorithm Software developer Virtual machine Computer programming Formal language Compiler Web 2.0 Energy level Mathematical optimization Data structure Mathematical optimization Computer architecture
Source code Programming language Implementation Run time (program lifecycle phase) Algorithm Run time (program lifecycle phase) Multiplication sign Compiler Formal language Revision control Latent heat Software Semiconductor memory Personal digital assistant Energy level Mathematical optimization Data structure Compilation album Mathematical optimization
Area Semiconductor memory Direction (geometry) Multiplication sign Spacetime Computer network MiniDisc Semiconductor memory Mathematical optimization Mathematical optimization Computer programming
Revision control Source code Code Source code Code Mathematical optimization Mathematical optimization
Tuple Building Run time (program lifecycle phase) Multiplication sign File format Instance (computer science) Counting Disk read-and-write head Special unitary group Different (Kate Ryan album) Object (grammar) Hash function Set (mathematics) Endliche Modelltheorie Category of being Repetition Complex (psychology) Electronic mailing list Amsterdam Ordnance Datum Range (statistics) Measurement Element (mathematics) Pattern language Absolute value Programmschleife Resultant Writing Point (geometry) Rounding Digital filter Functional (mathematics) Inheritance (object-oriented programming) Division (mathematics) Electronic mailing list Number Element (mathematics) Revision control Computer multitasking output Loop (music) Data type Multiplication sign Default (computer science) Standard deviation Online help Magneto-optical drive Database Letterpress printing Binary file Social class Library (computing)
Digital filter Functional (mathematics) Building Multiplication sign Electronic mailing list Electronic mailing list Element (mathematics) Field (computer science) Element (mathematics) Number Number Programmschleife Personal digital assistant Function (mathematics) Sinc function Resultant Form (programming)
Digital filter Functional (mathematics) Code Multiplication sign Electronic mailing list Theory Field (computer science) Product (business) Attribute grammar Sign (mathematics) Operator (mathematics) Form (programming) Social class Multiplication sign Computer virus Planning Variable (mathematics) System call Element (mathematics) Number Wave Personal digital assistant Function (mathematics) Statement (computer science) Resultant
Vector graphics Multiplication sign Personal digital assistant Multiplication sign Execution unit Right angle Rule of inference Condition number Attribute grammar Thumbnail Exception handling
Multiplication sign Computer file Code Multiplication sign Electronic mailing list Data storage device Set (mathematics) Element (mathematics) Connected space Element (mathematics) Number Internetworking Personal digital assistant Statement (computer science) Software testing Software testing Data structure Form (programming)
Multiplication sign Random number Functional (mathematics) Multiplication sign Electronic mailing list Motion capture Set (mathematics) Electronic mailing list Data dictionary Element (mathematics) Element (mathematics) Number Order (biology) Different (Kate Ryan album) Order (biology) Set (mathematics) Uniqueness quantification Software testing Endliche Modelltheorie Quicksort
Operations research Pairwise comparison Functional (mathematics) Random number generation Multiplication sign Expression Data storage device Set (mathematics) Range (statistics) System call Variable (mathematics) Number Database normalization Computer configuration Personal digital assistant Function (mathematics) Operator (mathematics) String (computer science) Square number Condition number
Randomization Functional (mathematics) Length Multiplication sign Electronic mailing list Electronic mailing list Rule of inference Variable (mathematics) Different (Kate Ryan album) Order (biology) Convex hull Data structure Lambda calculus Online chat Lambda calculus
Revision control Functional (mathematics) Positional notation System call Structural load Different (Kate Ryan album) Function (mathematics) File format Endliche Modelltheorie Line (geometry) Lambda calculus Number
Multiplication sign Greatest element Functional (mathematics) Overhead (computing) Personal digital assistant Time zone Electronic mailing list Inequality (mathematics) Resultant
Source code Functional (mathematics) Building State of matter Multiplication sign Source code Spiral Line (geometry) Cartesian coordinate system Variable (mathematics) Element (mathematics) Variable (mathematics) Revision control Category of being Number Arithmetic mean Different (Kate Ryan album) Function (mathematics) Network topology Energy level Mathematical optimization Mathematical optimization
Functional (mathematics) Multiplication sign Moment (mathematics) Kerr-Lösung Right angle Data structure
Physicalism Whiteboard
Profil (magazine) Semiconductor memory Internet service provider 1 (number)
Voting Source code Electronic program guide Disk read-and-write head Buffer overflow Condition number
Cycle (graph theory)
good without further ado like to present our next speaker Sebastian semester will be talking about writing faster code given that again if I have lot getting means so I
like to that would be about right invested goes and let's analyse giving a short talk some lightning talk about right faster code I remember someone pointed out that well basically you can take a Python called rewriting CRC Press Bassanio's it will be faster and I mean the guy was
right so take any piece of Python called rewriting 0 suppressed present Granny will be up and automatically
passed so thinking if I say just writing faster code it might not be clear if I mean on vitamin a lot so that fix the title my presentation and I was
very happy when you mean it makes it clear we're not gonna talk about CEO job there but then I realized that long I mean even though it's very clear is on the slide so I had to change the game so I get that depression that means exactly the same thing by the charter so this I and would write and classified them as a title for my today stopped so
that will decide more about which programming languages that we don't know you that we you here I'm glad that was not created to be a fast language that you would
use for some computations were every nanosecond pounds and that's final of me I mean by is a great programming language that very easy and fun to use I can it's very easy to learn the fact that it's so easy to read and write called in Python it's very encouraging for people of the new for people you too suffer development and I see that it's getting more and more popular
in schools or universities as the 1st programming language and I'm not surprised I mean imagine a
completely new programming and someone tells you pay let me show you how much fun programming is and that's start something so simple that's just bring some text on the screen and then he shows those 2 examples I mean 1 of them is clearly not something that you want to show the beginning to encourage the more have to start programming but as
a pattern is not only useful a useful for many there are many companies that are using Python companies with millions of users and billions of request amount so your website is gonna be found using Python so by Python is usually fast enough but what if you decide
that it's not fast enough anymore for example the websites that's giving the timeouts or maybe a constant called will bring more money to company so a sample
optimization but how do you optimize cold wall probably need to
follow some roles solicited over that and if we open the 1st thing we see that there are only 2 year-old power not easier than expected so let's take another look at those rules 1st Earl of optimization done done OK that was easy any questions laxity now there's not that so what does it mean don't well
9 out of 10 times when you think that we need optimization you done especially in the area the state of the product's life you might think it would be nice to optimize is called a bit but will 1st of all you will waste time doing something that is probably not needed but it you want to call to run faster you can stop in getting a faster hardware in the 1st place and 2nd of all optimization constant cost from most often the only cost is the time that you spend optimizing approach well sometimes you is also the time that the need to fix what we just broke the optimisation and but also optimizing the optimise called might not be as readable as it was in the 1st place then maybe is running faster but is now using more memory so and this you have really good reasons to optimize don't building and if you know that you have reasons to optimize then we can move the 2nd law of
optimization number this year
and this is how we understand this rule so 1st make sure your called words them make sure you have a good and only then you ready for optimization I love it always reminds me how many time I broke it I'm in so many times I was working on a piece of software and I started thinking from maybe I can change this piece of code and it will
be faster and maybe I will save like lines of code was a good idea no and not only because I am that breaking things but most often I did end up breaking things but also I started jumping around the cold and at some point I get confused with those writing in the 1st place did
make Michael Foster I have no idea because I had nothing to compare it to in the 1st place if I this writing the cold and then so try to improve it I could measure both solutions and compare them but in that in that scenario I could only get and that went into the last little optimization don't
gets always provide oracle human admirable in predicting
the bottlenecks of your cold so but if you think that the coldest slow 1st profiling and then see what probably takes most of the time otherwise you made up the you may end up spending time on right 1 piece of your just get like 1 per cent of speed improvement while the other parts of Europe software where you can gain weight more improvement
with less effort so that kind providing tools they were already
quite a few talks during Europe item about providing so I will not go into the details but we don't know what to static can take a look at the C program module and it'll show a clear overview of how many times each function is called and when the called and where the coldest and if you want to have some more advanced formatting
you can add the piece that's more than all could prefer the graphic interface you can take a look at the wrong thing around think these libraries so
once we ready for optimization we have to decide on which area
we want to focus so that different levels of optimization Starting from the highest level you have to design and optimization so depending on the constraints and priorities
of the system you can optimize by redesigning it in my require rewriting of software in the different programming language or change in the database fiber changing the architecture perform database queries so this kind of optimization we'll usually giving the best improvement but it also takes most you do this so I don't encourage a variety of software from the scratch but if you have some critical part of the call around often you can optimize in but very right that means you're the 1st class and because it's faster you you'll have some good speed from and for free put not way for free now you have vitamin C code in the same project the 1 that lower we have good algorithms and data structures so knowing good algorithms together with the complexity is definitely helped to creating a good and efficient software for example if you want to get the sum of
numbers from 1 to n the 1st idea might be to get a look that goes through all the elements and ask them to have do will want but it won't be fast instead in is the algorithm for the error committed by medics some which will give you the same results and it will what it will be more efficient so the next level is this article optimization and this is something that I will talk about in the 2nd part of the presentation and now we're moving to
the field level which involves setting up some specific field
let's so in your daily work it's not something that you will do all of them you can optimize spike for a specific architecture but is there a web
developer like me this is you know something that you will do was the machine or you won't bother at all next we have a compiler so we can make some optimizations if the program has if you programming language
has some ahead of time combined and say something about the bike and they which doesn't really have ahead of time compiler we asking the father well and the last but not least we have runtime that so it's related with the the compiler that using some some compilers faster than the other so for example if you replace it by
identified by you can get some improvement depending on the use case of software so but but innovate depends on what kind of the support your writing so most of the time want to set up on a specific language implementation there's nothing you have to do to benefit from this kind of optimization it's usually up to the creators of the compilers to optimize them so simply updating their new brought new version of the programming language you using can make called Runnymede faster so when you optimize probably wonder called run faster and also use this memory and basically less of everything the bad news is you
can't have all the optimization what area will usually deterioration in other areas so you always have to decide which resources are crucial and you have to optimize in that direction so it's possible that optimization will have nothing
to do with the speed because they're all their resources more important and there'll be for example who cares that you're the program is now 10 times faster when it's crashing income the time because that is running out of memory was up
another important resource that people are often for getting is the sanity assigning the other person that will be maintained in your code so please be nice to the person you never know that might be yeah so this is really writing it's always called if you making your called harder to read and maintain than you're probably doing it wrong so having those things clearly translate of how you can write faster by also known as source code optimizations my
examples I'm using the version
3 . 5 . 1 of I don't but there like although the example should work in both them 2 and pattern 3 so much for measuring the execution time of my cult I will be using the magic and function it has some overhead comparing to the standard time library but it doesn't really matter because as long as we use the same mental tools measure execution time of different functions we only need to know which method is fast and by how much so for each of my examples I will write different versions of gold measure the execution time and compare them so let's start with something simple let's say you want to count the number of elements in the list you can easily write a simple that to increment the counter and well this will was it will be very so you can achieve the same results using the building and function and as you can see for only 1 million of results the differences in the huge so my 1st advice is not to reinvent the wheel but 1st check if there is a function that you can use by country . 5 this
68 giving functions so it's nice to take a look at them and keep them in the back of of your head because they might be handy at some point falls off before you start writing your own version of all the dictionary or dictionary with default values take a look at the collection model from the standard library even though it contains only like 10 different database those are probably the
provides you are looking for you to stand the answers are not enough so let's say you have a list of 1
million elements and you want to select only 2 of numbers so the nite question would be to use the form so for each element of the least you check if it's hot and if it is added to another but I already show you in the previous example that in most cases for loops can be replaced with something better In this case you could use the building field of functions that and in fact until you the world of directly by countries with amateur radio so I have call the least function get the same results as in case of the form and even though this function has some impact of on the performance it's negligible comparing to the time spent the filter function yet we can see that the the performance is than the
former why does this happen well defined the field there is returning now and theory is a clear sign that it's wrong use case for this kind of functions so if
you want to get the call as a result it's better to use comprehension is around 75 per cent times faster than the formal plan of this form we use more clear when you want to execute piece of code but you are not sure if it will be successful maybe some variables are not set off but in this case the the class might be missing so much so you want to protect yourself from house the 1st wave and this is called look before you leave for ask for permission what it means is that your 1st check if the cost of a specific attributes and then you performed operations usually just checking is done the statement however there is defined the product years and it's called back for forgiveness and so In this scenario who performed the operation without taking the
conditions for but in case you expect you expect that something might break you wrap your called unit right love and you catch the exceptions of and as you can see in the
simple example being for forgiveness is like 3 times faster but it gets even better
if you've taken from our conditions so here we are checking if 3 up to the present and asking for permission is still slower and now it's also getting more difficult to read so following the vector from the this approach will result in a faster and more readable prose so we could say that asking for forgiveness instead of taking the permissions is always a better way but we won't say no because it's not true exception handling is still quite expensive so if the attribute is actually missing invading for for units will be slower than asking for permission so as a rule of thumb you can use to ask for permission way if you know that it's very likely did that after that will be missing or there will be some other problems that you can predict
otherwise if you're expecting that the called will work in most of the times try using using tri-accept will result in faster and quite often more readable code so for example
interventions some files from the internet then we expect that everything will be fine and if there is no income connections so instead of checking the
internet connection must and not there finals to store for the but then again I strongly advise you to measure both solutions and maybe in your case it would be different so it's tackle another problem the membership test so if you have a list and you want to check if it contains a specific elements you can use a form but the problem is you are iterating over the whole these given those you've not really doing anything with all those elements so we can replace the formal with the in statement it will check that specific element belongs to a given set of data and it will do this twice as fast the but there is still 1 the problem with this approach the lookup time depends on where you're element is located in that it is at the beginning of the list you're lucky and you will get fast if it's at the end of the list you have wait for what will be really nice scary who had the data structure that would have a constant look up and actually invite Python we have we have both sets and a dictionary that can
from the profession so if we
replace the least with set and then they look time becomes faster from just 2 times faster to hundreds of thousand times faster so was capture but you buy some time to come to refer to assess and in this scenario combating this leads to a set takes more time than any of the look some of these so it doesn't really make sense however if you're checking membership of different elements uh quite often it makes sense to 1st convert it to set so speaking of since they have another interesting feature they don't contain duplicates so basically have a list of elements and you want to remove the duplicates the fastest way to do this is to combat this leads to a set of and the ordered sets are not out of order so if you need to preserve the order they can look at the order dictionary from the collection model so if you want to sort 2 you can either do this in place using various of sort function or you can call the sort of function that will create a new lease and on this you really need to have a new sorting in-place will be like 6 times faster in this scenario and this is
for 1 million of random numbers if you want to perform the same operation on a large set of data then you have 2 options you
can write a function that performs the operation and call this function 1 thousand times or you can call a function that takes 2 set of data that and performs operation inside and the 2nd approach would be faster so if you can in an easy way to replace multiple calls to 1 function with just 1 function then quite often it's a good idea so what's the best way to check if a viable expressionist true well we can explicitly compute their this variable to true but in most cases you adding additional redundancy so you can simplify your condition to just give viable and political and there's 2 variable false non 0 and the string into the store or the falsity expressions and by doing that you're comparison get faster by the late 70 per cent
and the same rule applies when checking for so the fastest way to do this
is to if not viable unless you really need to distinguish false from it's a non 0 order false values it also applies to and data structures so simply doing if not only is this will be almost 3 times faster than explicitly checking the length of the list so let's take a look at different ways of defining functions in Python the most common 1 is to create a function with the keyword and the other ways to declare an anonymous function of lambda random if yes I understand that to a variable
it will act in the same way as a function quicker than we would that the work and as you can see the ball equally fast why because both
versions do exactly the same thing we can disassemble the cold of all versions with this library and we'll see that inside the same cold so is there any difference what if a function has more than 1 line you can use them and you can't really full documentation inside of land also you could have great enable you called editorial complain expanded to to assign them viable and this is why the numbers were really nice we need a simple one-liners called Fourier functions especially for functions that the model reduces and there are
also some quite narrow use cases where it might be necessary to use land as a covert so if you want to read more about integrating the the bottom but about in any other case I would definitely recommend to is that it's a much cleaner and and documented properly and the performance is exactly the same so
there always how you can create a name to so you get inequalities function or attentive the needles and that's and has disability thousand that faster why is it faster because if you call a function like 1st result is functions and with the literal syntax and there is no overhead for them and the exact same thing happens for creating a dictionary we have more examples that should be treated with caution and given that the
cold in faster I would not advise you to do this kind of optimization so sometimes even though you can squeeze some additional performance from your called um it doesn't mean that you should do this the 1 thing
is a viable assignment and if you have a bunch of variables that we need to assign you in do this the normal sequential way or you can go for this crazy spiral of science and I mean you can gain some speed but with this constant state of your colleagues that will be reading this gold later so we might opinions and work with OK and another interesting property of Python is that they look for local variables is faster than the lookup for robust or buildings so you can save save some time if you start the reference to the building function over the whole function in the local variable so in this example the only difference is on the line tree where storing the reference today blob often in the local often viable and thanks that this function is like that if I wasn't faster but then again if you see this called for the 1st time that it's not very clear what is this what is supposed to do it might be confusing to see this kind of and function because we are most you more used to see those used to build up and version some of the different kind kind of optimization it's quite often about this speed but not always and also different levels of optimization so sometimes if you come to rewrite the whole application maybe you can use a different approach M. even though the source code optimization it's not the fastest way to optimize of called small improvements will add up and the main advantage of it is she so you can write you can
optimize the code of at the moment of right only need to rewrite something and as long as you're writing idiomatic cold and you don't reinvent the wheel but already used existing functions and the destruction the title the new already doing it correct so
because when you call if you think that the different called structure can be faster you can quickly check it with the time and then you can improve it all right my name is the most
interesting and I work that's and so we that in book about physics and you probably in their own conference what if you want to buy the book quite intention somewhere on the board begin the the 2
minutes the questions as much as you have to take 1 or 2 questions
surely have them but does take as our question is wasn't all I have a quick
question use you should have some follows corpora followers to have any preference every indeed it's on the jury
depends what you want to profile because if you care about the speed and the basic ones are fine but if you want to provide a memory usage than you might need to use different providers so it really depends on what the what the profile it
westerns yeah uh and a good a condition known book so source where we can find the best practices regarding this idiomatic Python and not from the top of my head but what didn't there is uh some guides and no official by the annotation but also for me it's a lot of going for best practices also reading others said that overflow there are some books but I can't give you the names right now
more questions to yes vote was that you stick your hand up so just explaining
something excitedly pretty remote questions any more questions no negative that cycles because a fantastic talking


  390 ms - page object


AV-Portal 3.21.3 (19e43a18c8aa08bcbdf3e35b975c18acb737c630)