Writing faster Python
Formal Metadata
Title 
Writing faster Python

Title of Series  
Part Number 
41

Number of Parts 
169

Author 

License 
CC Attribution  NonCommercial  ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and noncommercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license. 
Identifiers 

Publisher 

Release Date 
2016

Language 
English

Content Metadata
Subject Area  
Abstract 
Sebastian Witowski  Writing faster Python Presentation on how you can write faster Python in your daily work. I will briefly explain ways of profiling the code, discuss different code structures and show how they can be improved. You will see what is the fastest way to remove duplicates from a list, what is faster than a for loop or how “asking for permission” is slower than “begging for forgiveness”.  Did you know that Python preallocates integers from 5 to 257 ? Reusing them 1000 times, instead of allocating memory for a bigger integer, can save you a couple of milliseconds of code’s execution time. If you want to learn more about this kind of optimizations then, … well, probably this presentation is not for you :) Instead of going into such small details, I will talk about more "sane" ideas for writing faster code. After a very brief overview of how to optimize Python code (rule 1: don’t do this, rule 2: don’t do this yet, rule 3: ok, but what if I really want to do this ?), I will show simple and fast ways of measuring the execution time and finally, discuss examples of how some code structures could be improved. You will see:  What is the fastest way of removing duplicates from a list  How much faster your code is when you reuse the builtin functions instead of trying to reinvent the wheel  What is faster than the good ol’ for loop  If the lookup is faster in a list or a set (and when it makes sense to use each)  How the “It's better to beg for forgiveness than to ask for permission” rule works in practice

00:00
Arithmetic mean
Meeting/Interview
Lecture/Conference
Code
00:36
Writing
Presentation of a group
Computer animation
Lecture/Conference
Code
Code
00:55
Writing
Slide rule
Programming language
Process (computing)
Computer animation
Lecture/Conference
Code
Game theory
Formal language
01:28
Programming language
Computer animation
Software developer
Software developer
Universe (mathematics)
Neuroinformatik
01:55
Source code
Touchscreen
Computer animation
Googol
Computer program
3 (number)
Website
Pattern language
Mathematical optimization
Computer programming
02:34
Computer animation
Lecture/Conference
Computer program
Sampling (statistics)
Website
Rule of inference
Mathematical optimization
Power (physics)
Vacuum
03:10
Area
Logical constant
Semiconductor memory
State of matter
Multiplication sign
Computer hardware
Video game
Bit
Mathematical optimization
Product (business)
Number
04:04
Point (geometry)
Word
Software
Code
Multiplication sign
Computer cluster
Code
Software testing
Line (geometry)
Rule of inference
Mathematical optimization
04:45
Computer cluster
Code
Software testing
Mathematical optimization
05:09
User profile
Software
Profil (magazine)
Multiplication sign
Weight
Mereology
05:39
Module (mathematics)
User profile
Graphical user interface
Functional (mathematics)
Computer animation
File format
Multiplication sign
Active contour model
Statistics
Computer programming
Library (computing)
06:03
Complex (psychology)
Code
Variety (linguistics)
Mereology
Mathematics
Goodness of fit
Different (Kate Ryan album)
Energy level
Data structure
Mathematical optimization
Fiber (mathematics)
Physical system
Computer architecture
Social class
Area
Programming language
Algorithm
Constraint (mathematics)
Projective plane
Database
System call
Software
Summierbarkeit
Energy level
Mathematical optimization
07:15
Source code
Presentation of a group
Algorithm
Algorithm
Mereology
Field (computer science)
Number
Computer animation
Energy level
Energy level
Summierbarkeit
Mathematical optimization
Error message
Data structure
Resultant
Mathematical optimization
07:42
Source code
Algorithm
Software developer
Virtual machine
Computer programming
Formal language
Compiler
Web 2.0
Computer animation
Energy level
Mathematical optimization
Data structure
Mathematical optimization
Computer architecture
08:02
Source code
Programming language
Implementation
Run time (program lifecycle phase)
Algorithm
Run time (program lifecycle phase)
Multiplication sign
Compiler
Formal language
Revision control
Latent heat
Computer animation
Software
Personal digital assistant
Energy level
Mathematical optimization
Data structure
Compilation album
Mathematical optimization
09:01
Area
Computer animation
Semiconductor memory
Direction (geometry)
Multiplication sign
Spacetime
Computer network
MiniDisc
Semiconductor memory
Mathematical optimization
Mathematical optimization
Computer programming
09:32
Revision control
Source code
Computer animation
Code
Source code
Code
Mathematical optimization
Mathematical optimization
10:06
Tuple
Building
Run time (program lifecycle phase)
Multiplication sign
File format
Instance (computer science)
Counting
Disk readandwrite head
Special unitary group
Different (Kate Ryan album)
Object (grammar)
Hash function
Set (mathematics)
Endliche Modelltheorie
Category of being
Repetition
Complex (psychology)
Electronic mailing list
Amsterdam Ordnance Datum
Range (statistics)
Measurement
Element (mathematics)
Pattern language
Absolute value
Programmschleife
Resultant
Writing
Point (geometry)
Rounding
Digital filter
Functional (mathematics)
Inheritance (objectoriented programming)
Division (mathematics)
Electronic mailing list
Number
Element (mathematics)
Revision control
Computer multitasking
output
Loop (music)
Data type
Multiplication sign
Default (computer science)
Standard deviation
Online help
Magnetooptical drive
Database
Letterpress printing
Binary file
Computer animation
Social class
Library (computing)
11:29
Digital filter
Building
Functional (mathematics)
Multiplication sign
Electronic mailing list
Electronic mailing list
Element (mathematics)
Field (computer science)
Element (mathematics)
Number
Number
Programmschleife
Computer animation
Personal digital assistant
Function (mathematics)
Sinc function
Form (programming)
12:20
Digital filter
Functional (mathematics)
Code
Multiplication sign
Electronic mailing list
Field (computer science)
Theory
Product (business)
Attribute grammar
Sign (mathematics)
Lecture/Conference
Operator (mathematics)
Social class
Form (programming)
Multiplication sign
Computer virus
Planning
Variable (mathematics)
System call
Element (mathematics)
Number
Wave
Computer animation
Personal digital assistant
Function (mathematics)
Statement (computer science)
Resultant
13:19
Multiplication sign
Multiplication sign
Execution unit
Rule of inference
Attribute grammar
Computer animation
Vector space
Lecture/Conference
Personal digital assistant
Right angle
Condition number
Exception handling
Thumbnail
14:19
Multiplication sign
Computer file
Code
Multiplication sign
Electronic mailing list
Set (mathematics)
Element (mathematics)
Connected space
Element (mathematics)
Number
Computer animation
Lecture/Conference
Internetworking
Data storage device
Personal digital assistant
Statement (computer science)
Software testing
Software testing
Data structure
Form (programming)
15:35
Multiplication sign
Random number
Functional (mathematics)
Multiplication sign
Electronic mailing list
Motion capture
Set (mathematics)
Electronic mailing list
Data dictionary
Element (mathematics)
Element (mathematics)
Number
Order (biology)
Computer animation
Different (Kate Ryan album)
Order (biology)
Set (mathematics)
Uniqueness quantification
Software testing
Quicksort
Endliche Modelltheorie
16:48
Operations research
Pairwise comparison
Functional (mathematics)
Random number generation
Multiplication sign
Expression
Set (mathematics)
Range (statistics)
System call
Variable (mathematics)
Number
Database normalization
Computer animation
Computer configuration
Data storage device
Personal digital assistant
Function (mathematics)
Operator (mathematics)
String (computer science)
Square number
Condition number
17:47
Functional (mathematics)
Randomization
Length
Multiplication sign
Electronic mailing list
Electronic mailing list
Rule of inference
Variable (mathematics)
Computer animation
Different (Kate Ryan album)
Order (biology)
Convex hull
Data structure
Lambda calculus
Online chat
Lambda calculus
18:29
Functional (mathematics)
System call
Structural load
File format
Line (geometry)
Number
Revision control
Positional notation
Computer animation
Different (Kate Ryan album)
Function (mathematics)
Endliche Modelltheorie
Lambda calculus
19:10
Multiplication sign
Greatest element
Functional (mathematics)
Overhead (computing)
Lecture/Conference
Personal digital assistant
Time zone
Electronic mailing list
Inequality (mathematics)
Resultant
20:02
Source code
Building
Functional (mathematics)
State of matter
Multiplication sign
Source code
Spiral
Line (geometry)
Variable (mathematics)
Cartesian coordinate system
Element (mathematics)
Variable (mathematics)
Revision control
Category of being
Number
Arithmetic mean
Computer animation
Different (Kate Ryan album)
Function (mathematics)
Network topology
Energy level
Mathematical optimization
Mathematical optimization
21:49
Functional (mathematics)
Multiplication sign
Moment (mathematics)
KerrLösung
Right angle
Data structure
22:20
Physicalism
Whiteboard
22:45
Lecture/Conference
Semiconductor memory
Profil (magazine)
Internet service provider
1 (number)
23:17
Voting
Lecture/Conference
Source code
Electronic program guide
Disk readandwrite head
Buffer overflow
Condition number
23:56
Cycle (graph theory)
00:00
good without further ado like to present our next speaker Sebastian semester will be talking about writing faster code given that again if I have lot getting means so I
00:22
like to that would be about right invested goes and let's analyse giving a short talk some lightning talk about right faster code I remember someone pointed out that well basically you can take a Python called rewriting CRC Press Bassanio's it will be faster and I mean the guy was
00:38
right so take any piece of Python called rewriting 0 suppressed present Granny will be up and automatically
00:44
passed so thinking if I say just writing faster code it might not be clear if I mean on vitamin a lot so that fix the title my presentation and I was
00:58
very happy when you mean it makes it clear we're not gonna talk about CEO job there but then I realized that long I mean even though it's very clear is on the slide so I had to change the game so I get that depression that means exactly the same thing by the charter so this I and would write and classified them as a title for my today stopped so
01:19
that will decide more about which programming languages that we don't know you that we you here I'm glad that was not created to be a fast language that you would
01:28
use for some computations were every nanosecond pounds and that's final of me I mean by is a great programming language that very easy and fun to use I can it's very easy to learn the fact that it's so easy to read and write called in Python it's very encouraging for people of the new for people you too suffer development and I see that it's getting more and more popular
01:50
in schools or universities as the 1st programming language and I'm not surprised I mean imagine a
01:57
completely new programming and someone tells you pay let me show you how much fun programming is and that's start something so simple that's just bring some text on the screen and then he shows those 2 examples I mean 1 of them is clearly not something that you want to show the beginning to encourage the more have to start programming but as
02:20
a pattern is not only useful a useful for many there are many companies that are using Python companies with millions of users and billions of request amount so your website is gonna be found using Python so by Python is usually fast enough but what if you decide
02:35
that it's not fast enough anymore for example the websites that's giving the timeouts or maybe a constant called will bring more money to company so a sample
02:44
optimization but how do you optimize cold wall probably need to
02:49
follow some roles solicited over that and if we open the 1st thing we see that there are only 2 yearold power not easier than expected so let's take another look at those rules 1st Earl of optimization done done OK that was easy any questions laxity now there's not that so what does it mean don't well
03:10
9 out of 10 times when you think that we need optimization you done especially in the area the state of the product's life you might think it would be nice to optimize is called a bit but will 1st of all you will waste time doing something that is probably not needed but it you want to call to run faster you can stop in getting a faster hardware in the 1st place and 2nd of all optimization constant cost from most often the only cost is the time that you spend optimizing approach well sometimes you is also the time that the need to fix what we just broke the optimisation and but also optimizing the optimise called might not be as readable as it was in the 1st place then maybe is running faster but is now using more memory so and this you have really good reasons to optimize don't building and if you know that you have reasons to optimize then we can move the 2nd law of
04:01
optimization number this year
04:05
and this is how we understand this rule so 1st make sure your called words them make sure you have a good and only then you ready for optimization I love it always reminds me how many time I broke it I'm in so many times I was working on a piece of software and I started thinking from maybe I can change this piece of code and it will
04:28
be faster and maybe I will save like lines of code was a good idea no and not only because I am that breaking things but most often I did end up breaking things but also I started jumping around the cold and at some point I get confused with those writing in the 1st place did
04:47
make Michael Foster I have no idea because I had nothing to compare it to in the 1st place if I this writing the cold and then so try to improve it I could measure both solutions and compare them but in that in that scenario I could only get and that went into the last little optimization don't
05:06
gets always provide oracle human admirable in predicting
05:10
the bottlenecks of your cold so but if you think that the coldest slow 1st profiling and then see what probably takes most of the time otherwise you made up the you may end up spending time on right 1 piece of your just get like 1 per cent of speed improvement while the other parts of Europe software where you can gain weight more improvement
05:34
with less effort so that kind providing tools they were already
05:39
quite a few talks during Europe item about providing so I will not go into the details but we don't know what to static can take a look at the C program module and it'll show a clear overview of how many times each function is called and when the called and where the coldest and if you want to have some more advanced formatting
05:57
you can add the piece that's more than all could prefer the graphic interface you can take a look at the wrong thing around think these libraries so
06:06
once we ready for optimization we have to decide on which area
06:09
we want to focus so that different levels of optimization Starting from the highest level you have to design and optimization so depending on the constraints and priorities
06:19
of the system you can optimize by redesigning it in my require rewriting of software in the different programming language or change in the database fiber changing the architecture perform database queries so this kind of optimization we'll usually giving the best improvement but it also takes most you do this so I don't encourage a variety of software from the scratch but if you have some critical part of the call around often you can optimize in but very right that means you're the 1st class and because it's faster you you'll have some good speed from and for free put not way for free now you have vitamin C code in the same project the 1 that lower we have good algorithms and data structures so knowing good algorithms together with the complexity is definitely helped to creating a good and efficient software for example if you want to get the sum of
07:15
numbers from 1 to n the 1st idea might be to get a look that goes through all the elements and ask them to have do will want but it won't be fast instead in is the algorithm for the error committed by medics some which will give you the same results and it will what it will be more efficient so the next level is this article optimization and this is something that I will talk about in the 2nd part of the presentation and now we're moving to
07:40
the field level which involves setting up some specific field
07:43
let's so in your daily work it's not something that you will do all of them you can optimize spike for a specific architecture but is there a web
07:51
developer like me this is you know something that you will do was the machine or you won't bother at all next we have a compiler so we can make some optimizations if the program has if you programming language
08:02
has some ahead of time combined and say something about the bike and they which doesn't really have ahead of time compiler we asking the father well and the last but not least we have runtime that so it's related with the the compiler that using some some compilers faster than the other so for example if you replace it by
08:22
identified by you can get some improvement depending on the use case of software so but but innovate depends on what kind of the support your writing so most of the time want to set up on a specific language implementation there's nothing you have to do to benefit from this kind of optimization it's usually up to the creators of the compilers to optimize them so simply updating their new brought new version of the programming language you using can make called Runnymede faster so when you optimize probably wonder called run faster and also use this memory and basically less of everything the bad news is you
09:02
can't have all the optimization what area will usually deterioration in other areas so you always have to decide which resources are crucial and you have to optimize in that direction so it's possible that optimization will have nothing
09:18
to do with the speed because they're all their resources more important and there'll be for example who cares that you're the program is now 10 times faster when it's crashing income the time because that is running out of memory was up
09:33
another important resource that people are often for getting is the sanity assigning the other person that will be maintained in your code so please be nice to the person you never know that might be yeah so this is really writing it's always called if you making your called harder to read and maintain than you're probably doing it wrong so having those things clearly translate of how you can write faster by also known as source code optimizations my
10:05
examples I'm using the version
10:06
3 . 5 . 1 of I don't but there like although the example should work in both them 2 and pattern 3 so much for measuring the execution time of my cult I will be using the magic and function it has some overhead comparing to the standard time library but it doesn't really matter because as long as we use the same mental tools measure execution time of different functions we only need to know which method is fast and by how much so for each of my examples I will write different versions of gold measure the execution time and compare them so let's start with something simple let's say you want to count the number of elements in the list you can easily write a simple that to increment the counter and well this will was it will be very so you can achieve the same results using the building and function and as you can see for only 1 million of results the differences in the huge so my 1st advice is not to reinvent the wheel but 1st check if there is a function that you can use by country . 5 this
11:07
68 giving functions so it's nice to take a look at them and keep them in the back of of your head because they might be handy at some point falls off before you start writing your own version of all the dictionary or dictionary with default values take a look at the collection model from the standard library even though it contains only like 10 different database those are probably the
11:29
provides you are looking for you to stand the answers are not enough so let's say you have a list of 1
11:36
million elements and you want to select only 2 of numbers so the nite question would be to use the form so for each element of the least you check if it's hot and if it is added to another but I already show you in the previous example that in most cases for loops can be replaced with something better In this case you could use the building field of functions that and in fact until you the world of directly by countries with amateur radio so I have call the least function get the same results as in case of the form and even though this function has some impact of on the performance it's negligible comparing to the time spent the filter function yet we can see that the the performance is than the
12:20
former why does this happen well defined the field there is returning now and theory is a clear sign that it's wrong use case for this kind of functions so if
12:31
you want to get the call as a result it's better to use comprehension is around 75 per cent times faster than the formal plan of this form we use more clear when you want to execute piece of code but you are not sure if it will be successful maybe some variables are not set off but in this case the the class might be missing so much so you want to protect yourself from house the 1st wave and this is called look before you leave for ask for permission what it means is that your 1st check if the cost of a specific attributes and then you performed operations usually just checking is done the statement however there is defined the product years and it's called back for forgiveness and so In this scenario who performed the operation without taking the
13:20
conditions for but in case you expect you expect that something might break you wrap your called unit right love and you catch the exceptions of and as you can see in the
13:31
simple example being for forgiveness is like 3 times faster but it gets even better
13:36
if you've taken from our conditions so here we are checking if 3 up to the present and asking for permission is still slower and now it's also getting more difficult to read so following the vector from the this approach will result in a faster and more readable prose so we could say that asking for forgiveness instead of taking the permissions is always a better way but we won't say no because it's not true exception handling is still quite expensive so if the attribute is actually missing invading for for units will be slower than asking for permission so as a rule of thumb you can use to ask for permission way if you know that it's very likely did that after that will be missing or there will be some other problems that you can predict
14:20
otherwise if you're expecting that the called will work in most of the times try using using triaccept will result in faster and quite often more readable code so for example
14:30
interventions some files from the internet then we expect that everything will be fine and if there is no income connections so instead of checking the
14:38
internet connection must and not there finals to store for the but then again I strongly advise you to measure both solutions and maybe in your case it would be different so it's tackle another problem the membership test so if you have a list and you want to check if it contains a specific elements you can use a form but the problem is you are iterating over the whole these given those you've not really doing anything with all those elements so we can replace the formal with the in statement it will check that specific element belongs to a given set of data and it will do this twice as fast the but there is still 1 the problem with this approach the lookup time depends on where you're element is located in that it is at the beginning of the list you're lucky and you will get fast if it's at the end of the list you have wait for what will be really nice scary who had the data structure that would have a constant look up and actually invite Python we have we have both sets and a dictionary that can
15:37
from the profession so if we
15:40
replace the least with set and then they look time becomes faster from just 2 times faster to hundreds of thousand times faster so was capture but you buy some time to come to refer to assess and in this scenario combating this leads to a set takes more time than any of the look some of these so it doesn't really make sense however if you're checking membership of different elements uh quite often it makes sense to 1st convert it to set so speaking of since they have another interesting feature they don't contain duplicates so basically have a list of elements and you want to remove the duplicates the fastest way to do this is to combat this leads to a set of and the ordered sets are not out of order so if you need to preserve the order they can look at the order dictionary from the collection model so if you want to sort 2 you can either do this in place using various of sort function or you can call the sort of function that will create a new lease and on this you really need to have a new sorting inplace will be like 6 times faster in this scenario and this is
16:49
for 1 million of random numbers if you want to perform the same operation on a large set of data then you have 2 options you
17:00
can write a function that performs the operation and call this function 1 thousand times or you can call a function that takes 2 set of data that and performs operation inside and the 2nd approach would be faster so if you can in an easy way to replace multiple calls to 1 function with just 1 function then quite often it's a good idea so what's the best way to check if a viable expressionist true well we can explicitly compute their this variable to true but in most cases you adding additional redundancy so you can simplify your condition to just give viable and political and there's 2 variable false non 0 and the string into the store or the falsity expressions and by doing that you're comparison get faster by the late 70 per cent
17:49
and the same rule applies when checking for so the fastest way to do this
17:53
is to if not viable unless you really need to distinguish false from it's a non 0 order false values it also applies to and data structures so simply doing if not only is this will be almost 3 times faster than explicitly checking the length of the list so let's take a look at different ways of defining functions in Python the most common 1 is to create a function with the keyword and the other ways to declare an anonymous function of lambda random if yes I understand that to a variable
18:30
it will act in the same way as a function quicker than we would that the work and as you can see the ball equally fast why because both
18:39
versions do exactly the same thing we can disassemble the cold of all versions with this library and we'll see that inside the same cold so is there any difference what if a function has more than 1 line you can use them and you can't really full documentation inside of land also you could have great enable you called editorial complain expanded to to assign them viable and this is why the numbers were really nice we need a simple oneliners called Fourier functions especially for functions that the model reduces and there are
19:12
also some quite narrow use cases where it might be necessary to use land as a covert so if you want to read more about integrating the the bottom but about in any other case I would definitely recommend to is that it's a much cleaner and and documented properly and the performance is exactly the same so
19:32
there always how you can create a name to so you get inequalities function or attentive the needles and that's and has disability thousand that faster why is it faster because if you call a function like 1st result is functions and with the literal syntax and there is no overhead for them and the exact same thing happens for creating a dictionary we have more examples that should be treated with caution and given that the
20:02
cold in faster I would not advise you to do this kind of optimization so sometimes even though you can squeeze some additional performance from your called um it doesn't mean that you should do this the 1 thing
20:17
is a viable assignment and if you have a bunch of variables that we need to assign you in do this the normal sequential way or you can go for this crazy spiral of science and I mean you can gain some speed but with this constant state of your colleagues that will be reading this gold later so we might opinions and work with OK and another interesting property of Python is that they look for local variables is faster than the lookup for robust or buildings so you can save save some time if you start the reference to the building function over the whole function in the local variable so in this example the only difference is on the line tree where storing the reference today blob often in the local often viable and thanks that this function is like that if I wasn't faster but then again if you see this called for the 1st time that it's not very clear what is this what is supposed to do it might be confusing to see this kind of and function because we are most you more used to see those used to build up and version some of the different kind kind of optimization it's quite often about this speed but not always and also different levels of optimization so sometimes if you come to rewrite the whole application maybe you can use a different approach M. even though the source code optimization it's not the fastest way to optimize of called small improvements will add up and the main advantage of it is she so you can write you can
21:52
optimize the code of at the moment of right only need to rewrite something and as long as you're writing idiomatic cold and you don't reinvent the wheel but already used existing functions and the destruction the title the new already doing it correct so
22:09
because when you call if you think that the different called structure can be faster you can quickly check it with the time and then you can improve it all right my name is the most
22:20
interesting and I work that's and so we that in book about physics and you probably in their own conference what if you want to buy the book quite intention somewhere on the board begin the the 2
22:43
minutes the questions as much as you have to take 1 or 2 questions
22:45
surely have them but does take as our question is wasn't all I have a quick
22:53
question use you should have some follows corpora followers to have any preference every indeed it's on the jury
23:04
depends what you want to profile because if you care about the speed and the basic ones are fine but if you want to provide a memory usage than you might need to use different providers so it really depends on what the what the profile it
23:17
westerns yeah uh and a good a condition known book so source where we can find the best practices regarding this idiomatic Python and not from the top of my head but what didn't there is uh some guides and no official by the annotation but also for me it's a lot of going for best practices also reading others said that overflow there are some books but I can't give you the names right now
23:52
more questions to yes vote was that you stick your hand up so just explaining
23:58
something excitedly pretty remote questions any more questions no negative that cycles because a fantastic talking