Add to Watchlist

FAT Python: a new static optimizer for Python 3.6


Citation of segment
Embed Code
Purchasing a DVD Cite video

Formal Metadata

Title FAT Python: a new static optimizer for Python 3.6
Title of Series EuroPython 2016
Part Number 64
Number of Parts 169
Author Stinner, Victor
License CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
DOI 10.5446/21236
Publisher EuroPython
Release Date 2016
Language English

Content Metadata

Subject Area Computer Science
Abstract Victor Stinner - FAT Python: a new static optimizer for Python 3.6 The Python language is hard to optimize. Let's see how guards checked at runtime allows to implement new optimizations without breaking the Python semantic. ----- (Almost) everything in Python is mutable which makes Python a language very difficult to optimize. Most optimizations rely on assumptions, for example that builtin functions are not replaced. Optimizing Python requires a trigger to disable optimization when an assumption is no more true. FAT Python exactly does that with guards checked at runtime. For example, an optimization relying on the builtin len function is disabled when the function is replaced. Guards allows to implement various optimizations. Examples: loop unrolling (duplicate the loop body), constant folding (propagates constants), copy builtins to constants, remove unused local variables, etc. FAT Python implements guards and an optimizer rewriting the Abstract Syntax Tree (AST). The optimizer is implemented in Python so it's easy to enhance it and implement new optimizations. FAT Python uses a static optimizer, it is less powerful than a JIT compiler like PyPy with tracing, but it was written to be integrated into CPython. I wrote 3 PEP (509, 510, 511) targeting Python 3.6. Some changes to support FAT Python have already been merged into Python 3.6. We will also see other pending patches to optimize CPython core, and the bytecode project which allows to modify bytecode, it also includes a peephole optimizer written in pure Python.
can and welcome into the place of the on that Python by
Victor steneck under that welcoming community views on be so hi my name is the
because and jointly working for
the record company I'm working on the OpenStack projects 1 part of my what is balls the giant beast chordal constructed by them free as the good news is that I'm or where is the horn was done because they both in the more than 90 %
of the projects so by the phrase coming and some and most a Python called rubble since the something like
5 and today I'm handing it to present you a new project called fats by them the 1st thing that
I will try to explain why it might turn is slow and way this specific language is more difficult to optimize and some of those and if you would like to say that's law you you must compare it to something as uh come on and comparison with these 2 users assume region because by turning sometimes is
as on was the same speed but in some chronic is it's up
to 20 times so strong and when I say gives a signal which the CDs compared to much so the carjackings instituted by the CPU are compared to buy on which is interpreted as means that you you get a byte code and byte code is executive vise and international uh case was a case of C Python and you can also compare patent was JavaScript because JavaScript is also a dynamic language as a patent but JavaScript is very very efficient and compiles of conference them in many brothels and compare chew Javascript Python is due to the storm that's what we
already have a much faster 2 implementation of Python as the most famous 1 ends the
most advanced and more stable is applied by which is the absence of something like 10 uh it's fully compatible with the Python uh it's really fast like 5 or 10 times faster than the Python all sometimes even more depending on the specific kind of application on your workflow uh you of a new project corridors based on which is the response or by a drop right so a fork of the 7 and based on the evidence here using these tools give the compatibility reads this extension of birds uh
in some cases tried to call that's the Python code to mention code to operate motion code and as the project is
good the patient a is it's made by Microsoft's it's only be guaranteed to be channels and based on uh I think that based on 2 years old and vision is 1 year old the vision is based on the Microsoft cost you know at some and as a kind of a project is number number is not there for the implementation of light magic compiler but you have to annotate your function with something like uh out so G T to and it's this for number also uh for example it's very efficient formed by but it's not that generic and implementation of Python you cannot make a jingle for example much faster with number and there's all common example is the site of Satan is nothing really and implementation of Python it's more compiler to taking patents slopes and comments the code as well something like a extension but you can also annotated type 2 to make even more optimization so if you start to annotate the type of its normal Python but it looks very close to so the the
1st question is why do we
need a new optimizer and this is the fact is that I'm walking on the upon side project on the upper side we have still using C by because instances of reference implementation and some people tried to use by private there is not enough support and the problems the community to fix some simple issues so and about our by starting from the books the issue is that is this started from bottom to and we are on moving slowly to plate them free and that is they they don't belong to support the free right now and PGO vision is too early to be and I'm not sure that's a course yeah there from Microsoft is really optimized to 4 perform like line exomic speaks so I'm trying to do do
my best to make survived the narrative it's faster uh
and this effect is that by is not was passed down wasn't see by the outcome of the unknown about bottleneck of is when you use this extension because 2 supports extension abide by the US to in relates to the name of the object in memory you have to you need to have 2 views of the data the optimized you of by by that's all way to that falls this extension Europe they also have to relates difference containing and of so many years complex traits and because of that surrounding extension on by by and because by bipolarity was returns from scratch so they don't have the that you would year period the Annals of factors
that pattern remains a reference implementation for new features for example if you compare to its 7 for the future and 3 . 6 theories
right range of individuals and there maybe 10 or 20 new materials and that's also a new changes in the syntax legs I weights and the single keywords but it's also the new at string and by country that's it so by by is moving and it's moving fast and Sisub and I suddenly
in many libraries on the applications rely on C Python implementation details but I have to put quotes because it's not really detailing the so and intimately complex but education continue to rely on on them and implementation details of the patent office for example this year period of the side and as all good example is the garbage collector because in those 2 bytes and you have some reference point dimension based on the reforms continue its means that there's when you really is the latest reference to an object is distorted immediately but you know by by they decided to use a more efficient garbage collector and a consequence is that you object maybe destroyed later you don't know when exactly and then if you write your code is that for example if you open a file for reading you put that identifies a uniform got to close explicitly that that millions not Billions reduce the bonding Wednesday district score so good practices to cause to cause a close-knit of the use of the context manager that's there is still a lot of code in a way in which is not too much to written correctly and that's all for your
information in and we know well the stops wanting to detect such issues to to simplify simplifies the is goal of my application of the start item projects is the idea is to replace the court of the learned function computing is the length of the string ABC and we proceed to jointly with the results a number of free this is God who looks quite simple but they really explain why it's not as simple as you expect and this is the 1st part of the broken appointees that's uh everything in between by 10 is me table and when I say everything is just everything's wage to you some examples of Britain function like is and function it can be replaced by something you can leave it modifies the byte code of a function at runtime you obviously is the value of the bond variables challenges and any time there is no such thing like constants in right on wage
so you cannot rely on the value of a global labor because he can change any time so you have to realize the value of each time at the beginning examples of for the
function you can view places land function at runtime and uh when you cut it instead of getting is the length of the string you get the stream of and this
example is maybe not very useful but is very common in the unit test to use a mock modular tool to reduce the complexity of many testimony test 1 specific functions efforts by
10 is not my 1st attempts to what may see vitamin and in the past I wrote to ice-t optimizer which is a simple case the humans and I also wrote produced of yeah which is on you and implementation of the group
evaluating byte instead of using a stack by user and user reduced also you have to register was not to be reduced also and above project implemented in optimization like grazing lands at ABC we free uh because of that's a good
thing about feedback on a project because of the changes to the patent semantic and that people uh explained me deeply that a really wants to buy time to remaining dynamic because the choose this on which because it's dynamic and because you are able to be placed everything at runtime but even if she looks so ugly as a host look in some specific cases it's very useful tool to be able to move to modify in anything so if you if you would like
to write a new optimizer you you have to respect some rules some constraints as the 1st 1 is too much ends of latent semantic uh I had something really important for Azerbaijan community has obviously you and you should not break application it means that if you of code using the optimizer it should continue to walk as it was before the optimizer and are good property would be to not have too
many faces so screwed them because I don't want to write something like number which require to put some of the correct doesn't function of use this stuff on the code yet is more to be able to optimize any kind of applications because I want to
to other the fastest growing regions the community and I and I hope that you but then becomes faster more people we review it again I really present you some some ideas through to
rock work around this limitation and you need to respect by semantic that's a analysts to of of Mexico and so 1st thing is tool to to implement optimization fact to implement efficient optimization which provides uh is about speed up on
real application and Newtonian tiny micro-benchmarks you have to make assumptions on the code and to to make assumption tool is called gods because of zirconia check made at runtime and for example I got be checking Britain and function was replaced on that at runtime of every point
on this a very important feature of my of namespaces namespaces are used like everywhere to stone that are for example in in a more you through the provided variable our our use and the space is the functions of local variable our of stored in in space class it's useful classifiable but of methods for instance to various forms of attributes of the object to um technically and space
and pattern is a dictionary and is a technical challenge to to write the gods on and then space these 2 of which increased which is faster than
dictionary lookup because you may not know but I look up in Python is very is very fast and you if you would like to have lots of the look at the checking is even faster so I propose a solution for that it's a new perhaps tools simply add the option to dictionaries and I really of that later as 2nd optimizer goodies to specialize across uh as is the idea is to to to make some assumption in the code and the neighbor of optimization for this assumption and scored as squad situations and the to to be able to use the distance
covered you will have to check gods time to decide if you use this bacillus goes also regular the because and I wanna 1 example of specialization is if you have a function which to bomb tells the X and Y hence to vomit cells are known to be usually to be integrals you can specialize efficient walk to be optimized for integrals because uh if you know that it integrals you can enable a lot of different optimization which are not possible in the common case when you don't know types and upset a code to to go function becomes the 1st you have to check so gods and you pass the function bimetals implement guards on the type of the parliament of the physical find from changed you can use this percentage tells you something changes just for about to the global uh by generating you as see bright and already has a tumors of codes of people optimizing is the it's not you is a walking on byte code it implements a means a simple recommendation like constant for doing that that coordination on the optimization on on genomes uh some sort of an important is
that it's written in C and because it's the tendency it's not easy to extend the tool to implement new optimization and moreover it has a very narrow world view on the byte code that you only see a few inspection before maybe 1 of tool instruction in Netherlands so you you can you only have a very tiny knowledge of the code for example you don't know the war function you don't know the word what you so you are very limited in the kind of information that you can do uh the plate and provide something more
interesting could HT based is abstract syntax tree when you when patent comparison applied 5 to it could in fact you have governor mediate steps the 1st 1 is the tokenization to take a thousand were observed to tokens and so can solve compared to waste the highest is at a higher a presentation of the code so it contains all information about as a tree
which is very convenient to to to analyze to process and later so other types of nodes and so it's sort uh it's even more easy to to analyze its and so waste is compiled to bytecode developed among show you an example of ice for the course and then of the string ABC so you can see that so the code as a type course so you can know directly that's quality test coupon the function on arguments the function in this case it's we have to load is that is the name of many forms of robot all from the beauty and and there is 1 argument which is a string so you get a type string and content to use the most
simple and I see optimization to just to replace the current research results you can use
a know that I still module which is part of the stomach neighboring and the and as a material as I have visit admitted and depending on to make use the name of the middle reliant on 1 another thank this category but we present we we the optimization of the so we we have we have got so we have specialization uh what we can do it that's that we can implement sort of condition so this the following optimisation already implemented in another 5 by 10 projects for example you when you call a built-in function you can replace it with the value of the is the idea is that instead of having to call each time you you get kick gets a constant so it's you don't have to compute the result everything you can also simplify it every for example of the place to the range function we fucked upon because uh later if you combine readable optimization in becomes much more interesting to other constants as you read the uh on this when you optimize a built-in function in the got on the replaced by abusing function and underage a range you have also your gone on the ranch function and find are interesting optimization is a loop and wording and this is the idea is that instead of paying the coast of the king q 4 q we chose to create our finite objects flux take the 1st take a 2nd them and continue until you get an exception the disease to to duplicate salubrity enough time for each iteration and generates on assignments for example it's to get 1 thing that we free and that this optimization on on on is not really interesting but it's neighbor even more information and that we will see later on for example a a a a a super optimization is to provide a constant uh because here you'll centers those of value 1 who was a it's so you have to store the value just after that you have to reload the value of because Python based the and so you have to push for that user every time so 1202 is isn't the load from the volume you can just call phi the value of the variable directly to the courts so instead of Prince makes you just call prints 1 article transferred is a set of operations on the constant values so all the girls as strings of circular for integrals uh to give you some examples of issue ask for the present value of 5 is just above 5 if you would like to check if 1 elements of the analysis and instead of creating is released at runtime you can uh compiler company call gravity to act upon which is going to be once universal object of the operation and strings operation of some string and the latest 1 is interesting because it's not the constants least but even if you set the table is you know there's a wizard will always be the same so you can replace Rosenberg operation jointly with the value of something as is that you can and you can avoid on a load development institution
because it when you call a computing information like a man you have to know each then you have to
reload the function from the from the robot got to check integral and after that you have to check contributing because in fact the function is used in between so the quails to look up uh and this means the distinction local robot can can be replaced with constant means that you have to inject a built-in function in reverse runtime and if you do that you don't have you ever do look at the to look ups uh the simpler challenges through we move the date code so for example the issue of the test and if we find each block in his work as the rookie is empty you can just involves the condition and removes the it's useful to avoid the germs in the use of a problem that uh it's you have a test and a test is known to we're always before you can just remove the waters if you have a final institution like return rates of something else that's she's as the end of the contour from it you can you can just remove which is after the final inspection
we can know about the implementation of the good news is that I already got the changes managed in the cold as the 1st 1 is a new type of ice the which is constants and that simplifies the In the optimizer because instead of having to check each time the type of user for example name constant string of bytes you single types so manually check but moreover if you have a topic of constant object auto parts of the of constant objects you can replace at once laughter that's the optimizer you only have to move 1 and of change which was mentioned by 10 feet at 6 is to you got a negative link and all that stuff because in Python we don't store the Yankees have a number of rule to each instruction because it would cost too much memory and it's not efficient so we we start compressed table mapping instruction of sets to land and when you implements optimization like loop and learning so the number of goes back part sometimes it's little tools because we don't start lender magically we which we saw that time so I changes and I'll to store negative and then go to 12 hours and then we suppose and the latest changes tools to to project the Department of frozen said constant compiler and because of this optimization already existing debate people optimizer but it is implemented in the ones about called on that I and I would
like to win that implements the same optimization but I still have them so we've mentioned you can generate directly constant Apollo frozen sets
general we present you free that's which are return to manage my walk into C Python as the 1st 1 is to add a new version to the addition to the dictionary uh you a prior that is not visible at the pattern of money at this university has a probability that national is increased at at every challenge and events unique for dictionaries and a and B and the 2nd property unit for dictionaries means that you not only you know if something changes but you also know a few of you still see you are still using the same dictionary because technicality in some kind of in some cases you can replace enhanced plays from material of class or something else and you would like to make sure that namespaces saying and
using a the option you can implement a got on the namespace you because it is the ones that come and go as if nothing changed you just have to compare the version and you have a to give you an example of sort of got the you you get to
the version of the dictionary is imaginings exactly the same you words and look at your otherwise you you look at falsity of the queries to assignments means that's something has changed the doesn't matter in own or use case so we still a new value the other 1 and I otherwise it means that the value change but invites own issue issue look at bridging function of also class middle of very we're very how to modify something things and space so the the hope of expectation is that you always go to the hospital the 2nd that is a path to a specific function that the newest the function to the step yeah I'm going to buy functions besides you you can use it to to register you specialists and code uh using God's to mean that user account of true you corridors this as go and they modify existing around the city uh fire which is the most important looking by turns to look which evaluates the byte code something mean there changes in the Czech gods and depending on the results of the cards you choose which could should be easy and not only in Uganda and generates as those but you can also call any kind of corroborate function and you can generate specimens scholars using any any taller and mechanism using a fact of tumors in which walks on the ice but you can also imagine you know that you use the site and a generate mission code you can use spectral generates a this to suppress postcode all maybe
you can also use about the number who use this as college votes to keep his and semantic
so want give you an example of specialization instead of contains a built-in function to generate a cocktail you can just be places so course with the value and when you will use this as a function you pass a god on the beach and function and the last step is that they have to distance consumers uh and this paper as a new common legible option Dutch for it as a new function corridors sees that that separate consumers of could customer summer can walk on the debate could
but he has to walk sums I still for example we might that the people of the Mozambicans are adequate consumer so it's it becomes part of the same process and you can even disabled and supply part of if you want all use you all 1 of the means of implementation which may implement more changes more optimization and the question is the real world happened for Python
and free that's 6 uh 1st they got the good which feedback in my free perhaps on on the project engineers there's a broker point is that people are asking me ask me asking me to show concrete cesspit upon application on the on the on micro-benchmarks and the sadly and to be honest today it's only faster on micro-benchmarks because I spend a lot of time just to implement the gods who implement a spaces specialization too many physical violence through to support to raise the on the of somebody so I not have much time to implement amazing optimization small as the foundation of the project and many European uniting need at least a few months to implement more innovation through
2 something visible on applications and what's coming next uh and so
say that we can implement modernization so here are just some ideas when you moral look for when you and you get this study discovered which looks inefficient because you have signed are valuable uh it's guy 1 meeting that we get it to go free but still it's valuable reason used so in this case you can just remove about because it's my an another example
is to replace the robot
because uh if you know all of the future that's probably not changed instead of having to call you to load to go about it each time a function you can just complaint in the function body under much more conditions like constant for the and as user need a
god on those accused of and as under optimization is
a function inlining because in invite some of the naming as our important function corridor as an important coast so instead of recordings of function as user ideas to apply the function body where you cost-efficient and is that in this case you can if you combine it's because optimization you can produce much more efficient code and obviously in in god on the enlightened function because is a function is modified somehow still have to cause of original would different function uh as a larger project is to implement for finding the this is done at once Institute compiler JIT compiler faster profiles across the Web occurred is running and depending on
some threshold and some trivial so you can emits much encoded uh and but I don't feel able to walk through to implement such thing at current time because it's really complex by guys to took have many years to prove program something efficient so my ideas to runs provider a and known what goes for example red
unit test so you ask me to stop but I know that they
have 45 minutes and the
all can alone it just to finish quickly at about Prof
material which is what to implement benchmarks and as the idea is to support multiple processors and computers rage because if you run a single process that you you're young you get 1 specific performance but if you in is meant to time and you have to do you you get
a bit of a realistic value and it's not it's very efficient tool to get more stable management and you can also store data as a user and because of and thanks to that you can display compare and analyze that the it's a set of library of already modified to survive 10 benchmark suites with to use it so we really gets much more stable benchmark OK here we have 3 of integration be at there at that
time very maybe 2 questions how hi thank you for the call of the talk but what I wanted to ask you is you said that when you you modified see evolve to see if the gods are valid and then going to the function and what kind of world that cycle that was a function that jet was because it's not only when you get into the function you have to check that what regards you mostly have to check every time you call the functions you call the vault or even some other things because buildings could have changed in and yeah values and at the end of other stuff how the deal with that's uh we felt bytes and you
know you get some of the gods which objects as the entry points but when you specialize the code that you you can inject you own a gas inside the code so you are free to generates God's inside the function body through we decide so the function would be if you take a 1st plus for 1 line of all 4 but to the you know who then I have all of which is why and
modify C of all why don't you check the guards inside of the specialized and then they all of it is just as
decision right the 1st time soulful technical reasons it's more easy to do it like that there be and things that you think you much mixed in the excellent
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation


  392 ms - page object


AV-Portal 3.8.2 (0bb840d79881f4e1b2f2d6f66c37060441d4bb2e)