Add to Watchlist

The Python Compiler

26 views

Citation of segment
Embed Code
Purchasing a DVD Cite video

Formal Metadata

Title The Python Compiler
Title of Series EuroPython 2015
Part Number 167
Number of Parts 173
Author Hayen, Kay
License CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
DOI 10.5446/20148
Publisher EuroPython
Release Date 2015
Language English
Production Place Bilbao, Euskadi, Spain

Content Metadata

Subject Area Computer Science
Abstract Kay Hayen - The Python Compiler The Python compiler Nuitka has evolved from an absurdly compatible Python to C++ translator into a **statically optimizing Python compiler**. The mere peephole optimization is now accompanied by full function/module level optimization, with more to come, and only increased compatibility. Witness local and module **variable value propagation**, **function in-lining** with suitable code, and graceful degradation with code that uses the full Python power. (This is considered kind of the break through for Nuitka, to be finished for EP.) No compromises need to be made, full language support, all modules work, including extension modules, e.g. PyQt just works. Also new is a plugin framework that allows the user to provide workarounds for the standalone mode (create self contained distributions), do his own type hinting to Nuitka based on e.g. coding conventions, provide his own optimization based on specific knowledge. Ultimately, Nuitka is intended to grow the Python base into fields, where performance is an issue, it will need your help. Scientific Python could largely benefit from future Nuitka. Join us now.
Keywords EuroPython Conference
EP 2015
EuroPython 2015
Series
Annotations
Transcript
Loading...
what's yes hello so welcome to my talk I'm here is the private entity and all these are not here for a company but at least now my project colonial so which you can see on the right side so I have a quick overview of the topics i've to previous talks that you hyphens so 2 and 3 years ago and the 1 on the talks this time I was going for faster talk I have some action good news and I
have some problems to share with you so I'm going to talk about what are my name is under the professionals from the right to the industry and I do is as my hobby is my spare time effort and my step to an effort of the title of the talk kind of disclose already is booktitle compiler I got a bit preposterous about it but basically if you see the goals of that makes sense show you what it takes takes is nothing we're going to compile the simple program and the football program material body there's not going to be a lot of time to look at this so that's going to be fast I'm going to present you with my and understand how we get to the last point on this slide join me is the most important because this project is really high potential and its limited by the amount of contributed so far I am mostly on my own but I have a few people who help me and sustained part of a project but is not enough so it's going on slower than it could be these years he was used to create the is progressing pretty well then I will have a look at some details of you'd new to God as you know partners very dynamic and so on complex language and I have taken steps to reduce the problem and there's some common complaints here know everybody knows that heightens highly dynamic how would a compiler with work then we look at optimization what we have so far that list for longer recently and so actually I'm generally no here and I uh when wrote the title of the talk is didn't work and it didn't work until maybe last week the lining of functional which is I see this as a breakthrough to the compiler technically practically it's probably not but technically it's very good achievements and what else there's going to come so are made based on the it's named after my wife and I and like it was suggested that seconds ago I could have made like this I needed laughter and write Russian and Russian which is called you and shot on you go uh which is tricky because it's pronounced differently than it is written so whatever name and I started with after mingling with other projects pipeline inside that to be fully compatible compiler that doesn't have to make any compromises and doesn't have to invent a new language and so on I was thinking out of the box so most people see Python has a very powerful tool for some part of the language landscape but not all and I wanted to take it to the also where performance-critical stuff so so
he and that I don't see any time person so I do this the right way and right reneged on all I can do this so we can result all the weight of Titan all the time it's like it's very liberating and you can use with everything so it's a free software of the most free predefined of most of the major milestones are now achieved is basically working if you want all the wonders inaccurate but I it's going to work on all the operating system Android and I is needs some work that's infuriating they should work and I know that some people have done some things but I it's still future work obviously mobilize space and Python I could see some help and maybe you'd go can provide so what it doesn't use it uses of older of Python versions and new ones alike being the latest freedom 5 meter anticipating the question from you I added support for that so we passes through the hyphen freedom of thought this to running the compiled code that it takes a C + + compiler will cover that issue in mall on native and it takes your Python code that so it's really just a sequence of C + + compiler and you you'd go and you can compile so coming in new language that is separate from Titan means I use all the things I like put them all on the slide here and I'm trying to be a bit fast about presentation but you know there's lots of things that you're used to and it's not fighting but for example of something else there value just loses so I put a kind of stop signs so you know so very important to me is if we have a fast tighten it should be applied much like a typewriter and tries to be 1 augite interest Java dialect I can switch back and folks the thing I'm trying to do this if you start using you that you are not going to have a price attached it doesn't mean that project the user I want to use that means if you encounter back you got and it stops working and you can just use something else and so my ideas here for performance and these are very old ideas I have not done anything actually in this direction get and I know what is running around and presenting some ideas for typing and everybody us will I support them and the answer is yes all technically I would like something that also works during the run time of python so in his proposal it's just something that's python very exquisitely ignores and doesn't use and I don't like that at all I wanted to be called that actually improves the quality and makes these actual checks and then the compiler just gets to to benefit from the knowledge extracted from such check so the 1st goal and 1 which I met a couple of years ago was future parity with Python it's compatible with all the language constructs and is also compatible with runtime so QTL XML whatever extension objects that are and you can use them the compatibility that I have achieved and that I have increased since it's amazing behind basically my 1st attempt at new there was to make a demonstration that something like a python compiler actually can fit into things without having the price and this is now what I consider a true statement so from there and not on to the next thing some of these projects are mentioned me patches so by cutie pie side and so on sometimes the I'll make to types check on what is the function so I have a compiled type and they were not tolerant about this with a picture so the next thing is to generate efficient code from as you would see overpriced on benchmark achieved a number of 2 and a half old speed up so this is something that looked at but it was only a concept it was only to
show if we don't have quite let's have compilation of what can we gained it's not really with the so I think this sort of speed is unimportant so what we got new is these cogeneration is now starting to remove called that is not used and it's using traces to determine but if all objects need releases and as we will see later exceptions on how fast I have a slide about so constant propagation which is basically just people optimization so identifies many values and pushes forward so you sign a constant and the variable and use that later on you generate efficient called so I have just recently achieved that what I haven't got yet and which will be an important part to getting any actual improvement that is worthwhile for anybody these are 2 main types France and treat strings integers lists and so on uh differently that's only starting to exist than interfacing with C code the so called findings I had a discussion with the with the site and I this morning that you'd go will and should be able to understand see Texans you if I anatomical to have a slide about that too and hence type doesn't exist so not this year type and the so on so I have here a outside of a new thing where you can see that on on the top left you have your cold when you put that through nude that can be modified so you'd go recurses according to your Python craft and just fine zip code and produces from that the bunch of C + + 1 and puts it in a directory and then run storms and what typically happens is that people tell me for some reasons that I do not really understand that sponsors somehow band I don't think it is it does the job and I have is quantified in Utica car which 1 can use used to produce a molecule so if you want to deliver extension modules from you might include that's feasible even whole packages or you can produce an executable so from a user standpoint you during your coat that is basically it's the scones does handle the C + + details and they get very kind of nice e-mails from people who said it even Outline my Microsoft compartments work and it's very easy to do so I have a very low barrier to entry when we look inside you will find that I have a couple of faces so based on the media abstract syntax tree the same 1 that heightened users so in a sense I'm using the Python positive which is 1 of the benefits of not having a separate language I at court really formulations so for example invited to load 6 with statement but added and while I could have a with nodes and generate quote from that and actually 1st versions of new did that so I had this steepest plus templated generated code which just happened to do the proper thing the compatible the but that's not how it's done in what we now have 3 formulations and with these we formulations with statement ends up in a simple point we're going to see if you examples of so speaking very fast and I try to be fast the idea is also that you can have your questions ask so if you have a question is raised hand and ask questions whenever you think you are I have 1 of these 2 down we go into optimization which is basically an endless because optimizing the Python program you cannot have a single or 2 pass approach because after every optimization any other optimization may become feasible again so it's in this room but it finishes at some point and then finalization presented which just annotates the called a bit more understand final tree receives a cogeneration and and the directory we will see so that's very untypical what's probably special is that various this reformulation step which tries to make a baby
right and other things so time
foot demo far he's is applied functional and today it has a nested function and it does the local variable assignments and then it makes this call which can be in line and actually you and I as a human we can we can see what happens and think which I am very proud of this that I now have variable tracing and SSH sufficiently strong to justify that on a global scale of new guy will be able to understand that's sort of code and produce in a simpler result so it has a purpose mode and here we see a look into the inside what happens there when it runs so there are outright block which is sort of true because of a re formulations for example this statement here does not have and secretly about also try finally semantics so if you get interrupted while unpacking you get to release something so but a static analysis finds out that the try blocks can be reduced it finds out the that assignment 2 g can be propagated entirely and therefore be dropped 1 2 exp 1 2 y the value is then actually propagated and then In lines 9 here we have a constant tuple constant result we can replace the call to achieve with a direct call and we can inline function we can discover about previously there was a variable g but it now is is
no longer used in that so it's not assigned and that's not been initialized anymore so misunderstood uninitialized very which you can be while reusing of it
should there has been an exception Python function you know could get released that's all what's behind it and then we propagate the inline variables and various of tuple and so on and remove all the try handlers and ultimately we are done with so for
example this even simple program but it will help me to make a better demonstration the
better in the sense that right now and think I don't I started to have analysis years go for that but it's not yet sufficient so I cannot have a tuple unpacking and to show the full reduction so when we run this as you can see outputs and a lot of finding and now for easier debugging I have invented a XML representation of the military and I use this to test that something is entirely optimized and as you can see here we have a statement return rather just a constant tool to sir this function F all it does is a lot of churn around the notion of producing 1 constant obviously your code is not going to be like this but that's a it could be if for example bx where an input value of some sort and man the 1st was already a partially optimized functions but for some reason that these things make sense so we got this any questions about this the story the question was is it storing reduced Python code anywhere actually that's a cool idea for a project that I have is to generate Python code from a reduces of 1 right now I only generates C I would love for somebody to take the internal representation of the optimization and generate type pulled from the pipe controlled friends just faster than the other particles but since we are making a Python compiler for reasoning I'm going to see directly and in see I'm not putting this but basically technically the internal final representation is not entirely Titan anymore so as we will see in the reformulation parts for example while loops and Fonollosa they don't exist so a reduced set but it would be for example feasible to create Python code of so maybe
quickly here but finally because
made XML and because of the easily confused I removed the code that is not used
and have opened it already but here we go again so this is what the generated code for example looks like so we have a local variable return value initialize it nothing that we initialize it to a result of which is a made constant and we go to the function returned exit which checks that it's actually return value and then returned so this is our become in Python words the most efficient way of the most efficient uh in of cold that you can have obviously there's more it so
we can also do that you will this is something of that were 2 years ago it's passing the test you with material so we could comply no and actually I was doing it it took care out of
13 45 minutes to comply and all of material which is a huge body of all the of court right now nuclear is not making enough optimization and discovering enough debt coach but half an hour is pretty OK on his laptop with all of so generated
codes works like this I would be quick so now it's equal when I initially started out and I was whether this is a very ambitious project so the only reason I dance even started was because he + + 11 the news that less language was having so much cooler new features that convinced me that cogeneration would be relatively simple enough so forget between the 11 and from 1 relative to small it turns out and that for example C + + exceptions such an place operations of Python optimizable but that doesn't fit into 1 object only 1 thing so I went to C + + 0 freely and then to see C + + which is basically just see with some C + + elements but no class node and it's going to be seen 99 soon enough so I'm going to skip something so as an evolution 4 years ago I'm talking now about the ones you have nothing that 1 the blue plot that was a joke generation so I had to achieve something phenomenon and I was a compiler which was capable of integrating move all the hypernyms didn't make things faster which was tremendous but the other part is so small you can barely see the but can't optimization there was basically only look pull people optimization 4 years ago 2 years ago I went to see the 1st were free and the cogeneration got a lot more done and reformulation so I started to the peer and optimization computer and right now cogeneration has become really stupid and the optimization is during the so now these reform realizations I'm making some overhead they're using temporary variables and so on I can optimize weight so availability I had hi focuses on correctness so it's available at in a stable and developed from the developed from also better than other stable projects I content and I have a factory where couple things that are not finished yet so for example the inline including is right now on the factory branch it's not just the result the instance lots of people are already using you know this is my most important slide so I wanted to join the project How we have a guide and so 1 thing I have to cover for correctness is the oracle Delphi the Delphi means I can use the Python competitive so Testing for correctness it's a dream but it's very easy for performance it's much harder it's a race and I have ideas and what I would like to do is to to help me come up and develop tool that will help us that give the user feedback for performance because it did not comply you fold it may not be faster at all it made the slower and we wouldn't know why there's no feedback there's no idea which functions faster and how much of that needs to be it what I need somebody with an interest to help me out with this and rescue us so this is the most important things I meant to say I would leave the rest of the time who still 10 minutes the 4 questions if you have a this the use of the of the you are the they are a pain my in my opinion which type of language constructs were were making cogeneration heart and I think technically no once we are able to inline class and very effects I think they will not be an issue and they're very very easy technically classes and instances need a lot of baby sitting especially in the apply and to to be correct so that's that was an issue and I have to lose a huge amount of difficulties with in place of operation and exceptions and expect the exceptions exceptions totally amount might so I waste and yeah reference counting is no fun which is why I develop compilers to so you do not have to write seafood yeah so next question have you all forward 54 tied to me right now is so the question was how to handle the something doesn't have time that right now and you go go is basically using note type of information and knowledge it it would it would use
and let me show you these uh in the future we will be able to understand for example see types and then make the courts but right now everything is an object theater listerine integer I'm belly on and not I'm not using of the knowledge yet I will start to make now that I have this tracing capability I can produce proper traces of life and I will be able to trace the this and make optimization that dedicated 5 but I don't have so I'm integrating with Python and pi object s it's a standard Python it's it's like wrote C extension we you have a question the this I tried to various compilers obviously on sea level this still always something to gain then trying to be clever and smart code I generates an affine from Microsoft compliant up to be terrible and influence would probably be better uh but technically you you'd guys should be in a position to understand life and so we did the example that I showed you engaged by an order of magnitude performance pledges in line of functional I I I don't have a slight sometimes shows like but if I'm just avoid a function called type I'd like can have speedups in the domain of 24 and and and so on and maybe on top of that the compiler can only few % could also be because of what where did that sort of there was applied on talk about 2014 14 and presenter with what's set pictures were spectral things that what you just write about something called the standard unknown mentioning that because I'm not interested in but it means that it will also have all things together and along distribution to other machines something people what to expect from a compiler this be able to take the call to another machine on my interest is mostly an acceleration and I'm solving this sidekick so but defeat biggest features words and so when and where it's at the point where I'm surprised if something doesn't work for for stand-alone I'm not surprised if something doesn't work because it's very hairy with extension this yes was incredibly smart I definitely given it's so this would mean that no need to talk about it in a binary which contains the Python implementation and uses is it's it's larger but I don't think it's and anywhere near as important issue and so of this the you will be much faster than the smaller but yeah it's it's it's it's larger but it's still small wineries so we we can have a look
at h t text it's swallows 31 megabytes of all of you to confuse users that's a very important part of the project so suppose the program you compile his name G. what is going to be the resulting name without overwriting and I want to put it alongside and I've made good experience with that x is it's relatively rare but a Python program exists that's not X so but I'm getting questions and please take note of this 1 will run on Linux despite its name and you can bring it was work so if if you despise the name of this you can also more than have that would the in I think I'm using the stack memory allocation yes and that would definitely be what I was some sort of this implementation that is not matter of current if I know the size and it's if if I know enough sufficient things I will use that try it with us all right so you can do in In this yeah the question is if I take advantage of critical thing uh I I I I tried to do this so I'm not always taking numerous around 125 and but it's a very marginal gain the real direction must be to avoid Python objects whenever we can and we we will see how far gets but of course where I can will have his analysis and no but I don't have to take another with confocal so will be
wouldn't want this Thomas you know what I mean you no there is no is asking about stand alone and a stand-alone distributions if you want to copy it to another machine basically in the in due to the incompleteness of called removal in includes the standard library and all of the libraries stands library users and so you end up with the a large set so the distribution I don't think it's huge but it's a it's like apartments selection whom yes I don't have real world benchmarks because that defines the purpose of benchmarks and all that time points really cool now with presenting that really would programs and help high-placed accepted as I I have I have these ideas about that and all my benchmarking and do that and that would give me takes so I don't have to run many many times in the Committee analysis of directly and would want your help all I would do it myself but I want your help to to create tools which will run the program and pipe and run the program and you and compare it to make a highlighting what parts of fast and slow and so I can get you as a user can get an image how much we don't to it gets In my program and I know that it should be simple enough to just run a program under under another tool and get to report so is all audible to have using benchmarks with synthetic cool but I would actually like to empower I mean developers of new theory and the user's I don't know if we can make this tool and the sat through reuse right now there's nothing basically I just have random number numbers of something and I'm working very hard on getting to somewhere but like I said I have time and I do not have a panic to be fast and everything to moral and right now I am only starting to wonder about actual performance so this is not where I want to know how would I to be and and what the theyre
Group action
Goodness of fit
Multiplication sign
Projective plane
Android (robot)
System call
Code
Java applet
Multiplication sign
Demo (music)
Home page
Visual system
1 (number)
Compiler
Mereology
Weight
Machine code
Formal language
Order (biology)
Sign (mathematics)
Benchmark
Cuboid
Extension (kinesiology)
God
Metropolitan area network
NP-hard
Spacetime
Proper map
Constructor (object-oriented programming)
Electronic mailing list
Bit
Port scanner
Functional (mathematics)
Benchmark
Sequence
Maxima and minima
Arithmetic mean
RWE Dea
Right angle
Modul <Datentyp>
Mathematical optimization
Data type
Physical system
Point (geometry)
Metre
Frame problem
Computer programming
Slide rule
Asynchronous Transfer Mode
Presentation of a group
Parity (mathematics)
Patch (Unix)
Online help
Mathematical analysis
2 (number)
Number
Revision control
Pi
Escape character
Software
Operating system
Integrated development environment
Message passing
Data type
Run time (program lifecycle phase)
Mobile Web
Plug-in (computing)
Dialect
Projective plane
Compiler
Vector potential
Computer animation
Software
Mathematics
Statement (computer science)
Object (grammar)
Cuboid
Flag
Point (geometry)
Slide rule
Computer programming
Vapor barrier
Code
Home page
Compiler
Interface (computing)
Mereology
Tracing (software)
Machine code
Formal language
Sima (architecture)
Benchmark
Hypermedia
String (computer science)
Information systems
Gamma function
Recursion
Extension (kinesiology)
Exception handling
Data type
Metropolitan area network
Email
Electric generator
Electronic mailing list
Bit
Directory service
Density of states
Abstract syntax tree
Compiler
Maxima and minima
Propagator
Message passing
Process (computing)
Computer animation
Network topology
Vertex (graph theory)
Statement (computer science)
Module (mathematics)
Website
Quicksort
Object (grammar)
Data type
Mathematical optimization
Logical constant
Frame problem
System call
Code
Multiplication sign
Demo (music)
Home page
Compiler
Exponential function
Semantics (computer science)
n-Tupel
Directed set
Statement (computer science)
Exception handling
Metropolitan area network
Logical constant
Scaling (geometry)
Demo (music)
Block (periodic table)
Mathematical analysis
Line (geometry)
Port scanner
Functional (mathematics)
System call
Computer animation
Function (mathematics)
Statement (computer science)
Quicksort
Block (periodic table)
Resultant
Asynchronous Transfer Mode
Metropolitan area network
Logical constant
Demo (music)
Home page
Compiler
Variable (mathematics)
Functional (mathematics)
Predictability
Variable (mathematics)
Computer animation
n-Tupel
Personal area network
Regular expression
Statement (computer science)
Exception handling
Computer programming
Code
Demo (music)
Function (mathematics)
Mereology
Variable (mathematics)
n-Tupel
Reduction of order
Representation (politics)
Statement (computer science)
Regular expression
Metropolitan area network
Metropolitan area network
Logical constant
Electric generator
Real number
Projective plane
Mathematical analysis
Internet service provider
Set (mathematics)
Port scanner
Functional (mathematics)
Compiler
Loop (music)
Computer animation
Statement (computer science)
output
Quicksort
Data type
Mathematical optimization
Metropolitan area network
Logical constant
Code
Ferry Corsten
Ext functor
Ordinary differential equation
Variable (mathematics)
Functional (mathematics)
Word
Computer animation
Interpreter (computing)
Function (mathematics)
Statement (computer science)
Regular expression
Local ring
Resultant
Laptop
Musical ensemble
Metropolitan area network
Asynchronous Transfer Mode
System call
Real number
Server (computing)
Demo (music)
Home page
Compiler
Abstract syntax tree
Binary file
Computer animation
Mathematics
Software testing
Modul <Datentyp>
Gamma function
Mathematical optimization
Curve fitting
Email
System call
Code
Multiplication sign
Home page
1 (number)
Compiler
Weight
Mereology
Proper map
Tracing (software)
Machine code
Order of magnitude
Formal language
Video game
Extension (kinesiology)
Position operator
Social class
Exception handling
Oracle
Musical ensemble
Metropolitan area network
Electric generator
Feedback
Constructor (object-oriented programming)
Fitness function
Sound effect
Instance (computer science)
Port scanner
Variable (mathematics)
Functional (mathematics)
Maxima and minima
Time evolution
Absolute value
Quicksort
Mathematical optimization
Data type
Resultant
Writing
Online chat
Point (geometry)
Slide rule
Domain name
Implementation
Presentation of a group
Software developer
Distribution (mathematics)
Electronic program guide
Virtual machine
Branch (computer science)
Electronic mailing list
Computer
Plot (narrative)
Operator (mathematics)
Factory (trading post)
Software testing
Integer
Statement (computer science)
Data type
Information
Sine
Patch (Unix)
Method of lines
Projective plane
Element (mathematics)
Counting
Weight
Grand Unified Theory
Machine code
Line (geometry)
Evolute
Affine space
System call
Power (physics)
Plot (narrative)
Compiler
Word
Computer animation
Mathematics
Factory (trading post)
Vertex (graph theory)
Object (grammar)
Ocean current
Point (geometry)
Computer programming
Implementation
Random number generation
Direction (geometry)
Real number
Multiplication sign
Distribution (mathematics)
Virtual machine
Online help
Stack (abstract data type)
Mereology
Theory
Number
Medical imaging
Selectivity (electronic)
Metropolitan area network
Software developer
Projective plane
Mathematical analysis
Memory management
Benchmark
Pell's equation
Arithmetic mean
Computer animation
Boom (sailing)
Modul <Datentyp>
Object (grammar)
Quicksort
Library (computing)
Loading...
Feedback

Timings

  629 ms - page object

Version

AV-Portal 3.8.0 (dec2fe8b0ce2e718d55d6f23ab68f0b2424a1f3f)