We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Cython to speed up your Python code

00:00

Formal Metadata

Title
Cython to speed up your Python code
Title of Series
Number of Parts
132
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Cython is not only a very fast and comfortable way to talk to native code and libraries, it is also a widely used tool for speeding up Python code. The Cython compiler translates Python code to C or C++ code, and applies many static optimisations that make Python code run visibly faster than in the interpreter. But even better, it supports static type annotations that allow direct use of C/C++ data types and functions, which the compiler uses to convert and optimise the code into fast, native C. The tight integration of all three languages, Python, C and C++, makes it possible to freely mix Python features like generators and comprehensions with C/C++ features like native data types, pointer arithmetic or manually tuned memory management in the same code. This talk by a core developer introduces the Cython compiler by interactive code examples, and shows how you can use it to speed up your Python code. You will learn how you can profile a Python module and use Cython to compile and optimise it into a fast binary extension module. All of that, without losing the ability to run it through common development tools like static analysers or coverage test tools.
35
74
Thumbnail
11:59
Machine codeInformation technology consultingCore dumpMachine codeWritingDisintegrationCompilerFormal languageComputer programmingLibrary (computing)Mixed realityCycle (graph theory)Machine codeProjective planeCore dumpWave packetWebsiteSoftwareINTEGRALRun time (program lifecycle phase)Hydraulic jumpBitNumberData typeMereologyDifferent (Kate Ryan album)Extension (kinesiology)Digital photographyBoilerplate (text)SpacetimeSemantics (computer science)Pattern languageMessage passingInformationCategory of beingCalculationOperator (mathematics)Object (grammar)Machine codeFormal languageCASE <Informatik>CompilerMultiplication signFocus (optics)Musical ensembleSequelProduct (business)Web 2.0Functional (mathematics)QuicksortMathematical analysisTraffic reportingInformation technology consultingSoftware developerWritingProgramming languageElectronic data processingInformation engineeringLevel (video gaming)Data integrityProcess (computing)Computer programmingChainBasis <Mathematik>Goodness of fitComputer animation
Revision controlBeer steinLine (geometry)MathematicsInversion (music)ClefExecution unitRun time (program lifecycle phase)Point (geometry)Different (Kate Ryan album)SoftwareSoftware bugFunctional (mathematics)BitSineElectronic visual displayKernel (computing)Cellular automatonMachine codeMathematicsExtension (kinesiology)Alpha (investment)Variable (mathematics)Product (business)Multiplication signRevision controlLine (geometry)Core dumpSign (mathematics)Module (mathematics)WritingFormal languageStructural loadComputer fileEmailGraphical user interfaceWeb browserOrder (biology)Lipschitz-StetigkeitCycle (graph theory)CompilerDecimalRight angleLaptopMachine codeDemo (music)Particle systemPerfect groupLetterpress printingExpressionElectronic signatureComputer programmingPattern languageNumber1 (number)outputLibrary (computing)WebsiteComputer animation
SineMathematicsSquare numberRead-only memoryGamma functionInversion (music)Convex hullLine (geometry)EmpennageSpecial linear groupSinc functionResultantMenu (computing)System callSineSemiconductor memoryPattern languageElectronic signatureCASE <Informatik>Line (geometry)Data typeSquare numberException handlingData conversionInteractive televisionObject (grammar)InformationDoubling the cubeFunctional (mathematics)Parameter (computer programming)DampingLevel (video gaming)FreewareWrapper (data mining)Sign (mathematics)Right angleError messageMachine codeTypprüfungModule (mathematics)Library (computing)Cycle (graph theory)Operator (mathematics)CompilerWordExtension (kinesiology)Revision controlCollisionString (computer science)Run time (program lifecycle phase)FehlererkennungscodeDifferent (Kate Ryan album)Machine codeComputer animation
Read-only memoryTelephone number mappingDuality (mathematics)Gamma functionMachine codeState of matterAuftragsspracheUnicodeFloating pointCompilerFunctional (mathematics)Semiconductor memoryFormal languageData conversionYouTubeString (computer science)Run time (program lifecycle phase)Library (computing)Computer fileDeclarative programmingCASE <Informatik>EmailNumberBit rateResultantDuality (mathematics)Goodness of fitError messageRight angleBenchmarkLine (geometry)Order (biology)Proper mapElectric generatorObject (grammar)Interface (computing)Machine codeLetterpress printingPointer (computer programming)BitOperator (mathematics)DemosceneException handlingCorrespondence (mathematics)Resource allocationWebsiteLipschitz-StetigkeitMultiplication signCompilerFehlererkennungscodeProjective planeModule (mathematics)Einbettung <Mathematik>Standard deviationComplex numberSign (mathematics)RecursionArray data structureStack (abstract data type)MathematicsComputer animation
Module (mathematics)BenchmarkSequenceDifferent (Kate Ryan album)MereologyLibrary (computing)Computer programmingFuzzy logicMultiplication signProfil (magazine)Mathematical optimizationDifferenz <Mathematik>Standard deviationLevel (video gaming)BitPairwise comparisonBenchmarkRight angleComputer animation
Profil (magazine)Multiplication signFunctional (mathematics)2 (number)Process (computing)Run time (program lifecycle phase)InformationLevel (video gaming)System callNeuroinformatikLaptopMachine codeBenchmarkCASE <Informatik>Cost curveFood energyPoint (geometry)Pairwise comparisonTracing (software)Musical ensembleMatching (graph theory)BitFuzzy logicRandom matrixSource codeComputer animation
Drill commandsExecution unitBitFunctional (mathematics)Ultraviolet photoelectron spectroscopyMatching (graph theory)Variable (mathematics)Multiplication signSubject indexingOperator (mathematics)MathematicsIntegerBound stateProgrammschleifeCommunications protocolLoop (music)Range (statistics)Data structureIterationValidity (statistics)Data typeSemiconductor memoryNumberTrailMachine codeSystem callDifferent (Kate Ryan album)Key (cryptography)ForestElectronic mailing listPhysical systemCycle (graph theory)Right angleComputer animation
Machine codeCompilerComputer fileLibrary (computing)WebsiteBenchmarkSource code
Execution unitContinuum hypothesisLimit (category theory)Operator (mathematics)Mathematical optimizationHecke operatorMusical ensembleParameter (computer programming)Data typeWebsiteMultiplication signBoolean algebraComputer animationSource code
2 (number)IntegerLoop (music)Cycle (graph theory)Variable (mathematics)Range (statistics)Subject indexingData typeAbstract syntax treeFunctional (mathematics)Mathematical optimizationPauli exclusion principleMultiplication signMachine codeElectronic signatureLibrary (computing)System callElectric generatorMachine codeRoundness (object)Interface (computing)Integrated development environmentMathematical analysisCompilerCASE <Informatik>Projective planeLatent heatHookingObject (grammar)SubsetSemantics (computer science)Error messageLevel (video gaming)BenchmarkRight angleAbstractionCybersexBit rateVideo gamePhysical systemPattern languageMotif (narrative)WeightComputer animation
Panel painting
Transcript: English(auto-generated)
So, welcome to my talk, it's about Cython, as the title suggests. The thing is, I regularly speak at Python conferences here and there, and tell people about Cython, teach them a bit here and there. I had a tutorial on Tuesday here. Don't know if anyone attended it.
Yeah, still a couple people, cool. So, question now to the audience. I can present a couple of different things, please, depending on what you want. Who's interested in a general introduction to Cython, understanding a bit what it is, who's never used it and wants to know what it is?
That is quite a number of people. So the actual topic that I proposed for today was more like, I have some Python code, and I want to optimize it without losing the ability to run it in Python. So, compile it, speed it up. That's a bit of a less general topic.
Who's interested in that? Who came for that? That's about the same number of people. Wonderful. Okay, so I'll just jump back and forth between whatever I can tell you. So, yeah, hi, I'm Stefan Bienen. Quick intro about myself.
I'm a software or data engineer, if you want. I'm also a trainer, so I'm giving Cython trainings here and there. If you want me in-house, just talk to me, and I'll come over and teach you people there. I'm using Cython since 2002. I'm a Cython core developer.
Ever since we forked the project from an existing project at the time, who knows PyRex, heard of the name before? Yeah, about a dozen people. Yeah, wonderful. That's what we forked, and Pyrex has basically not been updated much since then.
So we've pretty much taken over the project, and everyone's using Cython these days. But thanks, Greg, for writing Pyrex, because it was a wonderful tool to build Cython on. So I do trainings, I do consulting, and I actually joined TrustU,
spawned at this conference last year. And as motivating example a bit, I'll tell you what we do. So if you look for a hotel in Google, you'll see something like this, right? You'll get a lot of information about the hotel, and if you click around a bit,
you'll also see that some of this information was actually provided by TrustU. So we know what a hotel is, and how good a hotel is. How do we do that? Well, we crawl the web, and we have partners, and we read hotel reviews by actual people
all around the world, collect them, at a rate of about three million hotel reviews every week. And then we do text analysis on them, we do data processing on them, and we do that in lots of different languages, including Spanish, Thai, Mandarin, Japanese, lots of languages.
And built this kind of information from it, so we summarize all this. This is a meta review, so just a summarization of everything that 45,000 people have been saying about hotel in this case, and we can tell you what's special about the hotel, what people like, what people dislike,
all of these things. You can actually go to trustu.com, enter hotel name there, and see this information about any hotel you want, pretty much all over the world. So trustu, now it's Python, now it's data, we do all of the data processing in Python, and we use the usual tools that you would expect,
we use NumPy, we use SciPy, Scikit-learn, Pandas, we use Basie to a certain extent, LXML for the data extraction from the web. And many of these tools actually use Cython, are implemented in Cython,
some of them completely, others mostly, and they're part of the Python data ecosystem. So why is that an ecosystem? Well it's an ecosystem because it's integrated. Everything works nicely together, it chains together, and a big part of that came from the fact
that people develop NumPy as basically a data integration layer that all these tools could use, could put their data into, could use to share their data across some different libraries, even pass the data on into C libraries, into native code to process this data.
So NumPy integrates the data layer, and you can say that Cython is the way to integrate the code layer, because what does Cython give you? It allows you to talk to native code, it allows you to use all these tons of native libraries, C libraries, Fortran libraries out there,
connect them to Python, use them from Python, integrate these libraries from your Python runtime. So when NumPy integrates the data, Cython integrates the code. So what is Cython good for? Well it integrates native code into Python, that's one use case. Many people use it to speed up Python code
by compiling it, compiling it into native code, and surprisingly many people actually use it to write C code without having to write C code, because writing C code is actually hard, writing Python code is much more fun,
and Cython allows you to write Python code that translates to C, so writing C without writing C. So we write the C code so you don't have to. Our topic for today is speeding up Python code, so I'll kind of focus on that.
Why you use Cython in general, it's a very pragmatic programming language, it's actually a programming language, so it extends Python, but it is Python, you can take arbitrary Python code, drop it into Cython, compile it, it usually runs faster, but you can also use extended syntax or type annotations in there to speed up your code,
to tell the compiler how to optimize your code beyond what the Python language allows. So it's an optimizing compiler, it's actually very production proven, it's used to run, I mean I showed you a couple of tools it uses, it's used to process, I don't know,
like terabytes, maybe petabytes of data out there in all sorts of Python libraries, it is really widely used, and the cool thing about the language is that it's really about getting things done in the same way that Python is, it keeps your focus on the functionality rather than having to look into all the boilerplate
that you would need to talk to a C library or to speed up your code, to optimize it, if you optimize a code by rewriting it in a native language in C or C++, it gets really messy, it's way more difficult than taking your existing Python code,
your tested Python code, and just making it faster rather than rewriting it. So the main property of the language collection that allows you to freely move between Python and C or C++, so it takes Python code,
but it allows you to mix in C data types or C++ data types right in your Python code, okay? And that's the way for the compiler to decide that you as a user, you were actually opting out of Python semantics, Python object operations, and saying you know, all I really need here
is a calculation of C doubles, right? Native data types, be fast, I don't care about object operations anywhere, I don't wanna pass around stuff through Python and so just make it fast, compute it as fast as you can. And the language makes it very easy. So it allows you to write code that's as Pythonic as you want,
but as low level as you need. Okay, first demo. Does everyone know the Jupyter Notebook? Who does not know it? Nice, it's just, you know, Jupyter's just wonderful. It's a wonderful piece of software. It's a very interactive way to do programming,
even data analysis, lots of people use it to play around with data, visualize it in their browser. And one cool thing of Jupyter is that it does not only support Python, it supports lots of languages. Last time I checked, which was years ago,
there was already a dozen languages more. It's probably way more now, and one of those languages is Cython. All I have to do in Jupyter to make it support Cython is load x Cython, and I should restart my kernel to show you that it actually works properly,
and there you go, the output, okay. And then just a quick overview of what I'm using here. The latest Python release, the latest unreleased Cython, yeah, I'm a core developer, I can take the risk of using an unreleased Cython version in the talk.
Some NumPy version, GCC 7, and an important bit to understand is, you know, in Python, when you write a Python code, you write a module, you save it, you say Python port blah, and it just, you know, imports and runs. In Cython, it's an alpha version, or you mean 0.29?
Yeah, it's 0.29 ports. Well, it's production proven, but we still say it's 0.something because,
so at some point ages ago, we said that version 1.0 should be the version that runs arbitrary Python code, and we still have one or two bugs in there that, well, we have to say, well, we have to fix those first before we can say that we can compile
any Python code out there. And so it's still at 0.something, and we haven't reconsidered the goal for 1.0. I think we should, because no one cares about full Python compatibility, actually, it's, you know, it works, it's perfect, it's wonderful, and it's just, yeah,
it's 0.something, yeah. Okay, I consider doing like Chrome and Firefox and just say, you know, there's actually not 0.29,
it's 29.something. There you go. Okay, anyway. Quick intro. So here's a bit of Python code. I say from math import sine, so I'm using the sine function from the Python math module.
So I calculate sine of five, and it says, well, that's about minus one, more or less. So it works. Now, in order to do that with Cython, I can just add this line here, which instructs Jupyter to compile this cell in Cython
instead of just, you know, push it into Python, run it there, compile it in Cython, and it builds an extension module from that, that it then imports. Okay, so it builds a shared library, and that is also something to keep in mind with Cython. You step away from the simple write code,
import try, to write code, compile, import, try. Okay, so there's a build step involved from that point on. It's usually worth it. Okay, so run it. It's actually pretty quick, apparently. I think it was pre-compiled before.
So Jupyter caches these cells if it knows that they didn't change. You see a little change in here. What I did before, up there, is I just said sine of five, and for Jupyter, what Jupyter does is, you know, it takes the last expression,
the last value that fell out of your cell and displays it. For a Cython-compiled cell that does not work, why? It's an extension module, you know? It's native code, it's external. Jupyter can't look into it. It doesn't know anything that falls out of it. There is actually nothing falling out of it. Just, you know, it gets executed, it gets imported,
and it's imported as a module, but there's no value that comes out of the runtime. So if it just ran this, it would display nothing, but I have to be explicit then and say, print this. Okay, one difference. This is still, you know, it's a bit boring, right?
I'm doing the same thing. I'm taking the Python function and calling it. It doesn't make a difference if I call a Python function in compiled code or in pure particular code. It's pretty much the same thing. The nice thing about Cython now is I'm not limited to using Python things. Cython takes my code, takes my Python code,
compiles it to C, okay? So what I can do now is I can start using C things because, you know, my code ends up in C anyway, and doing C stuff is a totally natural thing to do from C. So instead of importing the math sign function,
I do a static import, and this is a syntax extension here in Cython code. So this is no longer Python code, it's Python code now. I say C import libc math, and that gives me the math header file from libc. Okay, and now I can use the C sign function.
Yeah, okay. What I'm doing here is I'm assigning it to a Python variable in my module, compiling that, executing it, and the question now is what happens when I assign a C function to a Python variable? It becomes a callable, right, a Python callable.
Because it's kind of obvious what this should do, right? I mean, I have a function here, Cython can see the function's signature, and I want it to be a Python thing, so the most obvious thing to do is convert it into a Python callable. What Cython does is it generates code for me
that wraps the C callable in a Python callable object. I can now call it directly. Okay. You'll see that in a minute. Okay. So, I have a Python callable now,
and when I call it from Jupyter, it outputs the same thing as before. Okay, this is kind of the quickest way to wrap a C function that has a Python-like signature. Sign is simple, it just gets a double in, a double out, C double in, C double out. That means from Python side, you can pass in any float object
and get a float object to pick out. Okay. What this basically does internally is this, spelled out, so this is a long version of this, right? I'm writing a Python function, which says it's called C sign, takes a C double, another extension
that I can use in Cython now. I can type my arguments. I could just write this, fine. I would do that in Python. In Cython, I can be explicit here and say, what I actually want there is a C double. Okay, whatever you put in as an argument should be converted into a C double,
and Cython will generate the conversion code for me. So, what I get here is a Python callable function, which takes an argument, converts it into a C double. I get a type error, if that's not possible. So, if I put it in a string, we get a type error, obviously. And then, inside of my function, I just call the C sign function.
That's what you asked for, right? And I can do that from Jupyter again, call my Python function, and it runs the sign function internally. Okay. Nice feature of Cython.
When I say Cython minus A, what it generates, so it generates a C code for me, but it additionally generates a little HTML snippet for me that shows me how the compiler understood my code. So, this is a copy of my code, and it has information that I can use
to understand what came of my code. It helps me in optimizing my code. And when I click on one of these lines, I can actually see what C code Cython generated for this line, okay? This is the function signature. It's pretty involved because it has to do error checking,
it has to do argument conversion, it has to do lots of things in there. It has to register a function, callable in the module argument and all that. So, this is what comes out of this line, and this line is actually, the second line is actually more interesting. I can see that there's an X argument going in. So, we are just, you know, we are mangling the argument names
when we're generating C codes to avoid naming collisions. This is basically variable X that I put in. I call the C sign function on it, and you can see that it's really straight. C call to C sign, and then since this is a Python function, the end result has to be passed back into Python
to my Python caller, and for that it converts it from a C double to a Python float, and this is what this C API call is doing here, okay? So, this is a nice way for me to, you know, for me as a, as a Cython user to understand what Cython is doing with my code,
how it's interpreting it, or what becomes of my code. You can see that there are a couple of yellow lines in here, and those yellow lines give me additional information that tell me how much object operations there are in each of these lines, which is interesting when I want to start
optimizing my code, when I want to drop it into C and make it faster there, because every yellow line tells me there's some object operation going on, there's some exception handling going on, some error checking in that, so anything that uses interaction with the C Python runtime, and if my goal is to convert my code into fast C,
not using objects, but using native data types, then any yellow line is probably worth looking at, okay? And the darker the yellow, the more object operations, object interaction there is going on, which is kind of obvious in this case, I mean, there's a signature,
so like type conversions, error checking, exception handling, stuff like that, lots of interaction going on there, and this line is really just converting the end result of a C call back into a Python object, so there's less operations going on here. Okay. Now, what makes, and this is just a different way, basically,
of spelling the initial example that I had, right? Wrapping a C function, that's it. What makes it more interesting, then, is in Cython, once I have a wrapper function here, I can move more code below the Python level, I can take functionality from Python, right?
And make my wrapper thicker. I can put functionality into the wrapper between the C library that I'm talking to, and the Python API that I'm providing to Python users. So I can make my wrapper more intelligent, smarter, better looking,
more Pythonic for a Python user, by putting more functionality at the lower level, and hiding all the little dirty quirks in the C API that this Python user shouldn't have to bother with. Okay? Simple example here, instead of calculating, just calling the sine function,
I call sine of x squared, right? I could do that in Python, I could use my wrapped sine function, and do x squared, pass it into the sine function, and then we'd calculate the sine of that, but doing that in C is just much faster, right? I push down the x, it calculates x squared,
sine of x squared, and then passes back that result, much faster than doing half of that in Python. I can do that here, and it runs. Okay. Just two little things. Since we're in C now, I can use manual memory handling in C.
This is how that would look like. I have malloc and free, so I can allocate some memory. I mean, in Python, memory handling is completely automatic, right? You have objects, you have references to them, and when the last reference to an object goes away, then the object just dies and gets collected.
That is not the case in C, like totally not. C is completely manual. And so, in C, I would say, you know, allocate some memory here. If that fails, I can now point it back, and I just say, raise memory error. You wouldn't really do that in C, right?
What would you do in C? Yeah, you would return some error code, right? Say, return minus one, or return something that tells the caller, you know, something went wrong and you have to handle it. In Cython, well, Cython is Python, right? It integrates with Python, so you can just say, you know, raise memory error.
Raise an exception from your Cython code. Right, totally normal. And this is totally what you would do. Well, if the allocation worked, then I can use my memory for something, and we have a couple of nice features in the language that make array operations, pointer operations, a bit more, you know, handy in Pythonic
than there would be in C. No, since, so, question was, are those bound checked? No, because C is just a memory. Memory pointer, this is one place in memory, and here, I'm actually explicitly telling Cython, you know, to assign something to the first two entries,
which is still much nicer than, you know, in C. But pointers are just, pointers, they're not arrays. Arrays are actually bound checked. Okay, so I can do this, and there's a couple of nice things. I'll say, just, you know, print.
If anything goes wrong here, then I definitely want to clean up my allocated memory. So I want to free the memory, and I do that with try-finally, as I would in Python. So if print, for example, raises an exception, can't print, no standard out anymore, for example, we're closed already,
anything can go wrong here, I'm doing a Python thing. So this can raise an exception, and it just, you know, if that happens, I say, you know, finally, still make sure you free my memory, regardless of what happens. Nice feature. Okay, quick example for calling external code.
You've seen the libc, libc sign function that I called. Here's an example for calling an external library that is not just, you know, there, like libc, but really an external library. This Lua, who knows what Lua is? Was used to before?
Okay, cool. So Lua is little, it's really a small language, right? It's a language that people use for embedding commonly in C++ projects, but you can obviously also embed them in C++, why not? So here's a little embedding for Lua in Python.
First thing I have to do is, I have to tell Cython what the C API that I'm using here, the Lua C API looks like. Previously, you may remember that I said, you know, C import libc math. What that does is, it looks up an external declaration file and finds the libc math declarations in there.
Here, I'm doing that inline in a module, and this is the syntax for it, I'm basically just saying, you know, there's an external declaration, it describes external code, which comes from the Lua.h file, so that's the header file that describes the Lua API, and it has a couple things in there, there's a struct in there, there's a function for creating a Lua runtime
for cleaning things up, for loading code into the Lua runtime, and so on and so forth, a couple functions, and note that I did not copy the complete, vast Lua C API in there, I just copied the functions that I need,
because that is all I need in my code. Remember that Cython translates my code to C, right? And then there's a C compiler afterwards that compiles that to a shared library. And the C compiler obviously sees the whole thing, right? But Cython doesn't need to know about the whole thing, it just needs to know what I'm using
in order to generate proper C code that the C compiler can compile and understand, okay? Those are very nice, so it's really just a dozen functions here, so that allow me to execute Lua code from Python. So what's my interface? I'm defining a Python function that takes a Lua code as a string,
and I'm doing a couple things, blah, blah, here. It might be Unicode string. In Python 3, it's quite likely going to be a Unicode string. In that case, I have to convert it, because C doesn't know Unicode, it only knows bytes in strings, and that's how the Lua API works also,
so I converted it to UTF-8 to be able to execute it there. Then I create a new Lua runtime. If that fails, I just say raise memory error, because that's what the documentation tells me. If the creation of the Lua runtime fails, documentation says it, well, it's probably a memory problem, right?
Couldn't be allocated. So raise memory error is the right thing. And then whatever happens, I use a try finally to make sure I close down my Lua runtime afterwards and clean it up. That's what I'm doing down here in my finally. So I'm cleaning up the Lua stack and closing the Lua runtime to clean everything up, to release the memory.
And then I have a function to pass in the Lua code into Lua. If that fails, I raise a syntax error. Well, probably wrong syntax. Then I can execute my code using the C API function and I get some result back. I'm a bit lazy here, I'm just expecting some number back.
I could look at what the result is and properly convert it to some corresponding Python type, but here I'm only expecting numbers. One number is relevant, that's it. There's a function to converting it to a C number and then when I return that from my Python function, siphon will see, okay, that's a C number and you want to return it from Python functions or return it as a Python object.
So I compile this. Here's a couple of lines of Lua code. Recursive Fibonacci, the usual stupid benchmark. And when I run time it on that, it's going to tell me that Fibonacci of 24 could be calculated in 200 seconds, a bit more.
Okay? This is how you use C libraries from siphon. Okay. Still have 10 minutes. So, I'm gonna get back here. Second example. Optimizing Python code.
I chose diff-lip. Who knows diff-lip from the standard library? Not so many people. You should look through the standard library documentation. There's lots of nice goodies in there. So what diff-lip does is you can pass two sequences in there and it's gonna compare them and tell you where the difference is. It's just like the Unix diff but at an API level.
You can use it directly from Python. It's been there for a long time. You know, it's in the standard library. Most things in the standard library have been there for a long time. So I'm going to optimize diff-lip a bit. Okay. So as benchmark, I'm using a tool called Fuzzy Wuzzy
which comes with a little benchmark. That's why I'm using it. Fuzzy Wuzzy basically just does fuzzy comparison between texts. Okay? So it passes into texts and it's gonna do fuzzy comparison and try to match the parts. And it's not really relevant. I'm just using it as a benchmark here.
And I'm using C profile for profiling to see how fast my program's running and where I get. Okay. So let's look at diff- Oh, let's run it first. Okay. I'll just clean up my stuff
and then run benchmark. So that's taking a couple of seconds. It's running multiple examples The timing, so the benchmark is just the time it takes to run the whole benchmark for Fuzzy Wuzzy. And so you'll see in a minute that it's dominated
by the time it takes diff-lip to do the comparisons. Takes a bit of time.
One thing you shouldn't forget when running benchmarks is some switch off energy handling of your laptop. It's probably okay in this case because it's been doing computations before and so on. Probably not slowed down much. Yeah, it's about what I expected. So 46 seconds for the whole runtime.
Okay. Now next thing I'm gonna do is I'm running C profile on the benchmark to see where the time goes. So I'm gonna run it again. It's gonna take longer to run. Who does not know C profile?
Okay, a couple of people. So C profile, what C profile does is basically it traces your code. On each function call it's going to tell you functions be called, functions have been exited and then afterwards it's going to present, it's going to dump the profile somewhere
and it's going to tell you what were the functions in there that took most of the time. And where was most of the time spent and what was less relevant. So it's a very quick way of assessing the runtime profile of your code. And finger pointing the point
that you want to optimize first. Okay, again it takes a while. Obviously it takes longer now because as I said it instruments your code, right? So it's really doing stuff while your code is running.
There are profiles out there that slow down your code much less and it's really worth looking into them. One is the perf tool in Linux. So if you have Linux, look at the perf definitely. But it operates, most of these tools operate at a much lower level.
So they give you C level information rather than Python level information. And C profile just really gives you Python level information. So I've executed lots of function calls. How many are there? 153 million function calls along the way. Took 78 seconds to run now.
And the most costly function here is find longest match in the flip. And it took 25 seconds over runtime all by itself. Plus it took an accumulated 34 seconds
if you add the function, its own function calls, its internal function calls to it. So the whole time spent in that function was 34 seconds. And so function and its own code that it uses.
And 25 seconds only directly in the function itself. So that's a lot of processing time. So I'll look into that function. And I have it somewhere here. Can you read this? Is this large enough? Larger? Okay. Like this? Better?
Okay. So this is the find longest match function. It's actually a method. Which makes it a bit more difficult to optimize but I'll just do a couple of quick speed ups here. And it basically starts here. You can see someone's already tried to optimize
this function quite a while ago apparently. This is one of the standard tricks that you would pull in Python. It's a bit of an ugly trick. Normally what you would say is A and B. And what they do here is they take the bound method of B's B contains and say B contains A.
Okay, so replace the operator by function call. Apparently that was faster at the time. Actually tested it and it's still a bit faster in 307. But it's something that you don't need in Cython because operators in Cython are actually much faster. So one thing I would do is remove that.
But they're actually quicker gains to make here. And I'll start with those. When you look at the code, what it does is it has a nested loop and runs over a couple of data structures. You don't need to understand what those data structures actually do. It's this DIC involved somewhere. Keeps track of stuff. DIC enlists and it's doing things here and there.
But that's a nested loop. A nested loop always means that there's a lot of work going into looping and to doing stuff there. And loops are very easy to speed up in Cython because you can replace them by C loops and do stuff directly in C rather than running through Python's iteration protocol.
So I'll do that first. How do I do that? Well, this is using a loop over a range. So I know that these are actually integers. And I can replace them by C integers. Okay? Most obvious change first.
And when I look into the usage of I, then I can see that it's used to index into a data structure. Well, it's used as key in the dict and it's also used as index somewhere. And that means that, so a difference between C integers and Python integers
is that Python integers are unbounded. Right, they have an arbitrary value range. They can be as large as your memory allows. C integers are not. C integers are always range bounded. So this is the two bit integer which can take numbers up to two to 31, stuff like that.
If it's signed. And when you're converting Python code to Cython now to faster C code, you have to take care of that, right? So I'm replacing Python integers, safe Python integers which can go out to large by bounded C integers.
And I have to make sure that I'm not restricting the value range in a valid way because C integers wrap around the weird stuff when they go out of bounds. But there is a C integer type that I can use here. And that is, it's called PySizeT, or size T if you want.
Which is defined to be large enough to fit the size of memory. So it's on a 32 bit system, it's 32 bits. On a 64 bit system, it's 64 bits and so on and so forth. And I'll type all variables in here using that type.
I'll actually use a different type. It's called PySizeT. But that's just a different name for it. That's used in the Python world. I'll use that for my integer variables and I'll type them in Python annotation syntax.
So I have an I, I have a J as an index. I have, what else do I have? A low, A high. So they're all index variables here. And there is B low and a B high when I look through my code. Okay, those are all the index variables that are being used here.
And that's also best size, which is also just an index. Best I and best J. Okay, yeah, all of them. Best I and best J, okay. Yep, I'm through with my 40 minutes. I'll just go a bit into the question time and then run this there.
So I'll first compile it. I'm using Cythonize here. So up, compile my difflib. What it does is it just generated a C file for me. Now it's calling the C compiler. So Cythonize is a nice tool for doing like all in one. Right, just say Cythonize my code. Cythonize minus I is built in place.
Compiles it, generates the shared library for me and you can just import it. And now when I run a benchmark again, take a while, it's gonna be a lot faster already. And while it's running, I'll look into the code again.
And just tell you that there are a couple of more things that we can do. One I already mentioned is I can replace this Python hack here by just the expected operator. So undo a Python optimization that is, you know, that you can do but that is not beautiful.
And a couple of other things are I'm using dicts here, for example, dicts and lists here. I'm calling the get method of a dict, which in Python is looked up every time. In Cython, that's actually very fast.
If Cython knows that b2j is a dict. And so if I type b2j as dict, which is the next thing I can do. Is that actually, that's probably an argument.
So b2j is a dict. Python dicts. Then that's also going to be sped up. And you can see we're down from 46 seconds to 35 seconds just by typing the integers.
And one thing that that did is it allowed Cython to replace this integer range loop by a C for loop. Okay, because all the index variables type so it knows that, you know, there's nothing that can wrong. I'm actually asking for C for loop here.
Okay, and I'll try the dict optimization next and then I'm already through with my talk. There should be time for one or two questions afterwards. So it's compiling and the benchmark again.
And this is the usual way how we go about optimizing code in Cython, right? You take Python code, you start typing variables, replace them by C data types here and there. You use static typing for instructing Cython
to opt out of general, this is some kind of object semantics into, this is a specific data type, please optimize it, optimize the code for it. And Cython will generate codes for the data types that is usually faster and definitely more adapted
to what you're doing in your code. Okay, and that gives another five seconds runtime. So we're down from 46 seconds to 30 seconds just by typing a couple of variables here. Okay, and that's it. We've got time for a few questions, one or two.
Just a couple of questions. The first one is like we have seen that you can call random, I mean custom C libraries from Python just by defining the interface for them. Is it possible to do the opposite, I mean call Python code from C with using Cython?
Yes. You can generate so-called CDEF functions in Cython which are just C functions with the expected C signature and you can drop any Python code in there and you can call the C function from C
and then you can execute any Python code in there. Oh, cool. And the second question is like whether Cython benefits in some way from using typings hints like in new Python we have, the function type hints and so on. Yeah, I mean I've used the syntax for this. It does not really benefit all that much
from normal PEP for typing type annotations in Python because that's not so much to gain from them for optimization. They are for code checking, they are not for optimizing. One last question. So we use Cython a lot, these are really great tools.
One of the things we miss in the Cython ecosystem is a linter, some kind of like flake or pepe or whatever. We use some subset of what flake does like basically ignoring syntax errors but we wonder if there is any project or idea of maybe exposing the abstract syntax tree or something
so we can hook into existing linter tool. Okay, so tool support. You can do two things. One is as long as you're really just optimizing Python code, you can use Python syntax for it and that keeps your code, your optimized code, your compilable code in Python syntax and you can use all your Python tools.
As soon as you start calling into C using native code, you lose that ability because it can't be mapped to Python and that code cannot run in Python anymore. So you use Cython syntax for that in which case there are a couple of, well there's at least PyCharm
definitely which supports Cython. You can use that. There are probably also other IDEs who are kind of having some basic support but I'm not aware of many tools for code analysis and this kind of stuff
on Cython code, specifically Cython syntax. All right, don't forget ready to talk and let's give Stefan one round of applause. Thank you very much.