We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Extending Numba

00:00

Formal Metadata

Title
Extending Numba
Title of Series
Number of Parts
561
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Over the years the Numba project has proven itself to be a pragmatic and effective tool to accelerate numerical computations significantly. Numba however puts some restrictions on the code that can be accelerated. These restrictions can force you to compromise, making your code less readable or more difficult to integrate with the rest of your code. This tradeoff makes Numba less attractive. In this talk I'll share what I've learned integrating Numba in our simulator. We'll look at the architecture of Numba and explore how we can extend it to ensure our code can be accelerated without having to trade in expressiveness.
Hyperbolic functionSimulationPhase transitionExponential functionObject (grammar)Data typeEndliche ModelltheorieCompilerCodeInferenceType theoryBytecodeBlock (periodic table)Function (mathematics)Hill differential equationTypinferenzClefContext awarenessNumberTelecommunicationMatching (graph theory)ExpressionCompilerJust-in-Time-CompilerStatement (computer science)BytecodeGoodness of fitType theorySoftware engineeringSoftwareExtension (kinesiology)Projective planeDigital electronicsWebsiteTypinferenzEndliche ModelltheorieObject (grammar)InferenceComputer simulationFunctional (mathematics)Block (periodic table)Connectivity (graph theory)BitData modelCodeAbstractionInternetworkingPoint (geometry)QuicksortData dictionaryProcess (computing)BuildingMathematical optimizationRegular graphIntermediate languageMachine codeConstructor (object-oriented programming)Interpreter (computing)Level (video gaming)Electric generatorMultiplication signString (computer science)MereologyOpticsSlide ruleParallel portData structureRewritingFiber (mathematics)Revision controlVirtual machineContext awarenessRepresentation (politics)ConsistencySource codePhase transitionFigurate numberComputer animation
Type theoryRepresentation (politics)Endliche ModelltheorieSocial classIntegerCodePoint (geometry)Computer programmingBroadcast programmingEvent horizonSemiconductor memoryInformationIntegerCodeBitProgramming languageMachine codePointer (computer programming)Type theoryImplementationExtension (kinesiology)Parameter (computer programming)Point (geometry)NumberEndliche ModelltheorieMultiplication signAttribute grammarOperator (mathematics)Intermediate languageLibrary (computing)Data structureElectronic mailing listAddress spaceComputational scienceOrder (biology)Constructor (object-oriented programming)TriangleFunctional (mathematics)Object (grammar)Utility softwareCASE <Informatik>Different (Kate Ryan album)BuildingTypinferenzAdditionPhase transitionRepresentation (politics)Goodness of fitElectronic signatureFerry CorstenINTEGRALStudent's t-testWebsiteGraph coloringStrutParticle systemComputer programmingUniverse (mathematics)Ring (mathematics)Proxy serverFormal languageDampingComputer animation
Block (periodic table)Error messageLetterpress printingType theoryEndliche ModelltheorieInformation managementTowerMUDIntegerStrutDuality (mathematics)Uniformer RaumData typeHill differential equationColor managementElectronic mailing listSemiconductor memoryPoint (geometry)ExpressionLengthMultiplication signData structureComplex (psychology)IntegerSign (mathematics)Type theoryBytecodeNumberData modelField (computer science)Projective planeBinary codeAttribute grammarInstance (computer science)Array data structureFunctional (mathematics)Representation (politics)ImplementationBlock (periodic table)Utility softwareInferenceConstructor (object-oriented programming)TrailCorrespondence (mathematics)Goodness of fitPresentation of a groupTemplate (C++)VarianceCodeMatching (graph theory)Just-in-Time-CompilerProxy serverError messageSoftware developerBranch (computer science)Closed setRevision controlUsabilityLecture/Conference
Computer animation
Transcript: English(auto-generated)
Good afternoon everyone I Hope you're all enjoying for them Learning a lot of stuff hopefully I also enjoying a lot of peace I'll be talking a bit about Extending number so the goal of my talk is to give you an overview on how you can
Extend number to better solve your problems So Let's first start with the beginning so give a short introduction what Numba is for those who don't know it yet Or we need some refreshing so Numba is a
Just-in-time compiler you see here an example that I shamelessly stole from the website from Numba It's a project supported by Anaconda so its goal is to accelerate scientific Python So you see here, this is basically how it works you have a decorator just in time
Decorator it says no Python is true, which means that in the generated code No, Python is involved so no Python the the Python interpreter is not called What's what's nice about Numba is that it has very good Numba support it is also able to generate code for CUDA and
Also it is extensible which is exactly the topic of this talk Anything else
So I'll Give you a bit of context on what I've been working on in my day job I don't want to talk too much about it, but it gives you an idea of what problems you can solve so What you see here is a photonic integrated circuit, so that's basically just a regular electronics chip, which
Is only special because you can connect an optical fiber to it so through this fiber Light will come into the chip and will be guided into the chip We at Lucida we actually try to build software to be able to build and design those circuits
Part of this is an optical circuit simulator so in which we try to simulate the behavior of those circuits and we want our users to be able to Build models of their components so the people who have a bit of background in electronics might know spice
which is Circuits simulator for electronics, so it's very similar to that So we want our users to build models in Python So a high-level API, but at the same time it has to be really really fast We don't want to we want to run a lot of simulations in a very short time so that we can
Figure out how to better build our circuits So when you see already some problems here, so this calculate S metric It has to be called by our solver, which is written in C C++ We have here sort of dictionary like
Structure And we have some C++ objects that we want to use from our simulator So these are all stuff and things that are not directly supported by number so dictionaries are not supported There's in the latest version some some basic support for strings, but even there it's limited
But luckily you can extend number Okay, let's go to the next slide Number is a compiler and it's basically a very boring compiler But it's good in software. We want things to be boring
Boring is good in software. That's basically what we do as software Engineers we take bunch of boring stuff, and we turn it together in something very exciting But the one thing that makes it special is as I said Extensible so you have a few extension points here. So one two, three four
But With those extension points you can add your own stuff to the compiler pipeline. So Let's start at the beginning. So you start with Python source code This will be translated into Python bytecode and then number will transform this
Python bytecode in its own representation. So this is number Intermediate representation. It's basically an abstraction over bytecode Next you have the opportunity to rewrite this internet intermediate representation
for example to do parallelization to do all kinds of optimizations and you can you at your own rewriters, so Then the next step is Type inference where you can also add your own types and do type inference of your own functions
Then you have another opportunity to do another rewrite face, but this time with the actual types And then when all that is finished you can actually get to the point and start generating The code so that it's important to remark that a number does not directly
Generate machine code itself but it generates LLVM intermediate representation, which is taken by LLVM to actually generate the machine code for your for your machine Okay, so also there in the
In the lowering phase you have the possibility to add your own stuff so you can add Custom data models custom code generation so and After that you have very fast Near native speed of your Python code. So I will now go to a bit more detail of all those extension points
Let's start with the beginning so it was talking about the rewrite phase so Number likes Decorators very much. So this is basically how all the extensions work. So you here we say, okay
We want to register our rewrite And we say okay, it's before inference as I said before you have a Step or rewrite step before inference before typing first and then one after inference basically a rewrite Consists of two steps. So you have a much match where you're going to look for the expressions
statements the instructions you want to replace and Then when you return true The apply method is invoked and in that face you can actually replace the function block of here With a new one in which you do an optimization for example
Next you have the type inference So there you have something the concept of types so Maybe have to clarify here So a number type is a bit different than what you would have in a part than in just a regular Python type
So you can compare it more with what the my pi project offers So where you have the opportunity to add type annotations to your functions, so you have to compare it with left So You have the possibility here to add my point type. So
This example we have a point which has basically an X and Y coordinate Then you can use that to do Type inference for your own functions. So here we have Callable my point constructor Which if you use it in your Python code will create a my point object
and this type callable is basically going to say okay, I want to infer the types of my point constructor this is going to generate your typer and
For given X&Y argument you want to say, okay The return value is going to be of a type my point type. That's basically how it works. So that's The possibilities you have during type inference Next when that is finished, so
Again, you have a rewrite phase where you are able to reuse the types and When all that is finished you can start with actually lowering so that means generating LLVM intermediate representation So again Decorator same principle. We have a my point type. So we're going to register a model
To a certain type in this case for our point we want to use a struct like a C struct like model with X&Y attribute of Which we here assume that it's an integer
And that's basically the data layout of your of your point. So this is telling Namba, okay, I have here The data structure with this data layout as you also see this is a list because the order is important and then this information can then be used when you're going to actually lower the
implementation of your callable So as you said as we had before we had our my point constructor, which takes two arguments So we have an integer argument the X and then the Y So the lower building is a decorator to say, okay, I have a
Callable that I want to lower. I have an instruction that I want to lower This can also be a set adder a get adder an addition basically any operation and you're going to say okay for this particular signature This is the implementation
Of the LLVM code but what is important is that for LLVM for Namba You're never ever going to or very rarely are going to generate LLVM intermediates Representation yourself. So what makes number very nice is that you they provide a lot of
functionality to be able to Easily generate LLVM intermediates representation. So this is a nice example. So as I said before you have We had the the point which for which we use a struct like model
And there's this in the code gen utilities. You have this create struct proxy, which Generates like this kind of syntactic sugar that you can use here. So it looks Here that we assign a value X to point of X but what it actually does in the background is
Using the builder to at the same time Generate the correct code for doing this operation So it looks like you're actually assigning the value but in reality, it's generating the code So That's basically it so summarize a bit so so we have first rewrite step then we have type inference
and then we have The actual lowering so these are the extension points you have Then I come back a bit to my problem. So
remember we had to Integrate with C C++. So and in reality that's going to be very often the case when you were working on Definitely scientific computing you already have maybe a solver that has been going around University or when in your company for a long time
You don't want to throw all that away because you've put all your experience in that Starting from steps would be very difficult. So it is important to be able to integrate with Other languages, but luckily we have this very nice Love triangle. We have number numpy and the C programming language and they all love each other
So let me clarify a little bit So for people we don't know but numpy Internally, it stores its data as a C contiguous array, which means that you can exit this from C and
Do all operations from C as well And there's this very nice library the C types library, which you can use to generate pointers to that data Okay Same thing a bit for our number so number has actually quite good integration with C
So The C array for example Construct for example, it allows to wrap a C array and pretend as if it was an umpire a so that means that you can
apply slicing Use all the numpy The numpy operations to be able to use a C array as as if it was a Numpy array, so it works actually pretty well
Then we have the C func decorator, which is very similar to the just-in-time decorator that you typically use when using number But the main difference there is that you have to upfront provide the types that you want your function to invoke with
This means that at that time when you use the decorator it's going to be already compiled to machine codes and Then you can get back an address to the actual code to this underneath so you can bet can get back a pointer to the actual
Address in in your memory where the function is located That's very nice because now we can take that pointer give it to a simulator and call the the Generated code from a C program, so that's basically what we also do in our solver so we pass
The pointer of the function generated by number to the solver we also pass the The pointers to the C++ objects that we want to call we wrap them and then using CFFI
we are able to Call The functions from our solver itself, so that's basically how it works Of also of course C you are able to call numpy numpy number
using the Python C API okay, that's It's You have also prepared examples. They are on the website So We'll be a bit short to go over them, but if you what is in here is very interesting to you I
highly recommend you to Go to there if something is not clear contact me I'll also be around to talk about it so and of course we are Still quite some time for questions, so please if there are some questions go at no questions
Okay, I will probably not be able to run on this computer, but So this is basically
The an example there are the rewrite so as I said, so there's a very very
Unuseful example obviously, but so we have here a very meaningful Variable that we want to be replaced So as you see here, so you have the function block And you can search for instances of a sign Expression so this is an assign expression
you research for any sign Expression that has the target name, so this is a target meaningful there we return true if we have any matches and then you can replace it with a constant of
42 and Then we can run it and this will Believe me it will return 43 because 42 plus 1 is 43 other example is
It's a Mac Yes, so this is an example of them
So the point basically the point example that I also explained during the Presentation so we have here our my point Constructor so it's my point so with X&Y which are supposed to be an integer and
Here we have the Python type to that corresponds with this The number type sorry that corresponds with the my point constructor This is a typer I was talking about and then we have
Yes scroll so This is the the binary Data layout and Then the actual implementation
When you only do that, so this is something I didn't talk about yet, but so You are actually at this point. You can do very little with your point So we basically okay, you can create a point. It will be allocated in your memory You can add it to a list
And then you can return the length of the list. That's obviously very not very useful But the first what you still need to add is Like support for getting the attributes. So this is basically here something I didn't explain during the presentation, but You can use something like a template which allows you to do inference for the attributes and also other
instructions, so it's basically here if my attribute name is Either X or Y. I want to return a type of integer 64
So and then I can again lower this so for this Instruction I can generate again bytecode Use the create struct proxy same name so I do get utter on
the actual struct and This will using the builder generate the LLVM code and then you use You use the Utilities from number to say how it should keep track of the memory
So that's once you've done that you're able to access the attributes of your point So it will create using the lower Yes, there's a question
So the question was if you can use structured Data structures for your For your fields. The answer is yes, so there's actually
On the number documentation. There are some examples or at least in the development branch of the the latest versions of number there is some explanation on how you can make records and use so
Use new types. So basically you're able to use any type there and Because number supports the NumPy arrays, so it has a lot of inbuilt types for NumPy arrays Also for CFFI As native support for CFFI, so it will take the CFFI
representations turn those into native number types so that it's all able to directly work with We've those types yes another question there at the back so the question is about the time that it takes and the complexity of
Understanding number. Is that correct?
Well, I'll be honest. It took me a time to figure out how number worked So it's one of us one of the reasons why I wanted to talk about it today. So Definitely did so the last half year they've been putting a lot of effort into documentation but before
There were documentation on this was sparse. So But what is very nice about it so number uses all this also to implement number itself So which is also always a very good idea to to build your and use your own stuff to build your own stuff. So
So that means that you can look at the code. So as when I was talking about the NumPy support It's advanced Because NumPy is quite a complex thing. So building adjusting time copilot for it will take a lot of time But you can reuse all that work
And you will By reading that code by reading number Code you learn how to use all those Things as well. So so the number project itself is a very good documentation for For number on how to use number so it is actually if you are really interested in using this I
Propose really that you get into reading the number code. So, okay. My time is up So thank you all for listening if there's any follow-up questions, I'll be glad to talk to you