We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

GraalPy - Fast Python Implementation

00:00

Formal Metadata

Title
GraalPy - Fast Python Implementation
Title of Series
Number of Parts
131
Author
Contributors
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
GraalPy is the fast Python implementation built on GraalVM. We run PyTorch and TensorFlow and ML models from Huggingface.co. We execute the test suites of the top 600 PyPI packages every day and are the most compatible alternative implementation of Python to date. We can JIT pure Python code to the same speed as code rewritten in Cython: https://twitter.com/timfelgentreff/status/1760597779250839820. We are the most seamless and performant choice for integration with Java in both directions, including Jython compatibility mode. In this talk, we want to show what's possible today with GraalPy and why you might choose it for your projects: for its performance, integration with Java and other languages, or sandboxing and distribution features.
ImplementationRothe-VerfahrenOracleCodeAsynchronous Transfer ModeJava appletCompact spaceDistribution (mathematics)Plot (narrative)Revision controlOpen sourceFormal languageSoftware frameworkComponent-based software engineeringOperations support systemTerm (mathematics)Condition numberFreewareCompilerJust-in-Time-CompilerExtension (kinesiology)Suite (music)Computer-generated imageryGroup actionEndliche ModelltheorieImplementationExterior algebraSoftware developerMobile appSubject indexingSoftware testingVirtual machinePhase transitionExtension (kinesiology)Pairwise comparisonGroup actionInstallation artMathematical optimizationMedical imagingType theoryMultiplication signFormal languageWindowSource codeCompilerArithmetic progressionDistribution (mathematics)Limit (category theory)Connectivity (graph theory)MultiplicationCASE <Informatik>Computer architectureSoftware frameworkObject (grammar)Virtual realityOpen sourceProgramming languageAuthorizationGoodness of fitAsynchronous Transfer ModeJava appletCodeRevision controlBinary codeLibrary (computing)Scripting languageFile archiverCentralizer and normalizerPhysical systemData storage deviceGastropod shellStandard deviationNormal (geometry)Endliche ModelltheorieSoftware suiteInterpreter (computing)Adventure gameException handlingMathematicsRun time (program lifecycle phase)Just-in-Time-CompilerAliasingBuildingBit rateSuite (music)Turbo-CodeComputer animationLecture/Conference
Suite (music)Computer-generated imageryGroup actionImplementationEndliche ModelltheorieExtension (kinesiology)OracleDefault (computer science)Source codeError messageBinary codeCodeCompilerComputer programOverlay-NetzLibrary (computing)Configuration spaceLocal ringFunction (mathematics)Demo (music)Just-in-Time-CompilerCompilation albumMathematical optimizationMachine codeCompilerInterpreter (computing)output2 (number)Binary fileStandard deviationGame theoryComputer fileSystem callExtension (kinesiology)CodeRevision controlPairwise comparisonSource codeMultiplication signBinary codeIntegrated development environmentMathematical optimizationFeedbackImplementationInformationRun time (program lifecycle phase)Module (mathematics)Game theoryComputer fileWindowAbstractionCASE <Informatik>CompilerCompilation albumMachine codeCartesian coordinate systemBenchmarkInternet service provideroutputInstallation artProcess (computing)Interpreter (computing)Software testingAuthorizationFile systemSoftware suiteResultantDistribution (mathematics)Video gameGraph (mathematics)Representation (politics)Just-in-Time-CompilerBytecodeStandard deviationBuildingRule of inferenceComputer animation
Error messageOracleJava appletDisintegrationDesign of experimentsLambda calculusModel theoryMiniDiscJava appletFunctional (mathematics)Interface (computing)INTEGRALCASE <Informatik>Cartesian coordinate systemObject (grammar)Social classVariable (mathematics)Keyboard shortcutCodeBitSinguläres IntegralContext awarenessException handlingResultantLibrary (computing)AdditionMultiplicationLambda calculusInterpreter (computing)Computer animation
Java appletDisintegrationRankingOracleComputer configurationContext awarenessPolygonImplementationCodeAsynchronous Transfer ModeCompact spaceCartesian coordinate systemJava appletCASE <Informatik>CodeAsynchronous Transfer ModeImplementationPresentation of a groupMathematicsWindowElectronic mailing listComputer configurationExterior algebraCuboidLibrary (computing)Table (information)Default (computer science)DatabaseWeb 2.0Web applicationTraffic reporting2 (number)Computer animation
Optical character recognitionOrdinary differential equationCodePseudodifferentialoperatorView (database)Java appletMachine codeComputer configurationBoundary value problemLibrary (computing)QuicksortGraphical user interfaceTheoryExterior algebraArithmetic meanLecture/Conference
OracleContext awarenessLibrary (computing)Machine codeComputer programWindowCompilerComputer fileComputer animationLecture/Conference
OracleBinary fileStandard deviationCodeGame theoryBinary codeComputer fileFormal languageComputer programJava appletWindowPhysical systemComputer fileIndependence (probability theory)ResultantGoodness of fitComputer animation
Binary fileFormal languagePresentation of a groupJava appletLecture/ConferenceComputer animation
Transcript: English(auto-generated)
Yes, I'm a Graphite developer, I work for Oracle Labs and we work on Graphite Air, which is a fast Python implementation. So what really is Graphite is an alternative implementation of Python. So the Python that you know that you're used to use is called CPython.
That's the reference, the standard implementation of Python. We are implementing the same, but differently, we are clean room implementation of the same. So what is Graphite? It's a compatible implementation of Python, compatible to the reference CPython implementation.
We believe that we are more compatible than any alternative Python has ever been. I will show more details why we think this is the case. Graphite provides high-performance execution of Python code, and Graphite seamlessly integrates
with Java, including Jython compatibility mode. So let's see how we can get started working with Graphite or using or trying Graphite. So the first mode is where you want to use Graphite as a CPython replacement. So basically, it will run on Graphite, but you would use it like CPython.
This is what we call a Graphite native stand-alone distribution. What it means is that it contains it has no dependencies, it only contains the native executable Python binary, that's the interpreter already, the runtime that runs your Python code, and the Python standard library, which we mostly share with CPython, except for
a few minor changes, including the native extensions that you can find in the Python standard library. So if you want to use GraalPy in this mode, you can, through pyenv or conda, so you can just say pyenv install graalpy-desired version.
If you feel adventurous, you can say pyenv install graalpy-dev, which will install the latest development build of GraalPy. Then you can say pyenv shelf, GraalPy in your version, and your shell will switch to be using GraalPy instead of the normal system Python.
So then if you say Python 3 dash dash version, you will see that, in fact, you're running on GraalPy, and you can use it as you're used to. So you can create a virtual environment, you can pip install packages, et cetera. If you really want to be sure that you're running on GraalPy, you can also use this
GraalPy launcher. So that's basically alias to Python 3, or actually Python 3 is an alias to GraalPy, but if you run through GraalPy launcher, you can be sure that you're not running any other Python implementation. The same for conda.
If you don't like either of those, pyenv nor conda, then you can just download this binary distribution as a turbo or zip archive, and just extract it on your system and run it. There's another mode, how you can use GraalPy.
GraalPy, I didn't say that yet, but GraalPy is based on this GraalVM technology, which itself is based on Java technology. That's an implementation detail that you don't need to care about if you don't want to, but if Java happens to be your thing, or if you, for whatever reason, need to use Java, then GraalPy is available also as a Java library.
It's available on Maven Central, which is something like Python package index for Java people. This is how you can edit through your Maven script, and below you can see how you can invoke, or how you can create an interpreter from Java,
and how you can invoke Python code from Java. You can exchange Java objects, or send Java objects to Python, and Python objects to Java, and work with them in Java and all that. I will give more details later if we have enough time. There is also a JITEN compatibility mode that you can use if you want to migrate from JITEN to GraalPy.
GraalPy works best on a Java distribution that's called GraalVM JDK, but it does work on any other JDK of version 21 and higher. So let's take a closer look at GraalPy. So as I said, it's not a fork of CPython,
it's a different implementation. It is based on the GraalVM technology. More specifically, it's based on something that's called Truffle Language Implementation Framework, and the idea behind it is that you use one high-performance virtual machine to implement and run multiple programming languages. It's all very fascinating,
but I wouldn't go into any details here because we wouldn't have the time for that. GraalPy is open-source. It's licensed under this permissive license, and it is supported on Linux and Mac OS, in both cases on AMD 64 and AR 64 architectures,
so also on Mac M1 and M2. It has limited support for Windows, but it's not because we want to ignore Windows, but because this is just a work in progress and we would like to eventually reach a full support for Windows as well.
There are two distributions, or two types of distributions that you can get. First one is a community edition, which is built fully from open-source components. The other one we call Oracle GraalPy. This is built from the same source when it comes to, in regards to GraalPy itself,
but it contains some more advanced just-in-time compiler optimizations that are not open-source. Those optimizations are, however, language agnostic. So anyway, so let's talk about GraalPy compatibility.
GraalPy supports the C extensions API, so you can install many well-known packages, such as NumPy, Pandas, machine learning models from Hugging Face, and we ourselves run the test suites
of the top 600 Python package index packages every day. We monitor the tests and fix incompatibilities. It's not like all the 600 packages pass 100%, but we have quite a good rate.
It's changing every day, and so we believe that we can run quite a lot of existing Python code out there. So given all that, to best of our knowledge, we believe that we are the most compatible alternative Python to date. GraalPy is included in the many Linux images,
if there are any package or native extensions authors, and GraalPy is also available on GitHub Actions setup Python, so if there are any Python package authors there, nothing stops you from adding CI jobs for testing GraalPy compatibility.
Yes, but let's be honest, there are some rough edges there that we are working on smoothing out. For now, most packages, believe it or not, don't provide binary rules for GraalPy, so when you pip install package xyz, what pip is gonna do is it's going to download the source distribution of xyz,
and it's going to build it from source, which in itself should be fine, but if you've ever built things from source, you know that sometimes this requires a non-trivial setup, and it also takes some time. However, we want to improve that,
so we are contributing GraalPy support to CI build wheel, and we are also thinking about supporting stable API, so that's basically a API for which you build, and it can run on any CPython version from the version that you've built to any higher,
and if we support it, it could also run on GraalPy. So I think this stable API in itself is very interesting thing that desires much wider adoption. There are talks about it tomorrow if you're interested.
So GraalPy performance. GraalPy uses just-in-time compilation for Python code. Let's illustrate this on this example where this PyBoy application had been rewritten to Cython to provide enough performance on CPython, but if you run on GraalPy,
the original pure Python code, it happens to be within a few percent of the performance of the natively compiled Cython code running on CPython. If I may simplify this a lot, Cython is like a C with Python syntax,
so it allows you to kind of quickly port your Python code to C, so basically we are comparable to C code here in this case. There are some results from the PyPerformance suite. Take this with a grain of salt because it's a benchmark suite.
There's no guarantee that this will be representative of your concrete Python application, and it's also a moving target, like everything in this graph is moving target, Python, GraalPy, the performance suite itself. Yeah, but life is not all roses, and there's no free lunch.
So because GraalPy uses just-in-time compilation, it needs some time to warm up before it reaches its peak performance. The idea there roughly is that there is a specializing interpreter that needs to interpret your code for a while to specialize the byte codes,
and those specialized byte codes are then the input for the just-in-time compiler, which can then produce better code, because there is this feedback from runtime. This feedback provides more accurate information to the just-in-time compiler than if you just used the code without executing it.
So this whole thing takes some time, and then also when you actually start the just-in-time compilation, the compilation itself takes some time. And the more advanced the just-in-time compiler is, the more time it's going to take to compile the code. Another big problem here is our native extensions, because the native code in those extensions
is opaque to the just-in-time compiler. It cannot see through them. So what it means is that when the just-in-time compiler compiles your Python code, Python byte codes, when it reaches a call to native extension, the only thing it can do is then just emit the binary,
the assembly for making the call, but it cannot go into the native code and see what it's doing and use that for the optimizations. And not only that, because it cannot see what the native code is doing, it must assume the worst about the native code. It must assume the worst about how it mutates
the global environment and things like that. So this hinders many important optimizations. By the way, most that I'm saying, I believe, also applies to the C Python just-in-time compiler in making because those problems are not specific to GraalPy, they are specific to
just-in-time compilation technology in general. So when you get the best performance with GraalPy, well, when most of your code is Python, which in a way, if you think about it, is great because we are at this conference because we like Python, not because we like C++, right? Or Rust.
Why GraalPy? Because of the performance, but it depends, of course. You need to benchmark your use case and see if it can be faster. If you happen to have an application where most of the time is spent in Python code,
GraalPy can provide a great improvement about benchmarking your use case and also when you do benchmark, make sure to warm up the application because, as I said, GraalPy needs some time to warm up. Another tool that we provide in GraalPy,
we called GraalPy standard on binaries. The idea is that you can package your whole Python application, its dependencies, and any resources like data files, et cetera, into a single executable binary. The trick here is that the Python implementation in GraalPy is using file system abstraction
that normally goes to the actual file system, but you can replace it with whatever you like. You can implement your own file system abstraction. So in this case, we have an implementation that is reading the data directly from the binary that we produce.
So the only thing that you need to distribute is the one file that's the binary, and everything is in there. There is one downside that we are working on improving, and that's that it produces quite large binaries. Here is an example of a pygame game
that you can package as a single executable like this. It produces even executable on Windows, as far as I know. And this is the invocation of the standard module that you would use to create this standalone binary.
Another big use case for GraalPy is of course the Java integration. This is an example how you can create one interpreter, create a function foo in it, then you retrieve the function into a Java variable, and from Java you can invoke the function,
giving it a 40 and two, and in the Python we will actually execute the addition, and then you will get back the result, which should be 42. And then in the same Java application, you can create another context. And if you try to call foo in that context, you will get the name foo in the defined because that's a different isolated context,
so you can create multiple isolated contexts. And also, you can catch exceptions from Python, like we see in the example. This is a bit more advanced example, where we use the transformers pipeline. So we create a pipeline, and from the Python code,
we return a lambda to the Java code. We save that in a variable, and then from Java we can actually invoke the pipeline. So this is an example of integration of quite non-trivial Python package into Java with crawlpy. And you can go even further.
You can, if you, for example, need to share your Python package library with another team that uses Java, for example, and you want to create a Java binding for it, you can use features like that, where you just need to define an interface on the Java side,
and then you define a class on the Python side, or maybe you already have a class on Python side, that follows that interface, and we have this facility that allows you to cast the Python object into the Java interface, and then the rest of the Java application
can work with that object, as if it was just a plain Java object implementing that interface, and be completely oblivious to the fact that it actually, under the hood, it calls Python. Another use case for this, we believe, is providing Python scripting capabilities
in Java applications. So this is an example where, imagine you have existing web-based application with Java backends that's showing some table retrieved somewhere from database or something, and you want to provide your users with a way to filter the data with something more advanced
than just checkboxes and combo boxes, et cetera. So in this case, we allow the user to write the filter in Python. And GradPy can be used as a JITEN replacement,
GradPy is Python 3, JITEN is Python 2.7, so if you want to need to have to migrate to Python 3 and you still want to use JITEN, that's something that we can offer. We support many of the JITEN features, not all. If you have a JITEN use case, then please reach to us, report on GitHub.
We wanna know how people use JITEN and what we can support. There are some features that, if enabled by default, would negatively impact the performance, so those must be enabled by the options bill. And this is an example of a code that would run on JITEN
as well as on GradPy that is creating some window from Java. So in summary, GradPy is an alternative Python implementation that is compatible. We believe it is the most compatible alternative Python to date.
It provides high performance, especially for pure Python code. It seamlessly integrates with Java, including JITEN compatibility mode. If you want to get GradPy, you can install it through Pyend. The second code listing should be the installation through Conda.
I will change that. And if Java is your thing, you can get GradPy Java library so that you can embed Python into your Java applications. Yes, and that's all. Thank you for your attention.
Thank you so much, Stefan, for this nice presentation. If you have any questions, you can move to microphones or you can ask in the Discord channel.
Thanks for the talk. How good it is on running native applications, like GUIs. Yeah, do you mean GUIs, like Java GUIs, or Python GUIs? Like if you want to build a GUI, like PyQT is terribly slow. I'm not happy with it. I'm looking for an alternative. Is there something you could advise?
Well, in theory, you could use the Java GUI libraries from within GradPy, like I just had in this example. I believe this should be quite fast, because the, and it's something I should have mentioned, the sort of boundary between Java and Python for GradPy is very fast,
because GradPy is running on Java. So that's one option you have. For libraries like PyQT or Tkinter, et cetera, those are going to run with about the same speed as on C Python. Like there's nothing, because those are native libraries, and we just, as I explained with the just-in-time compiler,
we just can't do anything else than just call the native code, and then the native code will be the same for us as for C Python. So, yeah, does that answer your question? So much, so, thank you. Yeah, I mean, feel free to talk to me after the talk.
So I have one question. So when you, for example, export it to Windows, which I know is limited, is like XF file, how do you run the program? So you mean this one? Yeah, this one. How do I run, which program? The result of this? Yeah, so is it like jar file?
Is it like XF file? No, it produces XF file, and it has no dependencies. So it should be portable to any Windows system. Okay, and do you have some approximate size of this large? No, this is large, so we're talking hundreds of megabytes. Hundreds, okay, I see, right. And what about Kotlin language?
Can you use it with Kotlin? Yes, absolutely, so any JVM language like Kotlin Scala works fine with this. So you would be using Java APIs, but I believe Kotlin has a good story for that, for interoperability with Java. So it should be just fine. In Scala, the interoperability with Java
is maybe not as great, but it's still quite good. So we've had users using GloPy and other languages from Scala, and I think Kotlin should be the same. Oh, that's okay, thanks. Thank you. In Discord channel also, there is no question.
Okay, thank you so much for this presentation. There will be another session at 11.55. You can stay here, or you can change room. Please accept this small gift from us. Thank you.
Thank you.