We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Managing Python Environments with pypi2nix

00:00

Formal Metadata

Title
Managing Python Environments with pypi2nix
Title of Series
Number of Parts
14
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Production Year2017

Content Metadata

Subject Area
Genre
Abstract
This talk wants to give an overview over the usage and development of pypi2nix, a tool to generate and maintain python environments with nix. pypi2nix is already a viable tool to generate and manage python environments with their respective dependencies inside of nix. The first part of this talk will be focused on the current functionality of it. This includes features and shortcomings. The second part of the talk is about nixpkgs-python and how we use pypi2nix to maintain semi-automated self-updating sets of python packages. Finally we will discuss how pypi2nix could be used to maintain python package sets inside of nixpkgs in the future.
MereologySoftwareMetadataScripting languageInformationConfiguration spacePhase transitionParsingShooting methodMetadataCodeLevel (video gaming)AuthorizationLibrary (computing)WindowExpressionInformationConfiguration spacePhase transitionInstallation artReal numberBinary fileModule (mathematics)Linker (computing)Thermodynamisches SystemSoftware maintenanceSoftwareOnline helpGoodness of fitMultilaterationPoint (geometry)Scripting languageBinary codeMereologyCodeData storage deviceExecution unitFormal languageSolid geometryRight angleBeta functionSet (mathematics)InternetworkingPiInstance (computer science)Bus (computing)Mixture modelState of matterWordLecture/Conference
MereologyMeta elementHome pageInterface (computing)Function (mathematics)Derivation (linguistics)outputInheritance (object-oriented programming)Interpreter (computing)Integrated development environmentFreezingThermodynamisches SystemRadio-frequency identificationLine (geometry)ExpressionMultiplication signMathematicsCalculationIntegrated development environmentOverlay-NetzVariable (mathematics)MereologyTwin primeCategory of beingDerivation (linguistics)Theory of relativityFlagProduct (business)Level (video gaming)Touch typingPlanck constantRevision controlPhase transitionView (database)Repository (publishing)Food energyCASE <Informatik>InformationRootThermodynamisches SystemRule of inferenceElectric generatorSet (mathematics)Function (mathematics)Computer fileAttribute grammarFile formatPoint (geometry)BuildingBitDemo (music)Computer animation
outputCache (computing)Interpreter (computing)Error messageCompilerCASE <Informatik>Error messageService (economics)2 (number)ParsingLibrary (computing)Computer animation
Regulärer Ausdruck <Textverarbeitung>Integrated development environmentSlide ruleCAN busFunction (mathematics)MereologySocial classThermodynamisches SystemFlagLecture/ConferenceComputer animation
Gastropod shellRevision controlComputer fileDirectory serviceCache (computing)outputVariable (mathematics)Thermodynamisches SystemSoftware testingConfiguration spaceRegulärer Ausdruck <Textverarbeitung>Uniform resource locatorPeer-to-peerComputer configurationMultiplicationInformationSemantics (computer science)Default (computer science)Latent heatSlide ruleInheritance (object-oriented programming)Standard deviationInterpreter (computing)Derivation (linguistics)Home pageLibrary (computing)Meta elementHacker (term)Right angleElectric generatorRepository (publishing)Information privacyoutputTouchscreenInformationExpressionSound effectRadio-frequency identificationProduct (business)Function (mathematics)Revision controlNeuroinformatikLibrary (computing)Arithmetic meanInheritance (object-oriented programming)QuicksortSet (mathematics)Logical constantUniform resource locatorComputer fileBitDemo (music)FlagComputer animation
Derivation (linguistics)Home pageLibrary (computing)BitoutputElectronic mailing listCASE <Informatik>Computer animation
Slide ruleIntegrated development environmentProduct requirements documentGastropod shellDirectory serviceCache (computing)outputVariable (mathematics)Thermodynamisches SystemSoftware testingRevision controlConfiguration spaceRegulärer Ausdruck <Textverarbeitung>Uniform resource locatorDefault (computer science)InformationLatent heatInterpreter (computing)CAN busData typePatch (Unix)Binary fileLibrary (computing)Set (mathematics)Maß <Mathematik>Basis <Mathematik>Overlay-NetzFluid staticsWeb pageSpacetimeInterpreter (computing)Revision controlArithmetic meanExpressionBit rateSoftware testingRepository (publishing)outputSystem callOverlay-NetzRight angleLibrary (computing)Game controllerType theoryDefault (computer science)NeuroinformatikInformation privacyMixed realityWordProcess (computing)MathematicsInsertion lossCondition numberSound effectRule of inferenceRobotExtension (kinesiology)Group actionComputer configurationPresentation of a groupAdditionInstance (computer science)GradientProduct (business)Electric generatorMachine visionMedical imagingCuboidMultiplication signPosition operatorStandard deviationMereologyFunction (mathematics)Run time (program lifecycle phase)Twin primePiDebuggerCore dumpSoftware repositorySet (mathematics)Different (Kate Ryan album)FlagSubstitute goodCache (computing)Set (mathematics)Control flowMultiplicationServer (computing)Thermodynamisches SystemBinary codeComputer animation
Basis <Mathematik>Software testingFunction (mathematics)Set (mathematics)AutomationSoftware repositorySet (mathematics)Overlay-NetzBuildingoutputComputer animation
Hash functionComputer clusterSelf-organizationLatent heatDiallyl disulfideProcess (computing)Multiplication signRevision controlQuicksortLecture/Conference
Transcript: English(auto-generated)
I'm going to present to you Sebastian Jordan, he contributes to Nix packages occasionally and he's the maintainer and co-developer to PyPy2Nix and in his day-to-day he watches way too many cartoons. Give it up to Sebastian Jordan.
Good morning, yeah I want to talk about PyPy2Nix and Nix packages Python. The goal of this talk is basically to give you an overview about what these tools are
doing and also ask for help a little bit, but more on that later. What is PyPy to begin with?
The official way to install Python packages, basically, probably everybody who used Python software at one point said pip install package x and that's how it was done until now.
And we want to basically change that to use Nix instead. There are a lot of packages on PyPy as of the day before yesterday, 120,000.
We probably won't be able to support all of them, but shoot for the stars, right? Yeah, Python packaging has some issues. I don't know if you've ever written a Python package, you probably know that you have
to write the setup.py to make your Python code into a package and yeah, this setup.py is just a Python script. So yeah, you can put any code in there and a lot of people do that.
Like if it's Windows or I don't know if certain check for certain C libraries and do something else if they are not there and stuff like that. So there's actually no way in general to get, for example, dependency information before
you run the setup script. But pip gives some meta information about installed dependencies when you install it
or when you build a package. There's real metadata, real, by the way, as a binary format of Python packages. And in there you can find some information about what was done, what was installed. And we will see later that we pass exactly that kind of information.
And there's another problem, because it's not a compiled language, there's no configure phase or anything. So if we want to build our Python environments, we also have to make sure that when we
call our executable or import certain modules that the linker path is correctly set. As I've already said, we pass the real metadata and because it's not such a simple
step as, for example, with Cabal to Nix, if anybody knows that property, or Node to Nix, we have to do a lot more stuff to get where we want.
We do that in three stages and the first stage is basically just pip install. We build an environment where pip is available and also your C dependencies that you need
for installing your Python packages. And in this environment, which we create just with Nix shell, we run pip install. Stage two is there to extract the metadata from the wheel like license or dependencies
author and that kind of stuff. And in stage two, we also calculate the checksums that we put in our to-be-generated Nix expression.
And stage three is just generating Nix expression from that. Our Nix expression that we generate with PyP2Nix has two parts, three parts maybe.
Of course, we have the machinery in place to build Python packages with the tooling from Nix packages, but mostly it is just the generated expressions for downloading and building Python packages as you would find them in Nix packages, basically, because we use just the Python
package from Nix packages. And there's an overrides part, which is similar to an overlay. First of all, the overrides are there to give the user the ability to insert special
information that PyP2Nix couldn't find. For example, dependencies that PyP2Nix wasn't able to pick up because they were not declared properly in the Python package or special environment variables that should be set on
build time. And we use the fixed point calculation similar to the overlays. And I will later show how this might look.
This is just to show you something special, this is part of the generated package set for CFFI and it just looks like you would expect it in Nix packages. This is an example for overrides.
Fortunately, this is not an overlay, so we can do stuff like packages and Python there on the top because our safe and super attributes in this case are only the Python packages and not Nix packages.
But it's a little bit cut off, but you get the idea. We just overwrite the derivation to remove some stuff that we don't want to have in
our setup.py, for example version pinning, because we want to take care of that. But nothing special, I guess, if you've ever overwritten any derivation. Now comes the fun part, the live demo.
I prepared a makefile and let's first look into that. Is this big enough to read? Yeah, most people, people who work with Python probably know about requirements.txt,
but just to summarize, it's basically a format, like the requirements.txt file is usually there to specify what should go into, for example, the development environment as Python dependencies or even in the production environment.
And the format is you just list the packages you line separated, basically. You can put more than just package names in there, also version pinnings, but we won't do that in this example. Or you could reference Git repositories.
And, yeah, we also have to install pyp2nix, this part. And then we generate our nix expression with pyp2nix and this is basically the example invocation for this case.
I will quickly go over the flags, pyp2nix supports Python 2 and Python 3. And the first flag, like the minus v 3, just says, yeah, use Python 3. The minus r flag is similar to the pip minus r flag.
It specifies the requirements.txt file that should be used to generate the nix expression. The dash e flag lets the user specify nix packages that should be available at build
time or pip install time. And minus lowercase v is just verbose output. And yeah, let's do that. So, yeah, nothing in here.
And now we want to make our requirements.txt file, nothing special, just, yeah, there's everything in there.
Now let's generate the requirements.nix file. Yeah, this takes, I don't know, 30 seconds or a minute because it has to compile some stuff.
In this case we build lxma, which is just like one of the more famous XML parsing libraries for Python. And I chose this example to demonstrate that we can handle C dependencies. Yeah, and here you can see, yeah, just, we just run, yeah, I don't know, maybe we
have to wait a little bit. Maybe, I don't know, there's somebody questions so far, maybe we can do that in the meantime.
Sorry? Can you reuse overrides? I'm really sorry, I didn't get the first part. If we can reuse overrides.
Ah, like that you created for a specific environment. Yes, there is a way to, there is a way to, let's just show this.
There is a flag that lets you specify additional overrides.
I don't know, on the screen it's still there. Anyway, there is a dash dash overrides flag that lets you specify, yeah, any override file that you previously created. You can even get your overrides file from URLs or Git repositories with the exact revision
that is currently at master or you can even specify in another revision. So there is a way to reuse overrides and I will talk about that later a little bit.
So this LXML library also depends on libXML2, right? Yeah. How can the generator find that dependency? Do you have to specify it manually? Yeah, that's a little bit, yeah, the dirty secret of PyP2NEXT.
When you specify, yeah, you need to add libXML2 and XSAT, it will just be there for every Python package because we cannot in general get information about which package exactly
uses the C dependency so we just give it as a bit input to all the Python packages in the whole set that we just generated. Python packages can be derived from the Python? Yeah, the Python dependencies are derived from the PIP bit output. OK, I will continue now with the demo.
We just generated the requirements.nxt file.
And this screen, which is also connected to the computer, shows everything constantly. I don't know.
Here is the generated expression for LXML.
I mean, in this case, LXML has no Python dependencies, so in the propagated bit inputs
it's empty, it's an empty list. And here is the dirty hack I mentioned earlier. The common bit inputs are now libXML2 and XSAT and this would be true for all the Python packages that we would have put in there.
And just to show that it works, let's make the interpreter...
Ah, yeah. This splits instantly because I previously tested it on this computer, so you have
to believe me that Nix ran. But yeah, let's quickly start up the Python interpreter and import LXML.
Yeah. Works. OK, let's go on with the presentation.
OK, so now you've seen what works. You've seen that we can generate Python environments and that we can use them, but there is a lot of stuff that doesn't work for now. And this is especially tests. And this is a big issue because we really have to support tests to make this a viable option
maybe to use the Nix packages. And even not, this is just necessary if you want to use it in production. We don't support multiple PyPy instances, so if you have a private PyPy server,
you cannot use PyPy2Nix to selectively pull from official PyPy and your custom PyPy server. And this is due to, at least to my knowledge, PIP has no way of telling the user
from where it got the packages. Of course, we could pass it from standard out, but we don't want to do that because it would break every time PIP updates. We don't want to rely on the front end of PIP to generate our Nix expressions. And if anybody wants to do that, feel free to contact Roc or me.
Now I want to quickly talk about Nix packages Python. Nix packages Python is a collection of Python packages that are generated with PyPy2Nix. And there is a GitHub repository called Nix packages Python, which Roc owns.
And it's semi-automated, self-updating. What that means, I will explain that later. It offers binary substitutions because we push build artifacts to the binary cache by SSH.
And it also serves as a library for common overrides. Semi-automated self-updating sets of Python packages. We automated this with Travis CI. And there is basically a cron job running every night which executes PyPy2Nix
and generates expressions which can be used by the user. And it commits its changes that it made maybe because of a Python package update upstream or whatever to this repository back.
There are multiple Python sets. All the great stuff you know or use. But by far not everything. For example Django, I think it's only Django. No plugins, no extensions by now.
But we plan on expanding that. And they are organized in groups that make sense. You would put for example pytest with its extensions in one package set because you have to make sure that these work together. And this is how we organize it for now. Especially to mitigate really long build times.
Different versions for Python 2 and 3. I mean, this won't be an issue for too long, but for now it is.
It's a library of common overrides. That means there are already pre-defined overrides that you can use. And PyPy offers a my-default-overrides flag that pulls automatically the overrides from Nix packages Python in your Nix expression.
It depends on the revision that was current when you executed this command. So if you generate a Nix expression, it won't break in the future. Even if we update something that might break you, your old expressions would still work.
This is not the default behavior, but it is planned to be the default behavior soon. What we want to do is, with Nix packages Python, we want to provide an easy way to install your Python packages.
And for that we would like to use overlays, maybe as intended. And there is already an overlay definition in the repo that you could use. But as we saw yesterday, it still needs some overhauling.
And we could generate a static PyPy from our expression. That would be also really nice to help with, I don't know, generation of Docker images or whatever. Because if you have a PyPy repository with only the versions you really need for the job,
you don't run into trouble with version fuckups or whatever. To enable tests, we need to rework the Python infrastructure a little bit.
If that's possible at least. And there is currently no way to differentiate between build inputs for a package like test inputs, especially like pytest, and runtime inputs from the way we generate our expressions.
So we have to figure out something there. So now, I don't know, the future. We would very much like to use an automated tool in Nix packages
to generate Python packages for basically end-user consumption, or I don't know, for packages that need these dependencies. I don't know. But we definitely don't support this now.
Don't expect us to integrate well with your Nix packages needs for now, because we are just not there yet. We would like to support that, but for the reasons I already explained, it's a little difficult.
So what we could do instead is generate a set of overlays and offer that to Nix users. This would probably be the least intrusive way to get people to know the tool and how it works.
Building pull requests on Hydra would be nice, but that's probably what a lot of people think. That was my talk. Any questions?
Any questions? There is a new tool called pipenv by Kenneth Reeds, I think.
It's using pipfile instead of requirement.txt, and it generates pipfile.loc, which has the dependency, the version, and the hash of the package.
Does that make sense to be used in PyPy to Nix? It's officially recommended to be used by Python organization. OK, thanks for the hint. We'll look into it. Actually, I didn't know about this pipenv command.
Last question. I was just wondering, how do you select which packages to put into an expected Python? Or do you only build dependencies of these three packages?
The workflow until now was, OK, I need some Python packages for my work, so I put it in expected Python. So it was basically derived from what I and Rock needed for our purposes until now. So there is no formal specification on what goes into what kind of package set in expected Python until now.
But we are open for proposals, of course. OK. Thank you.