Logo TIB AV-Portal Logo TIB AV-Portal

What Python can learn from Haskell packaging

Video in TIB AV-Portal: What Python can learn from Haskell packaging

Formal Metadata

What Python can learn from Haskell packaging
Title of Series
Part Number
Number of Parts
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
Release Date

Content Metadata

Subject Area
Domen Kožar - What Python can learn from Haskell packaging Haskell community has made lots of small important improvements to packaging in 2015. What can Python community learn from it and how are we different? ----- Haskell community has been living in "Cabal hell" for decades, but Stack tool and Nix language have been a great game changer for Haskell in 2015. Python packaging has evolved since they very beginning of distutils in 1999. We'll take a look what Haskell community has been doing in their playground and what they've done better or worse. The talk is inspired by Peter Simons talk given at Nix conference: [Peter Simons: Inside of the Nixpkgs Haskell Infrastructure] Outline: - Cabal (packaging) interesting features overview - Cabal file specification overview - Interesting Cabal features not seen in Python packaging - Lack of features (introduction into next section) - Cabal hell - Quick overview of Haskell community frustration over Cabal tooling - Stack tool overview - What problem Stack solves - How Stack works - Comparing Stack to pip requirements - Using Nix language to automate packaging - how packaging is automated for Haskell - how it could be done for Python
NP-hard Greatest element Building Context awareness Run time (program lifecycle phase) Code Multiplication sign 1 (number) Set (mathematics) Computer programming Subset Protein folding Computer configuration Flag Library (computing) Scripting language Spyware File format Building Computer file Flickr Bit Type theory Process (computing) Vector space Website Software testing Right angle Modul <Datentyp> Abelian category Functional (mathematics) Computer file Open source Line (geometry) Student's t-test Rule of inference Metadata Number Power (physics) Revision control Goodness of fit Latent heat Pi Lecture/Conference Hacker (term) Software testing Computing platform Condition number Data type Default (computer science) Domain name Default (computer science) Execution unit Distribution (mathematics) Information Gender Projective plane Computer program Directory service Line (geometry) Subject indexing Computer animation Software Integrated development environment Personal digital assistant Pauli exclusion principle Revision control Window Library (computing) Flag Extension (kinesiology)
Functional programming State of matter Multiplication sign Source code Set (mathematics) Mereology Fluid statics Mathematics Data conversion Imperative programming Error message Logic gate Information security Stability theory Social class Identity management Area Source code State transition system Arm Constraint (mathematics) Wrapper (data mining) Software developer Computer file Interior (topology) Hecke operator Term (mathematics) Electronic signature Type theory Data management Message passing Hash function Configuration space Website Software testing Right angle Point (geometry) Computer file Line (geometry) Connectivity (graph theory) Modulare Programmierung Machine vision Revision control Multiplication Distribution (mathematics) Forcing (mathematics) Projective plane Client (computing) Stack (abstract data type) Directory service Limit (category theory) Compiler Computer animation Integrated development environment Visualization (computer graphics) Personal digital assistant Network topology Computer-assisted translation Flag
Fluid statics Arm Computer animation Interior (topology) Software testing Right angle Stack (abstract data type) Resultant Computer-assisted translation Flag
Functional programming Group action Parsing State of matter Multiplication sign View (database) Set (mathematics) Data dictionary Perspective (visual) Formal language Mechanism design Different (Kate Ryan album) Physical system God Injektivität Scripting language State transition system Arm Binary code Fitness function Electronic mailing list Data management Process (computing) Hash function Repository (publishing) MiniDisc Configuration space Summierbarkeit Right angle Writing Spacetime Point (geometry) Locally convex topological vector space Functional (mathematics) Inheritance (object-oriented programming) Computer file Patch (Unix) Power (physics) Number Revision control Latent heat Lecture/Conference Hacker (term) Integer Software testing Configuration space Analytic continuation Default (computer science) Dialect Distribution (mathematics) Inheritance (object-oriented programming) Cellular automaton Expression Projective plane Stack (abstract data type) System call Compiler Word Computer animation Software Integrated development environment Network topology Video game Local ring
Point (geometry) Functional (mathematics) Greatest element Inheritance (object-oriented programming) Computer file Multiplication sign Set (mathematics) Parameter (computer programming) Function (mathematics) Disk read-and-write head Mereology Power (physics) Revision control Hooking Different (Kate Ryan album) Hypermedia Ontology Forest Data structure Extension (kinesiology) Recursion Reverse engineering Physical system Scripting language Arm Inheritance (object-oriented programming) Information Cellular automaton Projective plane Bit System call Subject indexing Digital photography Ultimatum game Computer animation Textsystem Personal digital assistant Order (biology) output Configuration space Reverse engineering Extension (kinesiology)
but their morning Jacob overcoming whereas if 2 engineers domain Costa everyone welcome to applied and really excited to be here and for his 100 years on just just before I start on talk i'd like to to 7 about myself so you you'll you'll better understand
the context of it on other than injured softer distributions is basically I was a student on my was of his using gender vector times on on developing for Google Summer of Code of a project to to package Python automatically offered gentle platform and so on and in the last 3 years has been working on um makes less the rights of an institution probably heard of it on and I'm unsettling the problem of of how to distribute all those packages to to people and make it easy to use and it turns out it's it's not so so I'll I'll talk about how Haskell does it and how that compares the Python and and what we can learn and what things we already know but we just can't get there because it's it's it's complicated because of our legacy on so and currently i'm working for us a company called from snot and we're breeding open-source networking stuff that a softer and all number from infrastructure engineers I'm setting of the whole the whole pipelines for testing and benchmarking so so so my prior right we we got types in Python so clearly that we're improving pies on even though it's more than 25 years old on n and has is definitely inspiration here on so there clearly that things to improve our and to learn upon on so so let's let's start how Haskell this packaging on and and there's there's the tool is called a ball and you would have a like this it's a special kind of syntax and at the top you see just some metadata about the package and at the bottom you see you can say OK my my thing myself Theresa library but there's also the executable and it has these dependencies are in associate directory and so on on so so 1 thing that Europe will figure out that the competitive type and this is just the file that you can parse and in Python have script has to run for actually to do something and all although i've instead Clayton white white that's a big difference on and how that you know affects everyone pretty much on so if you think about the API in this case it has to hold you would positives and get me data back in in Python API is set of function which does everything like this with on so so far the the format is is more approachable and we'll see that a bit later so 1 thing if you were a careful not you know does this those types of line in the file and and if it's if it's specify the simple that means the positive power and you have all the information you need to to to install that package in Haskell on but also on that you consider real-time make or built type customs and in in case of making all around the make files on it will keep the the Haskell building process and in case of customer run of Haskell program with specific cooks on where you can specify code on so you have the power to go from very simple to to writing a fortunately the customer is not accused are because the method is not used because it's very pretty document that's also a good thing because people back to simple so so in hindsight we have a pair of 5 1 8 which is I think it's it's not accepted yet but it's talks about basically how to hijack set of tools build process and and you can define your build process on the and and this is in progress and you'll be able to you'll be able to to go and not even touch the set of tools machinery and and do whatever you want to have the freedom to to for example right out a makefile for Python package in R and in the course this will be integrated into the and on and all the schools of which is which is really nice this finally will will be able to to go forward from from from the legacy the the the instructor so just a little about the advanced firms of features in the cobalt for example here in Haskell could you can say OK I want to have this flag that you can toggle and for example if we have like the budget and we can describe it provided folds and then to all the file we can write conditionals like you know this flag is enabled them to on this option is this could figure it and so on so it's like a very simple language with within with just acceptances and and nothing hard on and this was this gives you the flexibility of of of saying for example if you have a library do we want HDPs not on and and but there about sites also it has called for example he adds runtime ones Abecasis compiled there is no way to know which affects so you just don't know their on and and also for example you can say it's HDPs flag is enabled than and these dependencies but it also was the other way around if for some reason those dependencies are and environment that Flickr will be enabled by default so there there some magic and and they also those who have fallen 1 singular in packaging is that teachers are really problematic once you start introducing then we have to support and and these kind of things are really really painful on the long run on and and in Python we haven't depends 5 0 or 8 which which is environment markers so for example we have a dependency you can say look at this dependency is almost all on the on Python 3 and Windows for example and so on on this is already are supported in pairs but I'm not many people are using is because that no don't know about it on and the idea is that you don't write in Python it's interactive Co is saying you know if we're on Windows log Y intercept this dependency and the markers windows and and and and this gives everyone else the possibility to to also get this information to parse this marker and and to to to do something with that information also about later of what what we're doing with that on sold sold packets is is that the the Haskell type item packaging index you publish your packages there and the book and although there are many but just just as an example of a future where where you know it's really painful to the support of the long run and hackers you can edit the cabal follows in place through the website so that means if you release version 0 . 1 for example and somebody can go and and edges that file and remove a dependency and then it's not really 0 . 1 anymore it's a whole new thing but it's slightly modified but is still not the same thing and so in in that case is the 2nd rule will and this for revision to the aligned to coevolve file and in all when you start to think about OK now I have this local process where the software and then I can also edited online but then
what happens if a bomb distribution and push it through Hackett and so on so there's a lot of stateful things going on suddenly on and while you know this might be a good idea and maybe some of you know like 1 people wanted for everyone else using the Hackett to download packages and to figure out the state this is really really problematic especially if you want have a reproducible built once you added that follow the how to data hash changes of the terrible so all the people that say OK download this file and this is the hash there was something get a mismatch and do we really don't want to enforce a culture where where you just updated enhance whatever because that there's there's really no point right on so this kind of features are also present in Haskell and they're also present in spite of a world which of the giver's headaches every day and then they API identity of the fluorophore In the heck it is that you can see the and and 2 . cabal and you can get this revisions but it's you basically have 2 versions for his 1st have a version and then a revision and it's just becomes an item handling those so you know Haskell is is 1 year older than than Python and and therefore also have this path of of improving the package and ecosystem and it seems that most until 2 years ago they have this problem where a bomb while in Europe above all you have to specify the dependencies and we all know that you know not also software packages work well together and in case of Haskell because the types you would get a new package the types which changes and suddenly you're usage of that package would would not have you know with wouldn't work Haskell wooden compiled and and this is the the most the biggest problem that had this is called ball how so many would start you know when when the package would get a new version of things we compile you would start you know putting in this constraints and so on so every developer would produce for himself or herself and and it's just a big waste of time on trying to figure out which packages recompiled so all talk about how Haskell solve this but just them an interesting part whole l which is an area of functional language solve that aren't they basically said in Europe dependencies you have to say always specify the limits of the major version so if you say I dependent package http it has to be between version 5 and 6 and then if you uploaded on that packets and API changed it wouldn't allow you to upload it unless you bump the major version so it's basically the semantic version of the package so the package manager forces you not to change the types that the signature and less abundant major version on and and that's really nice we we can do that in Python fortunately because there is no way to really check if make change all of course that we could parse the API and so on but that's that's the gray area on they're not some not something about hopefully something we will be able to do 1 they on souls also just call Haskell solve the arm so they whole that actually was released in 2005 so just 1 year ago on called so 2nd is a stable source of possible pragmatism is guaranteed that it is built consistently the best as for generating might nonsensical so what does that mean so so on they've built a site where you of fermenting there can log in you specify some of the information and say OK I lamenting the of these packages on Hackage and then they go and they insist in the peak of dependency tree of your pocket and build it and see if all that has everything passes and then they say OK we use this versions and this version was compiled arm and then they provide an API for that and so you cannot you can get those visions so so if you think about it in Python we have requirements of the sky but everyone has their own set of versions in Haskell they pretty much crowdsourced that so they have haven't a website where all the those versions are attested and compile and and people use that as a community effort not something you commit to your repository and and you hope for the best arm and and and so so if you want if you want for example to have backwards compatibility you depend on stake it's at the 6 on and then all the minor versions of 6 that set of things that aid is going to you today I didn't change but they still security updates and so on are and when you're ready usually the new version means newer GATE which is the compiler and the main compiler and then you're ready to go on and fix those errors compiler errors and you go to the next class so I think that's very interesting on because you know they're they're doing all the work together Art in 1 place instead of everyone in their own garden arms and the lecture we if if we could do something like this in Python because it's way more complicated than just compiling the package saying it works on but I still think it would be worth the effort of having the major software on that we use in Python to to have this versions of community-managed are instead of well while having this work done by each individual or a company so there are a solution is requirements the text so together with which stackage those so the use of a tool called Static which is like of wrapper around compiled so it it can can do more things than that just about on and you specify a configuration file like this and you say OK and will use these flags and that you have to when all the components softer I'm going to use these packages so so you say OK the mother and the package is in the current directory and there is the cabal follow and that's the 1 will be used to to build this project and you can have multiple of those to think about the whole Python those that you have to say console minus B . or something like that so that's imperative you have to actually like run that and when you have a new feature development to thank you just have to run for both of them and this case is declarative you open that file so it can open the file and you know what packages are being added on it is there is no imperatives that instead of things just stand built and that that will execute the whole thing oscillates so it's way more declarative and and and at the bottom see there is always this is where you get this big set of conversion and it's a LTS 6 that 7 and there you go on and you have most of the Hackensack just down and those work on and there's also a called extra dependencies and those that those are the dependencies that are not in the LTS so so that everything is being done by 2 community efforts of course is people don't do it and not the so for all the the packages you have that you don't have the dominant part of as you can specify that there and spectral complain if you you don't want to do that and so it has a bunch of simple commands are like us text set is something like visual environment for us so it will download the compiler
and arms initial set it up for you but based on on the the on the result auditory using and so on and stake
in it will generate a fast like a mini templating for static Haskell packages and so on are so so that's the
that's what standards and and the community was really really happy on 1 which is half an hour from center on right so so long so
knowledge and have a this hackers we all packages and the 2nd of a set of files on a set of regions on then that in all my life my job and what I'm on doing is alcohol we distribute all this stuff users so that they can really come from gets seamlessly and then it would it would serve for the you know whatever the platform and and and we're doing this with makes itself functional language is based on the based on a PhD theses by a local Boston and it's on it's a it's a very short and I still so commended through the anyone please OK about packaging are and how will hold the functional language concept can change the the thinking God semantically and you know improves on a lot of things that we have a problem with that on so so this is for fervor has got kind of is that we have a and expected users than a collection of MYC expressions that specify whole are some software should be built are similar to act or or something else in the distributions on that we're not tied to unlimited distribution and we support Darwin and and only nodes and so so why you would need to this later on top of the jar upstream has affected Europe IPI is because we we can take care of system dependencies on all we have to build a system that will compare these packages implement binaries for you and we have a really powerful API which you'll see later on so that you can actually go there and change those packages and and and you know treat them in a way that you want apply some patches of them versions or whatever you want to answer your knowledge of so we're not things that you just have to to say OK I don't use of what we have or it's nothing on that you have the power of of changing that and and most importantly in England and makes packages we have all the Haskell packages we don't compile them are we we don't know because of the there is that a lot of you know power and disk space that you need so we take only 1 GAC version and from that of compiler which is the latest stable 1 we compile all the packages were most of them aren't that's directly we could we could distribute all binary and so on and so the user can then say OK I I I I have this project have perspective is I want wineries and so we know that do that expected manageable download that and and they're going to have you have you didn't compile anything except the package and then that's really nice on especially because you can show that environment and the numbers are markers so hard as for those that work in Haskell how do we how do we get that done and and why why is it so hard for Python 2 to to accomplish this so this is the single infrastructure and that we have so let me explain what's really going on here on what's on in the left upper corner you see Hackett states that united has all the packages and then there is a script that goes and almost all of them are calculates that the shots and everything in that in a repository so you have a look at your repository that's called all local hashes and have all cobalt follows there so you can go to all of them and parse them and generate a dependency trees and so on want everyone to do and then he and then those hashes of those composite taken and their ability thank it's thing and that's that's from gives you a view of what bills currently in arts and that's a continuous process of groups are and then based on the stack its nicely when things kind of located make this LTS Haskell on what you've seen before and that's kind of like OK this kind of compiles together now let's let's take those regions so this is like the it's and the upstream the Haskell provides so that we have a package that makes that parses and Koball has repository and 2nd repository and generates Haskell packages that makes any generates configuration that next so in Haskell packages of things there is every version of every packet specified how we should feel that and this is all generated from the cobalt files but it's one-to-one mapping of some features think about we don't support has some features we do there is some room for improvement but in general it it works on and the configuration LCS as basically just says OK based on the Algiers version and I and the long list of virgin dependencies because to be the default once when you when you use these Haskell packages are so there it's just the keeping in its OK take these versions because it had half packages it always use the latest version of which is the center for not always means that things were right so then there 2 more files configuration that uh that's common and configuration GAC X it's similar to those learned that the files that have to be manually arms that have to be manually crafted and and maintained and in the if if the compiled file for example doesn't have specified system dependencies in there we will override and say OK for packets http in also take this system dependence and so on so basically everything that's not an upstream come file will all right the ending in configurations using will do that but they aren't based on the future versions of from the few by the different by or disable tests because they don't work and so on so those 2 houses we maintain and everything else is upstream provided by the Haskell community and then you have these compiled that makes in the middle and this is what the user gets so when you have your project manager of policy about the next run and to generate and its expression automatically out of its specifying all the dependencies and in in there you can say I want a specific else's version of I want the latest integers whatever so this is this is all of as the users just from carbolic next file and you get basically the whole set of of dependencies that you know that they're going to work and and that the call comics file has this function called back alright so we can basically all right anything from the upstream you can say take this package but different version think this package butterfly dispatched or whatever you want and then you can you need then you install this on softer and there is only have binary distributed on Haskell pipeline on I hope I hope that that that was not too fast and it's clear enough so so this is probably the the hardest slide up but I would really like to say a few words about the infrastructure in X and how this files all work together and it all fits on 1 side it's just not that easy to explain so so that's so basically what we want to do some kind of inheritance we have different files and we want those files to overwrite each other right I we want this powerful writing of mechanism so at the time see a function called fixed and that's a fixed point that's how you recursion in in in functional language then and it's basically calling itself it's a recursive plug for the discourse itself and how it works is it takes the also good and if into the input and because the language is lazy it's that only until you're reference something so as for example of what's in the middle I define something you would call a dictionaries called edge of this has an expert it's pretty much the same and you can say OK I have an activity to fool it is that very few environment with by but the full where is actually the sum of the 2 ends of the wire but that's cells
were born that cell is really just
the the input of this function at land the function it gets self as a parameter but that's health is actually the output of
itself so it will actually then our reference cell that fuel and self will be the same as actually being in reference the full and get it back to it but just recurrent and a function nothing really fancy and when you when you call fixed fixed point on this on this function on the this dictionary and U S is the full body will get the value to work back and who just basically call supplies and this is a way how we holy owned do dependency and how we so you can reference different things guess so so now know that we have that we want to have a little bit more of its ability and we define a function call extent I want to go into a Hough forests all the following defined but of if you look at them right so that's the function that they guy you get an and this alright function accepts the things self and super and health is that encode and super is the output of of this dictionary so you have the power due to to get the previous configuration file and either references input or output so you have both things so in this case and I say OK but take food take the output a super through and the reverse that so if I call them 616 indeed and right so that means extends does the D dictionary and overwrite it with this function are you will see that but value is different because we have reverse the full and that gives us the power to override the dictionary that tall either but inputs or outputs on and you if you if you call it twice and well it's nothing here but you will get you will get to buy back on so so that gives you did is that although the power to override fast so again on photo we use that this is then all this unique to combine all these files is safer as the Head of Fixed bond which takes care of the recursion and then I take all of Haskell package is the common configuration file using for the compiler-specific conflict the packets of conflict and then at the bottom of all of the all right so we can hook into it and and you can change everything from the upstream how itself so entitled currently makes we manually added files why because of this problem we have a set of high scripts and have to run all those scripts to actually get and figure out what going on so someone would need to take that and for everything in Python package in Quebec's generate some days and something with all this media information that we could then use to generate an ultimatum of this on and we would need to maintain the requirements of global for a full full Python package indexed so these are the 2 big big projects from that 1 would need to tackle in order to have the same infrastructure and and then we would not be able to build all the part of the whole life and think it index basically and distributed to people and while the first one on from the 1st problem is kind of being sold arms in communities is trying to get there but we still don't have a way to do it today on but the infrastructure is improving we got wheels regarding the new python packaging index called warehouse which is going to be tested and changeable and so on so everything around is changing but this is still not available today and with the build system could told before it will be able to have different tools and a set of tools to build by the packages and hopefully 1 they will have a standard money that will be statically based instead of a script that you have to run on and as for the 2nd problem I don't know currently if anyone is solving that our crowdsourcing diversions but it's definitely something that are will have solve it ourselves or someone will have this for us so so Python is actually doing quite a good in the sense that it has all of these things are being worked on and someone was 1 thing that's really missing is if you think about it that it's still not declarative enough we have so many times that you have to tell you have to test the set up I said to you requirements manifest now the project almost coming problems that we need and and it's just a lot of different things you have to say and and in Haskell there's just too out the cabal and stacked on and it's really hard to to get rid of this because these are legacy on but it's a lot of information people have to know to actually to use it of and and this is improving but it's still an word processors right so so this this talk was based on the data Siemens inside unexpected is has clear structure are if you want to see that target had it goes into a little bit into the details of how it all works on and and that I hope that I hope that you've seen on whether the current limitations and and at the same time I I I would I would still like to 10 other Python packaging ontology and everyone was working on improving the ecosystem it's it's really hard to have 25 years later and then just replace all of this and say OK you know we have the same thing it's gonna work out on and and it's it's going slow but there's a problem on so thank from the so we have time for questions like this from the the 1st thing to realize that might have somewhere once it was a question of the question of what need OK unintelligible volunteer for this