Nix as HPC package management system

Video thumbnail (Frame 0) Video thumbnail (Frame 1618) Video thumbnail (Frame 2479) Video thumbnail (Frame 3652) Video thumbnail (Frame 4901) Video thumbnail (Frame 6380) Video thumbnail (Frame 6919) Video thumbnail (Frame 7595) Video thumbnail (Frame 9551) Video thumbnail (Frame 10115) Video thumbnail (Frame 11667) Video thumbnail (Frame 12850) Video thumbnail (Frame 14046) Video thumbnail (Frame 14519) Video thumbnail (Frame 15154) Video thumbnail (Frame 16496) Video thumbnail (Frame 17782) Video thumbnail (Frame 18611) Video thumbnail (Frame 20777) Video thumbnail (Frame 22631) Video thumbnail (Frame 28744)
Video in TIB AV-Portal: Nix as HPC package management system

Formal Metadata

Title
Nix as HPC package management system
Title of Series
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
2018
Language
English

Content Metadata

Subject Area
Abstract
Modern High Performance Computing systems are becoming larger and more heterogeneous. The proper management of software for the users of such systems poses a significant challenge. These users run very diverse applications that may be compiled with proprietary tools for specialized hardware. Moreover, the application life-cycle of these software may exceed the lifetime of the HPC systems themselves. These difficulties motivate the use of specialized package management systems. In this presentation, we outline an approach to HPC package development, deployment, management, sharing, and reuse based on the Nix functional package manager. We report our experience with this approach inside the GRICAD HPC center in Grenoble, France, and compare it to other existing approaches. --- After a PhD in applied mathematics, I have worked in the scientific computing field, with a strong HPC component.
Functional programming Integrated development environment Gene cluster Statement (computer science) Numerical analysis Applied mathematics Run-time system Digital rights management Physical system Supercomputer Physical system Supercomputer
Cluster sampling Graphics processing unit Service (economics) Simulation Service (economics) File system Gene cluster Data storage device Set (mathematics) Data analysis Computer network Data storage device Data analysis Disk read-and-write head Computer Supercomputer Supercomputer Moore's law Software Semiconductor memory File system System programming
Stapeldatei Intel Code Gradient Computer simulation Computer network Term (mathematics) Computer Supercomputer 2 (number) Befehlsprozessor Liquid Read-only memory Personal digital assistant Semiconductor memory Befehlsprozessor Telecommunication Software Computer network Bridging (networking) Computing platform Physical system Library (computing) Graphics processing unit
Stapeldatei Gradient Range (statistics) Virtual machine Gene cluster 1 (number) Parallel computing Number Product (business) Read-only memory Different (Kate Ryan album) Semiconductor memory Software Core dump Computing platform Stapeldatei Simulation Graph (mathematics) Parallel computing Data analysis Bit Term (mathematics) Software maintenance Computing platform Iteration
Stapeldatei Stapeldatei Gradient Gene cluster Data storage device Run-time system Digital photography Latent heat Software Charge carrier Computing platform Integrated development environment Process (computing) Scheduling (computing) Computing platform Window
Stapeldatei Cluster sampling Freeware Codierung <Programmierung> Characteristic polynomial Virtual machine Food energy Portable communications device Field (computer science) Power (physics) Formal language Different (Kate Ryan album) Software Personal digital assistant Computer hardware Integrated development environment Physical system Simulation Stapeldatei Server (computing) Field (computer science) Core dump Computer network System call Software Integrated development environment Personal digital assistant Finite difference Physicist Different (Kate Ryan album) Freeware Geometry
Module (mathematics) Laptop Module (mathematics) Structural load Structural load System administrator System administrator Shared memory Compiler Open set Portable communications device Revision control Software Software Revision control Integrated development environment Library (computing)
Point (geometry) Cluster sampling Functional (mathematics) Code System administrator Einstein field equations Portable communications device Software maintenance Independence (probability theory) Revision control Cache (computing) Computer configuration Term (mathematics) Multiplication Combinatorics Binary code Sound effect Software maintenance Variable (mathematics) Supercomputer Compiler Digital rights management Process (computing) Computer configuration Integrated development environment Software Fingerprint Library (computing) Combinatorics
Cluster sampling Server (computing) Manufacturing execution system User interface Binary code Data storage device Data analysis Data storage device Supercomputer Software maintenance Independence (probability theory) Random matrix Duality (mathematics) Computer configuration Cache (computing) Multiplication Physical system Point cloud
Scripting language Module (mathematics) Cluster sampling Time zone Module (mathematics) Scripting language Open source Modal logic File system Directory service Disk read-and-write head Mereology Computer File system Information security Multiplication Default (computer science)
Intel Server (computing) Scripting language Computer file Code Codierung <Programmierung> Directory service Streaming media Code Variable (mathematics) Power (physics) Web 2.0 Cache (computing) Set (mathematics) Multiplication Compilation album Demon Server (computing) Binary code Feedback Computer Directory service Binary file Cache (computing) Compiler Configuration space Normal (geometry)
Curve Cone penetration test Multiplication sign First-person shooter Binary file Run-time system Portable communications device Event horizon Portable communications device Formal language Cache (computing) Personal digital assistant Software Computing platform Configuration space Integrated development environment Software testing Identity management Identical particles
Point (geometry) Building Patch (Unix) Decision theory Moment (mathematics) Binary code Shared memory Disk read-and-write head Cartesian coordinate system Computer Latent heat Operator (mathematics) File system Monster group Writing
our next talk is by pierre-antoine and quentin willis talk i will talk to us about using nyx in the environment of high-performance computing right right yes so take it away and give a warm applause and welcome to Antoine thank you so I'm not on Butchie as you may hear I'm French so I'm terrible at speaking English also I'm coming from Applied Mathematics and numerical computing okay so I'm terrible at writing in functional language and administrating system so Givens of statements of this is a tie to talk about how we administrate software environment and our HPC clusters with a tools written in functional languages all right why are we so quick ad is a
public entity in France in Grenoble more precisely in Grenoble and among other services we provide computer computing power and efficient storage solutions we are distributed file systems for all Grenoble researchers we have two main clusters an HPC high-performance computing cluster named froggie and more data analysis oriented cluster named look yes as in Luke Skywalker so first
what is the HP C cleanser very very briefly maybe some of you know nothing about high-performance computing but basically it's a cluster is a set of interconnected Linux computing nodes a nodes as CPU memory graphic cards and one or several had node sharing some common network for systems the users we access to the closer through as his age and he will connect him himself on the head nodes and launched a launch simulation from the head node on the computing nodes so we have two clusters
the first one is froggy it's a HPC oriented cluster he has many CPU many memory he has performance computing network between computing nodes which is very important for us in the case of the HPC and yes we can say it's it has a direct clicking cooling system which is nice and it's dedicated to simulations that need a lot of communication between nodes so typically the code that runs on that panel on this cluster are paralyzed in distributed memory with MPI library is classically the the second cluster main cluster
because we have other smaller or older or not in production clusters is called Luke is a bit different from froggy because it's an iteration new news cluster it have nodes that are different need different number of cores of memory of size researcher can buy nodes and we take care of these nodes we plug in and however our our machine in Luke and we we take we we do the maintenance of this node etc etc memory size between 24 and 512 gigabytes depending on on nodes this kind of clutter is more adapted for a larger range of uses batch simulation experimentation shared memory parallelization and above all this
cluster is too bad a bit more clusters all the Greek add computing platform are integrated into a look grid that aims optimizing the resource is AG's another team users can launch a batch of that through the grid on the unity resources of our platform just a graph to to explain
briefly we have several clusters each cluster at the same batch carrier which is war Perez similar to PBS or slum for photos window and with the grid the user can launch evasion on all huge resources on this different platform so in the
software environment among all this cluster we have some specificity in our
environment we have eight original users different scientific fields we addressed all the the wool community research communities in green also we have a energy physicist geo physician biologist economist etcetera etc we have users that are new 90 and hatha we have power user advanced user better than me we have different use cases batch we have called him PL millions of call ready we have whole software's in Fortran 77 we have Python 3.6 with a lot of difference in in low-level languages etc etcetera and three vs. and free software MATLAB for example high hat MATLAB etc you derived different requirements performance for some of them more human-friendly access machine and to the simulation reproducibility obviously from the research portability etc etc also
another issue see we have different cluster with different hardware characteristic and different operating system and different physical hosting which can Network issue so can we hope until
relevantly to everyone needs with our resources how did we in the past we used
module comment it's a software which is using lots of computing center today system administrator compiled software's and libraries that needed by the the researcher on each cluster and after that user can load those software via the module commands module load open for module load MATLAB for example problem for each of four for each version season mean as to compile everything that is needed that work needed to be done on each cluster and no portability and no rafi disability guaranteed at all from a cruiser to another no sharing of the work done phone one module obviously no module cannot be used on a laptop on a desktop of the researchers so why Nix of
our cluster because for this point for maintenance functional package manager no side effect which is cool for us package creation in a session without for privileges which is crucial for us the cluster administrators also refer disability in variability higher sooner crucial for research portability obviously researchers can work on the laptop and install the same environment on the cluster and they are happy with that very happy and finally one channel and binary cash for many cluster the job is done and for all we have some specific teas
that are well covered by NYX we are heavily military users we can integrate it Christian community previous packages in our shot channel we can undo many build options because we are Church researchers are asked often to compile this code with this version of this hint of Intel compiler this version of this MPI libraries herd itself compared with this version of this Intel compiler etc etc we call that the combinatorial nightmare in terms of of version of the software we have on
the cluster and finally we want a system that is operating system independent so children the cake many nix package already exists we are very happy so Nick seems totally appropriate for us so now our Nick setup
now it's more or less classic so we have our cluster with a common distributed storage we set up our channel to a right server we are the binary cash and Rika channel and the user can access to them from the bath cluster the next or the next day
victory shirt on all the head and computing note on each cluster with a tap amount on a NFS filesystem on all of you of your node of our nodes so the
main steps to install nice on one cluster next tools so I'm not an expert Otto about this part that part but you can answer questions about that I won't answer I think install next to us a module yes Nix and module commands can can coexist easily and it's it's necessary for us for now set up zone is Damon and the user when he connects to to the cluster you have to source the scripts so the scripts at the past of the next
two binaries it's at the next path variable it initially is per user directories and configuration file says anak normal variable is necessary to use an examine then the saya Simon channel Simon is another name for the Greek at computing center it's a copy of NIC speakers reestablish NL plus some package for example Intel compilers or research codes that can go possibly AB stream in fact some of them have gone up stream already we have I shall be NER Rakesh's are already said and currently we have 10 17 packages without contain those who have which have goes of upstream so our workflow to
summer but I have already said of that so the cluster you have had nodes which with the next diamond on on yet we have the wet or the web server with the ponary cache and the channel ZN me no power user can contribute to the channel and sometimes when we have tested our code on our a cluster we do pull request upstream so some feedbacks now after our
experiments which have began two years ago approximatively yes reproducible and reliable is crucial for us but it's it's a fact so we are very happy with that users can install their packages or packages by themselves it's portable it's debate which is very important for users and for us to to to do tests for the most common package I I mean for the package that shooters use a lot with a shot be Terry cash very quick and sedation we are happy to contribute to our living communities Knicks community but also we are happy to have a living communities that that have a good documentation a lot of packages lots of issues but answer to these issues also and finally our main issue at the beginning was to install an identical software environment easily on different platform and and it's the case now is no more teachers so the cones more or less
I think it's not really concerts more a matter of time I think but it was said yesterday the leaning language learning curve is steep for us at first but especially for IT beginners our users in fact most of our users do know that new not create their packages we have to do that but we are not so many to do that for some language is the solution for packaging or setting up an event can be confusing for user especially for Python some packages are tricky to write and researchers for particulars of torment in a lot of different configurations so pokai packages as configurations it was presented yesterday morning we are waiting with a lot of enthusiasm about that feature and finally I think judging things properly especially in height in the public research takes a lot of time thank you for your attention
[Applause] but didn't I so sorry any questions yes do you have to do a lot of manual patching for some of the HPC specific applications that haven't been packaged for Nix yet like having to alter the our path and stuff in them in ways that you wouldn't normally do in a normal package patching the application not so much it happened but not so much yet but I have I have a bunch of package to do and I think I would do that for some of them but we try to avoid that so your main cluster you mentioned that you have nodes with next file system and then you monster NFS here I'm curious about NFS setup if ya encountered any issues with sharing the next door yes sometime in slow in fact it's a bit slow but it is not a breaking point for the moment nope No so if you have problems with an extending stone of and if s s the next week??s has a paragraph on what you could need to do when you exporting that sorry I didn't see you at the beginning so he's totally your question the problem was not a question if if there was a problem with the next one and FS sorry si was before a question it was a question what how one can use NFS when it's no Nicks when it's installed on an NFS and yeah I think we know we answer that because is this isn't mean maybe I can answer the question no no the the NFS sharing is just for with access from the nodes you know the the users are doing Nick's operations like like package installation building and things like that only from the head nodes and the the computing nodes are just accessing to the binaries the computing noodles and writings and in the stash necks in fact okay so it works okay were you involved in the decision to switch to Knicks and how did that go cuz we're trying to do that at work too for rhpc no I'm here since approximately one
year and a half so no but Bruno was here and I think he decided to do that we do but in France we are they fall for now we are the only computing center using NYX so the users they can create their own packages I saw you have a fork of next packages for you do so they make their own pull requests to that or in in fact yeah they they include their packages in our channel and after that on the after today they do a pull request upstream more questions nope okay then thank you very much for you [Applause]
Feedback