We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Growing a Workflow Language with GNU Guix

Formal Metadata

Title
Growing a Workflow Language with GNU Guix
Subtitle
Extending a reproducible software deployment system for HPC
Title of Series
Number of Parts
637
Author
Contributors
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
There are dozens of domain specific languages that allow scientists to describe complex workflows. From the humble generic GNU Make to large scale platforms like Apache Airflow you would think that there is something there to satisfy everyone. All of these systems have one thing in common: they have a strong focus on partitioning large computations and scheduling work units, but when it comes to managing the software environments that are the context of each of the planned computations, they are often remarkably shy to offer opinionated solutions. Software management and deployment often seems like an afterthought. Workflow language designers increasingly seem to be following the devops trend of resorting to opaque application bundles to satisfy application and library needs. While this strategy has some advantages it also comes with downsides that rarely seem to be weighed carefully. We present the Guix Workflow Language --- not as a solution to the question of software deployment in HPC workflows, but as an instance of convergent evolution: growing a workflow language out of a generic reproducible software management and deployment system (GNU Guix) instead of sprucing up a workflow language with software deployment features. We hope to encourage a discussion about the current state of workflow languages in HPC: when it comes to software and distributed computations, are we approaching the peak or do we circle a local maximum?