We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Nix for data pipeline configuration

Formale Metadaten

Titel
Nix for data pipeline configuration
Serientitel
Anzahl der Teile
27
Autor
Lizenz
CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
My team develops a data pipeline to generate music recommendations. It consists of many batch jobs that read data from somewhere and write their output somewhere else, with complex dependencies and parameter tuning. Historically, we have configured these batch jobs with hand-written bash configuration, or with dedicated python-based tools such as Airflow. However, both lack flexibility, often forcing the developer to bypass them and to run jobs manually during development. The tasks of data pipeline configuration and package definition share some requirements: both imply running many programs in a specific order and with specific parameters. Since nix is a language dedicated to packages definition, which allows expressing packages in a succinct and highly flexible way, we decided to try to use it for data pipeline configuration. Nix-the-tool is too centered around package management for our use case, so we built our own tool around nix-the-language. It this talk, we’ll explore how to apply nix to data pipeline configuration. This will give us the opportunity to look at nix as a language, abstracted from its current ecosystem. We’ll also explore how to structure a nix codebase, encountering the same questions nixpkgs encountered a long time ago, but in a much smaller environment. The main goal of this talk is to share the different point of view of nix that comes from applying it to a different problem and starting from scratch. We also hope to serve as an inspiration to explore other nix-based DSLs. --- Bio: Georges is a Software Engineer at SoundCloud, in Berlin. He is part of the team that generates music recommendations. He loves exploring new ways to solve engineering problems, which led him to look into exciting technologies such as Haskell and NixOS. Some of his favorites hobbies are playing board games and learning German.