We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Simplifying the creation of Slurm client environments

Formal Metadata

Title
Simplifying the creation of Slurm client environments
Subtitle
A Straw for your Slurm beverage
Title of Series
Number of Parts
542
Author
Contributors
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Slurm is the most widely used batch scheduler for HPC systems. The Open Source Software community is very active in the development surrounding the Slurm ecosystem, contributing CLI tools for accounting, monitoring, and notebooks among others. A lot of these client environments are nowadays created on containers, which have become a ubiquitous part of running applications. However, this way of working provides new challenges in HPC environments, especially when using Slurm. Slurm requires careful management of shared cluster secrets and cluster-wide configuration files that need to be in sync in order to work efficiently and securely. This talk proposes a novel and simple tool called straw, which allows the creation of secret-less and config-less Slurm client environments. Therefore simplifying the creation of (containerised) environments by removing the burdens of maintaining config files, sensitive munge secrets, and additional daemons. This talk will first provide an introduction to Slurm, followed by a description (mostly drawing from personal experience) of common patterns and pitfalls when creating containers that interact with Slurm clusters for different purposes (monitoring, notebooks, etc). Next, I will introduce Straw, explaining why it was needed and why despite its simplicity (it mostly just fetches a bunch of config files), it is able to perform a task that regular Slurm tools can't, therefore simplifying Slurm client environments. Finally, I will conclude by showing a simple example of how the tool can be used, and how it compares to the usual scenarios in which config files, extra daemons, and secrets need to be carefully managed. If time allows it, I might detail some of the weaknesses of this approach: the fact that the Slurm protocol isn't really documented, and therefore this tool relies on "reverse-engineering" (as much as one can say reverse engineering when no documentation exists, but the code is available) to keep up with new Slurm releases.