We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Keeping the HPC ecosystem working with Spack CI

Formal Metadata

Title
Keeping the HPC ecosystem working with Spack CI
Subtitle
Scaling a modern CI workflow to the Spack ecosystem
Title of Series
Number of Parts
542
Author
Contributors
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
The Spack package manager is widely used by HPC sites, users, and developers to install HPC software, and the Spack project began offering a public binary cache in June of 2022. The cache includes builds for x86_64, Power, and aarch64, as well as for AMD and NVIDIA GPUs and Intel's oneapi compiler. Currently, the system handles nearly 40,000 builds per week to maintain a core set of Spack packages. Keeping this many different stacks working continuously has been a challenge, and this talk will dive into the build infrastructure we use to make it happen. Spack is hosted on GitHub, but the CI system is orchestrated by GitLab CI in the cloud. Builds are automated and triggered by pull requests, with runners both in the cloud and on bare metal. We will talk about the architecture of the CI system, from the user-facing stack descriptions in YAML to backend services like Kubernetes, Karpenter, S3, CloudFront, and the challenges of tuning runners to give good build performance. We'll also talk about how we've implemented security in a completely PR-driven CI system, and the difficulty of serving all the relevant HPC platforms when most commits are from untrusted contributors. Finally, we'll talk about some of the architectural decisions in Spack itself that had to change to better support CI.