We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Workflow managers in high-energy physics: enhancing analyses with Snakemake

Formale Metadaten

Titel
Workflow managers in high-energy physics: enhancing analyses with Snakemake
Serientitel
Anzahl der Teile
798
Autor
Mitwirkende
Lizenz
CC-Namensnennung 2.0 Belgien:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
Workflow management tools have long been used in scientific computing to organise and operate workflows. Many such tools, e.g., Snakemake, Luigi, and Toil, have grown from the foundation of Make (wherein users define simple rules with interdependent inputs and outputs), incorporating additional features to suit increasingly complex user needs. Initially seeing a widespread uptake in bioinformatics, workflow managers have become commonplace in many fields, for example, high-energy physics (HEP). Analyses in HEP typically consist of many non-trivially related processes with widely varying requirements. Workflow managers can vastly simplify such analyses, providing user-friendly methods to define, review and run analysis workflows. Snakemake has emerged as a leading workflow manager for HEP, with an established user base spread across major experiments. Dialogue between developers and HEP has led to integrations for distributed storage/transfer frameworks, e.g., XRootD, FTP and Amazon S3, and scheduling frameworks, e.g., HTCondor, Slurm, and DRMAA. These integrations enable analysts to better leverage the distributed computing resources made available by experiments, significantly improving the efficiency of HEP analyses. Further collaboration between analysts and developers has seen Snakemake form the core of several standardised analysis frameworks aimed at improving analysis reproducibility such as REANA. This contribution discusses the current use of workflow managers in HEP, including best practices for their application. Additionally, the anticipated requirements of analysts are considered within the context of ever-increasing data scales in HEP.