We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Troika: Submit, monitor, and interrupt jobs on any HPC system with the same interface

Formale Metadaten

Titel
Troika: Submit, monitor, and interrupt jobs on any HPC system with the same interface
Serientitel
Anzahl der Teile
542
Autor
Mitwirkende
Lizenz
CC-Namensnennung 2.0 Belgien:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
There are a wide variety of HPC systems across the world, and nearly as many ways of interacting with them using job submission systems. Therefore, migrating complex HPC workflows from a system to another may prove challenging. We present Troika, a tool that aims to abstract the details of the job submission system from the user, providing a single entry point for submitting, monitoring, and interrupting jobs on multiple HPC systems. Troika allows for a site-agnostic job script with directives, that can be translated to a script that the job submission system understands, based on configuration. Troika has been designed with extensibility in mind, to enable support for as many job submission systems as possible, as well as differences in the use of such systems. Troika is free software written in Python, exposing multiple entry points for hooks and plug-ins. It is a fundamental part of ECMWF's 24/7 time-critical operational and research workflows, making the glue between the batch scheduler and the workflow manager, where it handles hundreds of thousands of jobs each day. We will present how Troika works, as well as giving insights into its current and future applications.