We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Reproducible bioinformatics workflows: A case study with software containers and interactive notebooks

Formale Metadaten

Titel
Reproducible bioinformatics workflows: A case study with software containers and interactive notebooks
Serientitel
Anzahl der Teile
23
Autor
Lizenz
CC-Namensnennung 3.0 Deutschland:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
Reproducible specification of workflows in bioinformatics is challenging given their complexity. We developed a new statistical method in the field of circadian rhythmicity, which allows to rigorously determine whether measured quantities such as gene expression are not rhythmic. The statistical method itself was implemented in the R package "HarmonicRegression", available on the CRAN repository. However, the bioinformatics workflow is much larger than the statistical test. For instance, to ensure the validity of the statistical method, we simulated data sets of 20,000 gene expressions, with a large range of parameter combinations (e.g. sampling interval, fraction of rhythmicity, number of outliers). We now demonstrate the use of Jupyter notebooks to document and distribute our statistical method and its application to both simulated and experimental data sets. The notebook runs inside a Docker software container. It ensures complete long-term reproducibility of the workflow.