Transforming scattered analyses into a documented, reproducible and shareable workflow

Cite

Related Material

FOSDEM VZW

Rochette, Sébastien

Formal Metadata

Title

Transforming scattered analyses into a documented, reproducible and shareable workflow

Title of Series

FOSDEM 2020

Number of Parts

490

Author

Rochette, Sébastien

License

CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/46922 (DOI)

Publisher

FOSDEM VZW

Release Date

2020

Language

English

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

This presentation is a feedback from experience on helping a researcher transforming a series of scattered analyses into a documented, reproducible and shareable workflow. Time allocated by researchers to program / code the analyses required to answer their scientific questions is usually low compared to other tasks. As a result, multiple small experiments are developed and outputs are gathered as best as possible to be presented in a scientific paper. However, science is not only about sharing results but also sharing methods. How can we make our results reproducible when we developed multiple, usually undocumented analyses? What do we do if the program is only applicable to our computer directory architecture? This is always possible to take time to rewrite, re-arrange and document analyses at the time we want/have to share them. Here, I will take the exemple of a "collaboration fest" where we dissected R scripts of a researcher in ecology. We started a reproducible, documented and open-source R-package along with its website, automatically built using continuous integration. However, can we think, earlier in the process, a better way to use our small programming time slots by adopting a method that will save time in our future? In this aim, I will present a documentation-first method using little time while writing analyses, but saving a lot when the time has come to share your work. Session type (Lecture or Lightning Talk) Lecture Session length (20-40 min, 10 min for a lightning talk) 30 min Expected prior knowledge / intended audience No prior knowledge expected. Example will be about building documentation for R software but any developper, using any programming language may be interested in the method adopted. Speaker bio Sébastien Rochette has a PhD in marine ecology. After a few years has a researcher in ecology, he joined ThinkR, a company giving courses and consultancy around the R-software. Along with commercial activities, he is highly involved in the development of open-source R packages. He also shares his experience with the R-community through free tutorials, blog posts, online help and other conferences.