We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Reproducible and shareable notebooks across a data science team

Formale Metadaten

Titel
Reproducible and shareable notebooks across a data science team
Serientitel
Anzahl der Teile
56
Autor
Mitwirkende
Lizenz
CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
At CybelAngel we scan the internet looking for sensitive data leaks belonging to our clients. As the volume of alerts could count billions of samples, we use machine learning to throw away as much noise as possible to reduce the analysts' workload. We are a growing team of data scientists and a machine learning engineer, planning to double in size. Each of us contributes to projects and we use Notebooks before code industrialisation. As for many other data science teams, a lot of effort and valuable work is encapsulated in a format that is tricky to share, hardly reproducible and simply not built for production purposes. During the talk, we will present what we did to overcome some of these issues and our feedback about notebook versioning and implementation in Google Cloud Platform using open JupyterHub and Jupytext. This talk is addressed to a technical audience but all roles gravitating around a data team are welcome to grasp the challenges of the interaction of data science within the organisation.