We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Analyzing Data with Python & Docker

Formal Metadata

Title
Analyzing Data with Python & Docker
Title of Series
Part Number
92
Number of Parts
169
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Andreas Dewes - Analyzing Data with Python & Docker Docker is a powerful tool for packaging software and services in containers and running them on a virtual infrastructure. Python is a very powerful language for data analysis. What happens if we combine the two? We get a very versatile and robust system for analyzing data at small and large scale! I will show how we can make use of Python and Docker to build repeatable, robust data analysis workflows that can be used in many different contexts. I will explain the core ideas behind Docker and show how they can be useful in data analysis. I will then discuss an open-source Python library (Rouster) which uses the Python Docker-API to analyze data in containers and show several interesting use cases (possibly even a live-demo). Outline: 1. Why data analysis can be frustrating: Managing software, dependencies, data versions, workflows 2. How Docker can help us to make data analysis easier & more reproducible 3. Introducing Rouster: Building data analysis workflows with Python and Docker 4. Examples of data analysis workflows: Business Intelligence, Scientific Data Analysis, Interactive Exploration of Data 5. Future Directions & Outlook