We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Feeding data to AWS Redshift with Airflow

Formal Metadata

Title
Feeding data to AWS Redshift with Airflow
Title of Series
Number of Parts
160
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Feeding data to AWS Redshift with Airflow [EuroPython 2017 - Talk - 2017-07-13 - Anfiteatro 1] [Rimini, Italy] Airflow is a powerful system to schedule workflows and define them as a collection of interdependent scripts. It is the perfect companion to do extract/transform/load pipelines into data warehouses, such as Redshift. This talk will introduce some of the basis of Airflow and some of the concepts that are data pipeline specific, like backfills, retries, etc. Then there will be some examples on how to integrate this, along with some lessons learned there. At the end, there will be a part dedicated to Redshift, how to structure data there, how to do some basic transformation pre-loading, how to manage the schema using SQLAlchemy and Alembic