Things I wish I knew before starting using Python for Data Processing

110

Cite

EuroPython

Cabrera, Miguel

Formal Metadata

Title

Things I wish I knew before starting using Python for Data Processing

Title of Series

EuroPython 2016

Part Number

Number of Parts

169

Author

Cabrera, Miguel

License

CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this

Identifiers

10.5446/21189 (DOI)

Publisher

EuroPython

Release Date

2016

Language

English

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

Miguel Cabrera - Things I wish I knew before starting using Python for Data Processing In recent years one of the ways people get introduced into Python is through its scientific stack. Although this is not bad, it may lead to learn solely one aspect of the language, while overlooking other idioms and functionality included in Python as well as some basic software development good practices. I will share some useful tricks, tools and techniques and software design and development principles that I find beneficial when working on a data processing / science project. ----- In recent years of the ways people get introduced into Python is through its scientific stack. Most people that learned Python this way are not trained software developers and many times it is the first contact with a programming language. Although this is not bad, it may lead to learn solely one aspect of the language while overlooking other idioms, standard and common libraries included in Python as well as some basic software development good practices. This may become a problem when a data science project is moved from an experimentation phase to an integration with technical environment. In this talk I share some useful tricks, tools and techniques and as well as some software design and development principles that I find beneficial when working on a data processing / science project. The talk is divided into two parts, one is Python centered, where I will talk about some powerful Python construct that are useful in data processing tasks. This include some parts collections module, generators and iterators among others. The other I will describe some general software development concepts including SOLID, DRY, and KISS that are important to understand the rationale behind software design decisions.