Miguel Cabrera - Things I wish I knew before starting using Python for Data Processing
In recent years one of the ways people get introduced into Python is
through its scientific stack. Although this is not bad, it may lead to
learn solely one aspect of the language, while overlooking other
idioms and functionality included in Python as well as some basic
software development good practices. I will share some useful tricks,
tools and techniques and software design and development principles
that I find beneficial when working on a data processing / science
project.
-----
In recent years of the ways people get introduced into Python is
through its scientific stack. Most people that learned Python this
way are not trained software developers and many times it is the first
contact with a programming language.
Although this is not bad, it may lead to learn solely one aspect of
the language while overlooking other idioms, standard and common
libraries included in Python as well as some basic software
development good practices. This may become a problem when a data
science project is moved from an experimentation phase to an
integration with technical environment.
In this talk I share some useful tricks, tools and techniques and as
well as some software design and development principles that I find
beneficial when working on a data processing / science project.
The talk is divided into two parts, one is Python centered, where I
will talk about some powerful Python construct that are useful in data
processing tasks. This include some parts collections module,
generators and iterators among others. The other I will describe some
general software development concepts including SOLID, DRY, and KISS
that are important to understand the rationale behind software design
decisions. |