We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Top 15 Python Tips for Data Cleaning/ Understanding

Formal Metadata

Title
Top 15 Python Tips for Data Cleaning/ Understanding
Subtitle
With two bonus tips!
Title of Series
Number of Parts
130
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Data cleaning is one of the most important tasks in data science but it is unglamorous, underappreciated and under-discussed. These are some common tasks involved in data cleaning but not limited to: - Merging/ appending - Checking completeness of data - Checking of valid values - De-duplication - Handling of missing values - Recoding Most, if not all, of the time, the datasets that we have to analyze are unclean. i.e. they are not necessarily complete/ accurate/ valid. This will impact the accuracy of our analysis if we do not clean them properly. This talk covers how to perform data cleaning and understanding using primarily Pandas and Numpy. If you’re new to data analytics/ data science and are interested how to use Python to perform analysis, or if you're an Excel user hoping to move to Python, this talk might be for you. Participants should be at least familiar with the basics of Python programming.