PySpark and Warcraft Data

Cite

EuroPython

Warmerdam, Vincent

Formal Metadata

Title

PySpark and Warcraft Data

Title of Series

EuroPython 2015

Part Number

123

Number of Parts

173

Author

Warmerdam, Vincent

License

CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this

Identifiers

10.5446/20222 (DOI)

Publisher

EuroPython

Release Date

2015

Language

English

Production Place

Bilbao, Euskadi, Spain

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

Vincent Warmerdam - PySpark and Warcraft Data In this talk I will describe how to use Apache Spark (PySpark) with some data from the World of Warcraft API from an iPython notebook. Spark is interesting because it speeds up iterative processes on your hadoop cluster as well as your local machine. I will give basic benchmarks (comparing it to numpy/pandas/scikit), explain the architecture/performance behind the technology and will give a live demo on how I used Spark to analyse an interesting dataset. I'll explain why you might want to use Spark and I'll also go in and explain when you don't want to use it. The dataset I will be using is a 22Gb json blob containing auction house data from all world of warcraft servers over a period of time. The goal of the analysis will be to determine when and if basic economics still applies in a massively online game. I will assume that the everyone knows what the ipython notebook is and I will assume a basic knowledge of numpy/pandas but nothing fancy. The dataset has been chosen such that people who are less interested in Spark can still enjoy the analysis part of the talk. If you know very little about data science but if you love video games then you should like this talk.

Keywords

EuroPython Conference

EP 2015

EuroPython 2015