Finding the Where in Big Fuzzy Data

Cite

Related Material

FOSS4G

Open Source Geospatial Foundation (OSGeo)

Turner, Andrew

Formal Metadata

Title

Finding the Where in Big Fuzzy Data

Title of Series

FOSS4G 2014 Portland

Number of Parts

188

Author

Turner, Andrew

License

CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/31651 (DOI)

Publisher

FOSS4G

Open Source Geospatial Foundation (OSGeo)

Release Date

2014

Language

English

Producer

FOSS4G

Open Source Geospatial Foundation (OSGeo)

Production Year

2014

Production Place

Portland, Oregon, United States of America

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

We've gone to plaid. It is now easier to store any and all information that we can because it _might_ be useful later. Like a data hoarder, we would rather keep everything than throw any of it away. As a result, we now are knee-deep in bits that we are not quite sure are useful or meaningful. Fortunately, there is now a mature, and growing, family of open-source tools that make it straight-forward to organize, process and query all this data to find useful information. Hadoop has been synonymous with, and arguably responsible for, the rise of 'The Big Data'. But it's not your grandfather's mapreduce framework anymore (ok, in internet time). There are a number of open-source frameworks, tools, and techniques that are emerging that each provide a different speciality when managing and process fast, big, voracious data streams.As a Geo-community we understand the potential for location to be the common context through which we can combine disparate information. In large amounts of data with wide variety, location enables us to discover correlations that can be amazing insights that otherwise were lost when looking through our pre-defined and overly structured databases. And by using modern big data tools, we can now rapidly process queries which means we can experiment with more ideas in less time.This talk will share open-source projects that geo-enable these big data frameworks as well as use case examples of how they have been used to solve unique and interesting problems that would have taken forever to run or may not have even been possible.

Keywords

big data

hadoop

analysis