GeoMesa: Distributed Spatiotemporal Analytics

Zitieren

Zugehöriges Material

FOSS4G

Open Source Geospatial Foundation (OSGeo)

Fox, Anthony

Formale Metadaten

Titel

GeoMesa: Distributed Spatiotemporal Analytics

Serientitel

FOSS4G 2014 Portland

Anzahl der Teile

188

Autor

Fox, Anthony

Lizenz

CC-Namensnennung 3.0 Deutschland:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.

Identifikatoren

10.5446/31648 (DOI)

Herausgeber

FOSS4G

Open Source Geospatial Foundation (OSGeo)

Erscheinungsjahr

2014

Sprache

Englisch

Produzent

FOSS4G

Open Source Geospatial Foundation (OSGeo)

Produktionsjahr

2014

Produktionsort

Portland, Oregon, United States of America

Inhaltliche Metadaten

Fachgebiet

Informatik

Genre

Konferenz/Talk

Abstract

The rapid growth of traditional and social media, sensors, and other key web technologies has led to an equally rapid increase in the collection of spatio-temporal data. Horizontally scalable solutions provide a technically feasible and affordable solution to this problem, allowing organizations to incrementally scale their hardware in tandem with data increases.GeoMesa is an open-source distributed, spatio-temporal database built on the Accumulo column-family store. Leveraging a novel spatio-temporal indexing scheme, GeoMesa enables efficient (E)CQL queries by parallelizing execution across a distributed cloud of compute and storage resources, while adhering to Accumulo's fine-grained security policies. GeoMesa integrates with Geotools to expose the distributed capabilities in a familiar API. Geoserver plugins also enable integration via OGC standard services to a much wider range of technologies and languages, such as Leaflet, Python, UDig, and QuantumGIS. In this presentation, Anthony Fox will discuss the design of spatio-temporal indexes in distributed "NoSQL" databases, the performance characteristics and tradeoffs of the GeoMesa index, and how it can be leveraged to scale compute-intensive spatial operations across very large data sources. This discussion will detail how GeoMesa distributes data uniformly across the cloud nodes to ensure maximum parallelization of queries, and other computations. Specific computationally intensive analytics include distributed heat map generation over time, nearest neighbor queries, and spatio-temporal event prediction. He will present common analytic workflows against spatial data expressed as batch map-reduce jobs, dynamic ECQL queries, and real-time Storm topologies. Using the Global Database of Events, Language, and Tone (GDELT) dataset as a working example source, Mr. Fox will demonstrate how a completely open-source architecture stack, including GeoMesa, enables ad-hoc and real-time analytics.This presentation will be of interest to data scientists, geospatial systems developers, DevOps engineers, and users of massive Spatio-Temporal datasets.

Schlagwörter

Hadoop

Accumulo

Analytics

Distributed Computation

Big Data

Cloud Computing