We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

GeoMesa: Distributed Spatiotemporal Analytics

Formal Metadata

Title
GeoMesa: Distributed Spatiotemporal Analytics
Title of Series
Number of Parts
188
Author
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Producer
Production Year2014
Production PlacePortland, Oregon, United States of America

Content Metadata

Subject Area
Genre
Abstract
The rapid growth of traditional and social media, sensors, and other key web technologies has led to an equally rapid increase in the collection of spatio-temporal data. Horizontally scalable solutions provide a technically feasible and affordable solution to this problem, allowing organizations to incrementally scale their hardware in tandem with data increases.GeoMesa is an open-source distributed, spatio-temporal database built on the Accumulo column-family store. Leveraging a novel spatio-temporal indexing scheme, GeoMesa enables efficient (E)CQL queries by parallelizing execution across a distributed cloud of compute and storage resources, while adhering to Accumulo's fine-grained security policies. GeoMesa integrates with Geotools to expose the distributed capabilities in a familiar API. Geoserver plugins also enable integration via OGC standard services to a much wider range of technologies and languages, such as Leaflet, Python, UDig, and QuantumGIS. In this presentation, Anthony Fox will discuss the design of spatio-temporal indexes in distributed "NoSQL" databases, the performance characteristics and tradeoffs of the GeoMesa index, and how it can be leveraged to scale compute-intensive spatial operations across very large data sources. This discussion will detail how GeoMesa distributes data uniformly across the cloud nodes to ensure maximum parallelization of queries, and other computations. Specific computationally intensive analytics include distributed heat map generation over time, nearest neighbor queries, and spatio-temporal event prediction. He will present common analytic workflows against spatial data expressed as batch map-reduce jobs, dynamic ECQL queries, and real-time Storm topologies. Using the Global Database of Events, Language, and Tone (GDELT) dataset as a working example source, Mr. Fox will demonstrate how a completely open-source architecture stack, including GeoMesa, enables ad-hoc and real-time analytics.This presentation will be of interest to data scientists, geospatial systems developers, DevOps engineers, and users of massive Spatio-Temporal datasets.
Keywords