Bestand wählen
Merken

GeoMesa: Scalable Geospatial Analytics

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Beta
Erkannte Entitäten
Sprachtranskript
and 1 of my name and was the the Fox from the World Cup The of child Ginia called from computer research and we developed G amazed set which we of soft by the year ago and that connected with the location the team now amazed under the location for a minute or today about Geo may set what it is how it came to the was talk alone that about distributed databases which is what you may said and I'm dive into how we do all the diverse indexing to enable geospatial data in a non Relational database and event limited to a few Analytics at all shows how we leveraged you makes built on top the versatile would is that
you may say many many of you have had been by customers in dictated to move to a cloud or some of you have an actual justification and the need to move to a club that cases that it's a little bit of both we were directed to migrate in a letter that had intense computational quiet to a o'clock based systems and the tools that we had come to rely on the rich geospatial functionality that available in post yes was just not available to us on the back so we developed and not all of the geospatial capability that we needed to support a analytic and and quickly realised that it was usual on its on to lead the couple from the end with a and B of soft Sergio may say is a result that it is a distributed spatio-temporal database in particular built on the cumulo complicated which McGowan 2 more details about the more quickly its the goal of the amazing is to be a one time only dependency on Friday that it implements all the deal tools the standard you for the and and also exposes data in these distributed databases by standardized services like the see that we had weeks of by just a like in the point being that you should never have to import makes class into your application you should have a word the deal tools relevant tools interfaces and work with those directly and the geospatial computations part transparently executed on the left again location over sauce some visualization just seeing here all are associated with the G Delve data said he noticed the global database of events language tone that by the end of the Boston belief and its to 150 million deal could events since 19 79 about 100 gigs of uncompressed data and just to give you some for its numbers we can invest that it into the system on a small virtualized about approximately 15 minutes and in the end you Analytics against the music visualizations against that data said so
smooth justification about Highgate cloudbase deal why would you need as well you don't look very far to see high-velocity spatio-temporal data to a does not 100 200 thousand between 2 per cent in a small percentage of those are geotag but a small percentage of the large number of these large number was quick claims the do 1 million 2nd day everybody's deal located clicks change so mobsters advertises very interested in geologically changes to show you how you can actually correlate Shia located the streams Biennale if we developed satellite imagery pretty obvious vehicle and traffic senses essay Kotcheff accepted generate tons of high high-velocity data so you have a need for a distributed database because you have more data that can fit on a single machines disc or can be processed to within a single showed memory so the customer dictates that you use a column family database of Google published a paper in about 0 6 or 7 on a system called the table and quickly after that the number of derivative databases dotted across monks those derivative databases are H based its built directly on the big table model Cassandra and humility focuses on securities does high-resolution sell levels during which were starting to 14 year made its these Distributed databases particular these key values sauce have very flexible scheme asserts easy to get up and going with them but not scheme Olesya application voice imposes scheme on the day after priciest understand you didn't have to do something what this means is that the complexity of query pushed up into application less you have to make these interesting trade where you have potentially a multi tendency database and each application its dictating how their doing scans having using the sauces that you have to balance between flexible scheme and potentially conflicting data access pack of its horizontally Scalable this very nice feature of talk about how that's how we leverage that in particular a cumulo has the notion of tablatures tablatures at a single table is spread across many tablet servers and we can actually take advantage of the processing and bio and with of multiple tablets and he will handle failover balancing on rebalancing all the nitty-gritty Distributed complexity but he is another trade off which is that in those Jesse have very nice and sophisticated large reason quite trees and and your traditional spatial indexes which were incredibly well on relatively but the size datasets well when you go to 1 of these contently databases you don't have many indexes you have to design a tables such your table has said it in the only index that actually available to you is an implicit lexicographic order of the keys in your keep value storey so this is a a critical element that has to be thought to be a Catholic and pushing geospatial data because obviously geospatial data is multi-dimensional when at times European three-dimensional
so how do I go about leveraging should databases for the quality is that they have that are advantages will be new petitioning very easily petitioning almost comes 3 where with Relational databases you to do a lot of work and the other Asian to due petitioning with Distributed databases protesting that is effectively given to you and that you were able to distribute queries across mobile cheese these Concurrent queries by different applications for users and they can be spread over Load across the street we use striding to distribute computations within a single query single query weaken actually split up into a ball of the tablet service of 100 tablatures then you can potentially have a 100 processors that are executing on your queries this makes some of the very large datasets operational and interactive the trade of the making with that is that when you make when you straight to data across all the resources you would incur a cost when you have to to current praise from many users because although he reaches are being brought all the resources are being brought their on every single query 3 of flexible and elements of the the indexing structure that allows you to to the amount of parallel as American again per table she will has an extension point called service ideas which we leverage in until said to provide geospatial and seedy well and easy she well query semantics at the data rather than a secondary stands with an awesome bed custom Analytics on the show some example that inside the service ideas with a century becomes ad hoc an interactive MapReduce like computation last book the kind of
place not only have the power to summon doses really quickly at this point the keys to working with the family of the value of data storage but that have been implicit lexicographical order in his use a terms of space are a one dimensional data structure that allows you to protect multi dimensions into a Linear space so have a
nice video here the Muazam latest so the shows how the space killing her fills the space that were interested in the early obviously busy access is time bullying seemed that secondfront Bahrain there is that striding each cautioned of the geographic time space actually contributes to each tablet so the tablets so the data strike across all all the tablets in a structured way to cross the road but this
is a complex polygons and wine spring essentially the compose the piling on into multi resolution Geo hash's to hash is the implementation of based makers that were using the order and the new stole or each the compose hash as an element in your index space and
create planning amounts to computing the writes that are candidates for results of your queries by the time
the get too early the year Analytics that we developed with implemented the Mall as buppie services and their deployed in Geo servers the discoverable some the handling and the foot this 1 is a spatio-temporal Prediction analytic optix events in space and time and that 1 of the interesting things about the sale at it is that it's Santiago Chile and its the rivalry events in Santiago predicting Robinson as the hot deceiver just in the low area below the red below the 7th believe is a stadium and we recently transition from Linear model to open on the new model that able to detect the threat for Ribery's is higher around the stadium inside the data and the and all that implemented as service at interest in this see any type
of associate of computation can be cast as the file and this is native Madridistas method is implemented with in iterated so for instance density computations can be done very rapidly by Computing spas density Matrix for any kind of transformation major in within each tablet servers you have hundreds of workers that are brought to their on your computation and reduce side applies associate of operation which in the case of densities is just summation and but you can expressed many different types of computations this way which kind of
another interesting analytic we developed as a deadly Fiesta that implemented within the Riverside iterative is interpolated time space queries basically I would like to see who you might in reacted with on a trip that you made so it's time interpolation as well as space interpolation we like to think of it
as tweeting the New Jersey Turnpike so you have to a a person that is tweeting as the travelling on the New Jersey Turnpike and I would like know who a might be on the mend must with them before their might have stopped interactive with so you have to do a complex series of queries that should time through each Japanese interpolate attract based on the road network for some of the underlying where so the possible interactions it
that is the result of this and with it the executed by W and with gas filling and and snapped a road tracks we can get more information but quicker would
not about what you may sell were pushing for 1 0 release in June last year implementing full based authentication authorizations living integrate with the sellable security became ill we have audit method of acquiring Codington Relational Projections allowed a subset the data and rapidly return just what we need for each queries in the full were looking at integrating deeply which she a server and the should do ecosystem so W piazzas will be executed across a parade of computer paradigms like Staum and MapReduce and spa and top cashing can be pushed into which gives as for within a cumulo and were also looking at during data increase the 2 states and how those might improve equipment for quick couple links here the location dialogue website time list some information about you may is that we have a Jimmy's a dialogue website that has tutorials and other demonstrations and this uses mailing list and debt mailing list have taken the questions in the Commons of the income questions
Relationale Datenbank
Informatiker
Datenhaltung
URL
Analytische Menge
Räumliche Anordnung
Ereignishorizont
Computeranimation
Resultante
Satellitensystem
Bit
Punkt
Prozess <Physik>
Skalierbarkeit
Formale Sprache
Familie <Mathematik>
Kartesische Koordinaten
Element <Mathematik>
Computerunterstütztes Verfahren
Information
Twitter <Softwareplattform>
Komplex <Algebra>
Computeranimation
Übergang
Streaming <Kommunikationstechnik>
Standardabweichung
Visualisierung
Quellencodierung
Schnittstelle
Bildauflösung
Distributionstheorie
Lineares Funktional
Schlüsselverwaltung
Datenhaltung
Computersicherheit
Abfrage
Anwendungsschicht
Nummerung
Ereignishorizont
Dienst <Informatik>
Generator <Informatik>
Abfrage
Automatische Indexierung
Zeitliches Datenbanksystem
Festspeicher
Server
Dimension 3
URL
Ordnung <Mathematik>
Schlüsselverwaltung
Portscanner
Standardabweichung
Tabelle <Informatik>
Klasse <Mathematik>
Mathematisierung
Zahlenbereich
Derivation <Algebra>
Analytische Menge
Nummerung
Datenhaltung
Open Source
Virtuelle Maschine
Multiplikation
Informationsmodellierung
Mini-Disc
Glättung
Indexberechnung
Einfache Genauigkeit
Physikalisches System
Automatische Handlungsplanung
Satellitensystem
Summengleichung
Audiovisualisierung
Tablet PC
Mereologie
Wort <Informatik>
Geschwindigkeit
Streuungsdiagramm
Distributionstheorie
Server
Subtraktion
Punkt
Datenparallelität
Hausdorff-Dimension
Familie <Mathematik>
Kurvenanpassung
Interaktives Fernsehen
Kartesische Koordinaten
Analytische Menge
Element <Mathematik>
Computerunterstütztes Verfahren
Verteilte Programmierung
Term
Raum-Zeit
Computeranimation
Datenhaltung
Formale Semantik
Multiplikation
Web Services
Distributionenraum
Coprozessor
Speicher <Informatik>
Maßerweiterung
Datenstruktur
Parallele Schnittstelle
Leistung <Physik>
Relationale Datenbank
Raum-Zeit
Datenhaltung
Mobiles Internet
Einfache Genauigkeit
Abfrage
Strömungsrichtung
Vektorraum
Abfrage
Last
Tablet PC
Ordnung <Mathematik>
Schlüsselverwaltung
Tabelle <Informatik>
Quelle <Physik>
Raum-Zeit
Kurvenanpassung
Implementierung
Element <Mathematik>
Räumliche Anordnung
Polygon
Komplex <Algebra>
Raum-Zeit
Computeranimation
Videokonferenz
Multiplikation
Automatische Indexierung
Tablet PC
Hash-Algorithmus
Ordnung <Mathematik>
Minkowski-Metrik
Bildauflösung
Web Services
Resultante
Server
Prozess <Informatik>
Gruppenoperation
Abfrage
Analytische Menge
Automatische Handlungsplanung
Räumliche Anordnung
Ereignishorizont
Computeranimation
Informationsmodellierung
Dienst <Informatik>
Web Services
Prognoseverfahren
Abfrage
Flächeninhalt
Datentyp
Server
Vorlesung/Konferenz
Minkowski-Metrik
Retrievalsprache
Partitionsfunktion
Server
Subtraktion
Iteration
Transformation <Mathematik>
Computerunterstütztes Verfahren
Dichtematrix
Raum-Zeit
Computeranimation
Schwach besetzte Matrix
Datentyp
Gruppoid
Ordnungsreduktion
Minkowski-Metrik
Assoziativgesetz
Nichtlinearer Operator
Prozess <Informatik>
Raum-Zeit
Elektronische Publikation
Ordnungsreduktion
Dichte <Physik>
Assoziativgesetz
Magnetkarte
Interpolation
Tablet PC
Server
Client
FIESTA <Programm>
Interpolation
Instantiierung
Dichtematrix
Resultante
Handzeichnung
Pufferspeicher
Interpolation
Datennetz
Reihe
Interaktives Fernsehen
Abfrage
Information
Komplex <Algebra>
Algebraisches Modell
Retrievalsprache
Web Site
Server
Decodierung
Desintegration <Mathematik>
Computeranimation
Mailing-Liste
Authentifikation
Autorisierung
Statistische Analyse
Programmierparadigma
Computersicherheit
E-Mail
Parallele Schnittstelle
Caching
Autorisierung
Automatische Indexierung
Binärcode
Axonometrie
Computersicherheit
Abfrage
Mailing-Liste
Binder <Informatik>
Automatische Handlungsplanung
Teilmenge
Suite <Programmpaket>
Verschlingung
Server
Authentifikation
Information
URL
Modelltheorie
Aggregatzustand

Metadaten

Formale Metadaten

Titel GeoMesa: Scalable Geospatial Analytics
Serientitel LocationTech Summit 2014
Anzahl der Teile 14
Autor Fox, Anthony
Lizenz CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
DOI 10.5446/15339
Herausgeber LocationTech, Andrew Ross
Erscheinungsjahr 2014
Sprache Englisch
Produktionsjahr 2014
Produktionsort Washington, DC

Inhaltliche Metadaten

Fachgebiet Informatik
Abstract The proliferation of smart phones with embedded geolocation sensors has led to an explosion of geospatial data in all domains. Every mobile app now asks users to enable location services and generates copious geotagged data. Existing solutions for managing this data rely on traditional approaches using geospatial relational RDBMS platforms. GeoMesa is an open source scalable spatio-temporal index built on top of the Accumulo distributed column family database that provides efficient OGC standards based access and query capabilities of very large datasets. GeoMesa provides WMS or WFS services over HTTP for data access as well as an API based on Geotools. Spatial analytics in GeoMesa can leverage Hadoop to perform computations in parallel on a cloud. Sensitive personal information inherent in consumer geolocated data can be protected using Accumulo's cell level security. This talk will cover the indexing structure in GeoMesa and how it enables scalable geospatial analytics in a cloud platform.

Ähnliche Filme

Loading...
Feedback