Merken

How GIS-friendly are Graph Databases today?

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Beta
Erkannte Entitäten
Sprachtranskript
OK so and welcome everyone to my talk it's great that can be used speaking to you I
yeah graphs everywhere that some typical marketing phrase that you will find to search for graph databases um to use 1 of the most quoted graph pictures of think the knowledge graph and you have to think about it from the graph model is pretty fascinating because it's pretty intuitive for humans but it's really easy to read and if you're using ability can be written as the newly model any data model and the domain in as a graph so and I fascinated by this idea by this model and on my the databases that are out there who are relying on this model and I ask myself GS friendlier graph databases today and not coincidentally also the university and working at right now there are also interested in this field and also that some research projects and today I will talk about the experience that they have made and also that I have made and and no 1 research project that i supervised so starting off with with theories so what is a graph database so I think most of you that would come to this chart already know this usually graph databases are listed under the umbrella of musical databases those so that 1 subfamily which also schema-less so that's makes them comparable to the other databases but still the data but also makes it
pretty unique and there 2 different kinds of graph databases what I know so good there are the ones which restoring nodes and edges as 1st class
citizens you can never undirected and directed edges with labels and so on both the 1st class citizens and both can have properties so that's why this is called property across model there are some open source 1 and source of property graph databases out there and not that do not listed so every graph database the only the ones where I know that they have special support or also spacious 4 through other systems and there is also another family which is the related to Semantic Web technologies so and there are some new moral a dangerous troubles so you have a subject and a predicate and an object and you know if you have a list of troubles you also have automatically a graph and there are also some very prominent standards kind of a RDF and do the query language SPARQL but again and the open sociable so as I have found with special support and are these ones actually also found out that you can do Cheers SPARQL queries on top of post jazz through a very weird to infrastructure so you need lots of different projects and in the end you have passed respect and so you could do choose buckling posture and yet by the end of my my talk I will focus on the property graph model so in and also in this talk I will not give you an introduction of the specific features what these databases offer so this is think something that you will mostly here but talks on prosody conferences and I think this is what you can do by yourself for what half an hour research and you will find out so In summary that most graph databases they have and then build point type they support to the WGS 84 reference system only with Fourier spatial indexing they have you hash indexing also with the sector said wikis and so on and they have support fall within corresponding looks very different this with inquiries and so on so by looking at this the smallest you probably can answer the question of our graph databases GS friendly and you can either by yourself and say well not really and we can finish the book right here but I thought it would be interesting for you just to hear what we found out when we use crafted in graph databases along with different use cases so these are the 3 I will presented in the next slides surrounding probably the most obvious the use case for a year and spatial trees if we have a graph why not just out between these notes that we have there find the shortest path between nodes and I'm also we were thinking about if we click on the map 1 to find the nearest edges and starts rotting from there the this scenario was we get data from the traffic men traffic management system of city of dressed in Germany and we get a road network with travel times for each link also we get penalties for the turns and traffic lights and this all was done by a group of students so they implemented running algorithms in a graph database and also this project was related to the research project excel where I'm working for so and the In this dataset we got the streets and the intersections interestingly as both as notes so we had to create edges between all the nodes of some and the travel time information off the street notes to the edges and then we could start by working with the routing algorithms that are available in the future so and this was implemented using you for J and embedded modes a library so you can also run you j several select and the yeah the results or reading the works that's pretty easy to set up and it's so then state of the art running engines to a degree of 10 times a something that we have really to new for j as hard as we can and then of course it has its database it's it's fine but it's not the fastest solution databases more designed for general purpose stuff even though rested is pretty specific so and you can also compare to key duality which is not the fastest but still flexible and also what it was a bit disappointing was that entries were really slow so and I also have another figure but it's really really embarrassed and there was stressed that even though adding the spatial index didn't really work so you can use the graph at which I show here it's shows that from the long and the relative some the more computation time that it takes so that's for routing and coming to the next part of this was a reaches research project where we thought wouldn't it be cool to have all your metadata from different data sources different databases stored in graph databases and and persist matches between equal columns and intersecting data especially intersecting data with additional edges and here's the scenario was we had some compounds of data from the Botanical Garden of Berlin which was maintained in many different datasets we get loads of CAD data which we had 1st a juror friends and transfer to GEO databases and then there's always done in the research project called from there was also talk about 1 of the German forces conferences and goes 14 thing and here we have as I told you we migrated most of the CAD data and prosperous using ETL tools and we sort all the metadata new for J and E and we it was also the goal to create an application where you can browse through the data and you can see it like a very simple method application and you have the different colors for the different data sources this theory works of something like this and but it's kind of and so you have 1 database node and then you get notes for the tables and notes for the columns and the unequal columns will will have a 1 year and actually traded in between they only on behalf of the name and then we have some extra nodes will we find found out that OK these these datasets are intersecting by their of global bombing voxel we create an extra edge and so no user can query this this graph to find out what how the data is connected between the data sources and here the results were OK it's pretty easy to store metadata in a graph database because it's easy to build it extends and it's also fast enough occurring so because we don't we really store the data in the graph database on the metadata and the by that time the and room the and we also used in the fridge spatial talking here but it wasn't really this there when really too many functions of this so in the end we we switch to due tools here and there but also by looking at this kind of structure and I'm also thinking maybe this would better be suitable for Semantic Web tools to create like SPARQL endpoints and so of the data it is just open questions OK so last use case I can tell the most because it was a master thesis which I have supervised and here the idea was to map the data models like you know this open at you know classes into a layer graphs database because some some older and so I previously from university job I worked at the company will have to deal with the did you know that a lot of sense to a kind of complex modesty model which can also be extended
by the user you can create your own data types you connect extend existing data types you can basically do whatever you want to the general but of course you need the suffering can consume this these user-defined extensions what is not freely available so um there are some solutions with which provide generic mapping um but yeah it is most often if you do it in a in a few map into relational databases you will in the end you will have a very very complex database schema with huge amount of tables not really performance and and so you would rely on external tools or what features services to we really work with the data as you not really using the database directly and for me this often felt like I'm the only using the database as a storage containers so I'm not really doing some intelligence stuff in there so why can't I just use a schema-less database anyway so because it but I think that scalable so and that's why I came up with this idea and so so you can find a paper was published this summer at the conference we had in Switzerland and we didn't like this that we transferred schema definition file an XML and also examine documents into Jason files so we have a is essential with the nodes and most of the edges and then we started into new for uh unfortunately there was a cool projects there is culture is omics which provided this proximal mapping to adjacent so it was really helpful here here here said the developer of the spiritual soul there was a big help for us yeah and then the results it can look like this so this is the city GML file and you will probably not be able to read all what's inside the nodes so the probable notes there at the top element of the XML file it's a city model and then from there and you get to all the city objects buildings and geometries and so on and so it was we came to this point pretty pretty quickly so this was a really hard to do and and is just simply works with all the old the data models that are out there because designers to support all of them so but there are if you look closer there's some candidates so from for example this is the polygon so on the right you will have a polygon node and then add the results in such a structure where you have a GML polygon and then use the the exterior the GMO exterior the Ringling during the pointless and this is new thing that really useful Indians to work with so I'm also consider electric train irregular networks is just the same and so on we wanted to have yeah but more logic how to separate objects so what what can we keep together as 1 document and what can we splits where which split the hierarchies into additional nodes and also we wanted to be sure that some if we work with the database the client should be able to validate our actions so that we're still compliant to the OTC data model and here we use them term another database which is called a rendered to be um that's a pretty new databases from Germany I think and which there and to be is called a multimodal database because of does not even does not only offer the graph model but also document store model on a key-value model and then in the usual way to access to work with the entity B is called a virus their API call Fox sister JavaScript API and and they can integrate the validation of layers so that every action you do with the Earth with the database as validated against adjacent to tumor and I also try to my visualize the notice as rectangles so because you in in a way we decided to keep them keep the data and documents and then like a building in 1 document and then support the sub features of a building into additional notes and so and then everything makes of the graph again but still and that's actually documents which are interrelated with each other so we 1 thing that was protocols graph databases and that carries a pretty easy with depends on what you do material this the the task was a key just all the information that is related to this 1 building and then of course you need to know to guarantee no they're there for a language which these databases offer the most of them are using their own dialect so here for example new for j you have cypher and entering the of you but what you can see is that the query for retrieving all the information of 1 buildings pretty easy whereas if you look at the relational sides yeah you will require a lot of joints so and this this is not even everything's I just joined the geometry and the the interference information so this is an example of a 3 three-city database schema so coming to a conclusion and so I would say a young man no I would say that although at 1st sight you would think was it's a cool eyes school data model it might be totally suitable for our complex data models and it's not really GAS friendly because it doesn't really support a lot of geometry data types and so on and that's from me I think for you and also for me it's that's not so such a big surprise them because the whole development in this field is very much vendor of there you know like the some the vendors decide what they want to implement so that you would need at least some Greek big group of users of the client that wants to have a certain specific spatial features and then you will can get them in the graph database something so there will also be like new additions in the newest version for new for j from its special features but still it's very very but not much and but what has been proven by this different use cases ever presented here especially the last 1 is that there are easily easily are capable in an serving very complex data models pretty natively pretty generally this kind of cool and if we think about using them a database just as a container when using a graph database so we can directly feed in all our different Jim all models and their lives there and we we can still create some external tools and to work with them yeah I still would think that maybe the elements as I look at GSA that's mostly really flat and not really in many objects are related with each other because we make the relation by their location so maybe there are not many use cases to use of pure graph databases but maybe a multimodal approach uh you might be very interesting to have the ability to switch between different and models while you go of course you can also do this by just using different databases but with different 1 um and it is often said that graph databases are not as fast as scalable like other than the sequel database that's kind of true but still you can you can have a very good performance with graph databases unless you have a lot of right so for example memoranda to be as far as I know it will all builds up there in the indices and indexes in in main memory itself is very much memory consuming but also then very performed so yeah that's not it so that you can find the slides on this link and them if you have questions e-mail me or talk to me I will attend rest of the conference so as an interim measure so I
think you you also initiated earlier so we have time for a few questions you heard someone isn't but then when yeah be all right of of a not all of the we we this is this is the idea that there is a lot of people use it has been my 1
it is and you will
I would say that the performance is really that is just like several times slower than but brooding engines and across the optimized in many ways
to drop off a lot of data and and so this is just what we found out that it was that that took some hundreds of milliseconds but the other reading engines are like finished intense milliseconds so there's still a huge gap but an I don't know firm graph databases really suitable place to to do real-time routing that's sure you can do you can implement all this flexible routing possibilities that you have the that will become OK I think and would be not be so I think because the other based on the data model as to sentences like graphs the In same question of search the so you
that the weights in in the United States with the vocabulary words let's just say these experience on that network it's about that has so again on the using its state space the space of things about it it you know you have to have a spatial state faithfully model represents of the now
we didn't represented in the database yesterday's so yeah that's exactly and so the weights were known in before and so it wasn't like I still people skills you I me it's acceptable for the and amplitude is having a you say that the multi but it
would be quite suitable for GA status or whether that the choice of a relatively and motivating database you think this is like the most suitable approach for storing the generated because we have still the ability to switch between different models because I don't and ingest that I don't see you that much hierarchies in there to create such kind of graph structures so and
sometimes it through just some of the theory it is like this information like the country
and the region said here and is going to show up 3 but it is true that there may be it also try to implement an archery
in and you fidgety about half of it but it was it was so so that that's why they're relying on to the can any more questions OK thanks
very much within the next
Grundraum
Graphentheorie
Graph
Datenhaltung
Datenmodell
Web Site
Physikalische Theorie
Computeranimation
Datenhaltung
Graph
Datenfeld
Gruppe <Mathematik>
Projektive Ebene
Graphentheorie
Grundraum
Retrievalsprache
Domain <Netzwerk>
Mereologie
t-Test
Statistische Hypothese
Computeranimation
Netzwerktopologie
Metadaten
Last
Algorithmus
Standardabweichung
Semantic Web
Punkt
Addition
Automatische Indexierung
Multifunktion
Datentyp
Datennetz
Kategorie <Mathematik>
Gebäude <Mathematik>
Prädikat <Logik>
Forcing
Rechter Winkel
Physikalische Theorie
Routing
Graphentheorie
Tabelle <Informatik>
Subtraktion
Hash-Algorithmus
Klasse <Mathematik>
Räumliche Anordnung
Open Source
Knotenmenge
Gewicht <Mathematik>
Datennetz
Datentyp
Retrievalsprache
Programmbibliothek
CAD
Datenstruktur
Open Source
Datenmodell
Datenmodell
Binder <Informatik>
Menge
Kantenfärbung
Personal Area Network
Resultante
Bit
Punkt
Familie <Mathematik>
Gruppenkeim
Kartesische Koordinaten
Komplex <Algebra>
Eins
Datenmanagement
Online-Katalog
Prozess <Informatik>
Figurierte Zahl
Inklusion <Mathematik>
Lineares Funktional
ATM
Datenhaltung
Kanal <Bildverarbeitung>
Quellcode
Wiki
Knotenmenge
Rechenschieber
Automatische Indexierung
Verschlingung
Kategorie <Mathematik>
Dualitätstheorie
Projektive Ebene
Information
Semantic Web
Standardabweichung
Aggregatzustand
Gebäude <Mathematik>
Physikalische Theorie
W3C-Standard
Graph
Systemprogrammierung
Grundraum
Meta-Tag
NP-hartes Problem
Data Encryption Standard
Prädikat <Logik>
Matching <Graphentheorie>
Graph
Mailing-Liste
Physikalisches System
Objektklasse
VOXEL
Mapping <Computergraphik>
Objekt <Kategorie>
Minimalgrad
Differenzkern
Last
Mereologie
Räumliche Anordnung
Resultante
Retrievalsprache
Punkt
Formale Sprache
Gewichtete Summe
Versionsverwaltung
Gruppenkeim
Fortsetzung <Mathematik>
Element <Mathematik>
Komplex <Algebra>
Computeranimation
Metropolitan area network
Client
Minimum
Rechenschieber
Vorlesung/Konferenz
Einflussgröße
Metropolitan area network
Umwandlungsenthalpie
Addition
Datennetz
Datenhaltung
Gebäude <Mathematik>
Güte der Anpassung
Speicher <Informatik>
Systemaufruf
Abfrage
Digitalfilter
Dialekt
Rechenschieber
Dienst <Informatik>
Datenfeld
COM
Rechter Winkel
Automatische Indexierung
Festspeicher
Projektive Ebene
Information
URL
Ext-Funktor
Tabelle <Informatik>
Computervirus
Subtraktion
Gruppenoperation
Rechteck
Hierarchische Struktur
Gebäude <Mathematik>
Mathematische Logik
Term
Polygon
Räumliche Anordnung
Datenhaltung
Task
Graph
Knotenmenge
Datentyp
Indexberechnung
Maßerweiterung
Datenstruktur
Softwareentwickler
Speicher <Informatik>
Hilfesystem
Eindringerkennung
Protokoll <Datenverarbeitungssystem>
Graph
Relativitätstheorie
Datenmodell
Validität
Datenmodell
Vektorpotenzial
Binder <Informatik>
Elektronische Publikation
Mapping <Computergraphik>
Objekt <Kategorie>
Generizität
Programmfehler
Schnelltaste
Injektivität
Räumliche Anordnung
Besprechung/Interview
Vorlesung/Konferenz
Baumechanik
Datennetz
Graph
Besprechung/Interview
Datenmodell
Vorlesung/Konferenz
Wort <Informatik>
Graphentheorie
Raum-Zeit
Aggregatzustand
Lesen <Datenverarbeitung>
Multiplikation
Gewicht <Mathematik>
Graph
Datenhaltung
Besprechung/Interview
Gewichtete Summe
Datenmodell
Hierarchische Struktur
Vorlesung/Konferenz
Datenstruktur
Auswahlaxiom
Wurm <Informatik>
Besprechung/Interview
Information
Physikalische Theorie
Normalvektor
Wurm <Informatik>
Besprechung/Interview
IRIS-T
Vorlesung/Konferenz
Zwölf
Normalvektor
Computeranimation

Metadaten

Formale Metadaten

Titel How GIS-friendly are Graph Databases today?
Serientitel FOSS4G Bonn 2016
Teil 09
Anzahl der Teile 193
Autor Kunde, Felix
Mitwirkende Kunde, Felix (Beuth University of Applied Science)
Lizenz CC-Namensnennung 3.0 Deutschland:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
DOI 10.5446/20297
Herausgeber FOSS4G
OSGeo
Erscheinungsjahr 2016
Sprache Englisch

Inhaltliche Metadaten

Fachgebiet Informatik
Abstract In the crowd of NoSQL data storage solutions that support spatial data, graph databases are one of the more fascinating technologies. Their data model is easy to understand and provides a high flexibility for handling deeply nested relationships. In this session I will introduce different case studies where graph databases have been used in research projects at Beuth University: As an integration layer for querying metadata of different data stores (Neo4j). As a routing engine (part of the Smart Data Project ExCELL (Neo4j)). As a plattform for generic mapping of standardized OGC data models (e.g. SensorML, CityGML) (Neo4j, ArangoDB, Jsonix) (running master thesis). We will discuss the strengths and drawbacks of each approach in order to give a proper answer to the talk's title. Felix Kunde (Beuth University of Applied Science)
Schlagwörter Beuth University of Applied Science

Zugehöriges Material

Ähnliche Filme

Loading...