How GIS-friendly are Graph Databases today?

Video thumbnail (Frame 0) Video thumbnail (Frame 878) Video thumbnail (Frame 3350) Video thumbnail (Frame 15828) Video thumbnail (Frame 27989) Video thumbnail (Frame 29320) Video thumbnail (Frame 29865) Video thumbnail (Frame 31142) Video thumbnail (Frame 31782) Video thumbnail (Frame 32401) Video thumbnail (Frame 33046) Video thumbnail (Frame 33548) Video thumbnail (Frame 34035)
Video in TIB AV-Portal: How GIS-friendly are Graph Databases today?

Formal Metadata

How GIS-friendly are Graph Databases today?
Title of Series
Part Number
Number of Parts
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
In the crowd of NoSQL data storage solutions that support spatial data, graph databases are one of the more fascinating technologies. Their data model is easy to understand and provides a high flexibility for handling deeply nested relationships. In this session I will introduce different case studies where graph databases have been used in research projects at Beuth University: As an integration layer for querying metadata of different data stores (Neo4j). As a routing engine (part of the Smart Data Project ExCELL (Neo4j)). As a plattform for generic mapping of standardized OGC data models (e.g. SensorML, CityGML) (Neo4j, ArangoDB, Jsonix) (running master thesis). We will discuss the strengths and drawbacks of each approach in order to give a proper answer to the talk's title. Felix Kunde (Beuth University of Applied Science)
Keywords Beuth University of Applied Science
Universe (mathematics)
Graph (mathematics) Graph (mathematics) Projective plane 3 (number) Database Database Field (computer science) Theory Graph theory Data model Universe (mathematics) Business model Website Graph theory
Meta element Complex (psychology) Group action Graph (mathematics) Correspondence (mathematics) Equaliser (mathematics) Wiki Different (Kate Ryan album) Personal digital assistant Set (mathematics) Duality (mathematics) Sanitary sewer Social class Physical system Link (knot theory) Mapping Building Structural load Electronic mailing list Voxel Bit Mereology Category of being Process (computing) System programming Personal area network Figurate number Directed graph Geometry Point (geometry) Slide rule Open source Library catalog Student's t-test Subject indexing Business model Data structure Data type Routing World Wide Web Consortium Standard deviation Graph (mathematics) Matching (graph theory) Information Theory Computer network Cartesian coordinate system Semantic Web Software Query language Predicate (grammar) Personal digital assistant Network topology Universe (mathematics) Data Encryption Standard Social class Table (information) Family Library (computing) NP-hard Standard deviation Building Structural load State of matter Weight Multiplication sign Computer-aided design 1 (number) Semantic Web Mereology Geometry Hash function Query language Vertex (graph theory) Category of being Algorithm Point (geometry) Degree (graph theory) Type theory Data model Right angle Resultant Asynchronous Transfer Mode Functional (mathematics) Link (knot theory) Graph coloring Theory Metadata Hypothesis Time domain Directed graph Multiplication sign Addition Forcing (mathematics) Graph (mathematics) Projective plane Database Predicate (grammar) Graph theory Inclusion map Subject indexing Computer animation Customer relationship management Business model Object (grammar)
Computer virus Complex (psychology) Keyboard shortcut Building Greatest element Group action Graph (mathematics) Multiplication sign Database Client (computing) Formal language Geometry Semiconductor memory Different (Kate Ryan album) Personal digital assistant Query language Extension (kinesiology) Injektivität Metropolitan area network Software bug Mapping Relational database Building Software developer Data storage device Ext functor Price index Measurement Data model Right angle Data type Resultant Geometry Point (geometry) Slide rule Digital filter Service (economics) Computer file Sequel Link (knot theory) 3 (number) Online help Data storage device Vector potential Field (computer science) Element (mathematics) Revision control Latent heat Goodness of fit Lecture/Conference Term (mathematics) Hierarchy Business model Data structure Summierbarkeit Metropolitan area network Task (computing) Addition Dialect Graph (mathematics) Slide rule Information Validity (statistics) Projective plane Graph (mathematics) Polygon Generic programming Coma Berenices Database Rectangle System call Subject indexing Uniform resource locator Computer animation Software Intrusion detection system Logic Query language Personal digital assistant Business model Object (grammar) Table (information) Communications protocol
Meeting/Interview Lecture/Conference Multiplication sign Civil engineering
Graph theory Data model Word Graph (mathematics) Software Meeting/Interview Lecture/Conference State of matter 3 (number) Arm Reading (process) Spacetime
Axiom of choice Multiplication Graph (mathematics) Meeting/Interview Lecture/Conference Weight Hierarchy Business model Database Data structure Summierbarkeit
Information Meeting/Interview Normed vector space Computer worm Theory
12 (number) Meeting/Interview Lecture/Conference Normed vector space IRIS-T 3 (number) Computer worm
Computer animation
OK so and welcome everyone to my talk it's great that can be used speaking to you I
yeah graphs everywhere that some typical marketing phrase that you will find to search for graph databases um to use 1 of the most quoted graph pictures of think the knowledge graph and you have to think about it from the graph model is pretty fascinating because it's pretty intuitive for humans but it's really easy to read and if you're using ability can be written as the newly model any data model and the domain in as a graph so and I fascinated by this idea by this model and on my the databases that are out there who are relying on this model and I ask myself GS friendlier graph databases today and not coincidentally also the university and working at right now there are also interested in this field and also that some research projects and today I will talk about the experience that they have made and also that I have made and and no 1 research project that i supervised so starting off with with theories so what is a graph database so I think most of you that would come to this chart already know this usually graph databases are listed under the umbrella of musical databases those so that 1 subfamily which also schema-less so that's makes them comparable to the other databases but still the data but also makes it
pretty unique and there 2 different kinds of graph databases what I know so good there are the ones which restoring nodes and edges as 1st class
citizens you can never undirected and directed edges with labels and so on both the 1st class citizens and both can have properties so that's why this is called property across model there are some open source 1 and source of property graph databases out there and not that do not listed so every graph database the only the ones where I know that they have special support or also spacious 4 through other systems and there is also another family which is the related to Semantic Web technologies so and there are some new moral a dangerous troubles so you have a subject and a predicate and an object and you know if you have a list of troubles you also have automatically a graph and there are also some very prominent standards kind of a RDF and do the query language SPARQL but again and the open sociable so as I have found with special support and are these ones actually also found out that you can do Cheers SPARQL queries on top of post jazz through a very weird to infrastructure so you need lots of different projects and in the end you have passed respect and so you could do choose buckling posture and yet by the end of my my talk I will focus on the property graph model so in and also in this talk I will not give you an introduction of the specific features what these databases offer so this is think something that you will mostly here but talks on prosody conferences and I think this is what you can do by yourself for what half an hour research and you will find out so In summary that most graph databases they have and then build point type they support to the WGS 84 reference system only with Fourier spatial indexing they have you hash indexing also with the sector said wikis and so on and they have support fall within corresponding looks very different this with inquiries and so on so by looking at this the smallest you probably can answer the question of our graph databases GS friendly and you can either by yourself and say well not really and we can finish the book right here but I thought it would be interesting for you just to hear what we found out when we use crafted in graph databases along with different use cases so these are the 3 I will presented in the next slides surrounding probably the most obvious the use case for a year and spatial trees if we have a graph why not just out between these notes that we have there find the shortest path between nodes and I'm also we were thinking about if we click on the map 1 to find the nearest edges and starts rotting from there the this scenario was we get data from the traffic men traffic management system of city of dressed in Germany and we get a road network with travel times for each link also we get penalties for the turns and traffic lights and this all was done by a group of students so they implemented running algorithms in a graph database and also this project was related to the research project excel where I'm working for so and the In this dataset we got the streets and the intersections interestingly as both as notes so we had to create edges between all the nodes of some and the travel time information off the street notes to the edges and then we could start by working with the routing algorithms that are available in the future so and this was implemented using you for J and embedded modes a library so you can also run you j several select and the yeah the results or reading the works that's pretty easy to set up and it's so then state of the art running engines to a degree of 10 times a something that we have really to new for j as hard as we can and then of course it has its database it's it's fine but it's not the fastest solution databases more designed for general purpose stuff even though rested is pretty specific so and you can also compare to key duality which is not the fastest but still flexible and also what it was a bit disappointing was that entries were really slow so and I also have another figure but it's really really embarrassed and there was stressed that even though adding the spatial index didn't really work so you can use the graph at which I show here it's shows that from the long and the relative some the more computation time that it takes so that's for routing and coming to the next part of this was a reaches research project where we thought wouldn't it be cool to have all your metadata from different data sources different databases stored in graph databases and and persist matches between equal columns and intersecting data especially intersecting data with additional edges and here's the scenario was we had some compounds of data from the Botanical Garden of Berlin which was maintained in many different datasets we get loads of CAD data which we had 1st a juror friends and transfer to GEO databases and then there's always done in the research project called from there was also talk about 1 of the German forces conferences and goes 14 thing and here we have as I told you we migrated most of the CAD data and prosperous using ETL tools and we sort all the metadata new for J and E and we it was also the goal to create an application where you can browse through the data and you can see it like a very simple method application and you have the different colors for the different data sources this theory works of something like this and but it's kind of and so you have 1 database node and then you get notes for the tables and notes for the columns and the unequal columns will will have a 1 year and actually traded in between they only on behalf of the name and then we have some extra nodes will we find found out that OK these these datasets are intersecting by their of global bombing voxel we create an extra edge and so no user can query this this graph to find out what how the data is connected between the data sources and here the results were OK it's pretty easy to store metadata in a graph database because it's easy to build it extends and it's also fast enough occurring so because we don't we really store the data in the graph database on the metadata and the by that time the and room the and we also used in the fridge spatial talking here but it wasn't really this there when really too many functions of this so in the end we we switch to due tools here and there but also by looking at this kind of structure and I'm also thinking maybe this would better be suitable for Semantic Web tools to create like SPARQL endpoints and so of the data it is just open questions OK so last use case I can tell the most because it was a master thesis which I have supervised and here the idea was to map the data models like you know this open at you know classes into a layer graphs database because some some older and so I previously from university job I worked at the company will have to deal with the did you know that a lot of sense to a kind of complex modesty model which can also be extended
by the user you can create your own data types you connect extend existing data types you can basically do whatever you want to the general but of course you need the suffering can consume this these user-defined extensions what is not freely available so um there are some solutions with which provide generic mapping um but yeah it is most often if you do it in a in a few map into relational databases you will in the end you will have a very very complex database schema with huge amount of tables not really performance and and so you would rely on external tools or what features services to we really work with the data as you not really using the database directly and for me this often felt like I'm the only using the database as a storage containers so I'm not really doing some intelligence stuff in there so why can't I just use a schema-less database anyway so because it but I think that scalable so and that's why I came up with this idea and so so you can find a paper was published this summer at the conference we had in Switzerland and we didn't like this that we transferred schema definition file an XML and also examine documents into Jason files so we have a is essential with the nodes and most of the edges and then we started into new for uh unfortunately there was a cool projects there is culture is omics which provided this proximal mapping to adjacent so it was really helpful here here here said the developer of the spiritual soul there was a big help for us yeah and then the results it can look like this so this is the city GML file and you will probably not be able to read all what's inside the nodes so the probable notes there at the top element of the XML file it's a city model and then from there and you get to all the city objects buildings and geometries and so on and so it was we came to this point pretty pretty quickly so this was a really hard to do and and is just simply works with all the old the data models that are out there because designers to support all of them so but there are if you look closer there's some candidates so from for example this is the polygon so on the right you will have a polygon node and then add the results in such a structure where you have a GML polygon and then use the the exterior the GMO exterior the Ringling during the pointless and this is new thing that really useful Indians to work with so I'm also consider electric train irregular networks is just the same and so on we wanted to have yeah but more logic how to separate objects so what what can we keep together as 1 document and what can we splits where which split the hierarchies into additional nodes and also we wanted to be sure that some if we work with the database the client should be able to validate our actions so that we're still compliant to the OTC data model and here we use them term another database which is called a rendered to be um that's a pretty new databases from Germany I think and which there and to be is called a multimodal database because of does not even does not only offer the graph model but also document store model on a key-value model and then in the usual way to access to work with the entity B is called a virus their API call Fox sister JavaScript API and and they can integrate the validation of layers so that every action you do with the Earth with the database as validated against adjacent to tumor and I also try to my visualize the notice as rectangles so because you in in a way we decided to keep them keep the data and documents and then like a building in 1 document and then support the sub features of a building into additional notes and so and then everything makes of the graph again but still and that's actually documents which are interrelated with each other so we 1 thing that was protocols graph databases and that carries a pretty easy with depends on what you do material this the the task was a key just all the information that is related to this 1 building and then of course you need to know to guarantee no they're there for a language which these databases offer the most of them are using their own dialect so here for example new for j you have cypher and entering the of you but what you can see is that the query for retrieving all the information of 1 buildings pretty easy whereas if you look at the relational sides yeah you will require a lot of joints so and this this is not even everything's I just joined the geometry and the the interference information so this is an example of a 3 three-city database schema so coming to a conclusion and so I would say a young man no I would say that although at 1st sight you would think was it's a cool eyes school data model it might be totally suitable for our complex data models and it's not really GAS friendly because it doesn't really support a lot of geometry data types and so on and that's from me I think for you and also for me it's that's not so such a big surprise them because the whole development in this field is very much vendor of there you know like the some the vendors decide what they want to implement so that you would need at least some Greek big group of users of the client that wants to have a certain specific spatial features and then you will can get them in the graph database something so there will also be like new additions in the newest version for new for j from its special features but still it's very very but not much and but what has been proven by this different use cases ever presented here especially the last 1 is that there are easily easily are capable in an serving very complex data models pretty natively pretty generally this kind of cool and if we think about using them a database just as a container when using a graph database so we can directly feed in all our different Jim all models and their lives there and we we can still create some external tools and to work with them yeah I still would think that maybe the elements as I look at GSA that's mostly really flat and not really in many objects are related with each other because we make the relation by their location so maybe there are not many use cases to use of pure graph databases but maybe a multimodal approach uh you might be very interesting to have the ability to switch between different and models while you go of course you can also do this by just using different databases but with different 1 um and it is often said that graph databases are not as fast as scalable like other than the sequel database that's kind of true but still you can you can have a very good performance with graph databases unless you have a lot of right so for example memoranda to be as far as I know it will all builds up there in the indices and indexes in in main memory itself is very much memory consuming but also then very performed so yeah that's not it so that you can find the slides on this link and them if you have questions e-mail me or talk to me I will attend rest of the conference so as an interim measure so I
think you you also initiated earlier so we have time for a few questions you heard someone isn't but then when yeah be all right of of a not all of the we we this is this is the idea that there is a lot of people use it has been my 1
it is and you will
I would say that the performance is really that is just like several times slower than but brooding engines and across the optimized in many ways
to drop off a lot of data and and so this is just what we found out that it was that that took some hundreds of milliseconds but the other reading engines are like finished intense milliseconds so there's still a huge gap but an I don't know firm graph databases really suitable place to to do real-time routing that's sure you can do you can implement all this flexible routing possibilities that you have the that will become OK I think and would be not be so I think because the other based on the data model as to sentences like graphs the In same question of search the so you
that the weights in in the United States with the vocabulary words let's just say these experience on that network it's about that has so again on the using its state space the space of things about it it you know you have to have a spatial state faithfully model represents of the now
we didn't represented in the database yesterday's so yeah that's exactly and so the weights were known in before and so it wasn't like I still people skills you I me it's acceptable for the and amplitude is having a you say that the multi but it
would be quite suitable for GA status or whether that the choice of a relatively and motivating database you think this is like the most suitable approach for storing the generated because we have still the ability to switch between different models because I don't and ingest that I don't see you that much hierarchies in there to create such kind of graph structures so and
sometimes it through just some of the theory it is like this information like the country
and the region said here and is going to show up 3 but it is true that there may be it also try to implement an archery
in and you fidgety about half of it but it was it was so so that that's why they're relying on to the can any more questions OK thanks
very much within the next