Graph Databases: Talking about your Data Relationships with Python

Video in TIB AV-Portal: Graph Databases: Talking about your Data Relationships with Python

Formal Metadata

Graph Databases: Talking about your Data Relationships with Python
Title of Series
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
Release Date

Content Metadata

Subject Area
Graph Databases: Talking about your Data Relationships with Python [EuroPython 2017 - Talk - 2017-07-14 - PyCharm Room] [Rimini, Italy] Have you ever considered how many relationships you have in your virtual life? Every friend or page liked on Facebook, each connection in LinkedIn or Twitter account followed is a new relationship not only between two people, but also between their data. In Brazil only, we have 160 millions Facebook users. How can we represent and manipulate all these relationships? Graph Databases are storage systems that use graph structure (nodes and edges) to represent and store data in a semantic way. This talk will begin approaching the challenge in representing relationships in Relational Databases and introducing a more friendly solution using graph. The definition of Graph Database, its pros and cons and some available tools (Neo4J, OrientDB and TitanDB) will be shown during the presentation, as well as how these tools can be integrated with Python. Outline: Relationships Relationships in Relational Databases Graph Definition Graph approach to represent relationships Graph Databases Definition Advantages Neo4J Usage Examples Integration with Python Comparison between Graph Databases Comparison between Neo4J and Relational Database Application
Group action Graph (mathematics) Graph (mathematics) Gender Software developer Debugger Mathematical analysis Database Coma Berenices Database Student's t-test Mereology Stack (abstract data type) Front and back ends Wave packet Degree (graph theory) Word Self-organization
Pairwise comparison Group action Matching (graph theory) Graph (mathematics) State of matter Relational database Cellular automaton View (database) Software developer 1 (number) Online help Bit Database Client (computing) Lattice (order) Cartesian coordinate system Code Event horizon Product (business) Power (physics) Process (computing) Core dump Self-organization
Web page Trail Group action Link (knot theory) Code Multiplication sign Connectivity (graph theory) 1 (number) Set (mathematics) Disk read-and-write head Facebook Medical imaging Profil (magazine) Internetworking Representation (politics) Software testing Data structure Area Home page Graph (mathematics) File format Relational database Gender Weight Graph (mathematics) Database Cartesian coordinate system Graph theory Category of being Vertex (graph theory) Video game Table (information) Row (database)
Home page Matching (graph theory) Graph (mathematics) Gender 1 (number) Database Basis <Mathematik> Line (geometry) Vertex (graph theory) Representation (politics) Circle Data structure Abelian category Graph theory
Pattern recognition Group action Graph (mathematics) Database Area Information retrieval Facebook Mathematics Mechanism design Different (Kate Ryan album) Process (computing) Physical system Area Algorithm Pattern recognition File format Relational database Data storage device Data analysis Measurement Mechanism design Type theory Proof theory Process (computing) Data structure Physical system Server (computing) Implementation Sequel Algorithm Virtual machine Data storage device Scalability Number Centralizer and normalizer Directed set Data structure Graph (mathematics) Information Mathematical analysis Theory Analytic set Database Ultraviolet photoelectron spectroscopy Cartesian coordinate system Scalability Graph theory Personal digital assistant Information retrieval Table (information)
Trail Category of being Graph (mathematics) Process (computing) Sequel Relational database Query language Electronic mailing list Website Database Bit Representational state transfer
Graph (mathematics) Gender Gender Poisson-Klammer Electronic mailing list Database Parameter (computer programming) Representational state transfer Attribute grammar Type theory Category of being Query language Vertex (graph theory) Object (grammar) Resultant
Matching (graph theory) Database
Positional notation Information Direction (geometry) Arrow of time Representational state transfer Form (programming)
Web page Home page Graph (mathematics) Vapor barrier Information Structural load Home page Mathematical analysis Database Database Complete metric space Query language Object (grammar) Address space
Home page Graph (mathematics) Information Query language Interface (computing) Digitizing Musical ensemble Representational state transfer Web browser Object (grammar)
Graph (mathematics) Gender Code Graph (mathematics) Multiplication sign Sheaf (mathematics) Database Database transaction Subgraph Methodenbank Cartesian coordinate system Mathematics Password Pattern language Object (grammar) Endliche Modelltheorie
Beat (acoustics) Home page Dependent and independent variables Graph (mathematics) Graph (mathematics) Web page Home page 1 (number) Type theory Query language Computer configuration Musical ensemble Object (grammar) Abelian category
Pairwise comparison Graph (mathematics) Sequel Key (cryptography) Java applet Graph (mathematics) Java applet Data storage device Database Database Revision control Query language Network topology Extension (kinesiology) Extension (kinesiology)
Sequel Observational study Graph (mathematics) Multiplication sign Home page 1 (number) Database Client (computing) Average Formal language Read-only memory Different (Kate Ryan album) Average Semiconductor memory Operator (mathematics) Selectivity (electronic) Software testing Home page Computer font Dependent and independent variables Graph (mathematics) Key (cryptography) Client (computing) Total S.A. Database Bit Line (geometry) Graph theory Type theory Vector space Vertex (graph theory) Right angle
Sequel Relational database Query language Operator (mathematics) Relational database Home page Database Table (information) Abelian category
Area Computer font Graph (mathematics) Sequel Relational database Database Database Cartesian coordinate system Attribute grammar Type theory Query language Personal digital assistant Network topology Query language Data structure Navigation
Area Graph (mathematics) Information Relational database File format Connectivity (graph theory) Graph (mathematics) Interactive television Mathematical analysis Computer network Database Mathematical analysis Event horizon Particle system Software Telecommunication Telecommunication Natural language Data structure
Cluster sampling Greatest element Computer file Algorithm Graph (mathematics) Weight Multiplication sign Gene cluster Mathematical analysis Methodenbank Centralizer and normalizer Methodenbank Query language Directed graph Chi-squared distribution Computer font Algorithm Graph (mathematics) Information Weight Mathematical analysis Electronic mailing list Computer network Database Ultraviolet photoelectron spectroscopy Cartesian coordinate system Category of being Digital photography Software Green's function
Vapor barrier Graph (mathematics) Information Relational database Graph (mathematics) Weight Graph (mathematics) Gene cluster 1 (number) Mathematical analysis Set (mathematics) Mereology Disk read-and-write head Cartesian coordinate system Measurement Centralizer and normalizer Robotics Normal (geometry) Computer worm Table (information)
Presentation of a group Codierung <Programmierung> Cycle (graph theory) Blog Hidden Markov model Twitter
Server (computing) Implementation Presentation of a group Functional (mathematics) Sequel Euler angles Multiplication sign Modal logic Time series Black box Mereology Event horizon Attribute grammar Revision control Fluid statics Coefficient of determination Centralizer and normalizer Mechanism design Goodness of fit Mathematics Population density Bit rate Semiconductor memory Energy level Form (programming) User interface Pattern recognition Graph (mathematics) Scaling (geometry) Relational database Weight Interface (computing) Graph (mathematics) Electronic mailing list Data storage device Physicalism Sound effect Database Instance (computer science) Cartesian coordinate system Graph theory Sparse matrix Process (computing) Website Video game Iteration Library (computing)
and had run has so many in the cold air and I'm going to discuss with you today about graph databases and how to handle relationships uh within your the you with the new data with by um so 1st
uh let me introduce myself I am the fully stack developer at lab coats a and I work mainly with jungle in the back end in England years in the front end and adding Alma and also as a student at the Federal University of Grenoble cope in and I'm trying to get my master's degree and I'm almost there and the final part in my master research is related to performing all Clary's olap analysis on graph databases so that's why I'm interested in this topic and that's why I'm talking to you today about this um and also a member of the Python Users Group impregnable cool and the pilates group in which the the that's my teacher an I leads in Brazil so yeah but later his see fainter than book I said a lot of words that you guys probably be they understand so I came all the way from Brazil to him in a to attend era Python yeah if it's so it's a really long trade from 19 hours between um airplanes and trains to get here so I'll leave you receive the that's in the northeast part of uh Brazil we have like this very active community of Python use back there we you recently organized gender grows last month I was 1 of the organizers and we also organized a the fees these uh to user groups uh the team so as we're very actively we're really proud of it aid I work at a lab
coats has a set the forehead and it's a softer pseudo and and what is that codes letter codes is a softer still that's from received the to the world and and the use of 2 studio means that we developed solutions being needs at a process or a and web at or a product for our clients and it's according to our clients desire in the client's needs an we can solve the problem we can develop a new product or we can implement a new process in their business we have 5 years of experience with clients in Brazil and in the US and in the technologies that we use are mainly and don't go on JavaScript may with and what has and react in the little bit of view you day we also work work with uh post Greece custom on a less in Alaska search and a lot of others and so on and codes from the with the help of the community so what that's why we are always trying to give something back to the community and we helped organize some of the uh powerful user groups meetings in our state we help to organize Abbas you come in 2014 it seems should thousand and 12 we have been to all Brazilians Python conferences every year and we also have participated in a lot of junk grows events as coaches and organizers so we are trying to always give back to the community because we came from its we came from a group of Anthony sets match in a Brazilian Conference cell and is it's just fair to give it back began about
about this talk about this is mentioned that I was that talking about relationships are what I mean by at a year and then I will introduce the concept of graph databases and in no if anyone have heard of it and has anyone heard of graph databases will that's really good that the missing it and then I will proceed to talk about a new Ford saying that the most popular graph databases the graph database that we have in the industry but we also have other so well be comparing some of the solutions that we have available today I wasn't doing as no comparison between new for J and relational databases are how we can compare those 2 because we as a the ones that are used to develop uh let developers I used to work with relational databases so I brought and some of the concepts here so we can compare in and and I will talk about some applications some core applications that we can have for graph databases
yeah at so yeah let's starts with relationship so what is that they keep talking about and
so on our relationships in daytime is an it's really is pretty much related to know the relationships that we have in real life every time that you add a new friend on Facebook or you follow someone on Tweeter and or been and an image in the interests or you accept a new connection on link Indian you are creating a new relationship not just in your personal life that means that now you're friends with someone else or you've vol someone else but you are also creating in a database a relationship between your data in someone else's data so when you want someone as your friends if you need your profile data is not linked somehow to the profile data of the other person and the but now I see you are a test we have a lot of relationships on the Internet only in Brazil we have 160 mediums used users of Facebook so it's like a lot of people use of Facebook back in Brazil and each 1 of them at each one's friends in like a page or participate in the group so that's a lot of relationships um but how can we represent and manipulate Altes relationships in a good way so is not how we presented as most scenario very common as social network where uh just an example where the user can be friends with another user or like a page working in a similar way as we know Facebook just as small as an area that to we can request so another to have this scenario let's try to come than uh represented the daytime of this scenario using tables as we will do in relational databases uh so as 1st we would have a user table so we can keep track of the date of our users and each user has a name gender and age OK that's cool so now we need to store the formation of we to user is framed weights it user so let's creates a table called friends with and in this table we have the idea of a 2 ideas of users in each row represents a friendship between 2 users code that's been nice I understand so far and so let's create another table to represent pages uh a each page has a name and ID obsolete in that category but we also need to store the information on from each user likes which page so let's create a table called like that connects uh user ID to a page ID OK now I need to know now I have all my data in the tables and running my application and they need to know what are the pages that the user with the name John likes and OK and OK users always maybe John so let's go back here to the user table and we find out that John has an idea of 1 code so now we have to go to the other table a is they're like stable and that's the idea of the pieces that the user with ID 1 light and we have their heads are the pages with ID to end I D 1 and then we have to go back to that page it's on table and against that that that page with ID 1 is cola in the page we take the 2 is the Beatles so yeah you saw that to introduce really simple question I had to comes to query 3 different tables just to get the pages that the usage of light and that was not fun it was a lot of going back and forth to try to figure out by the which 1 is which 1 needs it's not great then basically table sucks for this kind of thing you mean and it's it's not that the the best way to do it so how can you use a better data structure to represent this state that if we can insert this kind of question faster and more intuitive way that's where present your graphs and graph is added that structure that's usually represented by the letter G in mathematical format and is formed by a set of vertices V a the set of edges E if that murder simple as a data structure that i new 1 that has studied this in the past knows that is just a representation it is a very simple way in in to achieve a graphic way to represent your data and given this concept now we can represent all war scenario using graphs here I have to graph representing now the same data that we had before in Tables represented as graphs each
circle is a vertex or node In each line represents an match or a relationship between the notes and we have in the green ones are the users their red ones are the pages as uh so as you can see we get each alone we have labels for each relationship that we can see a friends with and and relationship of light I mean
so given this representation is too much easier to find out which basis you the usage of like so we just find in the so called was the name John and then we followed the lines I do we get to the pages that the users like this so that's pretty good then we can have like a graph can view of our data and it's really nice but how can we use this kind of structure in our database
that's when i comes this graph databases so graph
databases justice system that's farmers data in graphs took the structures and which allows users to explicitly store the relationships between data so with explicit relationships we can get either directly information Retrieval method we can directly retrieval the formation of the relationship and besides that we have other advantages for graph databases not only explicitly story the relationship in the database uh they also allow and more elaborated date analysis and we can use a common polaritons from graph theory area that I don't know if anyone here has ever heard of it but we can have process algorithms of community detection of pattern recognition or centrality measures and you run these organisms in out there actually own databases and find out a new information new analytical information about our data the another advantage is that graph databases I have a very flexible uh the schema so what is it let's imagine now that for our leaders scenario we would like to introduce the concept of groups as we have groups on Facebook proof of riffraff members and double lifted to the that in there in relational database we would need to create a new table and probably change the columns of another table and do a lot of things to make this work but from graph databases but we only had 2 already added the node with this type of group in connect those to the existing nodes we don't have to it we don't even have to look at how the database is organized before we can just add that's why I didn't show everything because it's not necessary I just have to add that node and connects to whichever I want to connect it's that simple yeah another didn't it is that a recent graph databases and implementations are implemented using no sequel storage mechanisms so bake they carry out their differences of no sequel database is with them which means that they have a horizontal scalability which means that 2 we can I improve the performance of a database just by increasing the number of simple machines that runs all the cases we don't need a huge amazing server we can have a small and simple machines that can ruin our our application end which means that we can as do some distributed processing to improve the performance of our application it's really good it
so I know I decided to
go up a little bit further is but explaining you foresee as our graph database at because it
is the most popular graph database according to D B E genes to be in the east 80 websites that contains a list of all the database is available both relation now or no sequel of graph databases and to keep on track of it and they keep the interests of most popular um databases and if was a is the most popular in the category of graph databases it is implemented in Java so has its own query language that is called 1st I'm going to show some examples of In the data can be accessed through through the REST API or add Java API uh in now I rule and come with some examples of such a process for you and let's let's say that we want to create a vertex on old as we call it the new for j this is the common that we use we just have to
uh use their the keyword creates In the in we are inside the the practices that John is just some and and yes for the nodes and we say that the type of the node user agents in the brackets we pass all the parameters that all the attributes that the you know the node has has name gender it in 8 and for these queries specifically I want to return this these nodes just I could show you this is the result of run it is
common to any of which it's just a single node with the name John yep basically but when we do this using the REST API he we get into his own object and decide the digital object we have this graph object that comes with a list of nodes in the list of relationships in our graph it we can see that for now we only have 1 node uh of the label user and with the properties that we see that named John with no relationships so let's create a relationship because that's not and the of graph databases how we can create relationships
using Cypher and let's say that we have already to all nodes in our database that's John and Mary ends at 1st we need to retrieve these nodes by using the keyword match and we and getting them by their names and then we create a relationship of the label friends with between these 2 notes and I we return so we can see it
yeah the so I at 1st we had married John totally separated just 2 nodes in our database but an now after creative relationship
we have this arrow come from John to Mary as we use this
notation this arrow notation in their this i for Crary where we use an arrow to indicate the direction of the relationship but this is totally apician now we can have relationships without directions Bob with directions both ways so you can at even add more information to about your relationship this 1 I prefer to add a direction that's
totally up to you and I yeah their arrests and they'd say so as a form of a REST API requests usually have to create a relationship we have now our
graph object with 2 nodes that's as John and Mary into 1 relationship that connects those 2 notes so we can see how we can do that using address API now
let's load all that information that we had before analysis barrier let's
and to let's load everything up to 4 stages this is how this is what we get we get all the 3 users and the 2 pages that we have the relationships so now let's query let's try to retrieve some data from this database and to query we use that you would match and then on we can just say next then then all ways than they use on on with the name John a that has a relationship of to page in her tummy those pages that's easy so it returns me please
and these graph this graphic this semester I took from the new for a embedded browser a browser interface that they have as soon as you install it you can go to their it's really easy to use it's really graphic you can totally see your data like this and makes things things easy if you I starting with new for J so it's an it's really nice if you were doing a REST API Prairie and digital objects that readers now has an object of data that's the data that it
wants to return to your query and it returns to roast because they use a John likes to pages so it returns Ichiro returns the relationship as an of Joan with the Beatles and John which cola so we have that information into his own house of yeah that's pretty good but we only use a cypher where's Python so as to
integrate this out to integrate new foods a
waste your Python applications and we use by to New York right 80 pattern model that integrates different it to your application a supports Python 2 and 3 yeah as so I would show you an example of how to do all the things that I said before to create a node to create a relationship inquiry 0 yeah database the use ICU so this is the python code to eat we import from the prior to heal our object of graph know the relationship as a always subgraphs running we await subdatabase Roni we can get the graph to object from that just had to pass the password and to eat a then we started to section they did not begin we static transaction in the synsets and we can create a lot of time and we can create nodes and relationships in a week or meets every change that we want at once and at the end of the all the transaction so at 1st created the nodes of a Joan and then the called the create method for that transaction I did the same thing for Mary and denigrate the relationship but everything just get close to the database as soon as and to the Committee to comment so when a coder comments method it pushes everything to to new which but
how can we query antiquaries is just a simple we have our graph object and we now use an old selector in which these notes lecture we can select an 80 and older with the label user and the name John and that the first one that corresponds to the squaring N is the day of the week can match these are and uh did the relationships that starts with these nodes that represents the user John and there has the relationship of types like and then get all these relationships and print them into and node of these relationships which corresponds to the pages that the usage of likes and I just created here so to see that that the response and you also allows you to all we room assets query he side of it so and this is the user you all price of to to run this but you can also and the common the ground and you can best whole cypher query to it and the 2 will return to you that the response and so you have now i i talked a lot about new for
unity we're pretty used to it by now but it out what are the other options that we have
so far and according to 0
I I took at the other 2 most
popular graph databases from the genes we have a really be entitled the as a put up the comprise more comparison between those 3 in infancy is an active graph database that going to be a somewhat more the database which means that it not only contains graph database but it also supports keystore a key value store and columnist for a bunch of documents are a bunch of other kinds of our methods to sorry data while tighten DB and works with rest but it has to have a better in the DB to work with it can be I of Berkeley DB are there is a kind of a database to connect to tighten the B all of the tree that all of the 3 Arab implemented in Java a at each 1 of them has its own query language we have suffered from the 4 j we have an extended version of sequel for or you to be able to be have grimly for Python DB comparing those 3
types of acquire language we have a serious of questions whether the pieces that the user John likes um was a we're right so how that's done it's pretty easy Broussard forwards and but audience be brings some of the are more familiar to the ones that they used to sequel because we can see that structure of select from where that we are used to but they always add some key was to work with graphs so as its select seen on the relationship like that goes both ways aimed at expanding to get the nodes that are at the end of this relationship in each we returned from end users on with the name John so it to get the notes that I am from the user with the John and will expand to the other nodes that these nodes connected to and we're to bring that the pages really has a total different syntax it's um it's not something that we use to but someone that has worked with growing should be find this easy but uh it's doing basically the same thing it's going through our graph G and to our of vertex assets that the it is getting the vertex that has the name John ain't getting out there in the then the nodes that's connected to eat but is basically the same idea behind it yeah sigh and that's compares some some some things about performance
of i didn't perform this these experiments I took it from some papers that are found on line that was compared those that were comparing these 3 databases and I thought it was good to bring here if someone was wondering how these 3 performed between each other so 1 of the most common operations that we do in a database these just retrieving using as their ID so in this test and a in the public they every it's time for at each of these 3 our databases to retrieve an old news that given its ID in the graph that has high 100 thousand vertices and a for clients the air performing these operations to 100 times so they did this experiment and based on the coupling to the average time response from that is it was done by some researchers in Belgium In and here we see that our moon Titan DB highest and is is likely is lower than the other 2 have been debate is the fastest 1 but new foresees quite close to the line so and we see that they have a pretty pretty close but yeah tightens the 1 that's a bit slower because I'm yellow but this can change if you change the the vector and B I mean the in for this experiment they they're using passenger so identified in other experiments using other kind of uh vector uh of database so this could change but the another performance experiment that they had these an is related to the amount of memory required by each of these databases and new so based on the graph with uh 32 thousand vertices and 256 thousand respect edges in the it chosen by some researchers at Georgia Institute of Technology and we see that a it's the requires a lot of internal memory 4 to studies kind of graph y type and it is the 1 that requires that the least on here In a 4 is a is In the middle of it so yeah uh but
bring into a relational database that something that some of that that's a concept that we are most used to as if we want
to perform this inquiry that it was doing before we doing their best examples if we wanted to do that in sequel we would have 2 this would be the query we would do would be selected from somewhere and we would we would have to join this 3 tables to get this information and besides the fact that these these not as legible as the 1 in the the 1st query so that's I think the the advantage of sigh fair to S equals so yeah and we know that a joint operations are really great for us but it takes some of the performance of complications so that's not so good and I I also brought a performance experiment that
uh as as some researchers at the Mississippi University did comparing and you for J with my sequel and they basically have tried to submit 2 types of queries and structural query aid I data query an query basically goals navigating through your data through the relationship so few data AIDS doing like have we did 1st search innovator like a tree and data querying only retrieves um um a node by its attributes and we can see here that actually um mice and no any of a is not that good for data query if your application is only trying to reach 3 nodes by some at tributes aimed not at using the relationships so much so that the topology of the graph so much maybe that's not a good idea to use new for j because my sequel can perform better than that and but if you have a lot of navigations between the relationships of the data and then new for T is the way to go because it's easy to use and it's more it's faster and then and my sequel so yeah we you have to to analyze what's your application is doing and how it's using the data to choose wisely if you goal or not for our graph database and so yeah let's see some
of the cases because I'm talking about so in theoretical so where they use its where it's it important to use and graph databases there are several areas where
we could use graph databases 1 of them is social network that's the example that I've been doing so far aid an the yeah we are we we can see that's pretty straightforward to relate those user as nodes and relationships as edges it's pretty straightforward but we also have some work done in this area for bioinformatics in the genetic analysis I'm not from this area so I am not I don't know exactly what they do but it appears to be that the particles of our DNA Dave have some interactions between them in those interactions are relationships between particles so they start this if this kind of information in a graph databases so they could process this information in a graph databases in a better way than it would be in a relational database so that's an area that's taking a lot of the of a lot of that event is from a graph structure another interesting area that's using graph databases nowadays EC telecommunications because as they can represents the formation of the person calling another person and the connections of the cables
easily using graph databases so they can visually see their network using this kind of database yeah but I personally brought to you today and more specifically as yeah yeah for me it's important but I do know about you an
application about graph at the bottom of Thrones that uses graph databases so it's really really interesting application of graph databases that someone did and it was amazing so brought to you here the work of these guys that's just uh it blew my mind and it was angel
beverage it's a huge shown and
and I find the those names correctly but they took the time to analyze the network formed by the the characters of the book I storm of salts from 1 of the most loans a and they went through all the book and they resisted the relationships between each of the practice and they also gave weight to this kind of relationship so if act as a person was really close to another person it has a higher weight of the relationship if it's just like and it's not so close so that the weight these is low In the did these and they put in the a CSV file it gets formant like this so so they they character at among tool the category has been laid and a weight of 5 but lemon with some well has a weight of 3 1 so they did this to all the book and all the characters of the book and you know that a lot so would that really amazing if there is someone the other guy it's called William Leon that had the brilliant idea to loans information assigning you for j a The I started to play around with it and do some analysis on its and to the this analysis he used to air module called I grant I don't know if anyone has ever heard of it it's about a module that allows you to manipulate the graph using as an network analysis algorithms such as centrality in community detection it's really pretty easy to call this away from us from this module a has a really neat way to connect with new for j uh so yeah let's take a look at how we move beyond the biggest using I graph and you for a using by some so he had got their graph from prior to yield a it's connects new for J and then he loaded all the information from the graph the the new photographs into I graph and then he just called The EDR community Walktrap method a ain't considering the weights that each relationship had and indeed was able now using this method he was able to identify the the clusters or communities of the end of the list of characters so low that they a
result of that that method was a table looking like these as so we have 8 clusters in each cluster head AEU and a set of characters that as a part of this community and in the west quite low so it's over here but a sees it's not deterministic and it some some some stuff like this may happen to your data but yeah he took this information and he also did some centrality measure and he was able to come up with this is a graph so this graph represents all the characters of the book and they are divided each column represents the community or cluster the note that the size of the node represents the importance of that character given the centrality measure In there it with this of the edges represent the weights of the relationship between the nodes so the below you can see that I don't know if you can see but um we have if you look at we have that the big the big node of the Barrier is John John Snow aid if you look around you see that the blue alkalinity represents that people that's in the wall on the wall so you see other the characters over that and the green ones you see that they be good node is then there is any you know if you look around to see that that's all the characters that are from her parts of the of this film the big yellow 1 you see we have robots that the king and anterior on in and so see other other important people at that represents the at the main part of the of the IMF name of norms and
so yeah and this is really funny it's was really have a nice way I found it I really interesting application of graph and databases how you could play around in and out and analyze the data to extract some more information about it and it and it's pretty it's pretty nice to see how your data behaves even his analysis if you work to with a relational database you couldn't have these visual of your data which the nice so young that as its
I have any questions please feel
free to ask now or later I would be around here on the road the this talk is only speaker that as less leather coats I also have written a blog posts about this presentation is of the same thing you can find all the nodes aligned in our medium account in lab coats and these are my 2 ETA in HMM so if you have any questions you free at the thanks fj
few and we also have stickers over here if you want the stickers yet so
thank you so much regret of the intron OK you have a lot of questions to have that sometimes the the of I don't think for the amazing talk I come from a basically know a skill and relation of the base background and my my question is mainly held the scale such about the basic graph at the race maybe would tighten the being Cassandra right conceived but with the other side and have an expense understand the static part effect the scale you grow when you're not that gets to a size where it's basically you cannot showed memory to store everything in 1 1 server yeah for new forms a you have distributed processing so you can do that easily but on foreign today they also referring to be they also provide a really nice way to disputes your processing because it's also no sequel and kind of database so they already have not in there I didn't bring here but they have in their on their website everything nice tutorials on how you can do this and but it's basically the same thing you can and come up with an instance of a database and spread out and they have nice mechanisms to make everything as a distributed and come up happiness they come together without conflicts and his or like sharding using whatever you yeah here again about so but I've PetaMedia notcher I'm sorry I Idea such much about that may be but where to be totally physicals is concerned about guns so imagine of more yeah I like I assume that some of that but I didn't read much about it but the so that because of the candidate or hello of thank you for a token so I have 2 questions the 1st 1 who can I some old specify Mishima for for denote ended the base for let's say user rests on the name and the of actually new asynchrony is chemo as and you don't have a way to to put your schema inside of the of the day preceding and I use probably should the DC in application level instead of the database it's can uh that the big thing is that you don't have to come up with a schema for for so yeah yeah thank you and the 2nd question of life from the with the moon yeah it's so it's some whole carry some of specified dough storage mechanism for autograph like for sparse graphs or dense graphs is so optimal is enabled by storing the Russians of iteration ships in Babel or like linked list you know inside the duo forager the he yeah I'm not sure about that and never look at 2 to it if there is different ways to to do it but and that as far as have got to new for subtree black box that you just and it's take that takes care of you for you these things and I am not sure if you are able to tweak these 2 to customize the way that based on the specific parts of your data and I'm not sure I know that with like you can add change your back and as as as your application needs but new for J. I'm not sure if you're able to do that you were retarded drew 3 more questions so the these babies are perfect for storing relationships much relationships change over time how do you keep the better that because if you change a relationship that old is no known no longer there yeah I mean as the next step for me would be to try to come up with the time series for uh how you you would implement time series applications using uh databases but yeah I'm not sure you probably should and you can have an and start attitudes in your relationship so you can add a star like lists of attributes EUR relationship which could you could start they had a historic part of your relationship weight has been but and I'm not sure how you could implement really time series and application using you freeze is a but and you can do it implementing the in 18 and attribute you know your relationship events it it yeah thank you for the presentation I have 2 questions so if I'm not wrong you no forger there's a user interface we can actually see the graph in journal on hold those interface react if you have a big Raphael funnel hundreds of millions of knowledge yeah yeah it limits it limits and working now with an on a database that was supposed to be 16 thousand nodes and when I search for all the nodes it's just returned to me at thousand and so on it limits to you because it's really can be really heavy in the front and so yeah it limits and the 2nd question would be what I did a quick collection of to Thunder and there were some post like used to deliver the true aborted all the continuing on all of the progeny yeah I I had this morning actually someone treated me about Titan DB that idea had the time to look at it but it seems that they have like a discontinuation going on so I probably have to update my my talk but I just heard of this today and I'm going to take a further look at the lowest being involved in the process of that change the name I'm going to I'm going to take a look but yeah I so this this morning to really weird know a very good presentation of danger and I'd like to know paraphrasable when you use our graph possible do you use that Baxter of reaching 4 is imple tool are find did trust expressed between 2 nodes in graph databases do you have despite the Arboretum achieve or necessary I get the data and use the graph to looking for these present you have and you have some libraries to have to add to your site to unify JA you call these functions from Cypher so they our at coupled with the graph databases because these want they I graph 1 is really nice but you can do it without a graph database but you have some libraries that calculates the and and they a centrality measure their as pattern recognition of using the new phytate only affecting without their their graph you have those libraries to thank you go on great dog because remember to the rate at which the government it