We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

NetworkX Visualization Powered by Bokeh

00:00

Formal Metadata

Title
NetworkX Visualization Powered by Bokeh
Title of Series
Part Number
157
Number of Parts
169
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Björn Meier - NetworkX Visualization Powered by Bokeh Visual data exploration, e.g. of social networks, can be ugly manual work. The talk will be an introduction for the combined usage of NetworkX and Bokeh in a Jupyter Notebook to show how easy interactive network visualization can be. ----- During some work with social network analysis my favoured tool to study the networks was NetworkX. It provides a wide set of features and algorithms for network analysis, all in Python. But the functionality to visualize networks is not very strong and not to mention the missing interactive manipulation. However during the exploration of data: exporting, feeding an extra tool for visualization and then manipulating data manually was a tedious workflow. As I also had the optional target of presenting networks in a browser, I improved this workflow by creating a Flask web application providing interfaces to my networks. On the browser side I created a javascript client based on D3.js. In retrospective the required programming effort in Python and also in Javascript was too much for such a task. And exactly this target, interactive visualization in a browser (and as bonus in a Jupyter Notebook), can be achieved quiet easy now with Bokeh. The talk will be a step by step introduction, starting with the basic visualization of a network using Bokeh, NetworkX and a Jupyter Notebook. Next, how to create interactions with your network which will be used to change a network structure, e.g. a leaving person. As we want to see directly the impact of these changes in a network I will finally show how to update networks and visualize directly how the importance of the remaining people changes. And all this can be achieved with Python and maybe a bit of Javascript.
11
52
79
Red HatComputer networkVisualization (computer graphics)Library (computing)NumberProjective planeMoment (mathematics)NeuroinformatikSoftwareIterationSampling (statistics)Natural numberTwitterInternetworkingObservational studyWeb pageCategory of beingCommunications protocolComputer networkGraph (mathematics)BitWeb browserDatabaseView (database)CASE <Informatik>Client (computing)Module (mathematics)CodeInformationVector spaceGroup actionDistanceGraph drawingWordData miningExecution unitMereologyComputer programmingBit rateSlide ruleAreaOverhead (computing)Lipschitz-StetigkeitSoftware engineeringAuthorizationLink (knot theory)Mobile appWeightSound effectHypothesisLimit (category theory)LaptopFunctional (mathematics)Computer filePlotterClosed setLecture/Conference
Computer networkCoordinate systemTable (information)Position operatorSource codeMathematicsForcing (mathematics)Data storage deviceCASE <Informatik>TwitterGraph (mathematics)Spring (hydrology)Task (computing)Social classSound effectForm (programming)Row (database)Endliche ModelltheorieFrictionFile formatElectronic mailing listGraph drawingIterationMilitary baseComputer animation
Moment (mathematics)Frame problemInformationPoint (geometry)BitData dictionaryGraph (mathematics)WeightLecture/Conference
Ext functorTupleKey (cryptography)Mathematical optimizationBitSource codeCodePlotterComputer animationLecture/Conference
Pointer (computer programming)Maxima and minimaMUDAreaSubject indexingRight angleCategory of beingSource codeFigurate numberPlotterComputer animation
VolumenvisualisierungPlotterPosition operatorSoftwareWordCircleMetropolitan area networkHeuristicGraph coloringMatching (graph theory)WebsiteSource codePoint (geometry)Line (geometry)Overlay-NetzLevel (video gaming)Cycle (graph theory)Lecture/Conference
3 (number)Metropolitan area networkComputer networkMaxima and minimaRoyal NavySource codeMountain passLine (geometry)AlgorithmSource codeData dictionaryPoint (geometry)State of matterComputer clusterSoftwareMaxima and minimaGraph coloringService (economics)Uniform resource locatorFingerprintInformationElectronic mailing listCategory of beingDifferent (Kate Ryan album)Cellular automatonBitPosition operatorWeightAlpha (investment)CircleTupleCentralizer and normalizerComputer animation
MultiplicationCentralizer and normalizerRange (statistics)SoftwareInformation securityCircleMappingBitAlgorithmLecture/Conference
Mountain passEmailExecution unitMaxima and minimaChi-squared distributionBitCentralizer and normalizerDressing (medical)Source codeSpacetimePoint (geometry)Connected spaceComputer animation
Graph coloringVolumenvisualisierungSource codeMappingAttribute grammarElectronic mailing listModule (mathematics)Computer clusterOperator (mathematics)WhiteboardResultantCalculationMultiplication signAdditionEndliche ModelltheoriePartition (number theory)PlotterModulo (jargon)Group actionLecture/Conference
Line (geometry)Source codeCentralizer and normalizerDifferent (Kate Ryan album)Graph coloringPoint (geometry)Computer animation
Matching (graph theory)Lecture/Conference
Set (mathematics)Slide ruleBitInteractive televisionSource codeData structureLecture/Conference
Vertex (graph theory)Line (geometry)Patch (Unix)View (database)MIDIMaxima and minimaIntrusion detection systemBitLaptopServer (computing)Subject indexingNetwork topologyPrice indexCircleData structureCodeCausalitySoftwareLine (geometry)MathematicsMatching (graph theory)PlotterKey (cryptography)Physical systemPatch (Unix)Source codeSlide ruleXML
LaptopServer (computing)Maxima and minimaServer (computing)MathematicsLoop (music)Electronic mailing listLaptopCentralizer and normalizerConnected spaceMobile appFunctional (mathematics)CodeGUI widgetTwitterProjective planeInteractive televisionQuantumLecture/Conference
Source codeCalculationCategory of beingConnected spaceFunctional (mathematics)InformationSoftwareLecture/Conference
SoftwareConnected spaceBitMoment (mathematics)Zoom lensCellular automatonNormal (geometry)Diagram
Electronic data interchangeSoftwareCategory of beingLecture/Conference
Software developerLibrary (computing)SoftwareNumberState observerLecture/Conference
Computer networkPresentation of a groupTwitterLink (knot theory)LaptopMetadataComputer animation
Centralizer and normalizerOrder (biology)Pattern languageClosed setFunctional (mathematics)Staff (military)Video gameCASE <Informatik>AlgorithmGraph (mathematics)Computer clusterSoftwareWordForcing (mathematics)QuicksortPoint (geometry)Connected spaceMultiplication signSpring (hydrology)Right angleSlide rulePosition operatorDifferent (Kate Ryan album)Process (computing)RandomizationSystem callComputer networkBitLine (geometry)CircleLecture/Conference
Transcript: English(auto-generated)
and he is going to show us some of the programming networks. We are coming to a close. Hi everybody. So I hope you are not here only for the comfortable chairs and wait for the lightning talks. In this meantime I will tell you something about networks and Bokeh and how you can plot networks
even if there is no support in there for directly plot networks. About me, I am a junior software engineer at Blue Yonder. I do not use this at our work. It is just a side project.
At the moment it is not used in our company. That is it. I hope most of you maybe heard the talk of Fabio yesterday. Do you hear it? Most people know what Bokeh is. It is a great visualizing library.
I will show you basics, how you can handle data, how you can manipulate it, how you can go back and change something or get effects. Why did I do this? During my master thesis I was working with networks, some kind of social networks. We wanted to explore them and the problem was we wanted to see them. We wanted more than just tables or some columns and to read them about them.
We wanted to visualize them and wanted to see some properties. We wanted to see it in a browser. Maybe we wanted to include it into an app. I came up with this. I generated the networks and the properties and I stored them in a database.
I wanted to visualize it with D3. It is a nice swish knife or how it is called. It is extremely complicated but it is powerful. I had to provide the data from the database and I created a RESTful Flask app. It is a lot of overhead and a lot of programming just to do some visualization to explore.
The question was, can we do this better? I was thinking, okay, I will try the same now with the library. It is much easier.
I did not have to handle any JavaScript code at the moment. I do not have to care about how do I get the data to the client running in my browser. Bokeh is doing this for me. I can also explore and start my visualization app in a notebook, a Jupyter notebook.
This is really great. On top of this, I can change a network. I can change my graph. I can manipulate it and I can effect spec. If I select something, I can get this back. I will now show you how it is done. I will create a network and show it to you. All the code you need for it is part of this slide.
I did not let any code out. At the end, there is a more complex example. There may be, but here you see what is necessary. I needed some example data. I was thinking about using some, usually, this example data like
or something like this. Yesterday, I had the idea. We are at Europison and people like to use Twitter. There are some nice Twitter modules like 3P. I used it to get the information from the user Europison. Now, the user Europison sometimes is linked to a lot of people or an author uses Europison and links to other people.
It can use this data and create some kind of social network. Authors are connected to each other. Maybe they treat it more so the weight on an edge might be higher. This will be useful for network. I have my data now. What is the next step? Yes, we need a network.
As I said, sadly, at the moment, Bokey doesn't support it out of house, but we can do it our own. We use NetworkX and we load our Europison data. I could have done it live here, but I was a little bit afraid of the Twitter limits and the Wi-Fi, so I did not do it. I started in a Gmail file.
I created the network using NetworkX and it has an awesome function to write it to a Gmail file. I now import this file back and I get my network. What I do now, NetworkX can draw, but it usually draws with muppet lip and it's static. I can use the layout from NetworkX to create a layout.
I can use this layout to fill in Bokey and there I can get an interactive visualization. I put in my network. I put some values in for K. K just says how much distance would you have between some nodes. It's an iterative algorithm,
so I can say some number of iterations. If you're a little bit more interested what it exactly does, you can go on the first Wikipedia page, force-directed graph drawing. What it basically does, it creates spring forces between nodes and then you have a 3D model and it puts it on a table and then it tries in a few iterations
to get rid of the friction and then you have your nodes on some positions. This is basically a spring layout or a force-directed graph drawing. I will use this layout now or later. Now we have to do some not workarounds, we have to get the data in a format we can use in Bokey.
The cornerstone in Bokey is usually, I would say, it's the column data source. It's one kind of I think three or four data sources, but I think it's the most important. It's the one you probably would see first. It's a class where you can store data column based. You see on the left
there's an ID, of course because there are usually all lists here. I store there the x-coordination, the y-coordination and the node name. The first row says Dabierny, it's my Twitter handle, is located at the position 21-3. The nice thing about this column data source is that you can
change it. You can add data, you can add columns and you can change it and you will also get effects back. If someone selected a node in your graph, this is the point where you get information about which node is selected. You can use a lot of
lists, you can use Pandas data frames to create those lists, but at NetworkX you usually have a dictionary first and so we have to do a little bit of transforming the data and this is a drawback at the moment. You have to copy the data. I get the layout, I have the items, so the key
in the layout is usually the node name and after the node name the value is a tuple of the coordinate of the node. We have to extract those values and we have to put them in lists so that we can create our column data source. We just extract them, they use it, we use them in the dictionary and put it back in the column data source.
Now we have our node source. Now we can finally plot something. How is this done? Here is a little bit of code. You can ignore first the hover code, but just look at the figure plot. Figure just creates your drawing area. So you define
how big it is and you say something else, you say which tools you want, so tab means you can click on nodes. Hoover is now the Hoover tool from above so that you move your mouse above a node it will show the name because I know that I have the column name in my data source and also I have the ID or the index.
It's a property which is always there and then you hover over it and you will see the ID and the name of a node. The next step is I want to see my circles and this is done by plot circle. It will generate or it will create a renderer. It's the R cycle, it's a cycle renderer and now I put my
data source in here. So you say source is my node source and now I want to have X and Y so here is X and Y so I say the first is the column name X and the column name Y and they will be used for the positioning of the circles and I have some fixed values for color
blue and the level overlay just means it's above the lines later and it's ten size so we have now this. It's not really a network, it's just points. Okay, so we need some more work. It's not so much but we have to add some edges and to add the edges we have to prepare the edges
again so we just take the layout and the network and we extract the positionings of the nodes again because we want to connect nodes and what we do here is I get the data off of the edges so if I say network edges and data is true I will get the edges and the
weight which is the data attribute for every edge and now I calculate some maximum weight because I want to do some alpha coloring of the lines and so I can calculate a value between 0.1 and 0.6 and I put all this in the list of dictionaries and those lists I can
put back into a column data source for the edges and now I get a line source. Yes, now I can plot multilines and I do the same circles I put in the source and say the source is the line sources and I say for the first point of every line so now you have tuples in those first two lists so
line is defined by XY for starting point, XY for the end point so XS is usually the starting point, YS is the end point and this is just a name for the columns and here you see already that we use for alpha the name is alpha so the alpha will be used from the column data source and
you cannot see it directly here but usually the lines have different coloring of alphas. You'll maybe see it later a little bit better. Okay, this was just a boring network. We want to see a little bit more. We want to see some properties like centrality or maybe clustering.
So we add those information to our column data source and it's not so complicated so NetworkX provides some really cool algorithms so you can use for example multiple centrality algorithms. I have chosen here the betweenness centrality. It just means a node so you have shortest paths in your network
and a node where a lot of shortest paths have to cross through has a high betweenness centrality and now I have a centrality. Again it's a dictionary. We have to transform it a little bit and I can use it and put in the values as a shifted mapping to a range. So I want to use
this value for the size of the circles so I say okay the least important are has a size 7 and the most important have maybe 17. It's just a range mapping and I say okay the new column for my column data is centrality and I add it to my node source. So my
node source has now for every node has a centrality value. Okay. So the next point is I wanted to have some clustering so which nodes and people are maybe a little bit connected because they have been
treated about each other. So I use this Python module. It's an addition to NetworkX and it creates clustering for you. So clustering is NP-hard so you will not get always the same result and it's maybe quite a calculation.
Needs some time to calculate it. But for this size it's still great so even much bigger sizes will work. So I would get a partition and now again I split up the partition get out the nodes communities here and the first you have again nodes. I don't need them. We have the communities and now I can again add some attribute or add a new
column to our column data source. It's community and now I have communities in my data source. Now I just do a coloring mapping because I want to have different colors. I have a list of colors and I use the modulo operator to just give every group a color and now I can
see another plot. No, I missed something. You have just added a new column but you're not using it. The problem is the renderer, we have an R-Cycles renderer, has still a fixed size and a fixed fill color so I just change them in my column data source and I say now use centrality
and now use community color for the fill color and now I can plot it and now you can see different colors, different sizes and there's a big dot in the middle. It's Europison. So yes, I let it in there because I want to show you now. I want to interactively remove it because I don't want to have a social network about
people plotting, twittering about Europison if there's Europison in it. It doesn't make much sense so we have to change it and we want to do it interactively. So I want to see I want to click on a node here. This is a little buggy because it's a slide show and usually it works also in Notebox.
You can go above I can show it here. You can go here click on something and you mark it and then you can remove, I want to remove it because I say okay it's a bad data set I want to remove it and I want to do some recalculation. So what I can do, I can do interactions and I can get out of column data source which nodes
I selected. It's a bit tricky data structure in here. So you have 0D, 1D and 2D. 0D is just for lines and patch glues. All other glues like circles are in the 1D key and in 2D are maybe some multi-line drawings
like octets or something like this. So we just go there, use the 1D key and we have the indices of all marked nodes currently in our plot. And what we can do now, we can remove it. This is just an example code, you can do it better I think. So I get the index and I use the index to get the node from my network.
And now I can remove the node from my network. I will pop it out of my layout but I have to recalculate or I restructure my data in my column data source because currently they are not sharing the data. So I iterate over all of the rows,
over all columns and I remove the index. So you could also remove multiple of them. And again, then you update the data, adjust the new data for every column and you add the dictionary for the updated edges and then you can remove an edge, can remove a node. But there's a problem.
Bokeh is great but it still has some problems. Not everything is working in a notebook. And as you see, I'm still in a notebook. It's just a slide show but it's a notebook. You cannot redraw data sources or cannot redraw automatically if you change a column data source. You can push your changes
there or you can create a push and it will redraw it or if you run it in a Bokeh server it will automatically redraw it because usually it will iterate and loop over it and check for changes or you mark it as trick M changed. And another problem is you cannot get those values. So what I showed you here is not working currently in a
notebook. The list will always be empty. So you have to do this in Bokeh server. Okay, it's still great. I can use a Bokeh server to run my app and it's not much a problem. So you can another floor back, I have to say yes if you want to add widgets. So your notebook can add widgets like sliders, buttons, stuff like this.
They will run this JavaScript code but you can't translate it or you run it in a notebook, they will still stay with pure Python function and pure Python callback functions. Good, now I want to show you that you can do those interactions. So as I said, this is the EuroPyson account of Twitter and I want to remove it. So
I marked it, I can remove it and you see now it's gone and there are some other connections. You see some stronger lines, those are connections between others you have might be tweeted more about each other than others and I can switch back. So you see a problem?
There's still no central person in there because we removed the very central person EuroPyson. I still have to update properties. I push the button, I call the update function, I go back to my network, it does some calculations I will get the information, put it back in my column data source and I see now more interesting people
who might be interesting to you because they are Twittering a lot here, I think it's the OpenStack account. They have connections to other peoples and yes, but we still have the layout so we can update the layout. Takes a while and now we get this layout. Looks a little bit weird at the moment because
the network for the EuroPyson is a little bit we have a lot of people who just tweeted about each other and then you get one-to-one connections and you also have still like here nodes, they don't have any connection because we removed EuroPyson but we did not remove nodes who have
no other node attached. We can fix this so we can remove it now. So you see this is this one is gone and I can reset this zoom back and I'm back here and I can update the layout again and now I can explore so we can dig a little bit deeper in there so I am looking out here.
Here's my colleague. He's sitting there and I think he tweeted the most of our people, of our colleagues and I can zoom in here and you can see which people he's tweeting about so here's another colleague and cool but I can now explore
also what happens if he gets a meltdown and decides to go to Java or something like this. I can remove it again and then I can again update properties and stuff like this so you see I did mostly interactive network plotting in just a few minutes and I think it's
quite handy if you just want to explore you can go further and do some more stuff and of course you can just switch NetworkX. It's a great library where you can switch it for other situations. If maybe you want to use NumPy or stuff like this you want to do some heat development and you want to plot it just think about it. You can do it.
It's not so complicated to bring it to Bokeh and interactively change maybe what you're doing and bring in some values you wanted to change and I think that's it. I hope you have enjoyed it and you maybe learned something. If you want to get the documents and the
notebook, the Twitter data and how I get the data you can go to our company Blue Yonder Documents. They are the presentations for this year and the last year. Here are the links for the NetworkX and Bokeh. That's it.
Is it working with all the layouts
which are possible in NetworkX with Bokeh or not? Can you customize the layouts more like there's five or ten different network layouts? Networks have some layouts they have a random circle layout but they are not so sophisticated. If I have a specialized one like my own stuff
can I use it through this as well so it will work? Have you tried it? No, I did not try but if you can generate a network where you just generate positions it should not be a problem. For example if you want to have a spring layout where you can move clusters nearer together I think you can just you have to copy it, you have to
fork it maybe NetworkX and then you can bring in some additional forces to draw others more together. It should not be such a problem. So it's like in Pyplot, right? You first draw the nodes then the edges and then I can put it into this one as well. Just wanted to understand a bit better the connection
between Bokeh and NetworkX. So once you've done the initial graph with Bokeh when you do some more things live does it go back to NetworkX again? Or not?
When you do things at this point? Yeah, I go back to NetworkX. So if I go I want to see a different centrality here close the centrality it goes back to NetworkX and calculates it. It's not pre-calculated it's just Python callback functions they go back to NetworkX call algorithms remove on NetworkX a node and then you have to transform it back
and then you can use it. Thanks. Thanks for the talk by the way. The buttons I see here is this from Bokeh or have you added these yourself? Okay, this is something you have it's not in these slides so it's basically
buttons from Bokeh. Two lines you say I want to have a button and then you add an update function to a button you bring it in a layout and okay it's two lines and maybe another and then you have both buttons and they do something what you want.
Any more questions? No? Give a big applause.