Interactive data Kung Fu with Shaolin
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 103 | |
Number of Parts | 169 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/21146 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Core dumpVideoconferencingDemo (music)Online helpHacker (term)TwitterObject (grammar)Endliche ModelltheorieCASE <Informatik>LaptopPresentation of a groupLibrary (computing)Order (biology)Graphical user interfaceInteractive televisionPattern languageState of matterUser interfaceSound effectData mining
01:09
Gastropod shellData structureOnline helpTerm (mathematics)Sound effectNP-hardXMLProgram flowchartComputer animation
01:32
InternetworkingData modelMathematicsAbstractionFormal languageBlackboard systemNatural numberCodeEndliche ModelltheorieFormal languageCore dumpProcess (computing)MathematicsBoss CorporationObject-oriented programmingCASE <Informatik>Order (biology)Goodness of fitVideoconferencingPascal's triangleOnline helpFilm editingParameter (computer programming)ResultantComputer animation
03:51
LaptopGUI widgetCodeInterface (computing)Process (computing)Lecture/ConferenceComputer animation
04:15
Coding theoryGUI widgetEndliche ModelltheorieLaptopBitCodeGraphical user interfaceCausalitySound effectObject-oriented programmingObject (grammar)Computer animationDiagram
04:50
Computer-generated imageryArc (geometry)Maxima and minimaDivisorObject (grammar)GUI widgetData structureCodeLine (geometry)CuboidSocial classSign (mathematics)Rule of inferenceString (computer science)TrailHomologieParameter (computer programming)LaptopSource codeXMLComputer animation
05:34
String (computer science)GUI widgetElectronic mailing listInheritance (object-oriented programming)Social classData structureStudent's t-testMultiplication signProcess (computing)GUI widgetElectronic mailing listType theoryPattern languageComputer animation
06:12
Row (database)Absolute valueWage labourRange (statistics)Line (geometry)Row (database)Functional (mathematics)Interactive televisionComputer animation
07:02
Function (mathematics)Attribute grammarPlot (narrative)Scale (map)Clique-widthSocial classRandom numberLipschitz-StetigkeitArray data structureRippingMaxima and minimaMoving averagePunched cardComputer fontRadio-frequency identificationSpacetimeData conversionRaw image formatGoodness of fitOrder (biology)Function (mathematics)Functional (mathematics)PlotterObject (grammar)Set (mathematics)Endliche ModelltheorieMoment (mathematics)Computer animation
08:03
Execution unitWide area networkSequenceOrder (biology)Functional (mathematics)Server (computing)CASE <Informatik>Plot (narrative)Structural loadLibrary (computing)Multiplication signComputer animationProgram flowchartDiagram
08:27
SequenceMetropolitan area networkSineGraph coloringFunctional (mathematics)MappingLevel (video gaming)Film editingLink (knot theory)Order (biology)PlotterComputer animationProgram flowchartDiagram
08:52
Convex hullVideo gamePlotterSoftware frameworkWhiteboardDifferent (Kate Ryan album)Film editingLink (knot theory)Plot (narrative)Computer animationDiagram
09:16
Row (database)Wide area networkGeometryEndliche ModelltheorieGraphical user interfaceElectronic mailing listString (computer science)Statement (computer science)Degree (graph theory)Form (programming)String (computer science)Near-ringForm (programming)Object (grammar)Universe (mathematics)GenderEndliche ModelltheorieElectronic mailing listDiagramComputer animation
10:44
Plot (narrative)Sheaf (mathematics)Complex (psychology)Process (computing)Software frameworkComponent-based software engineeringPrice indexMetropolitan area networkScatteringConditional-access moduleData structureTheory of everythingLipschitz-StetigkeitLengthLevel (video gaming)Scale (map)Raw image formatCAN busDot productArtificial neural networkMaxima and minimaNormed vector space3 (number)Software engineeringOptical disc driveValue-added networkComa BerenicesDemo (music)Graph coloringType theoryAlpha (investment)MappingSpeciesPlotterOrder (biology)ScatteringVisualization (computer graphics)Default (computer science)Category of beingParameter (computer programming)Functional (mathematics)InformationIntrusion detection systemScaling (geometry)Clique-widthTunisDifferent (Kate Ryan album)Form (programming)CASE <Informatik>Scalar fieldGame theoryAreaVideo gameFigurate numberState of matterArithmetic meanWebsiteBit rateWordLevel (video gaming)Greatest elementComputer animationProgram flowchart
14:21
GUI widgetHacker (term)Error messageMultiplication signHacker (term)CausalityStack (abstract data type)TrailMappingContent (media)DivisorComputer animation
15:15
Open sourceLaptopComputing platformCodeInformationComa BerenicesSoftware frameworkRepository (publishing)Line (geometry)Functional (mathematics)Hacker (term)LaptopOrder (biology)Interactive televisionLimit (category theory)SoftwareInformationLibrary (computing)Demo (music)State of matterIntegerCodeEndliche ModelltheorieBitBuffer solutionMaxima and minimaMultiplication signRight angleFreewareMobile WebBoss CorporationLattice (group)Object (grammar)VolumenvisualisierungComputer animation
17:44
Multiplication signHacker (term)Computer animationLecture/Conference
18:10
LaptopGUI widgetCodeOrder (biology)Plot (narrative)Proof theoryFunction (mathematics)Port scannerSound effectScripting languageMassElectronic visual displayImplementationForm (programming)AreaLattice (order)CASE <Informatik>Process (computing)HypermediaEngineering drawingProgram flowchart
19:56
Degree (graph theory)Order (biology)InformationPort scannerCodeLibrary (computing)Physical systemFormal grammarInterface (computing)Mathematical analysisMappingPlotterTraffic reportingComputer animationDiagram
20:45
View (database)Kernel (computing)GUI widgetLibrary (computing)PlotterCategory of beingSoftwareComputer animationProgram flowchartLecture/Conference
21:47
Software frameworkOpen sourceSound effectHacker (term)Sound effectSoftware testingData analysisEndliche ModelltheorieSoftwareFreewareEntire functionMultiplication signFunctional (mathematics)Computer animation
22:47
Moving averageLine (geometry)Simultaneous localization and mappingPlot (narrative)Graph of a functionEmulationMaxima and minimaMenu (computing)CAN busSummierbarkeitMountain passConditional-access moduleMotion blurRaw image formatMIDIUniform resource nameWide area networkVideoconferencingGraph of a functionCross-correlationPlotterMappingSoftwareDifferent (Kate Ryan album)Correlation and dependence1 (number)TouchscreenBinary fileMeasurementMathematicsCombinational logicAlgorithmGraph coloringMatrix (mathematics)Category of beingOrder (biology)WeightAdjacency matrixLink (knot theory)Multiplication signMaxima and minimaCASE <Informatik>Time seriesData structureInverter (logic gate)Negative numberBinary codeInformationPlot (narrative)Physical lawField (computer science)Keyboard shortcutModal logicProcedural programmingCentralizer and normalizerVideo gameFilter <Stochastik>GravitationLibrary (computing)FrequencyStability theoryNetwork topologySeries (mathematics)Computer animation
28:11
Floating pointMaxima and minimaGUI widgetPlot (narrative)MUDClique-widthMetropolitan area networkSpacetimeData conversionPerfect groupClique-widthGUI widgetState of matterData managementLibrary (computing)VideoconferencingView (database)HypermediaTerm (mathematics)Formal languageEndliche ModelltheorieComputer animation
29:42
FeedbackEmulationMortality rateSummierbarkeitWeightConcentricLecture/Conference
Transcript: English(auto-generated)
00:00
Hello everybody, thank you for being here. I am, as I said, Genduram Vayasteh, and I will be talking about Interactive Data Winful with Shaolin. These are my Twitter accounts and my GitHub accounts in case you want to check me out. And well, I'm explaining what I'll be talking about today.
00:20
Shaolin is a new Python library that allows for easily adding a layer of interactivity to a Python object when working the Jupyter notebook. This presentation will be not be a tutorial, but an introduction to some of the main capabilities of Shaolin, alongside some real examples in which Shaolin has been used to build a graphic user interface for some actual models.
00:44
And of course, the final goal of this talk is just to convince you to use Shaolin, and maybe try to build a community around it to make it grow. How am I presenting it? Well, we will basically talk about data confu. The philosophy behind Shaolin and its features,
01:00
inspired by science, martial arts, hugging, and teaching, combining all those different concepts, submerging them in order to build interactive dashboards. Shaolin stands for Structure Helper for Dashboard Linking. But of course, it's also called this way
01:21
because it allows you to perform some nice data confu. A term which can be translated as mastering data through hard work and practice. So what is it? Well, this is just how we work at my company. It all started as a way to let my boss play
01:42
with all the code I was developing without having him, well, letting him play without having to actually learn Python. He's a really good programmer, but he's working like in Delphi, a Pascal object-oriented language, and he just didn't want to know Python. So when I tried to do something about it,
02:10
I honestly had no way of doing it. So when I asked him for help, he told me something like, okay, if you got a problem you don't know how to do, just step aside and ask yourself,
02:23
what would Feynman do? So I tried to turn the peculiar way that Feynman had of looking at the world into features of my software, which I will explain in the next four chapters of my tale about data confu. You will also find some of the Feynman quotes that inspired me to build these features
02:41
in case you get bored of my weird English accent and you're just here for the video. So if you want to ignore me and just read them, it's fine by me. So let's get serious. Chapter one, science. As a math scientist in my work, I should really be doing something to honor that term.
03:03
So let's talk about science. The core of science is about thinking, asking questions, doubting, and experimenting with nature in order to build knowledge, knowledge that we tried to embed
03:22
into something we called models. We've been using language plus reasoning, which is also called math, to communicate our models for the last few centuries with great success. But actually nowadays, we found Python a simpler way to explain
03:41
the process of language and reasoning. Because if you write your models using code, when you try to explain them to another person, you will find yourself evolving from a blackboard into a dashboard. So science is also well simplifying.
04:03
Charlene aims to bring your code closer to the Jupyter notebook by shortening the process of writing complex interfaces using the API Widgets package. Although its syntax is a bit complex, it allows to really reduce the amount of code needed to write a graphic user interface.
04:22
And thanks to the work capabilities provided by the notebook and the API Widgets package, you don't even have to know Python to use it. So please, join me in converting Excel models into something more visually pleasing. How we can do that? Well, here's where Charlene comes into play.
04:41
Charlene achieved this by merging object-oriented programming with the API Widgets package when using the Jupyter notebook as a backend. Here, for example, we can see a Python object which we transform into a dashboard. By just inheriting from the dashboard class and adding the lines of code that you see inside the red box.
05:01
How this weird structure works? Well, first of all, you need to define your widgets. In Charlene, widgets are defined by strings formatted in the following way. First, you write the kind of widget that you're going to use, a dollar sign, and then you state its parameter values as shown in the picture. You can find all the weird syntax rules
05:22
that this package has in the example folder of our Jupyter notebook. Well, just go online and check how it works. Because once you have your widgets defined, it's time to define your dashboard layout.
05:42
So well, here you can see a nested list structure in which the widgets are stacked. Just to establish how the pattern type relationship works is like when you are working with a widget package and you want to stack widget boxes, well it's an analogous process
06:01
but just with some leads. You don't have to write children, equals, and all your children in the dashboard. If you print this stuff, you get this. This is how a dashboard is defined in Charlene. You can see that in the first line,
06:21
the C means that our dashboard will be a column comprised of two rows. The first one will be an in-ranges slider. Which is defined in a syntax mimicking the one provided by the interact function of the iPadWidgets packet and the second one will be a row comprised of two widgets, a dropdown, and a checkbox.
06:44
And how does it look when coded in the notebook? Wait a minute.
07:11
This is just the example I've shown you. And this is just an object that is in charge of processing an AMP array and scaling it,
07:21
applying it, and apply an arbitrary function. This is not really interesting on its own but it serves as a good example in order to illustrate how different dashboards can be combined into building something bigger. Here, for example, we have another one that just takes a map-loaded plot and another dashboard and converts its output
07:44
into a new one so you can see what you're doing while you are formatting your data. Of course, this example on its own is not something really great. So if you are looking into building something more complex,
08:06
you can also use shallowly in order to get that. Some of the functionalities that are embedded on the server. In case you wanna build plots
08:22
or you wanna do something quite complex, you may find a lot of dashboards in our library that can save you a lot of time when stuck in different dashboards. For example, here you have a color palette which has embedded all the seaborn functions and all the color maps. So you don't have to worry about
08:40
about choosing your color palettes again. In this case, this dashboard also is using cut links in order to plot an arbitrary data frame. And you can actually select what kind of plot you like and which columns you wanna plot for the data frame.
09:01
And also, I think I added a theme board so you can choose among different cut link things for making your plots. This is what shallowing is about and what we're trying to build here.
09:26
So well, let's have a quick recap.
09:52
Okay, quick recap. Write your models using Python objects and then you can turn them into interactive dashboards
10:01
just by adding a nested list of strings or dashboards. And now let's go to chapter two. This has been influenced by me learning martial art for the last few years. And I talk about what I call judo. What it's called, that means literally gentle
10:22
or flexible way. Judo's a martial art that it's basically based on two principles. The one, be flexible, of course. It's all about flexibility. And the second one is about taking advantage of your opponent's strength.
10:40
As I'm kinda tired of showing you slides, let's do a quick demo to show you what I mean. Here's what we're talking about flexibility. As I just showed, all the dashboards can be a stack
11:02
and merged into each other in order to accomplish something bigger than the way intended originally. Here, you can see how I built the dashboards in order to map some data, actually an arbitrary data frame, or panda's panel, or even a panel 4D, to visual data in the form of a scatter plot.
11:24
It's basically you have different dashboards, all combined. Every color means a different dashboard that has been stuck into each other. That allows you for selecting almost any visual property of a bokeh plot. Now we'll see how it works.
11:43
You basically have here your dashboard with your bokeh plot, and you can choose any visual property you like. For example, now this is the well-known IDS dataset, and we wanna do something new with this plot. Well, I think it can be done anything new
12:03
because everyone knows it, but I'll try to have some fun with it. You can, of course, tune every default parameter interactively, like here the size of the marker type of the, even the alpha colors, if you wanna check for overlapping in your data.
12:26
But of course, you also can define my mappings based on data. In this case, we are using a Seaborn color palette in order to distinguish among the three different species of IDs.
12:42
You have to select with your color mapping picker and then apply your mapping. And voila. You can also map, scale, and apply any function before converting your data into visual properties.
13:03
So it's actually really flexible. It even allows you to do something like, okay, I would like to just plot two type of iris, color them with color gradient, and the species that has not been colored,
13:21
I want it to be purple and to map their size between 20 and 50 according to its petal width. I don't know, you can do really weird plots with that. You just have to select what kind of data
13:43
you wanna map, and the one that is not mapping will get the default value. So you can actually rank your data and scale it. You can actually do anything you like. And of course, you still have automated
14:03
tooth-tip information. You can choose which columns do you wanna plot in your bokeh chart, and well, it works just like this.
14:34
So now the recap. You can use Shaolin to make dashboards and stack dashboards with other dashboards
14:42
so you can get real weird and cool stuff. All the dashboards I programmed can be used, can be instantiated with some kind of data, but you can change the data it is mapping on the fly so you don't have to worry about getting
15:00
any weird crashing or trail of errors or that stuff. Well, now it's time to get serious, cause it's time to be a hacker. Yeah, what does it mean to be a hacker? First of all, hacker should be able to find bugs and know the limitation of its over.
15:23
So I talk about what is Shaolin meant for. First of all, if you are trying to please your boss and get some nice dashboard that will make your investors really happy in a short time, please don't use it. Take Caravel or take something flash-based
15:43
and go do your business easily. This is a nerd-oriented framework meant for science. And in the other hand, if you are just writing some quick function and you want to add quick interactivity without worrying about making complicated layouts or scaling your functions
16:01
on your models, you can just actually interact. It's really great, it works nice, and everything that's been built in this demo has also been built using just the IPWidgets package and no Shaolin, and it's possible to do that. It will take you about 50% more lines of code, but there's actually no problem with that.
16:22
But if you wanna make something able to snowball, I mean snowball by starting to make some small dashboards and making it grow in order to stack them and to make some more complicated plots, you should be using Shaolin.
16:41
Interact is meant for functions, Shaolin is actually meant for Python object-oriented programming, what's next? Information should be free. Of course, Shaolin is free software. You can do with it anything you want. Of course, the ultimate goal, the final goal of this,
17:04
this software would be to make the Jupyter notebook something like the ultimate science sharing platform. If you collaborate with me, it would be really great to map all the PyData libraries so we could use it without having to know Python.
17:20
So if you do anything, just show me the code and I'd be glad to interact with you. Anything you wanna find about it, you can find it at the GitHub repository of our company. And finally, if you wanna be a hacker, you should try to use stuff differently than it was intended when it was made.
17:42
Let's see what I mean with that.
18:05
Now it's time to feel like a hacker. Here you can find a dashboard that I coded a while ago, mainly as a joke. Of course, you can see that it is possible
18:21
to fully customize any CSS of your dashboards and make it look actually like pretty anything you want. Well, it also includes pre-made dashboards for wrangling with IPs and ports. And, well, if you know Spanish,
18:45
you can see that this is actually a joke that I made. I was trying to implement the proof of concept of some port scanning that work it while, work it loading malicious HTML code into an image in order to perform port scanning past five walls.
19:03
It was really complex and it took me a lot of bad written scripts to do it, so I tried to embed everything into a dashboard. And once I discovered that the malicious HTML code could be loaded from the same HTML widgets
19:23
where you need to display your outputs, the Jupyter notebook became something like a port scanner. So, well, you see here that you can play and do a lot of stuff with it. Here you can see the snowball effect where by just defining three new widgets,
19:43
I can also add everything that I had programmed before in order to get beautiful outputs and customizable plots. Here, well, I'm just testing this, starting my scans and sending all my FOMBI requests
20:03
in order to get information about the delays of my forged request from my malicious HTML code. Here, well, I've used the bar plot from the formal example in order to make some customized report to match all the other CSS.
20:24
This is basically done by mapping all the plotly interface with the dashboard and merging it with another one that is just about making a cat-links plot
20:42
with a custom-defined layout. Cat-links is a library that binds the pandas data library to plotly plots, and so if you wanna do, I don't know, any weird plot and fully customize same thing,
21:00
you can choose actually any property you like.
21:32
Well, let's continue.
21:45
The Hacket Recap. Just remember that this is nerd-oriented software. Don't use it to play your investors. They will be actually disappointed. They don't want you to waste time coding every functionality of the layout
22:00
in plotly, and of course, free software, and it's mainly focused on try to create a snowball effect when writing your data analysis tools. And now, well, let's get into our final chapter.
22:23
Nothing more to add, just teach and play. I call it teach because Shaolin aims to make teaching and learning science easier. I call it play because for doing race science, you should actually be having fun. So now, I have some fun by showing you
22:42
how is it possible to turn an entire PyData talk into a dashboard.
23:00
I honestly hate how this video. Here is my graph plot dashboard. This is kind of tribute to Miguel Báth and his talk about finance network with Python, which was, I think, a bunch of months ago,
23:23
like made in PyData, in which we talk about representing correlation matrices as graphs. Here, we have joined everything that we made in order to fit this plot with a time series data frame,
23:40
calculate a lot of correlations and metrics, and combine everything together into a graph. Here, you can see the graph representing like seven time series of Forex data, and every link in each graph represents a value for correlation matrix.
24:00
We're trying to represent all the information, like a snapshot of how the market behave during a given period of time. So, first of all, we just define mapping for every time series, and now we are defining
24:20
the mapping for our correlation matrix. As we are choosing to be a color palette of just five values, all of the correlation values of our matrix are represented in five different bins, so you can, just with a glance, know what's happening on the market. Here, actually, you can see that the two correlations
24:43
that are higher are the one between AUSD and $1,000,000, and the lower ones are the bluer. I choose five bins because now the correlation that are almost zero get indistinguishable, almost indistinguishable.
25:02
Of course, you can choose this mapping to do anything you like, like coloring your different time series by the properties of the graph they are defining. For example, here, you can color them by its weightness betweenness centrality,
25:21
which is a measure of how this graph is structured internally. So, we define a new color mapping, more pleasing, and now we know that the greener it is, the more bigger it is betweenness centrality,
25:40
and the more bound it is, the lower it is. It's betweenness centrality. Here, we have also mapped almost all the capabilities of the network software library, so you can have different layouts, change how your graph is filtered. In this case, we have filtered the correlation matrix
26:00
applying an algorithm which is called minimum spanning tree, and you can also use another algorithm that's called planetary maximum filter graph. Just in case you have a really big correlation matrix, you can filter the values in order to get only the bigger or the lower ones. You want to get the bigger, just do it,
26:22
and if you want to get the lower ones, just click on invert distance, and you get your matrix filtered by its most relevant values of negative correlations. You can also choose any layout you like, which is implemented in the network's library,
26:43
and well, it's actually kind of fun to play with this. You see here how I, how it's displayed. You can filter not only by your correlation matrix, but also you have any other graph which it's structure is corresponding
27:02
to an adjacency matrix. It can be also used to plot your graphs. So if you get Miguel Vaz video of his talk in PyData,
27:23
and you get just another screen with this open side-by-side, you will be able to follow everything he's saying in the talk without having to worry about, much about how is it code, and all the examples that he's showing, because they are actually really complex to understand if you are just coding it.
28:14
Oh, okay.
28:21
Now before summing up, I will show you another video I have here, as we would have liked, no? Perfect.
28:43
And also I'll show you how you can style any dashboard you like, just by using its state manager attribute. Here you will automatically load a bunch of widgets that are in charge of defining how your CSS layout will be. It has all the features that the iPad Widgets Library has,
29:03
and it allows you to set all the CSS you need by just using widgets. As we can see here in this example, is it really easy to fully customize all the dashboards you are making, just by using widgets.
29:21
If you need to change the width inside of something, this can make it actually really easy to do it so well.
29:59
Finally, I'm just saying just one more thing
30:01
about having fun or not, more than a minute, great. Finally, I would just say try to be accessible. I'm here to try to prove myself wrong,
30:22
to get some feedback, and I don't know, listen for your comments if you find some better way to do what I do. So thank you very much for coming here. Thank you everybody for listening to my first talk, Eva. And, well, I hope you ask some questions,
30:41
because there's still five more minutes to go. Thank you very much. Thank you very much Guillaume for this amazing talk. Now the floor is open for.