The needle and the haystack: visualizing single datapoints out of billions
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 141 | |
Author | ||
Contributors | ||
License | CC Attribution - NonCommercial - ShareAlike 4.0 International: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/68743 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
EuroPython 202330 / 141
8
17
22
26
27
31
42
48
52
55
56
59
64
66
67
72
73
77
79
83
86
87
95
99
103
105
113
114
115
118
119
123
129
131
135
139
140
141
00:00
Single-precision floating-point formatPopulation densityAerodynamicsSample (statistics)Open setInteractive televisionEntire functionLoginView (database)Online helpKernel (computing)Read-only memoryWeb browserRootBookmark (World Wide Web)Route of administrationStorage area networkSoftware developerOpen sourceBitMultiplication signLibrary (computing)Data structureAreaCategory of beingDifferent (Kate Ryan album)Morley's categoricity theoremGraph coloringMedical imagingMereologyPlotterMaxima and minimaFocus (optics)QuicksortLevel (video gaming)Normal (geometry)Equaliser (mathematics)Set (mathematics)NumberDistribution (mathematics)HistogramSelectivity (electronic)ScatteringAudiovisualisierungProjective planeSparse matrixView (database)Point (geometry)First-person shooterImage warpingSampling (statistics)Single-precision floating-point formatVisualization (computer graphics)Computer animationLecture/Conference
03:59
Kernel (computing)Cellular automatonView (database)Computer fileHistogramTouchscreenComputer-generated imagerySample (statistics)CodeParameter (computer programming)Range (statistics)Texture mappingNetwork topologyAerodynamicsQuery languageFluid staticsVideo trackingForestFingerprintSicAdvanced Encryption StandardFreewareCone penetration testMathematicsLattice (order)Letterpress printingPoint (geometry)Function (mathematics)Visualization (computer graphics)SubsetPairwise comparisonPlot (narrative)Sampling (statistics)Zoom lensPersonal area networkLibrary (computing)Distribution (mathematics)Computer multitaskingPhysical systemInteractive televisionComputer configurationServer (computing)Open setEvent horizonFile formatMusical ensembleOnline helpClient (computing)AdditionLinear mapSurjective functionElectric currentQuicksortCircleEvent horizonHistogramGeometryGraph coloringPlotterInteractive televisionPixelMathematical objectMappingFunction (mathematics)Data structureInformationPoint (geometry)Server (computing)Web browserDynamic rangeMedical imagingElectronic mailing listSampling (statistics)StatisticsTerm (mathematics)2 (number)Peer-to-peerSlide ruleEqualiser (mathematics)NumberLoginMassRaster graphicsLevel (video gaming)Instance (computer science)Context awarenessLibrary (computing)Clique-widthSubsetScaling (geometry)Limit (category theory)Functional (mathematics)TouchscreenCursor (computers)Personal area networkSet (mathematics)Client (computing)MereologyMiniDiscMultiplication signFluid staticsCountingQuery languagePhysical systemView (database)TesselationNeuroinformatikAttractorLaptopComputerComputer animation
10:15
Query languagePersonal area networkZoom lensSource codeView (database)Element (mathematics)Electric currentSample (statistics)Default (computer science)Electronic visual displayInformationCursor (computers)Function (mathematics)Plot (narrative)Digital filterAreaStatement (computer science)Raster graphicsPoint (geometry)LoginCellular automatonKernel (computing)Online helpMilitary operationOverlay-NetzSubject indexingPixelCodeMechanism designShape (magazine)BuildingInteractive televisionScale (map)Revision controlDisintegrationAlpha (investment)LaceClient (computing)SubsetPairwise comparisonLibrary (computing)Visualization (computer graphics)Computer-generated imageryFile formatComputer configurationRange (statistics)AdditionTexture mappingPrice indexMaxima and minimaoutputArray data structureSet-top boxRandom numberWeb pageDemo (music)Reduction of orderTransponderDigital photographyStatisticsDifferent (Kate Ryan album)Electronic mailing listSubject indexingInformationQuery languageSampling (statistics)Cursor (computers)Figurate numberRight angleAreaPosition operatorFrame problemMessage passing1 (number)CodePoint (geometry)View (database)Entire functionServer (computing)Shader <Informatik>First-person shooterMultiplication signSelectivity (electronic)BitCircleVisualization (computer graphics)Set (mathematics)Single-precision floating-point formatTrailDependent and independent variablesPrice indexHistogramMusical ensembleNumberQuicksortRaw image formatMaxima and minimaMedical imagingTable (information)MereologyOperator (mathematics)Interactive televisionTesselationLevel (video gaming)Graph coloringEqualiser (mathematics)Demo (music)PlotterMathematical analysisSource codeStandard deviationContext awarenessRaster graphicsGeometryDynamic rangeArithmetic meanRange (statistics)Filter <Stochastik>2 (number)Online helpComputer animation
16:31
TransponderDigital photographyQuery languagePersonal area networkZoom lensKernel (computing)View (database)Function (mathematics)Reduction of orderSample (statistics)Casting (performing arts)Real numberCellular automatonClique-widthData storage deviceMultitier architectureDifferential (mechanical device)Bit rateElasticity (physics)CodeRadical (chemistry)TouchscreenFormal languageData structureGraph coloringSet (mathematics)Multiplication signRight angleType theoryReduction of orderDifferent (Kate Ryan album)Task (computing)Virtual machineQuicksortComputer animationSource code
17:56
TransponderDigital photographyZoom lensPersonal area networkOptical character recognitionDemo (music)Reduction of orderSample (statistics)RootView (database)Kernel (computing)Range (statistics)Query languageUniform resource locatorSet (mathematics)Point (geometry)QuicksortShader <Informatik>Computer animation
18:20
Plot (narrative)PlastikkarteReliefLine (geometry)WindowScripting languageView (database)Exact sequenceZoom lensQuery languagePersonal area networkAerodynamicsRange (statistics)LoginKernel (computing)Reduction of orderTransponderWeightLengthRight angleLaptopComputer animation
18:55
InformationZoom lensBit rateRootVolumenvisualisierungReliefImage resolutionMetreSquare numberShader <Informatik>Digital photographySet (mathematics)Drill commands1 (number)PixelType theoryServer (computing)QuicksortWeb browser2 (number)Query languageComputer animation
20:36
TransponderDigital photographyZoom lensQuery languagePersonal area networkRootKernel (computing)View (database)Function (mathematics)LaptopReduction of orderSample (statistics)LaptopRow (database)CodeTemplate (C++)Computer animation
21:01
RootEngineering physicsBit rateVolumenvisualisierungSample (statistics)Point (geometry)BitCodeLaptopSystem callProxy serverQuicksortInformationAsynchronous Transfer ModeComputer animation
21:28
Casting (performing arts)RootSample (statistics)Point (geometry)BitInformationGraph coloringArithmetic meanLecture/Conference
21:53
RootTexture mappingSubsetLink (knot theory)InformationPoint (geometry)LoginKernel (computing)View (database)TouchscreenUsabilityAdditionMilitary operationOverlay-NetzRaster graphicsSample (statistics)LaptopTemplate (C++)Video trackingComputer fileBit rateSeries (mathematics)Pole (complex analysis)Drill commandsLattice (order)Open setCodeWebsiteScripting languageServer (computing)View (database)Operator (mathematics)MappingLine (geometry)Subject indexingGraph coloringCategory of beingGoodness of fitMorley's categoricity theoremProcess (computing)Time seriesMereologyFocus (optics)Multiplication signShader <Informatik>Computer animation
24:09
GeometryWordOpen sourceProjective planeSet (mathematics)Expert systemWebsitePlotterFormal languageSpacetimeFocus (optics)CurvatureNumberPlastikkarteView (database)BitPoint (geometry)Stack (abstract data type)Source codePiLecture/ConferenceComputer animation
Transcript: English(auto-generated)
00:07
I'm a senior software developer at Anaconda, just a little bit about myself, I've been at Anaconda since 2015, I spend a lot of my time working in open source, particularly in the HoloVis ecosystem, which is something I'll be talking a bit about.
00:20
So this is my talk, Seeing the Needle and the Haystack, a single data point selection for a billion point data sets. So first a little tiny bit about HoloVis, so HoloVis is a collection of libraries, they're all open source, they're all kind of community open source where everybody who contributes kind of uses the open source before they actually start developing on it. And it's this collection of different libraries, all about visualization, that each have a sort
00:42
of focus and sort of a topic that they try to focus on, and our goal is to try to make all these different things work together. So in this talk I'll be talking mostly about DataShader, but all your plots you'll see are based on Bokeh, HoloVis and HoloViews is in the background, and DataShader itself is powered by Number, which isn't a HoloVis project, but is something that is open source
01:02
and makes everything possible. So the first thing to say is that nowadays a gigabyte of data is really not that much, we have huge amounts of data nowadays floating around, and the thing to understand is when you have a huge amount of data it becomes harder and harder to understand the structure in that data, so what you want to do is find a way to get those insights that you actually need into the data sets that you have.
01:23
And from the beginning of computing we've always used some simple visualizations to view our data, simple plots, scatters, whatever it is, but as the data gets bigger and bigger, we sort of hit this problem of how do you visualize large data sets in a way that you can actually get insight into them. So data is cheap and plentiful, but insight is expensive. And a lot of the plotting tools that exist nowadays don't really deal with large data
01:43
that well, even with WebGL, and I'll talk a little bit about the differences with WebGL in our approach, and in particular what's really interesting about large data is that the same approaches don't work at all, because of issues such as overplotting, which I'll describe, I'll show you what that means. So DataShader is where I'm starting.
02:00
So my whole talk is going to be a little bit about the history of DataShader, and as I go through it I'll motivate the features that we have in DataShader and I'll show you how it works and so on. So it's going to be both a little bit historical as well as giving you insights into our philosophy. So DataShader.org is something that's been around since 2016, as well as DataShader of course, the library that it talks about. And it's been around and it's been able to make really nice images and pretty plots
02:23
for a long time. So this for example is the US Census. So 300 million samples or so, and you can actually see the United States quite clearly just from every single person that lives there. This is a really interesting plot. Every single person in the United States is actually contributing to this image maybe
02:42
a tiny, tiny amount. And what you can see is you have of course your highly populated areas or densely populated areas like New York, and you have your sparsely populated areas. And what's fascinating about this is you can really see the spatial structure of the country and where people live. And part of what makes this possible is something called histogram equalization.
03:01
If you try to use a normal color map to make an image like this with either log or linear, you will fail, or normally you'll fail. And the reason is you'll have these bright spots like New York City, which have huge numbers of population compared to the sparse points where there's no one living there. And what that will do is that blow out your color map. Your max will be so high that it all gets flushed out.
03:20
Even if you use log, you kind of have this problem. So the earliest thing that DataShader did is something called histogram equalization, whereby it takes the distribution of your data to kind of warp the color map so you can actually see the spatial structure of the data that you're looking at. Here's another example of DataShader image. It's categorical. This time it's using the racial categories in the census data set, and each color is
03:42
a different racial category. So this is Manhattan. You can probably make it out if you know the area. Okay. Okay. Let's try that. That's my... Okay. Let's see.
04:00
Is that any better? Okay. You'll see some of the other plots I think will be okay. I'm doing a lot of geo examples in this talk, but I want to emphasize that DataShader is more than just geo plots. For instance, we have attractors. This is mathematical functions or mathematical objects that are really pretty to look at.
04:22
Actually I have something that will fix the width of the slides in a second. So there's two things. So overplotting is something else, which I'll show you. And the problem is if you just get a normal plotting library that has a little point for each point in a scatter, for example, you're just going to get a big mass of points and you won't be able to make out what it is.
04:40
And then you have this dynamic range problem, which is handled by this histogram equalization in DataShader. Those are two parts. That's this dynamic range issue and this overplotting issue. So I'm going to give you an example. So here is a dataset with 2.42 million samples, which for DataShader is actually really small. One of the things I should say about DataShader is it really scales to as much data as you
05:02
have. Because it's server-based, you can throw Dask at it and you can have a cluster, a huge amount of compute, machines, whatever you have, as much data as you have. If you throw compute and resources at it, you can actually get it to work for any size dataset. And I mean terabytes and up, it really comes down to how much you can afford to work Dask
05:21
and get the compute working for you. So this dataset has these columns. I'll just show you an example. Longitude, latitude, temperature. So what this is all about heading and height above mean sea level. So this is information from gulls. So this is taken from Belgium and the north coast of Belgium, where they were tracking gulls.
05:40
They had little bracelets with GPS on them and other sensors, and basically the dataset of all these birds and where they went. So let's see what DataShader could do from the beginning. This is from 2016. You can make a pretty picture, and you can see that there's some spatial structure. But the problem is, number one, it's a static image, and you have no context.
06:01
Like you don't know where this is located. You have no way of seeing inside this data. It's just a static image. So this is what we had with the first sort of releases of DataShader. Now we can compare it to interactive plotting.
06:21
So here we have an example of a bokeh plot, which is kind of classic plotting in the browser. But we have a very small subset of the samples. Because if we actually pushed all our 2 million samples, we'd probably crash the browser tab. You wouldn't be very happy. So here it's like, what, 0.5% or something? It's a small number of the samples. And now we have all sorts of things we can do, which we couldn't do before.
06:43
We can zoom, we can pan, we can hover, we can sort of get an idea of what's going on by sort of interacting with our data. Now what we really want to do, and our big goal for HoloVis, is to make this kind of easy interaction possible with the biggest datasets you can imagine, again, if you have the resources. And everything you see is running on my laptop, so I'm not using any sort of fantastic
07:05
resources here. Now the problem with this is it's not the whole dataset, as I said. This is the overplotting issue I mentioned. It's hard to see what's going on here. You just have all these sort of circles overlapping each other. What's going on? You don't see the structure anymore. You don't have this dynamic range. You don't see anything about the data and how it's distributed in terms of where it's
07:23
concentrated and where it's not, which is related. And also you can't actually query this. By this I mean compute statistics over samples, all the samples under the cursor. You get maybe three or four goals here, but there might be hundreds of goals, and you won't tell, because they're going off the screen. This is just a really long list of data.
07:41
But it does have inspect, so if you have a single goal, you can just easily see what that data is. Okay. So this is really something that we want, and in 2017 we added all the functionality in Bokeh to make an event system, hooked it up with DataShader and HoloViews, and we ended up with this ability, which is we can now zoom in, put tiles behind it, get the context,
08:04
see the coast, see the structures that the girls are sort of aligning themselves with, whether they're peers or coasts or whatever it is, and now you can zoom in and see the structure of the data using EQ hist. And this is server-side histogram equalization, the server side is doing it. So this is a big improvement, but it still doesn't do all the things that we had
08:21
before. There's no color bar, for instance. This is an image that's generated by DataShader, and DataShader goes through the whole dataset and it builds a 2D histogram. That's essentially what the image is, and that's what it uses to color map things with histogram equalization. With that, you can actually have a color bar, but this one doesn't have it. And again, you don't have query, you don't have inspection, but now we do have
08:41
panning and zooming, and it's the whole dataset. So a big improvement, but still not where we want to be. The next step is to say, what could we actually do if we do it slightly differently? Well, we could use something called rasterize. This time, there's no color mapping happening on the server side, it's in the browser. So this is a log color mapper.
09:01
We've got a color bar, which is really nice. We can hover and get the counts, the number of goals that contributed to the pixel that you're hovering over. So now we have some hover, which is good. Not all the information, but some hover. We can still zoom and do all these other things, but it's not as easy to see what's going on here, because we don't have histogram equalization. It's just the log, and linear is even worse.
09:21
You can't read it. So now we have some more things, like the color bar, but we've lost the full dynamic range. Again, bokeh 2.2, so I think this is 2021, and we actually got the histogram equalization working in bokeh on the client side. So now it looks like what we had before, except we now have our hover, and we have
09:43
a color bar. So now we're getting very, very close to the sort of data exploration experience that you want. And note that this is something that could probably just about be handled by WebGL, and the limit to two million points, two and a half million, but it's really at the limit, and data should have really excelled beyond that. Okay.
10:00
So now we have a whole lot of stuff. The thing that we don't have is information about the individual goals. It's just this count, which is the number of goals, goal samples that actually contribute to that pixel. That's 441 goals in that pixel that I'm hovering over, for instance. Okay. So what we want to do now is to be able to query the data samples.
10:21
And by querying, I mean getting all the samples with some simple spatial filtering. So around the cursor, there's a sort of delta in X, delta in Y. In that area, can I get all the samples, and can I get the information out of those samples? So this is something that we have in HoloViews, where what you can do is you can set up
10:41
an inspect operation, and this is all API that we're going to put into hreplot to make it easier. So it's not much code, but we're going to actually simplify this. And what it does... So what it's going to do is it's going to be able to essentially look at the samples in the original data, get that data, push it to the browser, and update it.
11:01
So for now, all I've done is I've basically made some very simple thing without that, which is just a tile source, which is the coast of Belgium, and the normal raster, which I just showed you earlier, with the color bar. And as you can see, the color bar updates due to histogram equalization. You can see these strange numbers here. Now with this operation that I've added, what you can do is you can now hover anywhere,
11:24
and what's happening here is, effectively, I get the X, Y position on my cursor, send it to the server, get a delta, so I do a spatial filtering on it, based on some X and Y delta, on the original data frame. So HoloViews keeps a pipeline of all the operations, all the way back to the raw data. It can figure out where are the samples that have those X and Y in that range, figure out
11:45
something here that's just doing the closest sample, the closest goal to the cursor, and then it can send it back to this little white circle, which is following my mouse, and it gives me this hover information. So this is really great, and this really works for this kind of size of dataset. We were kind of happy with this, but what we found is that that spatial querying
12:03
becomes quite slow for the sort of datasets that we really want to work with DataShader. 300 million, a billion points, those sizes, doing all those sort of filtering queries becomes expensive, even if you do tricks like spatial indexing. We've tried that, and it helps, but spatial indexing did not get us the performance that we wanted.
12:21
So here's the stuff that's new this year. Well, actually, first I'll say, I kind of said this already, but what's interesting about this approach is that you get all the samples in there. So if I was hovering in New York City, it'd be all those millions of people in that tiny little area. I could actually get some statistics, some analysis, some means, whatever it is, over
12:41
that entire set of samples. And that's really powerful, but it's slow, and it's always going to be slow if you're going to have code that has to do those kind of aggregations. What we decided is we want something that we call instant hover inspection. And the difference is that often when you're looking over the data, you're not ever going to be able to see the whole data.
13:00
Because as I said, with Bokeh, even if you have hover, this is not readable. Especially if you have hundreds and hundreds of these, it's just a giant list. It's not really that helpful. What you want to often do is just get exemplar, a single sample, a single statistic, something very simple that gives you an idea of what's in that area, instead of trying to see
13:21
all the data in that area. And that's what we call inspect. So coming to inspect. So the way this works is that before, Holoviz was doing everything. But now what we're doing is we're adding support in DataShader itself to actually do aggregates, whereby we actually get the index of the pandas data frame at that position.
13:43
So we do it in one pass. That means DataShader goes through the entire data set, builds your 2D histogram for what you see, but also keeps track of the indices in the data frame, so that you can easily look them up without this expensive XY spatial filtering step. So I might actually demo it before I talk about how it works.
14:03
Actually this is part of how it works. So here's an actual example of this happening now. We had our goals. And if we go back to our table, you can see that one of the columns was height above sea level. So height above means sea level. So this is the height of the gull at that point when that GPS sample was made.
14:24
What we can do is we can now do something like don't show me all the gulls, just show me the highest gull. So in a way, you can think of it as looking down on the earth and you're looking at the top gull, the one that's flying highest, obscuring all the ones underneath. So what we have is something we call a selector, where we can take the max of the height
14:42
above mean sea level and use that together with the aggregation pass that makes the image that you see. And this is basically a visualization of that. So now when I hover, I'm getting an index into a pandas data frame of that sample, which is the one that's highest, the highest gull at that point. And that basically reduces this problem of having to do a spatial selection.
15:04
It's just been done with DataShooter as it went through the data the first time. So using this information, this is some of the new stuff we have now. This is what we call instant inspection. It looks a little bit like what I saw before, but the little white circle, which is hard
15:22
to see, I guess, but kind of lags my cursor. And that's because every time I move my mouse, it has to talk to the server, get a response, and it has to go into the spatial selection, which is fast for this size dataset, but as I said, becomes a problem when you have huge amounts of data. But this is instant. I can move my mouse around and I can do this with much, much bigger datasets, as I'll
15:42
show you in a second. So this is kind of where we are now. You have your whole dataset. You can visualize everything you've got, no matter how much it is. Far more than you could even manage with WebGL. There's no overplotting, because you can do histogram equalization for your color map. So you've got your full dynamic range. You've got all your interactivity.
16:01
So again, the panning, the zooming, all the stuff that we like for interactivity. Color bars. We've got the tiles underneath. We can see the context. We've got the query, which is still available. It's still there as an operation, if we need it. We need all those samples. But we also have this quick, easy, fast inspection, which is really, really great to give us kind of all the interactivity that we had with a sort of standard bokeh
16:24
plot, right? Without all the large data. So I'm going to give a little demo. As I said, there's a lot of geo, and I know not everyone cares about geo. So I'm going to quickly... Where's my thermal going? Okay.
16:40
Yes. Actually, it might still be running. This is because it's full screen. That's why. Okay. So I'm going to switch to this tab. Let's see. This is still fine. Yeah. Okay. So this is a really cool little dataset. This is doing dimensionality reduction on language. This is not my data. This is data that comes from Christopher Akiki, with his dataset which is on Hug-n-Face.
17:05
What it's done is basically dimensionality reduction on a corpus of language across different languages. And it's done effectively some kind of clustering. And now what we have is we can actually look at all this data, and we can see all the nice colors. We can see nice structure here. And of course we can zoom in, and we can now see all the different languages, right?
17:24
Okay. Tamil, yes. It's all Portuguese there. But here we go. All the different languages that are in each cluster. And if you imagine doing this without the hover, you'd have to have a giant legend with all the languages and all the colors, and it'd be kind of hard to see what you're looking at. But this just makes it super easy. This is all, yeah, this seems to be Catalan, Portuguese, and so on and so forth.
17:43
So this is a really pretty, really nice little example of a non-Geo dataset, sort of more of a machine learning type task, showing you the instant inspection. Okay. So back to the talk. So far these datasets have not been the data shader size dataset.
18:02
So since this is 300 million points, I mentioned a billion in the title of the talk. We do have data shader examples with one billion points of OpenStreetMap data. Those are available, and that's the sort of thing that we want this tool to be working with. But I will show you something which is not quite a billion, but 300 million, which I think should be good enough.
18:24
So this is going to be... This will take a... I'm actually running it now, so I'm actually starting it up on my laptop right now. It takes a second or two to come up. This is ship traffic data. Every ship... It's all shipped traffic around Vancouver in the US, and each ship has a transponder that
18:41
gives you GPS coordinates of that ship, as well as things like the heading, the idea of the ship. And of course, when you have the idea of the ship, you could look up that particular ship, the name of the ship, the length of the ship, the weight of the ship, whatever, without cargo, I guess. So here it is. This is 200 million AIS pings, which are these GPS pings, with our instant hover.
19:05
You can see the name of the ship and the type of the ship as we look around. This is our instant inspection. But it also has this querying. So this little square is right here. And what it does is it's doing the querying, whereby I can actually get all the ships
19:24
in that pixel. I can even look up the vessel by its ID and find a photo of that ship, as well as the name of the ship and everything else. So it's got a drill-down aspect to it as well. So I can click around. If I click somewhere else, that will update. I know the dashboard is...
19:40
So these are new values. Let's actually click somewhere else again. It takes a second. So this is what I'm telling you about the inspection. With querying being slow, you can see it takes a few seconds. So it's based on tap instead of hover, and that's because of the size of the dataset that we're working with. But the actual hover, which is based on instant inspection, is just right there, right? And of course, I can zoom in and then pan and so on and so forth.
20:05
So the red ones are, I think, the cargo ships. You can see, just by looking around, they're mostly cargo. And these are towing ships. You can see they actually take different paths around the coast, right? So this is a sort of insight into the data that you can get with these kind of tools
20:22
that are quite difficult without having something like DataShader running on the server for, again, huge data sets, which go well beyond what you could do with something in the browser tab with WebGL or something like that. Okay. So the last thing I'll show you is this notebook is itself a dashboard.
20:42
Everything I talked about builds up to a final dashboard. It's only a small amount of code. It uses panel, which is another HoloVis tool, which lets you make dashboards really easily. And what it does is it lets you take all your pieces together, lay them out in columns and rows, put them in a template, and so on. And so for my last example, I can actually take this notebook and actually run it with
21:05
panel serve, and it hits the serverable call at the last bit of code, and it turns into a dashboard. So this is the sort of final example of the GULs now presented as a dashboard that you could run on a server and present to someone else, give them the URL, kind of share
21:21
your insights with someone else. So this is it with a little bit more text, a little bit more information, a nice title. You can switch the mode to dark mode if you want. Your color bar, your EQ hist, we're still working on it a little bit. That's why you get negative NaNs when there's missing data, which we'll tidy up a bit. But you can see that you get all the information about the highest seagull that you're hovering
21:45
over. Again, so all the stuff, the temperature, the heading, the height above mean sea level and so on and so forth. So we've achieved a lot. We're very happy with where we're going, and we're going to have this all released
22:01
very, very soon. The inspection, everything up to the very last step, the instant inspections have been released, and we're getting that instant inspection out soon. In fact, that's actually in Data Shader already. We're just integrating it into HoloViews and HVplot. But there's still other work that's going on. One of the things that has happened in Bokeh is the categorical color mapping can
22:21
now be done client-side. So you can see this is the server-side color mapping, or color mixing, if you've got different categories. And this is done client-side, which lets you do things like hover again but with categorical, and also have a color bar. So there's definitely benefits to doing client-side color mapping.
22:41
Also, we do focus on other things, like time series. We want to have very good support in Data Shader for time series. We already have anti-ALIST lines. I'm not sure on any time series, but Data Shader does work pretty well with time series, with anti-ALISTing. Though there's another step, which is getting these operations, these index operations, to work with anti-ALISTing as well.
23:01
And that's a tricky job that Ian Thomas is working on. So yes, there's actually a sprint. There's a panel sprint, so it's not about this particularly. It's not about Data Shader or HoloViews, but panel, which is actually used by HoloViews. And is actually an important part of our ecosystem, actually, at the base of a lot of
23:22
our packages. If you want to help out, there's a sprint at EuroPython for the HoloVis panel sprints. So please do check that out. To wrap up in acknowledgments, exciting news this year for HoloVis is that we're now NumFocus-sponsored, which is very exciting for us. And I'd also like to thank NumFocus for enabling me to attend this talk today.
23:46
And lastly, of course, this is a work of lots and lots of people. Here are just some of the people involved in the last releases of HoloViews and Data Shader. There are many more, but this is what GitHub throws up when you look at the release notes. So that's HoloViews, and that's for Data Shader.
24:02
Okay, so that's good. It looks like we've got five minutes for questions. Please go ahead.
24:28
Yes. So that's a very good point. So we are not using the geospatial stack in this example, because you need GDAL and lots of other expensive geodependencies, but we have a project called Geoviews, which you
24:48
can find out on the HoloVis website, so that is something I do recommend everyone goes to. If you go to holovis.org, you can have an overview of all our different packages, including panel, but there's also Geoviews here, which if you click on that, you can
25:03
see that we actually do projections, and we do actually support all this stuff, and you need to actually start thinking about the curvature of the earth, if it's really big spaces, and that's where you need to use Geoviews, because you need the full card or pie stack and all that geo stuff, but we do support that, yeah.
25:23
3D we don't do, but we do projections in 2D. So 3D, we can do a little bit of 3D with Plotly, but not for geo, I don't think.
25:45
Okay. So I am not an expert on any of this data. This is an example that I found that is very pretty, has some nice plots, and I describe it to the ability I have. If you want to learn more, you can get it from the real source, which is Christopher Akiki. It's his data set, and it's on hug and face.
26:06
Any more questions? I think we'll wrap it up here. Congratulations on a numb focus funding, and thank you very much for contributing to the open source community. Thank you. Thank you.