Using GPU-acceleration to Interact with Geotemporal Data at Planet-Scale
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Alternative Title |
| |
Title of Series | ||
Number of Parts | 295 | |
Author | ||
Contributors | ||
License | CC Attribution 3.0 Germany: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/43340 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
00:00
Single-precision floating-point formatInheritance (object-oriented programming)Computing platformLecture/Conference
00:27
Presentation of a group2 (number)Demo (music)
00:51
Slide ruleSlide ruleGroup actionLink (knot theory)MathematicsLevel (video gaming)
01:14
MathematicsQuicksortDirection (geometry)BitMoving averageInternet service providerLevel (video gaming)WebsiteXMLComputer animation
01:40
Level (video gaming)Temporal logicScale (map)Graphics processing unitLevel (video gaming)Internet service providerMereologyQuicksortGeomaticsBranch (computer science)Self-organizationXMLComputer animationUML
02:20
QuicksortSelf-organizationFile formatGeometryTerm (mathematics)Branch (computer science)Similarity (geometry)Network topologyVenn diagramValue-added networkDiagramOpen source
02:56
Scale (map)Temporal logicBefehlsprozessorQuicksortData typeOpen sourceSocial classXMLUML
03:28
Temporal logicVolume (thermodynamics)Connectivity (graph theory)Multiplication signRow (database)Social classTime seriesReal-time operating systemReal number
04:00
VolumeTemporal logicTraffic reportingDecision theoryProcess (computing)Vector potentialFraction (mathematics)Data typeVolume (thermodynamics)Social classMass
05:01
Hill differential equationTemporal logicVolumeWebsiteRight angleQuicksortSequelSemiconductor memoryCategory of beingOpen sourceData typeElectronic data processingCore dumpGraphics processing unitSpacetimeQuery languageDatabaseMultiplication signSingle-precision floating-point format
05:59
Volume (thermodynamics)BitQuicksortDatabaseRow (database)Open sourceMultiplication signResultantOnline helpDemo (music)WebsiteGroup action
06:31
Digital filterCountingMenu (computing)BuildingMagneto-optical driveExecution unitBuildingRow (database)PlastikkarteLevel (video gaming)QuicksortGraph coloringPoint (geometry)BitTerm (mathematics)PolygonPower (physics)Instance (computer science)Set (mathematics)View (database)Query languageGraphics processing unitVolumenvisualisierung
08:21
CASE <Informatik>QuicksortMultiplication signPolygonResultantDatabaseInstance (computer science)Visualization (computer graphics)Client (computing)PlastikkarteQuery language
09:05
BuildingCountingMagneto-optical driveDigital filterScalable Coherent InterfaceUser profileReal numberAnnihilator (ring theory)Software bugPlastikkarteServer (computing)DatabasePoint (geometry)Query languageBit2 (number)Semiconductor memorySequelDrop (liquid)Multiplication signLine (geometry)
11:12
CountingBuildingDigital filterMultiplication signDemo (music)FamilyDevice driverQuery languagePoint (geometry)TowerDecision theoryBuildingBasis <Mathematik>Information privacyLabour Party (Malta)Traffic reporting
12:34
2 (number)Computer animationLecture/Conference
13:03
Link (knot theory)Slide ruleDemo (music)WebsiteGreatest elementDatabaseOpen sourceRow (database)WordQuicksortComputer animation
13:29
Visualization (computer graphics)DatabaseQuicksortOpen sourceProjective planeImmersion (album)Core dumpGreatest elementOpen setSequelCodeProduct (business)Mereology
14:15
MereologyOpen sourceData storage deviceRow (database)DatabaseWebsiteNumberInterface (computing)Semiconductor memorySequelComputer animation
14:40
Streaming mediaDisintegrationArtistic renderingComputing platformVolumenvisualisierungPhysical systemVisualization (computer graphics)Web browserServer (computing)Machine learningScalable Coherent InterfaceStandard deviationSet (mathematics)Interface (computing)Multiplication signSequelNumberInstance (computer science)Bookmark (World Wide Web)Different (Kate Ryan album)DatabaseOpen sourceQuery languageDemo (music)Online helpOpen setPoint (geometry)QuicksortComputer animationProgram flowchart
15:26
PolygonScale (map)Interactive televisionPoint (geometry)Line (geometry)Scalable Coherent InterfaceScaling (geometry)Context awarenessBasis <Mathematik>State of matterQuicksortOpen sourceOpen setMultiplication signComputer animation
15:53
Scale (map)Interactive televisionPoint (geometry)Line (geometry)Scalable Coherent InterfacePolygonBuildingComputing platformLine (geometry)Point (geometry)PolygonBuildingDemo (music)Scaling (geometry)Open setOpen sourceComputer animation
16:20
Line (geometry)PolygonPoint (geometry)Online helpEntire functionOpen setText editorScaling (geometry)PolygonPoint (geometry)Line (geometry)BitQuicksortData miningAreaComputer animation
16:58
Line (geometry)PolygonText editorOpen setEntire functionPoint (geometry)MIDIScalable Coherent InterfaceOnline helpPoint (geometry)PolygonAreaLimit (category theory)State of matterQuicksortLine (geometry)Different (Kate Ryan album)BuildingPhysical systemInstance (computer science)MathematicsComputer animation
17:29
PolygonPoint (geometry)Inflection pointLine (geometry)Link (knot theory)Row (database)ResultantPhysical systemQuery languageLine (geometry)
17:58
PolygonInflection pointLine (geometry)Point (geometry)Entire functionOpen setTwin primeZoom lensQuery languageBitDatabaseScaling (geometry)Lecture/ConferenceComputer animation
18:43
Line (geometry)Point (geometry)Open setEntire functionPolygonQuicksortQuery languagePairwise comparisonTheory of relativityInstance (computer science)PlanningCross-correlationLecture/ConferenceComputer animation
19:31
PolygonNumberPairwise comparisonBuildingTerm (mathematics)Distance2 (number)Right angleArithmetic meanScaling (geometry)
20:03
Line (geometry)Inflection pointPoint (geometry)Open setPolygonEntire functionScalable Coherent InterfaceText editorBuildingAreaDynamic random-access memoryFiber bundleOnline helpPolygonBuildingQuicksortSet (mathematics)Total S.A.State of matterNumberSatellitePopulation densityDifferent (Kate Ryan album)Dimensional analysisGraph coloringAreaRow (database)Distribution (mathematics)Lecture/ConferenceComputer animation
20:54
Text editorGraphic designMenu (computing)View (database)Scalable Coherent InterfacePlastikkarteOnline helpAverageJust-in-Time-CompilerElectronic meeting systemPolygonBuildingAreaMaß <Mathematik>AreaRow (database)NumberMetreRange (statistics)QuicksortBuildingResultantTotal S.A.AverageTerm (mathematics)FrequencyMatching (graph theory)State of matterPRINCE2Least squaresFamilyInstance (computer science)Computer animation
22:22
PolygonBuildingAreaScalable Coherent InterfaceText editorOnline helpInclusion mapVideo game consoleElement (mathematics)Server (computing)Graphical user interfaceTotal S.A.Fiber bundleEmail2 (number)InformationArmResultantRow (database)Video game consoleDatabaseDistribution (mathematics)MereologyInterface (computing)User interfaceQuery languageRight angleSequelSoftwareProcess (computing)Scripting language
23:16
Fiber bundleVideo game consoleEmailServer (computing)Graphical user interfaceClique-widthFile formatPolygonBuildingTotal S.A.Online helpAreaText editorAverageScalable Coherent InterfaceConvex hullMIDIVolumenvisualisierungQuery languageSequelGreatest elementRun time (program lifecycle phase)2 (number)Multiplication signGoodness of fitLecture/ConferenceComputer animation
24:12
Video gameStorage area networkOnline helpBuildingPolygonMaß <Mathematik>Text editorAreaDrum memoryGoodness of fitBuildingMereologyOffice suiteSpacetimeType theoryComputer animationLecture/Conference
24:42
Storage area networkNormed vector spaceAreaText editorBuildingPolygonMaß <Mathematik>Video gameOnline helpScalable Coherent InterfaceLevel (video gaming)Point (geometry)BitQuery languageBuildingArmDatabaseQuicksortAdditionReading (process)Block (periodic table)
25:48
SatelliteBuildingBlock (periodic table)Computer animation
26:15
AreaBuildingPolygonVideo gameScalable Coherent InterfaceText editorDemo (music)Open sourceOpen setQuicksortIntegrated development environmentInterface (computing)GeometryDatabaseCASE <Informatik>Right angleServer (computing)Mobile appLecture/ConferenceComputer animation
27:15
QuicksortOpen sourceSlide ruleStack (abstract data type)Process (computing)Interactive television
27:37
Goodness of fit
28:19
Scalable Coherent InterfaceInstance (computer science)Execution unitConditional-access moduleComputing platformWechselseitige InformationTwin primeRoyal NavyGraphics processing unitLevel (video gaming)Computing platformDemo (music)WebsiteInteractive televisionBuildingOpen setInterface (computing)View (database)1 (number)Service (economics)Process (computing)Instance (computer science)Point cloudComputer animation
29:07
Instance (computer science)Scalable Coherent InterfaceBoom (sailing)Emulation1 (number)Computing platformComputer animation
29:33
Slide rule
Transcript: English(auto-generated)
00:07
So I'm introducing Aaron Williams. He's coming from San Francisco to join us with Vice President with Omnisci and works on adoption of new technology platforms. And I think the keynote is all about that. You have the floor.
00:28
Thank you. I did offer to do his presentation, if they would put it up here. Luckily for you, though, they declined my offer, so you're going to be stuck with just my presentation. So happy to be here. I am going to be talking about new technologies. I'm going to hopefully spend almost my entire presentation doing demos.
00:49
So we'll get to those in a second. I've put these slides already. I only have a handful of slides, but I've put them already up onto speaker decks, so they do have some links in them. So if you want to, you can go and just take a picture of this slide and
01:03
then go pull the other slides down or take pictures of all the slides. I don't care. And happy to be connected with all the folks here. This is a great group. One other quick thing. We changed our name from MapD to Omnisci. So some of you may remember from Phosphor-G's past, MapD as a participant, it's no change to our technology or what we're doing.
01:29
It's just a change in name. So MapD is now Omnisci. Primarily we did it because of the additional work that we're doing with data scientists and the sort of direction of the companies going.
01:42
Too often MapD was confused with being a map provider, and we are definitely not a map provider. Okay, great. So let's get into the topic here. I joined Omnisci about two years ago, and it happened to be that my first week at Omnisci was Phosphor-G in Boston.
02:01
And it was a great sort of trial by fire for me into not only geospatial technology but also this community. Certainly one that I feel I've really been happy to become part of now over the last two years. My background with the companies that I used to work with was more focused on the big data side, not so much on the geospatial side.
02:23
Primarily that was because my experience was that those two topics were handled by two different branches of the IT tree within those organizations. You'd have your big data over here and your geo over here, and they had some similar tools but not all of them. And they had some similar data formats but not all of them. And even their definition of what it meant to be big data or what it meant to visualize data was usually completely different.
02:46
And so those two sort of camps had a very limited overlap in terms of the Venn diagram of the kinds of problems they were solving internally for these companies that we worked with. And I think one thing that I realized when I joined this community was how that's really starting to change.
03:02
That overlap is now happening more and more frequently where we're bringing geospatial data together with other data sources to be able to tell a more complete story or solve bigger problems. It's not just about sort of keeping those things siloed to themselves. And so as we've seen those sort of data types come together, we've also seen that sort of there's this new class of data.
03:27
We call it VAST or we use the acronym VAST to talk about that. It's this class of data where it has big volume but it also needs agility and tends to have a spatio-temporal component to it. So big volume, we're typically talking about hundreds of millions to hundreds of billions of rows of data.
03:46
So a lot of data, the temporal piece of this alone, anytime you get time series data, it's easy to see how quick that grows into a very large data set. The agility is the other important piece of this. It's not just about being able to get data real time, but it's also about being able
04:02
to query that data in such a way that you can find the interesting insights in the data. It's not about just generating reports that take weeks to create, but it's about putting the data into the hands of more people so that they can actually get the insights they need and make the decisions that they want. I really like what Julia said about this diversification of users for this data.
04:23
It's turned into something now where we want that data to be out into more hands and that needs, therefore it needs to be more agile. And I think the spatio-temporal piece of it is pretty familiar to this audience. So the common denominator for us to be able to take this vast class of data and actually have it live up to its potential was the speed.
04:47
The ability to actually process that data fast enough so that it could be agile. If we could get a faster processing of the data, we could get higher volumes and we could get that agility. And we needed, of course, the spatio-temporal data types to be able to make sense of it all.
05:04
And so what we kept coming back to was this idea of speed, and that's really where I'm going to focus most of my talk is on how we see GPUs as being a technology that helps us get that speed that we're looking for to be able to satisfy the demands of this space.
05:22
So GPUs, massively parallel, right? Also, the memory for GPUs is built into the chip, so the memory is very, very fast. You combine those two properties together, and you can actually speed up data processing dramatically. The core of what OmniSci is is an open-source SQL database that's built from the ground up to take advantage of GPUs,
05:46
take advantage of that technology, to be able to run SQL queries hundreds of times faster than what you see on typical SQL databases. So what do I mean by faster? I can take a 10 billion row data set, I can query it, and return the results in about 200 to 300 milliseconds of time.
06:08
Once you get to that kind of speed, you get the opportunity to have the agility that I was talking about, and you get the ability to process the data at the volumes I was talking about. Okay, so enough talking about it, though. You get the sort of basics of what OmniSci is, open-source database.
06:22
Let's go look at it in action, because I think that'll help to make it a little bit more tangible. So the first demo, we've taken a billion rows of New York City taxi data, that's about seven years worth of taxi rides in New York City. We've combined it with a million building footprints in New York City, that's polygons for the buildings in New York City.
06:46
We associated each pickup point for the taxi ride with its nearest building, and then color-coded the buildings based on the tip percentage of those rides. So what you can see on the left, let me zoom in a little bit, make it a little bit easier to see, you'll see red buildings and blue buildings.
07:06
The blue buildings are where the not-so-great tippers live, the red buildings are where the very good tippers live. And what you see is our ability to sort of query those two very large data sets, join them together, and then visualize them in milliseconds.
07:22
I'm going to zoom back out here so you can see that we actually do have all of New York, we have all of the building footprints. And you can see how quickly, as I zoom out, we recalibrate that map to show all of those polygons. I can also filter the data, so for instance, if I just want to look at the rides that paid with a credit card, I can click on that.
07:44
It's gone from a billion rides now down to just 507 million rides. We've recalculated the colors for those buildings based on which rides are now still in the filter, and recolored the buildings now on the left-hand side to show just those 500 million rides.
08:01
And this is what I'm talking about in terms of the interactive power of having those GPUs combined with really fast software on top. The coolest chart up there is the map on the left-hand side, because we don't just use the GPU for accelerating our SQL queries, we also use it to be able to render that map. If we're sitting on top of a GPU, we might as well use it for what it was originally designed for, which is creating graphics.
08:25
So we actually take the results of the query, in this case the 100 million polygons, 500 million taxi rides, we pass the results of that query back to the GPU and have the GPU actually do the rendering for us.
08:40
So all that's coming down to the client is a few hundred K of a PNG. And that is what we mean by sort of taking full advantage of this technology. We want a database that's fast, we want the visualization that's fast. You combine those things together, now we can actually start to take advantage of this sort of vast use cases.
09:03
I'm going to also show you, just real quick, I can scrub over here for instance and show just one year worth of the data. I can easily show how, over time, if you watch the donut there on the right, I'm just going to sort of take it forward a year at a time. You'll see how the credit cards become a far more prevalent piece of their business,
09:23
to the point where when I get to the end of these seven years, credit cards are now almost two-thirds of their business. And in the first year, it was basically zero, because they hadn't yet actually rolled out the credit cards into all of the taxi cabs yet. So you see how interactive it was.
09:40
Just in the last 30 seconds, I've probably hit that server with a few hundred SQL queries. Each of those SQL queries is querying a database that is not indexed, not pre-aggregated, we haven't done any downsampling, we haven't done anything to the data, we've just loaded it into GPU memory, and now we're running those queries against that data,
10:01
and it's coming back in these 100, 200, 300 millisecond timeframes. The other thing I'll point out about this dataset, if you look at the timeline at the top, and I know for some in the back it'll be hard to see that timeline, but trust me, there's a beautiful timeline at the top.
10:20
I'm going to zoom in on that a little bit so you can actually see an interesting anomaly. Do you notice there's a dip every year? And it's very predictable. It happens every year, it happens right the week of Christmas. When I first saw that, my heart kind of broke, because I was thinking, oh you know what this means, like people get really stingy at Christmas,
10:42
the tip percentage drops precipitously at Christmas, that means people are running out of money, they're just not giving the same tips. Turns out that's not true, faith in humanity can be restored. I'm not going to tell you exactly what the answer is to it, but the data tells you what the answer to why that dip happens every year at Christmas.
11:02
I want to hear what your ideas are, come see us at the booth, and I'll actually show you in the data why that dip happens every year at Christmas. And remember, it's not us being stingy, I promise. Okay, so this is the first demo. This demo is public, it's out on our website, so if you want to go play with it yourself, or if you have a family member who's a taxi driver in New York City,
11:26
feel free to point them to this. I can show them exactly where they should be picking up additional rides down here in Dumbo. You'll see the red buildings here at the tip of Brooklyn. The other thing I'll point out about this is we're taking two public datasets,
11:43
the taxi ride dataset and this public building footprints dataset, we're combining them together and we're creating a tool that could literally be used by thousands of taxi drivers or thousands of Uber drivers on an hourly basis to figure out where they should be going to maximize their tip revenue for a particular day.
12:03
And I want to point that out because that's the kind of agility I was talking about before. When you get the query speed to a point where it feels interactive, suddenly you open up new kinds of possibilities that you can put into the hands of individual taxi drivers, of individual users of data, so that it's not just sitting in an ivory tower of data scientists somewhere
12:22
that are pumping out reports. We want this to actually be down into the hands of the decision makers, we want this to be out there in an agile fashion for people to take advantage of. Sorry about that. It will come back, I hope. I think.
12:50
The thing you don't expect. I think it's going to come back in just a second.
13:10
Let me jump back to my slide. So that was the taxi demo, that's the first demo. As I said, it's publicly available on our website or I encourage you to come visit us and we'll be happy to walk you through it.
13:21
The link for it is at the bottom of that slide. Next I want to talk just a second about what sort of, now that you've seen a demo, you've seen what I'm talking about when I talk about speed, now let's talk for a second about what OmniSci is.
13:41
So I said before that the core of OmniSci is this open source database. We built it from the ground up so it's our own source code, but we welcome folks to come and join us and participate in the open source project. You can see that sort of down here at the bottom, the OmniSci DB SQL engine.
14:00
On top of that, we've built this sort of BI tool called Immerse. That's the thing you see at the top of this visualization. We combine Immerse, which is a paid product, with that database and that's what we sell to companies. But the database, if all you need is really fast SQL, we have that database available for you as part of the open source.
14:22
For the most part, OmniSci is not your store of record. You're going to already have a store of record. So we provide a number of interfaces, this is on the left hand side, for being able to get data from either streaming sources or from your data at rest and bring it into our database, which basically brings it into GPU memory to make those SQL queries fast.
14:41
We also provide a standard set of interfaces so that you can use, I don't know why it keeps blinking. One more time. We also provide a standard set of interfaces so that you can use your favorite data science tools, for instance, so Python, a Python connector.
15:02
We also have JavaScript connector. We have JDBC, ODBC. We're working on a Julia connector. So we have a number of different ways for you to actually get access to that really fast SQL queries on the database itself. I'll just point out that the database, again, is the thing that is open source and I'll be happy to help folks get access to it if you like.
15:23
I have two other demos that I'm going to show. The first is, and now we're going to shift to sort of talking more about the open data sources that are available. There's been two that have been sort of really interesting to us. The first is OSM, probably no surprise there. Everybody sort of has an awareness of OSM.
15:43
Usually it's used on a sort of maybe city scale or even state scale basis and usually it's just certain layers that folks are looking at at any given time. We decided to see what it would look like if we brought the entirety of OSM into OmniSci, so all the points, lines, and polygons, and put them into an interactive experience.
16:02
So I'll show you what that looks like. And then secondarily, we brought all of the North American building footprints in as well. It's about 135 million building polygons that we can easily interact with using our platform. So I'm going to show you a couple of those quick demos, again, just showing kind of scale of these open data sources and what's possible.
16:20
Let me go back to this one first. So this is the entirety of OpenStreetMap global scale. We have 117 million points. We have 170 million lines and 400 some million polygons.
16:41
We brought all that into OmniSci and you can see here it is kind of on a global scale. My apologies that it's a little dark, so it may be a little hard to see, but I'll be happy to walk folks through it. If I zoom in, it's a little bit easier. So let's look at just Bucharest.
17:02
As I zoom in, you'll now see that we're down to 189,000 points. We're going to be looking at about 80,000 lines and 100,000 polygons. This shows sort of all of the data for the Bucharest area. The polygons are a wide swath of different kinds of polygons. We've got building polygons in there.
17:21
We've got state and region polygons in there as well. So let's limit that to just, for instance, the hotel polygons, and we'll limit the points to just the hotel points. As I make these changes, what's actually happening here is the system is doing a query back to those very large hundreds of millions of row data sets that we have for OSM,
17:42
filtering out to just the things that I care about, and then returning the results and putting them into that graphic at the top. I'm going to do the same thing for the lines, actually. I'm just going to show the bikeways around Bucharest. Let me zoom in here.
18:02
You can see where the bikeways are. You can also see where we are. I think this is the hotel. Zoom in a little bit more. There it is. So here we are at the Intercontinental. And you can see how close...
18:22
There's a little bit of a delay here. There it is. You can see how close we are to the nearest bike path, which is right there on the left-hand side. I'll zoom back out a little bit so you can see again at kind of city scale. And as I was mentioning earlier, I think this is where we start to see an interesting reason to have the entirety of OSM in a single database.
18:42
It's not necessarily because you're going to want to do queries across entirety of OSM, but it's more that you have the opportunity to start to compare cities. So for instance, we've looked at Bucharest and its sort of bike and hotel correlation, where the hotels are, where the bike lanes are. Let's switch over to Calgary as a comparison
19:04
to see where we're going to be next year and see what that looks like. So by just typing in Calgary now, we've completely changed to where we are on Earth, but I can still see that same sort of filter of data. I'm seeing just the bike lanes. I'm seeing just the hotels. And I can get a sense of, wow, significantly more bike lanes,
19:21
or at least significantly more reported bike lanes. And I still get that capability of seeing where are the hotels in relation to those bike lanes. I could do, for instance, a very simple query to find out which hotels were closest to the bike lanes or which hotels in the city were a certain distance from the bike lanes and use that as a means of comparing different cities. So we have 10 different cities we might want to go to.
19:43
Having the right number of rooms that are close enough to bike lanes is important to us. How do we actually do that comparison on a city scale, but look at it at a much broader scale, look at it in terms of cities all across Canada or all across North America. Okay, so it's this idea of being able to query very, very large data sets.
20:06
OSM is one example of that. I'm going to jump to a second example, which is the North American Building Polygons. Let me just check my time. North American Building Polygons. So we brought in 137 million building polygons across all of North America.
20:22
This was a data set that was generated by Microsoft using satellite data. And so you can see we've color-coded the U.S. here, or all of North America, based on the density of these buildings. You can see we have sort of the building footprints distribution. You can see the total number of buildings we're looking at right now is all 137 million of them.
20:45
Let's create a quick chart that'll just show me where sort of the most buildings are and the largest buildings are, building footprints are, across the different states. So I'm going to come in here and I'm going to choose the dimension of state, province. I'll pick number of records as the width, and then I'll color-code it by the area.
21:04
I'm going to limit this to just 20 results so I get a slightly better look, and then I think maybe red and green would be a better color-code for it. So with a few simple clicks now, again, and this is all going back to that 170 million row data set,
21:22
I'm now able to see California, no surprise, has the most building footprints. It also has some of the largest building footprints by average. If I click on just California, you'll see how quickly it'll sort of reset itself to just show the data for California. You can see the sort of bump here at about 200 square meters.
21:41
Just make sure that you're seeing that, and then if I pick another state, let's pick Michigan. If I pick Michigan, about half the number of total building footprints, but also a significantly smaller size in terms of those building footprints. You think that's probably because it's more residential, probably because it has smaller or fewer of the multifamily homes than California does.
22:03
So down in the, you know, the sort of peak here is down in the maybe 100 square meter range. Of course, I don't have to go just by states. I can also, you know, take a lasso, for instance, and say, let's look just at the West Coast, and I'm going to go all the way up to Alaska because I can.
22:22
So now I've created a fairly complex region that I want to look at. We go back and do the query now, back to that 170 million row data set. I'm looking at, instead of 137 million, I'm looking at just 20 million now. And I return the results in, you know, what probably felt like a couple of seconds, but remember, I'm going back to California to get this information.
22:42
So a lot of that is actually network lag. Maybe I want to take that same region and bring it over here to see what's happening in the middle part of the country. Easy for me to now switch from that 20 million to this other 25 million and see what the distribution looks like there. Let me also just bring up, for those of you that might care,
23:05
I'm going to bring up the JavaScript console here on the right-hand side. The reason I'm doing this is because I want to show you that the interface between everything I'm showing you here in the user interface is actually just SQL back to this database that I keep talking about. So on the right-hand side here, you can actually see the SQL queries that are going back and forth.
23:23
You can see how complex this query is. It's a render query. It's actually the thing that says render this particular PNG and bring it back to me based on this particular SQL query. And it's because it has that fairly arbitrary region in there that I've circled that it becomes a relatively complex query.
23:40
But if I come back down here to the bottom again, you can see the execution time was about two seconds. Not that surprising for a query that is so complex. But the execution time of the other queries, which are happening for the other charts that are on this dashboard, those execution times are in the 28 milliseconds, 20 milliseconds down here.
24:02
So very, very fast query times for the simple queries, and even two seconds for us to be able to get the data back for the larger one. Not that surprising. Let me delete that region, and I'm going to zoom back into our good friends in Calgary.
24:23
Again, just type Calgary. It's going to take me so that I'm now looking at just 509,000 building footprints. It's pretty easy to see kind of where the more industrial parts of the city are because they're going to show up here in red. These are going to be the larger building footprints, typically warehouses, larger office space probably.
24:41
If I zoom in here to show just kind of the downtown, you can see it even more clearly. The downtown has much larger building footprints than sort of the suburbs. Yeah, you can see that. Maybe we want to look at just the, and I'll sort of scrub here to look at just the smaller building footprints.
25:01
So we're going to get rid of all those big buildings that are in downtown. We're going to look at just sort of what it looks like over here in the suburbs. If I sort of move the map up here a little bit. So again, the point of this is fully interactive. Over the last minute, what I've done is hit that database with dozens and dozens of queries. I can zoom in even further to see.
25:22
It's interesting that there's some additional red buildings up here. So it looks like maybe these aren't single-family homes. Maybe there's more apartment buildings up here. Let's extend this out a little bit so we can see what that looks like.
25:40
Yeah, so these are larger homes and these sort of apartment buildings that are down here in the corner. So anyway, the point is fully interactive data. I've gone from something that was entirety of North America down to something that is, you know, looking at a single individual block very, very quickly. I'll bring up the satellite background for this just so you can see how accurate the building footprints are.
26:03
They're not quite as accurate in Canada as they are in North America, in the U.S. that I've noticed. But they're still pretty good, surprisingly accurate for how they were obtained. All right, so I think that's enough for this sort of piece of the demo. Again, trying to drive home the idea that these open data sources can be foundational
26:24
to what you would want to do with your data sources to create sort of an agile environment. Take the data that you've got, combine it with these open geo data sources, and be able to create entirely new interfaces for folks to be able to engage with that data. And in case, you know, you're thinking, okay, great, but, you know, maybe immerse isn't the right tool for me,
26:43
that's perfectly fine. I use immerse because it's the easiest tool for me to use. But, of course, you can also use through just all your other favorite tools. That's what we're showing here. What we've done is we've actually created a Flask app that runs next to the OmniSide database. Uses our Pi MapD interface, our connector, to be able to pull and query the database,
27:05
and then translate it into GeoJSON and pass it back as a WFS server. And so we can now import that data, bring that data into other tools like QGIS, and this is now an entirely sort of open source stack, open source way of engaging with that data.
27:22
So if anyone out there has large data problems, interactive data problems, they need faster data, I think, you know, we've got a great tool here that folks can use. I'm going to jump back to my slides because I just have a couple to wrap up with.
27:41
We've got socks. I know, that's what you've been waiting for. For anybody that stopped by our booth yesterday, we've got these fantastic data socks. I encourage you to come by and pick up a pair of data socks. I've got really smart colleagues here with me this week. So Tyler and Mike and Akmal, far more technical and way smarter than I am.
28:02
So come by and talk with them. We'll be happy to walk you through how you can get started and give you a beautiful pair of socks. Anybody out there wearing their OmniSide socks today? Oh, I had people promise me they were going to wear them. Okay, that's fine. Okay.
28:20
So for self-discovery, how to get started. The demos that I've shown you, the two open data demos, so OpenStreetMap and the building footprints, are not public demos yet, but they will be soon. We'll be happy to provide you with the data if you want to play with it yourself. The taxi data is public and available. We've got another probably dozen demos out there on our website
28:41
where you can go and interact with the data using our GPUs. So please go out there and play with it and see for yourself what that level of interactivity looks like. So we have a OmniSide as a service, so just go to omniside.cloud and you can spin up an instance, again, on our GPUs that you can use, upload your own data, play with it, use our Python interface, really use any of our APIs to be able to test out the platform
29:03
and not have to worry about any of the infrastructure of setting up GPUs. Of course, if you want to run it on your own GPUs or in your own cloud, that's no problem. You can download that from our website as well. And last but not least, my team runs the community. So we're the ones that are out there talking to new users, getting folks excited about the platform, so if you're interested
29:23
or if you have questions, come join us in our community platform. We'd love to hear from you and see what you've actually been able to create with it. And we'd love to see pictures of you wearing your OmniSocks, please. And with that, that's it. Thank you so much.