We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

DGGS: a new paradigm for spatial

00:00

Formal Metadata

Title
DGGS: a new paradigm for spatial
Alternative Title
DGGS: The New Digitally Friendly Format for Spatial Data
Title of Series
Number of Parts
52
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Byron and Adrian's talk was the fourth and final talk in the "Standards / Formats" session at FOSS4G SotM Oceania 2019, organised by OSGeo Oceania and held at The National Library in Wellington, New Zealand from November 12-15 2019. FOSS4G SotM Oceania is the coming together of Oceania's geospatial open source and open data community - with four days of workshops, presentations, a community sprint and social events.
Open setProgramming paradigmMaxima and minimaThomas KuhnWorkstation <Musikinstrument>Exception handlingUniform resource locatorApproximationCoordinate systemDiscrete groupMereologyDistanceLine (geometry)AreaPhysical system1 (number)SphereObject (grammar)BitRange (statistics)Level (video gaming)CurvatureView (database)Nuclear spaceMappingDistortion (mathematics)Standard deviationQuicksortSurfaceDecision theoryZoom lensDirection (geometry)Cellular automatonNumberMetrePoint (geometry)Moving averageSpacetimeWebsiteRight angleCircleAngleProjective planeGoodness of fitMultiplication signPurchasingMeasurement
Numerical digitTrailWeb pageLatent heatSynchronizationSpeciesCellular automatonPoint (geometry)Vector spacePolygonPhysical systemSingle-precision floating-point formatPoint (geometry)Square numberPolygonAnalogyCellular automatonSubject indexingApproximationAreaMetreLine (geometry)Vector spaceStandard deviationOpen sourceLevel (video gaming)NeuroinformatikMappingMereologyRegular graphVisualization (computer graphics)CubeString (computer science)MeasurementCoordinate systemMultiplication sign5 (number)BuildingKey (cryptography)HexagonOrder (biology)1 (number)Computer fileDiscrete groupShape (magazine)IdentifiabilityCartesian coordinate systemÜberlastkontrolleAnnihilator (ring theory)MetadataQuicksortVolumeRaster graphicsGoodness of fitCASE <Informatik>Object (grammar)SpacetimeError messageProper mapPentagonCollatz conjectureRight angleComputer animation
Digital signalAnalogyWaveSquare numberTriangleCellular automatonDifferent (Kate Ryan album)Process (computing)Price indexUniqueness quantificationAddress spaceAreaSpacetimeCodierung <Programmierung>CurveHierarchyHexagonOrthogonalityLinear mapCartesian coordinate systemCoordinate systemEllipsoidSurfacePlane (geometry)Degree (graph theory)MetreFraction (mathematics)IntegerClassical physicsMaß <Mathematik>Real numberNumberOperator (mathematics)AlgebraSingle-precision floating-point formatImage resolutionMetadataSimulationFinitary relationInheritance (object-oriented programming)ImplementationBinary fileGeometryHash functionFunction (mathematics)Compact spaceDatabase normalizationoutputInterior (topology)Boolean algebraCovering spacePhysical systemSymmetric matrixOperator (mathematics)Equaliser (mathematics)Slide ruleQuicksortAreaCode2 (number)SpacetimeStandard deviationBitGroup actionWeb browserVolume (thermodynamics)Physical systemImplementationPoint (geometry)Address spacePolygonOverlay-NetzIdentifiabilityVoxelFundamental theorem of algebraType theoryEndliche ModelltheorieMultiplication signNeuroinformatikSet (mathematics)Series (mathematics)MultiplicationInheritance (object-oriented programming)Loop (music)Cellular automatonProcess (computing)DiagramCoroutineNormal (geometry)WebsiteCombinational logicPresentation of a groupCASE <Informatik>DatabaseSubject indexingWeb applicationNetwork topologyMusical ensembleReal-time operating systemTerm (mathematics)Library (computing)WeightWaveformRing (mathematics)Cartesian coordinate systemCurveString (computer science)WritingFitness functionPlanningAnalogySingle-precision floating-point formatData conversionHexagonDifferent (Kate Ryan album)DistanceDevice driverSoftware developerShape (magazine)Discrete groupTable (information)GeometryReferenzmodellHash functionMereologySpace-filling curveNumberComputer programmingParameter (computer programming)Scaling (geometry)Mathematical analysisCalculationExtension (kinesiology)Mechanism designDimensional analysisMathematicsVulnerability (computing)TriangleMappingSquare numberDegree (graph theory)Discounts and allowancesInstance (computer science)TheorySoftware testingOpen setCoordinate systemScripting languageJava appletInformationLecture/Conference
Transcript: English(auto-generated)
Thank you. Wow, big crowd. Yeah, it's interesting to follow a talk on coordinate systems because this is a system that doesn't use coordinates to locate things on Earth. And yeah, I'm gonna be talking about discrete global grid systems. This is a standard that from both ISO and the OGC led by people here in New Zealand.
Robert Gibb of Landcare and Matthew Purse from GA in Australia, and then there's a third party from Canada. But anyway, I just get into this a little bit, but I'm gonna tell a start with a story. Kim Jong-un a few years ago got the shiny new toy and ballistic missiles on top of which he can mount his
other shiny new toy, nuclear warheads. It scared a lot of people and the news agencies thought it'd be really good to draw some maps that showed the range of these missiles. And this is what the Associated Press actually published.
Anyone see an issue with the way this map is drawn? Yeah, it's a flat world view of the Earth. This is not the range of those missiles because that's a projected map. And it's taken an orange, projected it out, and flattened it out, and stretched and and, you know, sprunched it up, and that's what you end up with. It wasn't just the Associated Press though.
This map on that side was that, well, they're both from the same publication, but this was the Economist magazine. That looks like evener circles, so they didn't even try to adjust at all for it, but at least they issued a retraction. And you look at the map, this one, and you see, wow,
those missiles do actually cover most of the Earth. That's kind of scary. But it doesn't make intuitive sense to most people looking at that, you know. I'm kind of trained to think the Earth is flat. A better way of presenting it might have been like this, although the longest-range missile, you can't tell how close it comes to New Zealand on this map. You'd have to do another one that shows that side and such, so a little problematic there.
How did we get to this sort of understanding? Well, for years we've been needing to look at the Earth as a flat surface. So we have the Mercator projection that's really good at showing you which way to point your ship if you want to go from England to the United States, or which direction to pray if you're
Muslim and need to point to Mecca. You know, that works actually really well. It preserves angles really nice, but it doesn't do so well on areas or distances. And Google perpetuated this and other map engines, most of them. They all use a sort of Mercator view to map the Earth. Although Google, I have to give them credit. Now when you zoom out,
this is what you get, kind of more a Google Earth and a Google Maps sort of view. And so they've kind of decided, yeah, not so good to show the map as a flat Earth, since too many people believe that already. And some of the problems with it. Well, you know, you look at a map like this and you see why maybe
Donald Trump thought Greenland was a good purchase because it's as big as Africa. If he knew that was actually no bigger than some of the larger countries in Africa, maybe he wouldn't have been so excited. And there's other issues, you know, we don't just have area issues, we have distance issues. So, you know, you draw, take a map and you're going to draw straight lines from one point to another and those should be the
shortest distances, right? You know, got to be, you know, that's the way things work. Two points, a straight line between them, shortest distance. But that only works if the Earth is flat. The Earth isn't flat. You take those same lines, you put them on a globe, and they look funny curved like this.
And it's like, why would you go those directions as a shortest path? It doesn't make sense. But that's because it made sense on our conception of what the Earth is, not how the Earth really is. So if we didn't flatten things out and we use a system, this is a discrete global grid, all the cells all nested together there, and by using this we don't really flatten the Earth except for at that cellular level and
at the area where you really need to. All the data is stored in ways that are friendly to a sphere or a and that allows our shortest distance to be accurate.
So yeah, how'd we get here? Again, kind of getting back into the coordinate and discussion. We have several different coordinate systems out there, these blue ones that kind of have a curve and get closer at the top. That's your lat-long. You know, and that's how we how the the Earth is kind of mapped out. But lat-longs do not give you anything you can really measure area or distance on. They give you a coordinate space on the Earth.
That's what GPS gives you, too. So there's problems there. And what we've traditionally done is use projected coordinate systems. All those numbers on the side is a coordinate system with Eastings and Northings and meters that you can actually measure things now, distances and areas. That's really nice.
Of course, it's all approximation. This is a curved surface, too. It's just a smaller part of the Earth, so the distortions are less. But so that's that's how we've done things for a very long time to kind of get approximations of measurements. That's not really that accurate. Another problem that you have with the coordinate system is
that's kind of developed more in the computer age, when you start getting computer maps, is how big is a point? Yeah, you have you have points that when you draw them on a paper map, you kind of know the accuracy. It's accuracy of the paper map, right? You can't get more accurate than what the map is actually showing you. When you abstract those points away from a paper map
and it's in a computer, it's just sitting there as a point value. How accurate is that? You don't know without the metadata. In a discrete global grid system, we get around that because everything is an area. I'll get into that in a minute. And I'm back starting right now. So describing what a DGGS is.
Soccer ball's a good analogy. You take the spirit, you take a solid shape and you wrap it around the Earth. You shrink wrap it so it fits. Even and some of them are kind of like this. The ones that use hexagons intersperse it with pentagons in order to make it actually wrap properly. But the key is at any level, all those cell sizes are the same size and they have indexes on them.
Another way of doing is with squares. So you imagine taking a square and shrink wrapping that or a cube and shrink wrapping that to the Earth. But we're not shrink wrapping in this case in order to show kind of the indexing system. So you have all these spaces at that top level in in a volumetric sort of
shape and that's got a face on it. Face A, B, C. And then they have a regularly regular indexing system inside of each of those. So you have face you have one, two, three, four things. There's some divided in four parts in this example. And each of those is divided the same way and you can go on down. You could divide three into one, two, three, four the same way. That gives you a unique index for anywhere on the Earth
no matter how fine a point you want. It can be sub centimeter. It can be sub millimeter. Who really cares? Of course with the drifting continents, why would you ever do that? It's gonna be if you get that fine, it's gonna be by tomorrow be off. But yeah, it's but you can do it. You can do it in this. And it only takes a single
string to identify anywhere on Earth. Not a coordinate pair, which is a little more difficult for computers. So here, yeah, another just go through this quickly. It's kind of a visualization of the same thing taking the cube Dividing it into nine parts in this one, which is one that Robert Gibbs been using quite a lot.
Formerly of Landcare or Honorary Landcare or something. And yeah, keep on going down and going down and going down. That's what that kind of says in this regular indexing system. One of the beauties of that is because they're indexed the same way It's actually fairly intelligible to people because you know that nine is going to be next to eight or next to
What would it be? Well, they don't have nine. It's eight. Eight is the highest one on this one. That's right. Eight is next to six and next and diagonal from three and fives above it. So you kind of always can do your navigating just knowing that and building that in. It's really kind of cool. Actually kind of human friendly.
The other thing you can do with the discrete global grid system is describe areas on Earth with a single array, dimensional array of identifiers. So this is a map of California filling it in with a few shapes as possible that actually creates a pretty small file that creates that
polygon, filled in polygon of California. As you get to the edges the polygons get smaller and smaller and it all nested inside there and such as they normally are in DGGS and that gives you quite a simple way of describing an area. This happens to be,
Uber developed this one. It's called H3. It's open source. You can go and download all the tools and everything. They developed it in order to do congestion pricing. So most of the tools around it are optimized for that use and not so much general use. But it's an interesting application. They deviate in some small ways from a general DGGS as we've described it in the standards, but it's still highly useful and a good approach.
And vectors. You'd think it would be just for raster. It's kind of a raster idea. You have cells and cells and cells, but it actually works for vectors, too. You have to think though that instead of a point being a zero dimensional object, a point is
a cell. But it's the cell that's an approximation of the precision, you know, the error. It's not an ellipsis that you'd normally have as an error, but it's a good approximation of it. You know, it's like, well, the cell size is 10 centimeters. Well, I know my accuracy of my data is no better than 10 centimeters anyway, so what does it really matter?
A vector, you can do a line. Then you can just string all the cells together, or you could take the points and say it's from this cell to this cell to this cell to this cell, just like you would do with points in any GIS. And polygons, well, I kind of already explained those, and they could also be done using that vector system.
I like to think of DDGS as similar to what we did with digital music. You know, it would have been possible to take waveforms and the mathematics behind them, the same analog systems we kind of did on vinyl, and describe those in mathematics so that we could write computer programs and know how to play the music back.
But we didn't do that. Instead, it was far more efficient to approximate it with these kind of step curves. And if you make those steps small enough, then the difference between that and what you would get from the waveform is indistinguishable, and it's far easier to scale, far easier for the computers to do something with. We didn't do that with GIS.
Essentially what we did with GIS, we took our old analog systems and we just plopped them into the computer where the fit isn't necessarily all that good for a lot of things. So again, just kind of summarizing what a DDGS is, we have basically three shapes that really work to do it.
They have their strengths and weaknesses, but yeah, the triangle, the square, and the hexagon, and it won't go into that too much. Hexagons are nice because the distance from the center of any hexagon is the same to any hexagon around it, so doing distances is really cool, but they don't nest quite so well, and so I get really confused on how they do the indexing on hexagons.
People who know more than me, they insist it works right, and it's a very popular way of doing it. But one of the things that's really cool, and that's what makes it really computer-friendly, is that you have a single dimensional array of values that's really easy to put an identifier on. In fact, the string itself could be an identifier if you set it up right.
With that, it's really easy to link it to other data and other more traditional type systems. It creates a real lightweight way of doing geospatial operations. So yeah, this is just kind of a summary table that talks about some of the differences between traditional GIS and discrete global grid systems.
So yeah, space-filling curves as your identifier and the nested shapes of equal area is how it really works. You really don't have to work with projected plane type issues, to a certain extent you do, because at some point you're going to kind of flatten it.
But in all the calculations and everything, it's going to give you true shortest distance and something much truer to the area. And one of the ways it's been used quite a lot is to take big data and move everything into this and be able to do your spatial analysis on top of it.
It scales tremendously well. We're kind of more interested in some of the lower-end applications of it. Whoa, that's kind of strange on my little LSEPs, I don't know what that's about. But anyway, the spatial operators are vastly simplified. You can do your overlay
operations without having to do anything, without having to have a GIS essentially. It's just a little bit of code that you could actually embed in a browser if you want. And Adrian is going to kind of get into the more geeky bit of this talk and describe exactly how simple these things really are.
No? No? No, we're OK. Hold on to that. OK. Yeah. OK. Now I'm up here.
In fact, these operators are so simple, the code fits easily on our slides. But let's start with some fundamentals first. A computer stores all its data in a single array of numbers.
We can compact a single DGGS identifier inside one of those numbers, as shown here. And we can store multiple of those contiguously to describe an area.
Then if we arrange those correctly, we can use the remainder operator to get a good first guess at where to find a given cell in an area. This is known as a hash set and computers do them all the time.
Here's the code we use to do that detection. And the outer loop makes sure to also consider any parents of that cell because those contain the child. And here's the normalization routine to compact the data, like Byron talked about.
We look for any siblings we can replace with their parent and continue that until we've got a nice geometry like that California diagram.
Getting into some of the operators. For intersection, we actually do that process once both ways, so it doesn't matter which area contains the parent of a cell in the other area.
Here's contains and it's just like covens, except contains is defined as requiring there to be a point on the inside, which for DGGS just means is there a point or rather a cell.
For overlaps, we look to see whether there's any cells on the inside and or outside, and if both are the case, then it's true.
Moving on to some of the combinator operators. Union keeps all the cells from both areas and discards any duplicates using that normalization routine I showed previously and intersection takes all the cells that are in common between the two areas.
Difference takes all the cells that are in one area but not the other and symmetric difference is the union of both differences.
Yeah, so that's a kind of a quick glance at how the code is so simple. And that's quite a powerful thing if we want to move to web-based mapping, say, embedding a lot of your GIS inside a browser or just some very lightweight tools.
One application that I thought, that I've heard from W3C that's been of interest is to have a ring fencing. So drones can't fly into an airport or you can't play Pokemon Go in Auschwitz were the two examples that they brought up. And with this, it's light enough weight you could embed that code with all the values and
everything if you had a standardized GGS you were working against in the chips on those devices. It's that light weight that it could work that way. So yeah, as I said, you can find out more at the open geospatial address there. As I said, some local people here in this region have been leading that and our libraries are available at the NZOSS site.
And yeah, if you have any questions feel free to ask.
Sorry, my question was really about how does this work in terms of real-time performance. For instance, if I want to find all points in a database that are within my polygon area, how does that compare with, say, the more traditional spatial trees that the spatial indexes and databases use?
Very, very fast. In fact, that's one of the use cases that's been the most explored. PIXIS or GGS, as they're calling themselves now out of Calgary, has been using this quite extensively.
They're kind of a commercial outfit, been using this quite extensively. And they have some wonderful presentations on doing this in OGC experiment that was for Antarctic SDI or Arctic SDI, I should say. And the speed of being able to create analysis, that point type analysis, is just massive.
It's really, really fast. It's far beyond anything that you can do with other systems. You know, it's essentially raster processing in a way and just selecting cells that have a particular value. But yeah, it works really well in the nested system. And my answer is that I could personally do some more testing, but it does appear very fast in practice to me.
I've done some. And in theory, it should be significantly faster, and you can still use indexes on it.
Hi. Thanks for the presentation. It was great. Just having a quick look at libdggs online, and I can see the code that you're just putting up on the slide seems to be in that library, which is cool.
Yeah. For that first bit of code, I inlined a lot to make it fit, but otherwise. And it's small. It's very compact. Yeah. I'm just wondering, is there a reference implementation in JavaScript yet? You mentioned web-based applications a couple of times.
Has that been done already, or is that a non- Not that I know of. Oh, OK. That's something for someone to grab. Yeah. Yeah, that's something that we'd like to explore as a company, and I'm looking for excuses to do so, actually. Because as I said, most applications thus far have been at the big data end of the scale.
I'm just wondering how the conversion for things like GPS data to map into this.
Because most traditional GIS data will need to be massaged a lot of conversion between the two systems. Yeah, that's something that in some of those recent developments in the OGC on the coordinate systems and such, a big driver for it was DDGS that Matthew Purce and Robert Gidded in particular were working with them to make sure that you could do that.
So the conversions are all pretty much in place, I think, in those libraries, in those standards now, at least. And the thing you do need to do is make sure you have precision information,
because that's very much a part of DDGS, which it isn't with coordinate. Yeah, and that also follows all the same. What's your reference models you're using under the hood? How you're describing the shape of the Earth and all that is important. One of the things there isn't is a standard DDGS.
They've been very careful to allow people to define their own DDGSs, so they don't have a library in there that says this DDGS of this name can be converted to that yet. The mechanisms for doing so are all there. And so you just fill in these parameters and then you could do so.
Yeah, that's quite interesting. Actually I was kind of checking myself because talking to Robert just a few days back, he's kind of taken, you know, it's all based, it's always been talked about, all based on equal area or voxels, equal volume and such.
But he kind of realized with some other pushback from other groups that really you have something that's equal. It doesn't necessarily, it could be time, it could be, you know, I don't know, from China they're trying to push some that are really based on lat, long, so they're based on degrees, minutes, seconds sort of thing,
which aren't equal in area or time but it's really useful for a lot of purposes to do that and it otherwise fits the model. But yeah, they definitely have been pushing in a lot of dimensions, a lot of excitement there actually. And in that end it's been used, those problems have been solved in a lot of 3D design type things,
you know, building airplanes and such. Some of the aircraft manufacturers use a voxel approach that's very similar to DDGS but only in a localized space.