We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Cloud Web Services for the Open Data Cube

00:00

Formal Metadata

Title
Cloud Web Services for the Open Data Cube
Title of Series
Number of Parts
50
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Cloud Web Services for the Open Data Cube
Open setCubeSoftwarePoint cloudWeb serviceCloud computingCubeOpen setXML
SatelliteInternetworkingPoint cloudOrbitAreaChemical polarityCubeState observerInternetworkingAnalytic continuationSatelliteMagnetic stripe cardMultiplicationQuicksortSpacetimeMultiplication signPolarization (waves)BitAreaOrbit1 (number)Computer programming2 (number)Cloud computingPole (complex analysis)Roundness (object)Nichtlineares GleichungssystemComputer animation
ChainProcess (computing)Level (video gaming)Mathematical analysisSatelliteSpectrum (functional analysis)Musical ensembleComputer-generated imagerySimilarity (geometry)Image resolutionDigital signalComputing platformRWE DeaCloud computingMoment (mathematics)Computer programming1 (number)State observerData managementSatelliteLevel (video gaming)Computing platformMedical imagingProcess (computing)Musical ensemblePairwise comparisonImage resolutionMathematical analysisDigitizingForm (programming)MetreOrbitCloud computingAngleCondition numberInheritance (object-oriented programming)Point (geometry)Open setRectifierComputer animation
CubeOpen setOpen sourceComputing platformSatellitePoint cloudSource codeCanonical ensembleScalabilityCubeOpen setQuicksortComputer programmingArray data structureState observerPoint cloudPoint (geometry)AdditionScalabilitySatelliteMetadataComputing platformOpen sourceLevel (video gaming)Multiplication signComputer animationLecture/Conference
Core dumpClient (computing)File formatIdentity managementContext awarenessSoftware developerSource codeMetadataDatabaseServer (computing)SatelliteRaster graphicsWeb serviceMusical ensembleAerodynamicsFunction (mathematics)PixelMobile appStandard deviationComputerComputer-generated imageryGraph coloringWeb 2.0QuicksortOpen setProjective planeWebsiteClient (computing)Web serviceMedical imagingMusical ensembleRaw image formatServer (computing)Level (video gaming)Computing platformWeb applicationInformationDivergenceSatelliteMetadataStandard deviationVisualization (computer graphics)Row (database)Vector spaceDifferent (Kate Ryan album)Process (computing)TesselationOperating systemFile formatRaster graphicsNeuroinformatikComputer-generated imageryCalculationContent (media)Database.NET FrameworkCellular automatonCompilation albumBitGeometryCubePixelResultantCommunications protocolOpen sourceOcean currentComputer animation
Electric currentPolygonClient (computing)AnglePixelSparse matrixMultiplication signLevel (video gaming)Revision controlCommunications protocolBitAngleMultilaterationClique-widthMaxima and minimaClient (computing)Demo (music)Proper mapMathematical analysisProgram slicingTesselationComputer animation
Electric currentFunction (mathematics)Process (computing)Client (computing)Clique-widthMaxima and minimaControl flowCalculationStandard deviationExtension (kinesiology)Client (computing)Reading (process)Point (geometry)Parameter (computer programming)Arithmetic progressionSparse matrixElectronic data processingDisk read-and-write headCommunications protocolSubject indexingAnalytic continuationMaxima and minimaCubeMultiplication signComputer animation
Limit (category theory)Electronic visual displayMultiplication signDemo (music)Real numberComputer animation
Electronic visual displayAsynchronous Transfer ModeLemma (mathematics)Computer fontSoftware developerElectronic visual displayMedianGeometryPairwise comparisonComputer animation
Structural loadSatelliteSurfaceReal numberGreen's functionHost Identity ProtocolAreaGreatest elementPolygonLevel (video gaming)Zoom lensBit
Green's functionReal numberDemo (music)NumberCubeGreen's functionQuicksortOpen setOpen sourceProcedural programmingRight angleMathematicsTwitterPhysical systemProduct (business)CodeComputing platformState observerMultiplication signWeb serviceSubject indexingPlanningDifferent (Kate Ryan album)Sheaf (mathematics)Musical ensembleNormal (geometry)Sign (mathematics)Slide ruleProcess (computing)MereologyLecture/Conference
Transcript: English(auto-generated)
Okay, thank you. I'm going to talk about cloud services for the open data cube. Anyone here not in Alex's talk? One? Okay, well I wasn't going to say too much about the data cube anyway, but we'll just work our way through. We're living in a golden age of Earth observation satellites. There's technology
improving rapidly. There are multiple space agencies around the world maintaining Earth observation programs. New satellites are being launched regularly, as we already heard this morning, and the data from the old ones keeps on coming in. Broadband internet and cloud computing means that we now can access this data in a way that we haven't been
able to before. I'm going to start by annoying flat-earthers in the audience, hopefully you haven't gotten near this conference. There are two interesting orbits for Earth observation. Geostationary orbits, where you stay above one longitude continuously. That's great
for looking at focusing on one area of the world. You're a lot higher up, so you can see a much broader bit of the Earth at the same time, but you do only see that one third or so of the globe. If you want to cover the whole thing, you've got to do some sort of second polar orbit, where you're orbiting over the top of the poles, and the Earth's covering a little strip each time. The Earth rotates beneath you, and
you gradually cover up the whole world with imagery, until those stripes overlap. You get a lot of overlap at the poles, not much overlap at the equator, again, because the Earth is round. The data fresh off the satellite is not in a very useful form, so it has to go through
this preliminary processing first. Level 0 is that raw data, direct off the satellite. Once we've added geo-rectification, so the lack-long of each point, then we call that level 1. But the most interesting data is the analysis-ready data, where we
correct for things like satellite observation angle, solar incidence angle, atmospheric conditions, terrain angle, and so on. Yeah, I'll just leave that there. The most interesting satellites to us at the moment are Sentinel-2 and Landsat-8.
It's not because they're the biggest and the best or the most fancy, it's because they're the ones with the best free and open data policy. Sentinel-2 is by far superior. It's a 10-metre resolution as opposed to about 25, 30-metre resolution for Landsat. 13 spectral bands compared to 9 covers the Earth in about 10
days as opposed to 16. Plus, there's two Sentinel-2s, so you actually get it in 5 days. But the advantage of Landsat is that we have this program going right back to the 80s with high-quality images on the same orbits, so we can do some really nice longitudinal comparisons with the Landsat data.
Digital Earth Australia are funding this work. It's a platform operated by Geoscience Australia to manage Earth observation imagery for the scientific community and for the general public. They use the NCI, Supercomputer Computing Platform, and also Commercial Cloud Computing Platforms, or one in particular.
And so, yeah, thanks guys for letting me work on this stuff. We have the Open Data Cube. That's sort of the starting point. As Alex told you, it's an open source platform for managing Earth observation data.
It's built on X-Array, which if you're familiar with pandas, it's like pandas for numpy, so it just adds this extra metadata layer over the top of your numpy arrays. It's managed by an international consortium that includes Geoscience Australia and CSIRO, and we have outreach programs helping
developing countries access and make use of their satellite data through the cube. But until recently, we lacked a robust open source cloud publishing platform, so we wanted to change that, as soon as a good strategic thing for the international consortium. At the same time, DEA is pushing data international map. Some of that they were doing directly from the NCI,
but there was a perceived need for the additional flexibility and scalability of cloud deployment, and so that's where we stepped in. The first step, which Alex also talked about in more detail, was COG support.
I'm not going to repeat what Alex said, but the team at GA worked on that and did a stellar job. It works just beautifully. I was working on this DataCube OWS platform. OWS just standing for Open Web Services. It used to just be DataCube WMS, but then we started
when we implemented WCS as well, and we had to change the name. It's an open source, lightweight web application server. It's written in Python 3, so it can directly integrate with the Open DataCube. It basically sits on top of the DataCube and publishes standard geospatial web service protocols, in particular, WMS and WCS.
The metadata is stored in a database, but the actual satellite data is downloaded on the fly from S3. This means that we can do a lot of on-the-fly calculations, effectively in the DataCube.
We can do a lot of things dynamically very easily that would be impossible in a more traditional web hosting geoserver site solution, I guess. Just comparing the web map service and the web coverage service, if you're not familiar with them, the web map service is great for general-purpose web apps, whereas the web coverage service is more aimed
at scientists and data specialists. The WMS serves standard 24-bit RGB computer images, whereas the WCS would typically return a more rich raster container format like NetCDF or GeoTIFF.
The question is how we get from this 13-band 12-bit per channel raw data down to 24-bit image. For web coverage service, you can just request the bands that you want, and it'll package them up and send them to you.
But for web map service, the web service has this concept of styles. You can request what style you get your tiles in, and I think that's really intended for when the underlying data is vector data, so I want my rows to be green rather than red or whatever, but we can use it to supply different ways of combining those bands down into a
three-band image for visualizations and do different sorts of false color projections. In web map service, we also have get feature info to get the raw data on an individual pixel, which is nice. WMS is fairly clearly specified, and most clients are fairly well behaved.
Web coverage service is a lot harder to read, and there's a much bigger divergence in client behavior as a result, which is challenging. So current status. WMS is working superbly. I'll show you a demo later. We support 1.3, which is the most recent version.
We work well with most clients. There's one large commercial client that steadfastly ignores the advertised max height and max width values, and insists on requesting very large tiles, but apart from that it's pretty good.
We can do on-the-fly solar angle correction, which is handy for serving level one data if we don't have access to proper analysis-ready data, and we've got the get rich data coming from get feature. There are some issues around sparse data. Both those protocols don't work well with sparse data.
So with this satellite, for any given day, the time slice, we don't have data for the whole Earth. We just have it in these strips, and for any given bit of land, we don't have data for every day. We have data every 5, 10, 16 days, whatever it is, and the protocols aren't well designed
for that currently. That's an issue with the protocols. WCS is very much a work in progress. We're working with 1.0.0 because it's the most easy one to read and understand, but we do want to move on to the more modern protocols once we get ahead around it. TeriJS has a native WCS client that works very well with.
QGIS works okay. ArcGIS we're working on, and again we've got issues with sparse data. So the next step is we're looking at WTMS support that will get us around that commercial client that won't read max height and max height with values, and that should be pretty easy as a matter of converting the arguments to a
WTMS argument. We're going to continue work on WCS. We're thinking about maybe doing some WPS work, so we'll have some really rich online on-demand data processing. We need to do some initial data processing of the data up front to do
some extra indexing that's not currently in the data cube itself. We're looking at folding that into the data cube, and more data, more deployments. So now I've just got time for a real quick demo, if I'm lucky. Is that going to work?
Yes, and now it's going to find, there it is. So let's maximize that. Okay, so this is Landsat geo-median data, so let's just average it across the whole year. This one we're looking at now is Landsat 5 from 1988. This is Melbourne
obviously, and this is Melbourne now in 2017 from Landsat 8. You can really see the industrialization around the edges and that blow out at the west, and also the development of docklands in there. Just a nice little comparison. I'm going to try and really quickly show you some
Sentinel-2. So grab that in Sentinel-2, we'll grab Sentinel-A, add to the map. So as we see, sometimes those polygons show up, sometimes they don't. They're just showing us where the data is. If we zoom in a bit, they'll actually fill in with the overview data, but I'm going to pick out
an interesting data I found earlier. So up around here there's a really fierce bushfire burning there. That's all smoke over here. If I switch over to this infrared false
color, so this is with green shortwave infrared and near infrared, this here is the burnout area, shows up really well, and those red bits, that's the actual fire. You can see it right through the smoke. And if I zoom back out and go down the bottom
here where it's actually green, again we've got another fire down here. You can see that fire burning quite clearly. I'll go into RGB, you just see the smoke. We also have NDVI there. So this is a vegetation index, it's the
normalized difference between the red and near infrared. Green is growing, red is not growing. We conveniently have this little legend down the side for those indexed bands, and they are generated again dynamically in the code from the definition of the band math.
So I will leave it there and open up for questions. Just a quick question, does Open Data Cube also have a section of one SIR data? So DEA is not working with that currently,
but as Alex was saying, the Open Data Cube itself is a generic platform. It can certainly index that and ingest it, but it's not something that DEA is currently hosting for this purpose. We've got a number of products based around Sentinel-2 and the NASAT program, but
all those pictures are from my Twitter feed, so for more pretty depth observation images, follow that Twitter. In the next step slide, you said that you're going to work on WPS. Have you thought about WPCPS? I'm not familiar with that one, that's
coverage processing, is it? Yeah, right. WPS is a sort of long-term plan. We've talked about it a few times and it's always been a pushback priority, but certainly something we're thinking about.
I have a question, what's the relationship between the Open Data Cube by the US system and the Giski data server?
Yeah, so that's a sensitive topic. So we're essentially with competitors, really, but we... firstly, Giski's not open source and there was a procedure within the Open Data Cube consortium for a Giski-like product that was open source, so that was one of the
motivation for setting up competitor, essentially, and also NCI haven't been as cooperative as DA would like in getting more more services added within the NCI. But it's a sensitive topic, I probably already said too much.