We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

OpenEO: Earth Observation data cubes

00:00

Formal Metadata

Title
OpenEO: Earth Observation data cubes
Title of Series
Number of Parts
295
Author
Contributors
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
OpenEO is a H2020 project that aims for a new web service standard for Earth Observation data cubes that supports data extraction, processing and viewing. Both the standard and its implementations are Open Source projects, with an active community of contributors. Under the hood, the client and backend implementations rely on open source libraries, such as GRASS GIS, GDAL, Geotrellis, Rasdaman, or provide a standardized interface to proprietary systems such as Google Earth Engine. This talk will show an overview of the main capabilities, and available client and backend implementations.
Keywords
Information securityProjective plane
Computing platformProcess (computing)Projective planeShift operatorProgramming paradigmBuildingDisk read-and-write headSelf-organizationComputing platformProcess (computing)State observerPoint cloudBitComputer animationLecture/Conference
Process (computing)Computing platformInstance (computer science)Interface (computing)TelecommunicationStandard deviationAngleGraph (mathematics)CubePixelComputer fileMultiplicationScalabilityPixelScalabilityOpen setProcess (computing)MereologyCore dumpLatent heatImplementationClient (computing)Projective planeStandard deviationFront and back endsMultiplicationINTEGRALOpen sourceComputer fileRepository (publishing)Electronic mailing listWritingCategory of beingWhiteboardCubeScaling (geometry)SoftwareState of matterInstance (computer science)Data fusionComputer animation
Demo (music)Process (computing)DisintegrationGroup actionSurfaceAlgorithmAreaDirected setCASE <Informatik>Metric systemInformationProgrammable read-only memorySeries (mathematics)Cantor setRule of inferenceEstimationMusical ensembleWell-formed formulaCubePrice indexLemma (mathematics)Point cloudAcoustic shadowLevel (video gaming)Operator (mathematics)CASE <Informatik>Demo (music)Process (computing)Web serviceSystem callBitWorkstation <Musikinstrument>Drill commandsGoogolMereologyGraph (mathematics)Client (computing)Medical imagingWell-formed formulaConstructor (object-oriented programming)ResultantRepresentation (politics)Point (geometry)Connected spaceElement (mathematics)Software developerVideo gameCodeGoodness of fitUniform resource locatorWeb 2.0Subject indexingComputer animationLecture/Conference
Total S.A.BefehlsprozessorPoint cloudRaster graphicsKernel (computing)CodeFunction (mathematics)Binary fileMotion blurFuzzy logicMusical ensembleDistanceImage resolutionComputer-generated imageryAreaAcoustic shadowPixelSummierbarkeitProcess (computing)Graph (mathematics)StapeldateiIndependent set (graph theory)CubeWeb serviceLetterpress printingMathematical analysisSeries (mathematics)Disk read-and-write headTime zoneAuditory maskingAlgorithmBounded variationPlot (narrative)SmoothingInterpolationDigital filterPrice indexUser-defined functionHausdorff dimensionRun time (program lifecycle phase)Revision controlBlock (periodic table)Electronic visual displayTemporal logicIterationSample (statistics)ImplementationoutputShape (magazine)Electronic signatureDefault (computer science)Parameter (computer programming)Element (mathematics)Error messageContrast (vision)Electronic meeting systemEvent horizonInfinite conjugacy class propertyPoint cloudAuditory maskingRaw image formatCodeMedical imagingPixelKernel (computing)BitTime seriesPolygonFeld <Mathematik>outputAverageCubeDimensional analysisSmoothingAlgorithmPresentation of a groupLevel (video gaming)Fuzzy logicConvolutionMultiplication signRight angleSeries (mathematics)Functional (mathematics)Term (mathematics)Computer filePlotterResultantGoodness of fitAreaOperator (mathematics)Web serviceCore dumpDemosceneLine (geometry)Complex (psychology)Ferry CorstenLibrary (computing)Doubling the cubeGoogolSystem callVideo gameVotingLattice (order)Real-time operating systemKey (cryptography)Negative numberField (computer science)Diagram
Process (computing)Demo (music)StapeldateiMachine learningLevel (video gaming)AlgorithmBitOperator (mathematics)StapeldateiAlgorithmInformation retrievalVirtual machineState observerWeb 2.0Medical imagingFront and back endsObservational studyTime seriesNumeral (linguistics)1 (number)Machine learningLevel (video gaming)Process (computing)Instance (computer science)Category of beingComputer animation
Computing platformGrass (card game)Server (computing)Gamma functionGeometryNormed vector spacePlanar graphText editorWorld Wide Web ConsortiumHill differential equationWeb serviceFunctional (mathematics)Connectivity (graph theory)GoogolInstance (computer science)MereologyLevel (video gaming)Chief information officerGroup actionLibrary (computing)Open sourceComputing platformInterface (computing)Process (computing)Uniform resource locatorFront and back endsGraph (mathematics)Software developerText editorSlide ruleComputer animation
Latent heatInstance (computer science)AlgorithmVirtual machineNeuroinformatikMoment (mathematics)Data centerInterface (computing)Front and back endsComputer animation
GoogolProjective planeMereologyFront and back endsBitInterface (computing)Universe (mathematics)Open sourceSpring (hydrology)Source codeComputer animationLecture/Conference
Transcript: English(auto-generated)
All right, awesome. OK, well, welcome. Without further ado, this is Jeroen from OpenEO. And he's going to be talking about some cool stuff.
Yeah, hi. Thanks for coming. So OpenEO, quick recap of what it is and why we started the project. It's an H2020 project funded by European Union. And it started when, of course, you do the paradigm shift that we had in our observation of processing,
moving towards the data, and moving towards cloud. You see multiple organizations starting to build processing platforms on their respective clouds. And of course, that ends up with a bit of a problem because users are isolated in the platform that they use.
But they cannot easily switch platforms. It's harder to reproduce work, et cetera. It's a fairly well-known problem. And also, the solution is fairly well-known. That is, of course, to build a standard. So that's the core of OpenEO. It's an API specification.
But when we conceived the project, we immediately said, OK, no, we don't want just the API. It's very important that we, from the start, start building back-end software, multiple back-end implementations, actually, and also multiple client implementations.
Because that's what you need to get a standard that actually works. You can only verify it by implementing it. And so when we talk about OpenEO, it's also that part that we think is very important. Another important part is, of course, the open part.
We conceived it as basically an open source project. So the community was very important to us, and also the aspect of community building. Therefore, one example of that is that also the API itself is developed in GitHub as a project there in the open source
way. And it's based on open API specification. So while normally, when you think of a standard, it's lots of documents. We tried, well, it's still documented, of course. That's important. But we tried to make it easier for users.
So from the open API specification, we generate the nice documentation with examples. We try to explain it very well. And also, a very important part of the OpenEO specification is actually the processes itself. So whereas, for instance, in WPS,
you know how to invoke a process and how to define it, but they don't define the processes. And an integral part of OpenEO is actually the processes themselves. So you see them here. We have a whole list of them. And they're also in the repository.
Another aspect that we took into account is, of course, data cubes. Everybody knows they're quite convenient. For us, it means that you're not dealing with files anymore. You don't have to open files and write files. You get pixels being aligned automatically
when, for instance, doing multi-sensor data fusion. And of course, scalability is also important. That is, of course, not a property of a standard, but more of a back-end implementation. But at least the standard tries not to get in the way of scalability.
So that's enough for the maybe more boring stuff. So let's try to show what it actually looks a little like at this stage. So I built a demo in my Jupyter lab. That's where I usually do my small code experiments.
The demo is also available online. And in this demo, it's about trying to process some Sentinel-2 data for an agriculture use case and mostly about the pre-processing part because the other stuff simply isn't finished yet.
In the demo, I will be using the back-end that we host at VITO. Wait, I have to see how to scroll down here. Good. And the first thing you do when, yeah, first you need to have a back-end URL. OpenEO is a web service, so you need to paste URL.
You specify the web service. Then you can list collections if you want. And the collection is like a Google Earth Engine image collection or a layer. You find the ID, and you connect to it. It's lazy, so I just did that.
But nothing really important happened because we are still specifying what data we want to access and so on. The next thing I want to do in the demo is I want to use a nice vegetation index called the enhanced vegetation index, which has still a relatively simple mathematical formula.
And the Python client of OpenEO allows me to use mathematical operators to express that. So we were a bit inspired by tools like NumPy and Pandas to really make the client-side code look very elegant and normal to researchers
that like to work like this. When we run this cell, in fact, we don't really do any processing yet. This is, again, what's happening in the background. What OpenEO does for us is it constructs what we call a process graph, which
is just a graph representation of our small workflow. And if we draw it like this, then yeah, you really see it that multiple elements are combined. And that's what makes it possible to send the same graph to multiple back ends that
should give the same result. So that's how it looks like internally. It's more complex than the client-side code. So this is mostly for the developers working on the OpenEO tools themselves. OK, now I want to actually see my image.
I have a nice download method for that. I'm not going to be adventurous and run it live, given that the connection may not be so good here. And I get this image. At this point, there is a bit of a problem. You see that there are some white areas.
That's clouds being removed. But there are still some cloud left at the edges. That's a well-known problem with Sentinel-2. If you use the center core scene classification, then you don't get, not all clouds are filtered out. So a very traditional step is then to use some cloud masking.
I have the scene classification layer available. And I also have these nice logical operators that I can apply to construct my mask. That gives me a simple binary mask. And now I'm going to do something more complicated. Because if I use this, then still I
will have cloud in my image. So I'm going to blur it out a bit to extend the mask beyond the pixels that are marked as cloud into the pixels that are maybe clouds. So it's a bit of fuzzy logic here. And I'm going to apply a caution using
a 2D convolution operation. And in OpenEO, that's just called apply kernel. And the kernel itself is constructed using SciPy. So when we look at that result here,
here you see that my raw image is on the right. And in the middle, you see that I now have some more clouds removed. And here is the fuzzy mask that I applied. So OpenEO allows me to retrieve that all
in real time. So I actually have the timings here in the notebook. So that's on image level. Actually, OpenEO also allows me to create a viewing service. That's quite nice. So you can say, hey, OpenEO, give me a WNTS
so I can browse my map a bit. That's not in the demo yet. But I'm going to continue with looking at some time series. So now I'm going to look into time. And first, I will aggregate, take the average pixel values inside a polygon that corresponds
to an agricultural field and plot that somewhere here. And I plot the data with my mask applied and without. And although my mask values are a bit better, it's not yet smooth. So now I wanted to go for a more complex
smoothing operation. But I'm lazy. I don't like complex. So I'm going to use something that's already available in SciPy, which is an algorithm called the Savitsky-Kolai filter. Now, the thing is, well, first, let's apply it
on the time series. So the green line looks a bit smoother already. So I'm happy with that. But now I want to do that on the pixel level but without actually having to re-implement it in terms of OpenEO predefined functions.
And for that, OpenEO has a very nice feature called user-defined functions, or UDFs, where you can simply write a piece of Python code that can reuse libraries and send that to the backend, which is then run either on a single pixel or on a time series of pixels, depending on how you specify
or which function you use. So I'm going to show the UDF. I load it from a file here. It's a bit of Python code. The input is actually NumPy arrays. It can use pandas, and it can use the SciPy
Savitsky-Kolai filter. And I have this nice apply dimension method available in OpenEO that I'm going to run on my data cube and then extract the time series. And then we get this result.
Maybe I should have compared it with applying it on the polygon, but actually, it's already, again, a bit better. And the nice thing is that if I now download an image, ah, no, that's gone. Normally here, there should be an image,
but that's probably not going to happen anymore. Yeah, time is short, so let's not wait for that. Good. Back to the presentation. So those were already a few important features,
capabilities of OpenEO. But there is more that's already implemented or planned to be implemented. An important one is batch processing. All of the operations I showed was happening on the fly because they were small, working on smaller bits of data. Although the backend that I used is using geotralis
and Spark, so it's scalable. It can process more. But, of course, a web request can time out. So if you have requests going on, that will take multiple minutes, and you can do batch processing. And there we are looking at, for instance, retrieving time series for 10,000s of fields
by running a batch process. Machine learning is, of course, something very important. Another nice one is that we are connecting with Sentinel Hub. Those guys have done a ton of work in integrating numerous Earth observation data layers, and we don't want to re-do that,
so we just integrate it into the backend so that we can offer layers that will fetch the data from there. And another one is, of course, integrating more higher-level algorithms than the ones you have seen here. So maybe doing image segmentation, a lot of machine learning algorithms are in that category,
or maybe atmospheric correction, typical Earth observation things. To wrap up, there's also some other nice tooling being developed. One is the Hub that lists all of the available backends.
And the nice thing about these backends is that a lot of them are really built on open-source components. For instance, the URAC-WCPS backend is built on Resdaman. We have Mundialis running on the Actinia platform that uses CROSS internally.
And ours is built on top of the Geotravis library, which is also a location tech project. And we aggregate all of them into the Hub to make it easy for users to find. Then there's the editor of OpenEO.
That's also a nice tool. There, people can have an online workspace where they can also construct these process graphs.
Here, using the boxes, it's clear over there. And then they can create, for instance, a web service out of that and browse a map. This one is using the Google Earth Engine. So we also expose some of the functionality that's in Google Earth Engine,
but through the open interface. Okay. Yep. Back to the slide show. And that's it. Thank you. Any questions?
I don't see the audience so well, so... All right. Okay, so the question was...
Yeah, the question was if there's any way to run the computations on the GPU. Currently, there's no backend that does that, to my knowledge. But the nice thing is, of course, that we can...
Yeah, indeed. And also machine learning, for instance. So that's what we are looking at at Vito and are very interested in. So we already have some custom Spark algorithms that use GPU, but exposing that through the OpenEO interface,
maybe even in a very transparent manner, would really be nice to have. Yep, that's another one.
Actually, not already at the moment, but we hope to have it in two weeks or so. So we've been working on it for some months. And actually, that's also the Vito backend that we are deploying there. So some backends are more tied
to a specific data center than others because they already have a lot of infrastructure in there. But we are also looking at deploying it on Dias. So it should hopefully not take too long anymore.
So the question was that we indeed expose Google Earth Engine layers and how we integrate. So the nice thing is actually that Google is part of the project,
although they are not funded, but they are a partner. So they were so nice to help us out a little bit by basically answering our questions. So the Google Earth Engine backend is built by Munster University.
And they were, I mean, Google was helping them out, telling them how to best interface. But basically, they are translating to the Earth Engine web API, I believe. And that backend is also available open source, so I think you can even look it up. Thank you.