Semantic querying in earth observation data cubes
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 351 | |
Author | ||
Contributors | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/68906 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Production Year | 2022 |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
FOSS4G Firenze 2022336 / 351
1
7
13
22
25
31
33
36
39
41
43
44
46
52
53
55
58
59
60
76
80
93
98
104
108
127
128
133
135
141
142
143
150
151
168
173
176
178
190
196
200
201
202
204
211
219
225
226
236
242
251
258
263
270
284
285
292
00:00
CubeDomain-specific languageThermodynamischer ProzessEvent horizonData managementData storage deviceReal numberSoftware frameworkComputer-generated imagerySoftwareExpert systemTexture mappingOntologyQuery languageConstructor (object-oriented programming)Data managementDifferent (Kate Ryan album)Green's functionExpert systemMereologyState observerSpacetimeAreaData storage deviceDimensional analysisConfluence (abstract rewriting)Cloud computingDomain-specific languageComputer scienceCubeSoftware frameworkTask (computing)SatelliteNumberResampling (statistics)Right angleEvent horizonInterpreter (computing)ForestCartesian coordinate systemProcess (computing)Mathematical analysisPower (physics)Thermodynamischer ProzessReflection (mathematics)System callLatent heatVirtual machineType theoryArithmetic meanFront and back endsDenial-of-service attackProjective planeSemantics (computer science)MiniDiscLocal ringUniform resource locatorSingle-precision floating-point formatHypercubeThermal radiationTowerComputer programmingGoodness of fitPoint (geometry)Mixed realitySet (mathematics)SoftwareBitTerm (mathematics)Raw image formatObservational studyStandard deviationWater vaporOpen setReal numberCuboidData analysisTimestampQuery languagePointer (computer programming)Direction (geometry)Product (business)Theory of relativityComputer animation
08:45
CubeConstructor (object-oriented programming)Software frameworkComputer-generated imageryDomain-specific languageSoftwareExpert systemOntologyScale (map)Local ringLatent heatThermodynamischer ProzessEvent horizonGenetic programmingTexture mappingRule of inferenceHybrid computerFinitary relationBinary fileQuery languageData structureArray data structureGroup actionProcess (computing)Multiplication signNumberExpert systemSoftwareSource codeState observerSpacetimeCartesian coordinate systemRepresentation (politics)AreaUniform resource locatorObservational studyComa BerenicesGreen's functionFormal languageNetwork topologyQuery languageCubeArray data structureThermodynamischer ProzessReal numberSocial classProjective planeEvent horizonDifferent (Kate Ryan album)Latent heatSymbol tablePhysical systemForestWater vaporCloud computingTime seriesMorley's categoricity theoremMappingRule of inferenceMedical imagingGroup actionPixelConnectivity (graph theory)Term (mathematics)Task (computing)SatelliteDocument management systemDomain-specific languageOntologyRight anglePoint (geometry)MereologyCore dumpSet (mathematics)AlgorithmSoftware frameworkBitData storage deviceNeighbourhood (graph theory)Shape (magazine)Subject indexingSlide ruleSubsetOpen setWordSingle-precision floating-point formatDynamical systemComputer fileComputer animation
17:30
Computer fontData structureMathematical analysisLogical constantQuery languageContext awarenessShared memoryStructural loadTexture mappingOpen setCubeRepresentation (politics)Temporal logicParameter (computer programming)Configuration spaceSpacetimeComputer fileAxonometric projectionReduction of orderDependent and independent variablesPlot (narrative)MetreQuery languageMappingInformation retrievalProof theoryWater vaporTime zoneDimensional analysisMereologyOpen sourceDifferent (Kate Ryan album)SubsetCodeView (database)Multiplication signGreen's functionSpacetimeMathematicsData structureMathematical analysisSoftwareExpressionExtension (kinesiology)Expert systemOperator (mathematics)Priority queueCountingReduction of orderAreaLink (knot theory)Stability theoryForestInterpreter (computing)Projective planeComputer reservations systemGroup actionTrailLibrary (computing)Cartesian coordinate systemScripting languageImplementationVapor barrierDemo (music)Array data structureCubePerformance appraisalState observerTemporal logicDemosceneAlgebraComputer animation
Transcript: English(auto-generated)
00:00
Good, I think I can start right away. Thank you all for coming. My name is Lucas. I'm from the University of Salzburg and I'm going to present you about Semantic querying in Earth observation data cubes and first I want to ask a small question who in this room works a lot with Earth observation data. Can you raise hands?
00:22
Who would think they can get value out of Earth observation data? Good, and that's that's a good mix So, let me start and First I will show a bit. What's the current some? background because we're talking about Earth observation data and for some of you that attended the keynote yesterday of
00:46
The European Space Agency you already heard that there is a lot of Earth observation data There comes terabytes of new Earth observation data every day Data metaphor of if you add all this HD discs that you get the height of the Eiffel Tower of new data every day
01:02
And many of these data are actually freely and openly Available for everyone. Yeah, for example through the Copernicus program of the European Union. Yeah, this is not This is just open data. Everyone can use it if they want and Because there's so much data and it's also often free and open
01:23
This is used more and more in Different application though domains that maybe did not use this type of data earlier so for example hydrology or ecology or Urban planning all this kind of different application domain start to see that they maybe can get value out of Earth observation data
01:41
Yeah, and they use it to analyze entities events processes that happen in the real world think about floods or forest fires or Soil ceiling or All this kind of stuff green space in cities So they are interested in such real-world entities real-world processes and to analyze them with earth observation data
02:05
And the data storage and management and the data access has been greatly simplified in recent years with a new Let's call it technique or
02:20
Yeah, which is called Earth's observation data cubes and very simply put this basically means that all the different Satellite images for an area you store them in a single cube which has two spatial dimensions and a temporal dimension and you don't as a user as an end-user, you don't have to worry anymore about
02:42
Resampling of the data because it was in a different resolution or about missing timestamps or everything. It's all stored in a single Hyper cube and makes data access data storage and data management much easier But what we have to remember is that the EO data themselves, you know, they are numbers
03:00
for example, they are reflectance values of Specific radiations here that the satellite captures these numbers these data in themselves are not yet knowledge about What we want to analyze in the real world So there is a step needed we need to get from the data to the knowledge that we want to
03:22
Obtain and this is not always always easy This is a hard task that requires technical expert knowledge Okay, so how does this used to be we have at one side the EO data and at the other side knowledge? So we want to go from there to here
03:41
Constantly have to bend to get to the microphone and Dutch people problems so In the previous situation, there's one person there and he has a or they have a toolbox in their hand And this toolbox need to contain all the tools all the skills to get from EO data to EO knowledge
04:01
Yeah, so this is data access data storage merging of data resampling of data knowing about processing power Interpreting the data analyzing the data the whole road from EO data to knowledge needs to be toolbox of that single person Now with earth observation data cubes this becomes easier
04:23
Because we get a data cube in the middle and I said the data storage management and how to access it is greatly Simplified as the analyst you don't have to worry him anymore about that For example open EO which is a project that is great work in this week in this regard They create for example standardized API that you can use to access a lot of different backends of satellite data
04:48
usually this also works in the cloud so that you on your local machine don't have to download the data and Make sure that there is enough processing power. This is all much easier to get the data The access to the data is greatly simplified
05:02
So your skill set that you need your toolbox does not to need to contain all these tools anymore But you're still a gap Gap that needs to be filled because in the end you query this data cube and what you get are still these numbers this reflectance values for for example that are not yet equal that
05:23
Doesn't tell you anything yet about this real-world entity this real-world process that you as an analyst actually want to analyze You have to give this meaning to this data and you have to interpret this data before you can move on and actually do the and do the Analysis that you want to do
05:43
And this interpretation I said is hard. This is a hard task this requires expert knowledge in EO data Analytics and that is hard to obtain So Can we not move to a future situation? Where there are three persons here in this road
06:04
Where the cube on the left? Is Containing numbers yet the reflectance values, but the cube on the right is containing what we call Symbolic categorical data it tells you directly something about debt entity event
06:25
Process and that concept that you want to analyze for example, it's if I am an urban planner Which I actually am And I want to analyze green space in cities every Location in space time will will for example tell me here green space was observed and here green space was not observed
06:44
This is a direct Relation to this concept and the part that interpretation going from what are how do we? Represent this concept green space in terms of the EO data in terms of the numbers That is done by someone who in their toolbox has the advanced technical earth observation expertise
07:06
This is the earth observation expert They define how do we actually represent this real-world concept in terms of the data? Which means that you finally as a domain expert don't have don't need to have these skills at anymore in your toolbox And you can focus on actually analyzing the concepts that you are interested in by directly querying this concept
07:27
From the cube because there is a step in between and that is what we then called Semantic querying so instead of querying the raw data values that are in the data cube you actually query Meaningful concepts that you're interested in and it have a meaning in the real world for example green space
07:45
forest water lakes And So how does it work? How does our framework look like? That looks like this. I will go through all the steps. I don't get overwhelmed at first sight and
08:01
So I said we have three different roles We have the application expert and said this is the person that in the end wants to analyze Something in the real world We have the earth observation expert which knows very well how to interpreter in to interpret earth observation data
08:20
And we have the software expert who knows very well how to set up a cloud infrastructure with the data Data cubes resampling all this kind of more computer science oriented work One point to make is that of course this purse this these roles can all be taken on by the same person Completely fine, but it doesn't have to be anymore. You don't need to have all the skill sets of all the three roles
08:48
Then we have two abstract Domains, let's call it in our framework with here on the right is the image domain This is the domain the abstract domain that contains the numbers that contains the data
09:01
Well on the left we have what we call them the semantic domain this contains the real world concepts So this is a conceptualization of things that we see in the real world when we look outside So here we have the real world that's in the end what we're all interested in that's what we want to analyze
09:21
The real world is captured by Earth's observation data. We have the satellites going around the earth and they capture the Real world and then this data or numbers and they are stored in this observation data Can also have extra data in there for example the EMS everything that can help the eel expert to
09:40
Accurately interpret this data and to represent concepts with it And the software expert is then the one that constructs this data cube and just the whole technical infrastructure cloud infrastructure At the other side the real world is abstracted by semantic concepts We have to define what actually exists in there in the real world
10:02
We have I mean we cannot really look out outside you now, but you see build-up area cities forests mountains lakes these are concepts that abstract what exists in the real world and we formalize this concept in so called on
10:22
Tholji, so in the ontology we have formalization of concepts that exist in the real world and that can mean for example that we say a lake contains water and green space is made up of vegetation green space contains trees all this kind of
10:42
Formalizations like that we define for ourselves What do these concepts actually mean we define that in an ontology? But because this is in the semantic domain this ontology does not contain any data We don't say at this point green space has
11:01
Red-band value higher than sixty point zero nine and you know all this kind of stuff This is real what then technology in which we formalize what do these concepts actually what what are they? how do we formalize and then the ontology you agree upon by the Community and this doesn't have to be everyone this can be only your working group or your project
11:25
We are going to formalize only those concepts that we are interested in in our project in my urban planning project I'm going to formalize what is green space? What is build-up area and what is blue space it doesn't have to be a huge ontology?
11:41
Which the wits will describe the whole world that is too big keep it small keep it simple and keep it local and the come and the community that Will agree upon this can contain of course application experts can also contain earth observation experts But together they formulate this is what these concepts mean this is what they are
12:02
Then the core role here is for the earth observation expert because their task is to actually map These concepts that are formalized in ontology to the data values that are stored in the earth observation data cube They're going to say they're going to formulate rules that say how is this concept green space?
12:20
How is that represented by the data that is stored in the EO data cube? So they bring in their expert knowledge into the system. So that's the application experts can say I'm interested in in green space Yeah, it can write a query recipe which will come to soon that says I'm interested in green space and
12:43
That they don't have to know. Hey, how do I actually? represent green space in terms of this data values So that is how the three roles Make up the whole system and every one does What their expertise is and I said it's perfectly possible that one person takes on all the three three roles
13:01
This actually happens quite a lot. But the key point is you don't have to Okay, now I will go again through the different components maybe I repeat a bit but let's see Yeah, I said we have the ontology formalizes the conceptualizations of real-world entities events and processes Uses real-world terminology no data values agreed upon by the community and keep it small. It doesn't have to contain everything and
13:27
That was just a summary of what I just said The EO data cube stores the earth observation data and they also store other data sources like a DEM or anything that the earth observation expert thinks this is useful to define these concepts and
13:41
Can be accessed with a standardized API which for example open EO is very suitable for And it's not limited to a single software in our system you can use open data cube You can use the file based system You can use any different kind of software to actually store your open your data cube in But I said that it's not the task of the application experts to set this up
14:03
Then we have the mapping which of course a core part in this in the system It's a knowledge based expert system where the earth observation expert Brings in their knowledge about how to represent real-world concepts in terms of So they formulate rules that then quantify a direct
14:22
relationship between the data and Concepts and these rules can be binary. I said It's pixel. It's observation in space-time. You just labeled a this is green space. Yes, and this is green space No, it's either true or it's false, but it could also be for example Probabilistic and where I say there is a high probability that this is green space or a low probability
14:46
And this is just what the earth's observation expert Things is suitable there in the entity expert in this part and your rules can be very simple You can for example say we have the concept green space in the ontology
15:00
We say green space has a high photosynthetic activity, which means it's green vegetation Which would which the earth's observation expert says, okay, then we calculate an NDVI index and if it's higher than 0.6 It's yes green space. Otherwise, no and this can be super simple, but To make it more accurate. They can also go more complex again. This is the expert
15:23
Knowledge of the EO expert that is brought in here so you can look through time series of different images How did the numbers change over time? Can we learn from that? You can look at spatial neighborhoods at shapes So they can rule and they can range from very simple to more complex
15:43
depending on what the EO expert seems Finds suitable they can also be hybrid which basically combines a knowledge driven with a data driven approach Where for example you say we first run an automated algorithm like Google dynamic world or some other thing on our data
16:04
We have a set of classes for each Image and then in the knowledge based part we further customize these classes For example merging them to really represent those concepts that you're interested in So a lot of different approaches are possible here and then we have the query recipe
16:22
So then the application expert references the concept that they are interested in in my example green space And they ask the cube, okay Give me green space for my area in space time that I'm Interested in and they get some cube like this for each pixel each observation as a direct
16:41
Relationship to the concept they're interested in then they can use array specific processes to further customize this symbolic Categorical cube taken for example say I want to count The green space observations over time that for each location in space I know hey in this year six times. We found this green space here five from here
17:03
So you can reduce it over time over space you can filter them you can merge different cubes Yeah, there are a lot of area specific processes that you can apply to this queried categorical subset of the EO data cube And we named each of these processes by single action word verb which makes it very clear
17:24
hopefully for the application expert what's happening and So don't get confused by the next slide It's just to show we have a lot of different burps and that all do a specific thing on an area The one I showed was the reduce one which for example says we reduce it over time
17:40
We count all observations through time. You can filter you can evaluate Expressions you can group it trim it all different kind of operations are possible for details Please look at the documentation, which I will share soon or at the paper And because the final part I just want to show briefly I think times, okay
18:02
It's just to summarize the benefits because I said this in our view lowers the technical barriers for people to make value out of EO data because they don't need to have the skill set to actually interpret the data and They can focus on their application But also I think it improves the structure of the existing EO analysis workflows also of expert users
18:26
because this interpretation like what how is this concept represented by the data is defined only once in the mapping and not defined everywhere in each and in each script and Somewhere knitted into other code. Yes defined clearly in one place
18:44
You define it once and the whole group research group The whole project can use it and you can easily share it and their recipes. They also remain constant because they reference relatively stable concepts like Forest like green space So for example when the data changes or when the techniques to interpret the data change or when you apply it in a different area
19:06
Your mapping will be different. You have to update your mapping with your recipe count green space Remains the same because green space is still green space that didn't change The concept is still the concept. So this query recipes they remain fairly constant
19:24
And you don't always have to update them when the data or techniques get up Final part we did a proof of concept implementation of this in a Python library I will show very quickly some demo code, but please there is extensive documentation Which I will give the link to also which explains it in much more detail
19:42
But then ideas, for example, you're an application expert you have to load a mapping which is predefined by an EO expert so you don't create a mapping for yourself. You load one that is predefined You represent you basically link to an EO data cube, which is set up by the software expert You don't have to set it up for yourself. You only have to link to it
20:00
Then you set your spatial temporal extent and some additional context like in what CRS and time zone you want to work Etc etc with this your recipe just looks like okay I'm interested in the concept in the entity water and I want to apply the reduce process and use the count Reducer over to them over the dimension time, but you see that here you don't reference any data
20:25
You reference a concept by its name. This concept is defined in the Mapping and the mapping can translate this concept with the data values In the queue then you execute this and you get a map of hey how often was water observed over time in my
20:41
spatial temporal subset and So this is a package. It's called semantic for semantic querying and And you can find it on this get the blink. I said there's quite extensive documentation I think so if you want to know more, please take a look here. It's open source. So the code is out there and Everybody can use it. And of course, we have a paper because the academic track where you can also find more details about
21:05
our ideas and what we did so Thanks a lot. And now since I'm also chair, I will check for questions