We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Spatiotemporal modeling of environmental dynamics at global scale: building open multiscale data cubes

00:00

Formal Metadata

Title
Spatiotemporal modeling of environmental dynamics at global scale: building open multiscale data cubes
Title of Series
Number of Parts
57
Author
Contributors
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Producer
Production PlaceWageningen
BuildingQuantum chromodynamicsMIDIIntegrated development environmentScale (map)CubeLinear mapMobile WebEndliche Modelltheorie
Integrated development environmentService (economics)Open setPhysical systemImage resolutionGoodness of fitBit rateOpen setPlanningPhysical systemMobile WebProjective planeMetreMoment (mathematics)Set (mathematics)Level (video gaming)Open sourceComputer animation
Sign (mathematics)EmailHTTP cookieInformation privacyAverageCodeThread (computing)AreaGrass (card game)TwitterService (economics)Term (mathematics)Covering spaceMetreImage resolutionTwitterSoftware testingBookmark (World Wide Web)Student's t-testParadoxProduct (business)Mathematical analysisComputer programmingWordPixelUniform resource locatorComputer animation
Computer wormPhysical systemSet (mathematics)Multivariate AnalyseInformation privacyHTTP cookieAverageCodeTwitterThread (computing)Sign (mathematics)Grass (card game)Service (economics)EmailCovering spaceCartesian coordinate systemComputer animation
Physical systemMultivariate AnalyseSet (mathematics)BuildingQueue (abstract data type)Logic gateCuboidComplete metric spaceWorkstation <Musikinstrument>Graph coloringService (economics)PhysicalismFile viewerMathematical analysisProjective planeComputer animation
Abelian categoryExecutive information systemComputer-generated imageryPrice indexVideo gameFood energyPhysical systemMultivariate AnalyseSet (mathematics)Image resolutionNormed vector spaceRegulärer Ausdruck <Textverarbeitung>Computer animation
Degree (graph theory)Case moddingDemo (music)ParadoxPredictionSound effectVector potentialProcess (computing)Computer-generated imagerySeries (mathematics)Image resolutionRevision controlField (computer science)Continuous functionFrequencyVariable (mathematics)Network topologyAbstractionInformationCodeProbability density functionView (database)Sign (mathematics)LoginScale (map)Temporal logicConsistencyObservational studyPhysical systemMedical imagingSuite (music)SatelliteSimulationMetric systemBitDifferent (Kate Ryan album)Goodness of fitMultiplication signConnectivity (graph theory)State observerProjective planeDemosceneStandard ModelParadoxInformation privacyMultiplicationMetreTime seriesSet (mathematics)Covering spaceMedical imagingEndliche ModelltheorieAverageField (computer science)Analytic continuationDatabaseSpeech synthesisDefault (computer science)Regulator geneProcess (computing)DivisorArmNetwork topologyHand fanPoint (geometry)19 (number)Computer animation
Context awarenessSurfaceVariety (linguistics)Musical ensembleStandard ModelParameter (computer programming)MeasurementField (computer science)Continuous functionProduct (business)Integrated development environmentRevision controlPhysical systemAreaFraction (mathematics)Axonometric projectionVertical directionTotal S.A.Population densityForestDistribution (mathematics)Home pageLibrary catalogAngular resolutionMetreLinear mapTerm (mathematics)AlgorithmImage resolutionRow (database)Mach's principleAnalytic continuationField (computer science)BitComputer animation
MIDISet (mathematics)Formal languageNoise (electronics)MetrePixel
File formatSlide ruleRule of inferenceFile viewerMedianSeries (mathematics)Formal grammarHorizonEmbedded systemInheritance (object-oriented programming)HypermediaSession Initiation ProtocolComputer wormData typeDampingCovering spaceLevel (video gaming)Network topologyProduct (business)MetreMathematicsSet (mathematics)Open setImage resolutionOrder (biology)BitHypermedia19 (number)Archaeological field surveyAlgebraic closureSpacetime
VacuumMomentumData typeScaling (geometry)Set (mathematics)Transformation (genetics)Virtual machineMathematics
Slide ruleFormal grammarLemma (mathematics)View (database)Computer fileQuantileFraction (mathematics)Open setRootDatabaseSimilarity (geometry)SpectroscopyCompilation albumSample (statistics)MeasurementFluid staticsPredictionCondition numberIntegrated development environmentInheritance (object-oriented programming)Variable (mathematics)Bounded variationStandard ModelTerm (mathematics)AverageProcess modelingImage resolutionGroup actionPoint (geometry)Set (mathematics)Different (Kate Ryan album)AutocovarianceOverlay-NetzParameter (computer programming)Lattice (group)PixelPopulation densityQuantileMedical imagingProcess (computing)PhysicalismPosition operatorWindowPressureMathematicsCategory of beingSpacetimeStandard ModelForestPhysical systemCovering spaceAverageVariable (mathematics)Network topologyFraction (mathematics)PredictabilityFraktalgeometrieMoment (mathematics)Time seriesPoint cloudMultiplication signExterior algebraArmCASE <Informatik>Auditory maskingPresentation of a groupForcing (mathematics)WordComputer virusCumulantSheaf (mathematics)19 (number)Key (cryptography)Goodness of fitSampling (statistics)MeasurementClosed setElectric generatorEstimatorSelf-organizationGame theoryFocus (optics)Ultraviolet photoelectron spectroscopyComputer animation
QuantumMathematicsZoom lensComputer animationDiagram
Curve fittingMathematicsComputer animationDiagram
Execution unitVariable (mathematics)Process modelingBuildingCubeAngular resolutionSound effectMathematical analysisProcess (computing)SpacetimeProof theoryProcess (computing)Standard ModelControl flowProduct (business)Time seriesWeb pageSet (mathematics)Musical ensembleImage resolutionTemporal logicConsistencyMathematical analysisProper mapLevel (video gaming)Complete metric spaceGroup actionMultiplication signEstimatorField (computer science)CubeCovering spaceState observerOpen setMachine learningEndliche ModelltheorieWorkstation <Musikinstrument>State of matterReal numberWordVirtual machineMoment (mathematics)AreaExecution unitComputer animationDiagram
Image resolutionPredictionFluid staticsVariable (mathematics)Condition numberIntegrated development environmentInheritance (object-oriented programming)Bounded variationStandard ModelProduct (business)Physical lawRule of inferenceOperational amplifierComputer animation
DatabaseRootVideoconferencingVariety (linguistics)SpectroscopyInformationMeasurementCompilation albumSample (statistics)Term (mathematics)Variable (mathematics)AverageProcess modelingImage resolutionType theoryField (computer science)MereologyBitArtificial neural networkMultiplication signConnectivity (graph theory)Chaos (cosmogony)ResultantLevel (video gaming)TwitterTime seriesVariable (mathematics)Open setComputer virusPhysical systemNumberStandard ModelArithmetic meanComputer animationDiagram
Process modelingVariable (mathematics)BuildingCubeAngular resolutionSound effectProcess (computing)Mathematical analysisEmulatorSlide ruleFile formatComputer fileFile viewerTerm (mathematics)SurfaceMaxima and minimaMathematicsRotationMultiplication signZoom lensBitPolygonMeasurementSpacetimeImage resolutionRow (database)InformationZirkulation <Strömungsmechanik>CausalityDegree (graph theory)AreaStandard ModelEvoluteRange (statistics)19 (number)Computer animation
Transcript: English(auto-generated)
So yes, I'll talk about some work we did, especially Leandro and me, preparing a global data and doing some modeling.
We have this project, actually, that's how the first project we started called Open Land Map. And it's our flagship, really, flagship system for global data, global solutions. It's been inspired by OpenStreetMap, but we don't have at the moment people digitizing or doing anything like that. But we are focusing on hosting and making
available environmental data. And also, we would like to establish open development communities. We have quite some data sets. We have land cover, soil climate, land degradation. And we have different resolutions,
up to 100 meters globally. And recently, I think when people ask me, like Open Land Map, what do you really want to achieve with it? And so in one sentence, what Open Land Map would like to become is environmental history of the planet. And we're planning to do some upgrades in 2021,
and it will continue. But so we are adding more layers and making it more up to date with new data layers. This is the land cover map that it's on the S3 living atlas published recently,
I think maybe two months ago. It's a 10 meter resolution global land cover map. There was quite some discussion. This is one of my favorite. You can follow on Twitter. Matt Husudan, he did quite some testing
and analysis, and so there's quite some discussion about it. But I'm just mentioning it. I mean, I don't want to get into that product now. I'm just mentioning that something I ask also, Mark and Harold, that we have this paradox today that we have a better and better remote sensing.
And if you can map the world at 10 meters, and possibly somebody who's good in programming, you could do it in Google Earth Engine, some student, master student. But what the paradox is actually that it's way more interesting what is happening in the past. So for me, looking at the land cover today, OK,
it's interesting. But any application we do, we notice that it's actually way more interesting what was on some pixel, what was it there like 50 years ago, 100 years ago. There's this environmental history. And that's what I'm going to talk about today, basically.
I want to talk about that, how to fill that gap in the data availability. So one of the global data cubes, a project in Germany, called Earth System Data Lab. So they build this what they call Earth in a box.
So it's like all possible biophysical grids going from carbon CO2 emissions to climatic variables, temperatures, vegetation, et cetera. So it's kind of complete data cube.
It's ready for analysis. There's also a viewer. So it looks so great. But it is, I think, 25 kilometer. And it covers those only, I think, last 20 years. So a very close resolution, only 20 years. When you look in the US, there's this Neo-Nassus Earth
observation, also similar. It's similar to this one. So it's data kind of stacked to same grid definition. And also a lot of data. It's primarily based on MODIS. And also covers only from 2000 to 2021, now up to date.
And also, when you look at some layers, you will see that you miss a bit of Canada and Russia. So there's also gaps. So we looked at all that. And we said, yeah, there is this paradox that actually we need better data going backwards.
And also, it's important when you think about it, why do you need to have a good models of past? Because how do you predict future? Future is to build models to understand what happens in the last 30, 40 years. If you build models that you can explain things,
then you can apply that model to the future. But otherwise, it's difficult to predict future without understanding the past. And also, when you look at lots of things like land degradation, and you have all these different soil degradation, land degradation processes,
so really to understand them, what caused them. And sometimes, there are multi-factors. So there's multiple things caused some process. And to understand which component is most important and how do they work, the only way to do that is to do some modeling through time. So when you look at before 2000,
2000 was a big year, I think. The GPS became public. And more this project started. And so there was a big year. And from 2000, we have a lot of data up to 100 meter, I think, 250 or 100 meter globally public data. But if you go beyond 2000, there's less and less data.
And then if you go beyond 1984, I think that's the pre-Lancet, you have very little global data based on Earth observation. Then from beyond 1984, basically, you do detective work. You just have pieces. So it's like reconstructing a scene with somebody
dropped a piece of hair or something. So it becomes like a detective work. So I did the inventory. And this is all the data I find that it's a special temporal, published and available. So public data sets. So there's the average that go all the way to 1980s.
And so there's the AVHRR. And there's a daily and monthly NDVI at five kilometer. Then there's the recently published HILDAPLUS. This is actually the most detailed data layer I could find based on Earth observation
that has a land cover time series. It even goes back to 1960. Then there's another similar product, but at five kilometer, glass GLC. There's also nightlight images. They're available also up to 1980s, I think.
Then I think the data set that spans the furthest back in the past, even up to time of Jesus Christ and things, it's the height 3.2. So that's a land use time series. It's a 10 kilometer. And then there's a vegetation continuous fields.
And their climate, also five kilometer monthly goes all the way back to 1980s also. This is an example of the nighttime light data set. There's been a lot of work also to fill in the gaps and reconstruct for the past years. But you see it goes, I said from 1980s, but it's actually from 1992. That's the furthest you can go with finding the nightlight
images. And this one is the VCF. So vegetation continuous fields. Also really well-prepared. But it's a five kilometer also,
but it misses the Northern Hemisphere a bit. So it needs to be gap filled. Then, so what Landry did, we looked at the AVHRR and the MODIS, and we started gap filling and producing a date, trying to produce a monthly data set,
which has basically no missing pixels and no artifacts, and the noise is minimized. And so we are producing these data sets now, and Landry actually made it. It's 250 meter from 2000, 2020. And also we have the five kilometer.
So maybe I can show you that. So if I go to open land map. So if I go to open land, let me just start from scratch. I can go to the place where we are now. So we are here. Let me zoom out a bit. So as a default, what you see in open land map is the land cover data.
And you can go back. There's this 300 meter product. So we can go back to 1992. This is the European Space Agency CCI land cover project. And so that's very nice that they made this data set and they updated it. So it goes up to 2018. So you can see changes around Lachningen.
You see Netherlands, usually there's a urban growth. There is also the 100 meter resolution data set, which is, as I said, this is great 100 meter resolution globally available, but it's only 2015, 2019. It's based on the probably, I think, primarily.
And you can see much, much less. I mean, I literally have to zoom in. I have to zoom into Lachningen to be able to do some changes within a few years. So that's the 100 meter global. Then you have the HILDA data set. So we're very happy that we managed to integrate
in open land map all these land cover data sets. In the HILDA data sets, you can go up to 1960, but it's a one kilometer. So you have to zoom out and you have to, in order to see the big picture, you have to zoom out and you have to play that. So these are these data sets. Of course, it's much more drastic if I go somewhere to Brazil.
So if I go here, then it's much more drastic. Let me zoom out. Then you see much, much more drastic change, especially when you go through to the past. And so when you now visualize this HILDA, I mean, then you get an idea of the scale of,
you know, the land transformation in tropics and in countries like Brazil and Indonesia. So that's the HILDA data set. And it's very nice. It's actually a group, Martin Harrell's group. They made this data set, so very happy. And now we are based on this HILDA.
We can go and remake all these other layers, downscaled into one kilometer. And we would like to make a, you know, data cube, which is a one kilometer. So, but at the moment what we have, it's the NDVI, we have a 10% and 90% Monte quantiles. So the high and low NDVI.
And we also get filled the terra-clim and downscaled high to five kilometer. And we derived cumulative values for the fractal vegetation, cover fraction and for height. Height has a probability per land use. So we, you know, we did a lot of processing
and I actually, I made all this gap filling and processing and then re-sampled everything exactly to five kilometer grid. And eventually it's a five kilometer grid. One image is like one megabyte whole world. But when you have a time series data, monthly data, and it goes all the way to 1982. So it looks like on the end, it was about 80 gigabyte.
So it's actually become slowly grows. And if we will do the one kilometer, it will be, so it will be 25 times bigger. So you are almost close to, I don't know, half terabyte. So that's what we would like to do. And so now you have this data
and we prepared it as a cloud alternative to put it on the, we organize it. So it's all stack, you know, you can do space time overlay. And so here's a data set, a point data set I prepared, a global soil point data set. It has soil chemical and physical measurements, different depths. It's available on GitHub.
And so I prepared this data set and then I do a space time overlay and now I build the model. I focus on chemical soil properties. And so I have some static covariates. They only change with the lattices longitude. Then I have accumulation parameters. So this is how long has Pixel been under
some land use system or how long has Pixel been on the forest cover? So how many years? Then you have the, so these accumulation, then you have the position variables so that will be opposite from the, so like the position will be, you know,
if you accumulate the rainfall or something like that. And then you have the CL is the past climate. So we take the climate for soils. I mean, it's not interesting what is the climate in the year when you do observation, but for soil, it's a soil forming factor. So you want to know what is the climate last five, 10 years.
So we can do this moving window. So we have a monthly terra-clim but we run a moving window and we estimate what was the climate, what is the rainfall in average last 10 years up to that moment. So we have this past climate. So we made that model and so we prepared all this,
even like we have a accumulative rural population density. So it's kind of a pressure on soil and we prepared all this moving window averages. And so we have also covariates that represent sudden change in land cover, which usually changes so properties, et cetera. And we did a space time overlay and bingo,
it's a good model. So we can model pH and space time. And as you see, the main most important variable seems to be a tree cover. So is a forest, so it's more acidic usually. So it seems to be the tree cover. So as you lose the tree cover, the pH of the soil changes.
Then we have the HILDA plus also very good, this forest again mask. And then we have a precipitation and these are the usually the April and February and October and December. So these are the variables. And interestingly also the land use.
So rangeland, whether you switch to rangeland and how long you have up. So the F.CUM, it means accumulated fraction of rangeland. So how many years multiplied by the fraction. So they come as the best covariate, the predictions. So there's the soil pH 1984, 2018.
If I switch very quickly, you see it's actually soil pH is relatively stable, there's no big changes. So you have to zoom in somewhere. So let's zoom in. This is the, I think Nebraska. So North Montana, North Dakota.
So if you zoom in here, you can see, yes, there is what happened with the pH is that it's more and more acidic looks like. So the pH used to be higher and now it's less. So there seems to be a change. So basically, yes, it's a proof of concept.
If you build up this space time data, you can then model soil vegetation and you can try to explain maybe some land degradation processes. So to come to conclusions, so yes, special temporal modeling is possible. And if you're interested in how exactly I did it, we have this tutorial on our GitLab page.
We have a tutorial doing a special temporal ensemble machine learning. So how you to do a space time oval, et cetera. So all explained. And there is, I think there is enough monthly annual data at one kilometer, up to one kilometer resolution to go up to 1982 to have a consistent data set.
I think it's possible to do that modeling. And we would like to build, go beyond this earth system science lab and beyond the Neo, we would like to make a proper earth system science, complete consistent analysis ready data cube that covers this area.
And we would like to make it available through open land map and to allow groups to do modeling analysis. It is way more challenging, you're not doing space time, just doing spatial analysis, space time analysis. It's way more challenges to give you an idea. My estimate is it's about 10 to 50 times more effort
to, if you just say, for example, NDVI or land cover, you can make a land cover for one year. When you want to make a time series land cover, it's a way more effort to figure out how to process it, the gap field, et cetera. And it is really challenging to go beyond 2000.
Do you say beyond or before 2000? So as you go to years before 2000, it becomes more and more challenge and it becomes more and more a challenge to a gap field and to really convince people that they have the best data. And then to go beyond 1980 at the moment,
for me, it's a really detective work because like it's almost now, no global earth observation products. So that's my conclusions and I'm open to questions. If you have questions, so if people online maybe have some question, I'll be happy to answer before we go for coffee break.
There's the microphone. Yes, Lando has a question or comment. Yeah, well, you did talk about the depth
of the approach and how to explain how we can step in. So, but I would like to talk about these,
how we can spread out the ink, for example, produce like a animal and aggregated product like furniture, money in the eye, whatever you need, but without any type of activity and really produce like more and with fewer products, like with higher and better resolution
for the activity that's actually what we are doing this activity and also I'm doing it and we are literally creating like some kind of artificial data, a little bit there. How can we create a gap field
and have a better problem in the time that the results are not able to produce like animal? I think in the gap field, so there are two things. I think the, and what I was teaching also in my course,
the time series data has multiple components and usually two components like the trend. So there's something systematic. And I think this thing you should definitely try to gap field and because if it's very systematic, let's say 90% of signal is just a systematic variation,
like a global temperature or something, then you should definitely gap field because you're better off. I mean, it's better to have values because you're 90% of signal, you know that it's systematic, but sometimes you have this chaotic part and this chaotic part is tricky. If you gap field it, you will assume that you understand,
but it's chaotic, so you cannot model chaos. I mean, you can simulate it. So it's tricky. But so my answer would be for variables that you know that the systematic component is large, I would recommend gap filling. And for the variables, even like a rainfall,
rainfall, a daily rainfall is very chaotic. Monthly rainfall, it's not monthly but it becomes systematic. If you visualize, we have that actually in the open land map. So if you visualize the rainfall for the world, you can notice that it's actually very systematic
going from month to month. So this doesn't change when there's a bit of climate change effect, but it doesn't change. So let me just show you that one so everybody sees. So here's the climate and we look at the rainfall. So that's the rainfall. And as I scroll through the,
let's say look at whole world, this is called two different months. You know, it's kind of related to the rotation of work and change of seasons. And yeah, so it's actually very systematic monthly, but the daily, it's very chaotic, you know? So if you go and get filled the daily rainfall, it's really tricky.
I mean, unless you have neighboring days, you know, but if you don't have like eight days and would I go and get filled if I don't have like eight days and then I get filled rainfall, probably I wouldn't recommend it. But monthly rainfall, if you say, well, I need to get filled, I would just say it's quite systematic.
So for most places, you know, it's reoccurring. So that's my answer. There's a question on Zoom. Let me read it. What will be the compromise to go earlier than 1980?
A question by Nick Ham. Maybe coarser resolution space and time or fewer variables? Yeah, it's difficult. I really want to say that to go before 1980, it's really detective work. To do the same thing like we do post 1980,
to move back, you just have pieces of information. I mean, it's on the edge of science fiction. But somebody might come, but you know, like Hyde, for example. Well, Hyde is maybe a good example because they reconstructed land use up to the, as I said, the Jesus Christ time.
You know, we measure the time from Jesus Christ and things. So they managed to reconstruct. Hyde dataset goes all the way back 2000 years. So how did they do that? I mean, they don't have any representation data. So I read the paper a bit. So they do some downscaling
and they use some historic data. You know, there's lots of historic data, right? You know, you have lots of records of what was in some province or some geographical region. So not big stuff, but larger polygons. And that's how I think that's how they did it. But how do you validate this data?
How do you validate, you know? We have like some climatic measurements, meteorological measurements, 150 years back, but we don't have beyond that. So how do you validate, you know, you have to do this ice course and stuff for 10 global temperature and things. So you have to measure some places, but you don't know in London, let's say,
you don't have measurements of, daily measurements of temperature or something going beyond 150 years. So it becomes a bit detective work.