Creating Wallonia's new very high resolution land cover maps
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 295 | |
Author | ||
Contributors | ||
License | CC Attribution 3.0 Germany: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/43573 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
00:00
Image resolutionGrass (card game)PixelCovering spacePixelKolmogorov complexityTerm (mathematics)MetreGoodness of fitLecture/ConferenceComputer animation
00:26
Grass (card game)Image resolutionPixelMessage passingSupercomputerChainType theoryParameter (computer programming)Entire functionProjective planeGrass (card game)SupercomputerResultantChainMessage passingBilderkennungMereologyPixelObject (grammar)Medical imagingComputer animation
01:25
Message passingPixelType theoryGrass (card game)PlanningIntegrated development environmentDatabaseSystem administratorMereologySoftwareContext awarenessMessage passingProjective planeSquare numberOpen source
02:04
PlanningIntegrated development environmentStatisticsDatabaseContext awarenessProduct (business)Element (mathematics)Covering spaceMappingPhysical systemLevel (video gaming)Integrated development environmentContext awarenessPlanningSoftware developerExtension (kinesiology)Entire functionComputer animation
02:47
Context awarenessProduct (business)Element (mathematics)IntelVector potentialThresholding (image processing)Projective planeSystem administratorMaxima and minimaContent (media)Texture mappingInteractive televisionCovering spaceUniverse (mathematics)Rule of inferencePoint (geometry)Execution unitNumberDifferent (Kate Ryan album)ResultantImage resolutionSocial classDirection (geometry)Mapping
04:43
Thresholding (image processing)FrequencyImage resolutionExecution unitMaxima and minimaProcess (computing)Electric currentTotal S.A.Digital photographyPhotographic mosaicComputer-generated imageryPolygonVector spaceBuildingTerm (mathematics)Network topologyArtificial neural networkSurfaceDifferent (Kate Ryan album)Water vaporMaxima and minimaVector spaceSeries (mathematics)MathematicsEntire functionPoint (geometry)Social classCartesian coordinate systemMedical imagingTexture mappingCovering spaceSpecial unitary groupCondition numberExecution unitFrequencyProcess (computing)BitMetreSquare numberOcean currentLine (geometry)Digital photographyMusical ensembleComputer animation
07:09
Vector spaceBuildingTotal S.A.Digital photographyPhotographic mosaicComputer-generated imageryForestPolygonImage resolutionSet (mathematics)Social classCore dumpBilderkennungProjective planePoint (geometry)Object (grammar)PixelData centerMedical imagingComputer animation
07:32
AutomationGrass (card game)Machine learningScripting languageSource codeChainProcess (computing)SupercomputerInformation and communications technologyParallel computingService (economics)Inheritance (object-oriented programming)PixelBounded variationParameter (computer programming)Mathematical optimizationObject (grammar)Random numberForestDatabaseComputer fontGrass (card game)SupercomputerScripting languageMedical imagingGraph (mathematics)Product (business)MethodenbankMachine learningUniverse (mathematics)Term (mathematics)Parameter (computer programming)Physical systemObject (grammar)ResultantMereologyAreaComputer chessVirtual machineMetreForestComputer animation
09:26
Parameter (computer programming)ForestRandom numberObject (grammar)DatabaseBounded variationInheritance (object-oriented programming)PixelComputer fontMathematical optimizationSource codeParameter (computer programming)Thresholding (image processing)Grass (card game)Variety (linguistics)Object (grammar)RandomizationMathematical optimizationForestFilm editingMethodenbankMedical imagingLine (geometry)TesselationPixelAlgorithmInheritance (object-oriented programming)AreaNumberGraph (mathematics)Parallel portPerturbation theoryCrash (computing)MetreSocial classBuildingComputer fontComputer animation
12:06
Bounded variationMathematical optimizationInheritance (object-oriented programming)PixelParameter (computer programming)Object (grammar)DatabaseComputer fontRandom numberForestSoftware development kitProduct (business)SupercomputerShift operatorAcoustic shadowMathematical morphologyOperator (mathematics)MultiplicationMathematicsScripting languageCalculationFront and back endsMedical imagingUniverse (mathematics)Physical systemShift operatorArithmetic meanSmoothingComputer animation
12:36
Product (business)SupercomputerPixelAcoustic shadowMathematical morphologyShift operatorOperator (mathematics)System identificationPixelMathematical morphologyGraph (mathematics)Acoustic shadowSoftware testingProjective planeInformationTexture mappingReference dataSound effectExistence
13:24
MultiplicationTemporal logicProduct (business)System identificationPixelForestSocial classRule of inferenceSoftware testingVariable (mathematics)Machine learningObject (grammar)Random numberoutputGlattheit <Mathematik>Computer fontImage resolutionBuildingObject (grammar)Pixel1 (number)AreaGraph (mathematics)Image resolutionRainforestTerm (mathematics)BenchmarkResultantSource codeDifferent (Kate Ryan album)Variable (mathematics)Graph coloringVirtual machineSoftware developerMereologyInformationSoftware testingAuthorizationProjective planeDigital photographyRule of inferenceFitness functionGoodness of fitForcing (mathematics)Covering spaceMultiplicationTexture mappingRight angleDiscrete element methodClassical physicsSound effectPoint (geometry)Logic programmingRandomizationForestMachine learningComputer animation
16:14
Object (grammar)Temporal logicImage resolutionoutputVirtual machineSupercomputerScripting languageGrass (card game)Texture mappingMethodenbankComputer fontPerspective (visual)Time domainPolygonMetric systemProduct (business)Process (computing)Entire functionInformationProjective planeoutputTerm (mathematics)Grass (card game)MereologyInformationBitVirtual machineCovering spacePhysical systemVideo gameSeries (mathematics)Scripting languageEntire functionChainSocial classPerspective (visual)CodePoint (geometry)Image resolutionMetreKey (cryptography)5 (number)Line (geometry)Multiplication signGoodness of fitSoftware testingLimit (category theory)Discrete element methodAreaProduct (business)Medical imagingPhase transitionObject (grammar)Task (computing)Computer animation
18:44
outputTexture mappingPerspective (visual)Entire functionTime domainMetric systemProduct (business)Process (computing)Latent heatPolygonInformationCovering spaceCustomer relationship managementSystem administratorSeries (mathematics)INTEGRALRaw image formatInformationForestGoodness of fitPhase transitionPolygonGraph (mathematics)Projective planePlanningWorkstation <Musikinstrument>MereologyTexture mappingProcedural programmingComputer animation
19:37
Perspective (visual)Time domainPolygonMetric systemProduct (business)Process (computing)outputEntire functionInformationTexture mappingAutomationProjective planeMereologyCovering spaceImage resolutionDatabaseoutputMetric systemProduct (business)Medical imagingEntire functionSet (mathematics)Computer animation
20:43
Water vaporSocial classTesselationForcing (mathematics)Image resolutionShift operatorDemosceneRandomizationVirtualizationForestDifferent (Kate Ryan album)Presentation of a groupComputer configurationLevel (video gaming)CASE <Informatik>MeasurementProjective planePoint (geometry)1 (number)RainforestTunisCuboidLinear regressionReal numberMaizeFunction (mathematics)State of matterSolid geometryValidity (statistics)Junction (traffic)Field (computer science)AreaGrass (card game)WeightProfil (magazine)File formatMereologyMultiplication signResonatorMedical imagingRule of inferenceComputer fileFrequencyConformal mapCovering spaceDirection (geometry)ResultantMathematicsMultiplicationSymbol tablePort scannerCodeMaxima and minimaRaster graphicsEquivalence relationDatabaseMathematical analysisCross-validation (statistics)Computer fontError messageNormal (geometry)QuantileCausalityRight angleSinc functionLecture/Conference
Transcript: English(auto-generated)
00:07
And so, I will start myself with the first talk on OBIA and pixel-based methods combined to create land cover maps in Wallonia. And then we will have Ivan Rikin, who's going to present automated GIS-based complex
00:27
develop for the long-term monitoring of growing season parameters using remote sensing data. So, I will start with my presentation. Just as a preliminary, you see the entire team of people who are working on that project.
00:46
And so, I'm kind of the speaker of the team here. Only all of these people have contributed to the work that I'm presenting today. And this is all part of a project funded by the Walloon region called Walloos.
01:02
So, the main messages I will present to you today are that we combined actually object-based image analysis and pixel-based image analysis to improve the results you can get with either or these two methods.
01:21
And we built fairly efficient high-performance computing tool chains using GRASS GIS and using OTB. So, this is different teams using different tools, but we then combine them together. And the other message is that using FOSS4G, using free and open-source software for this
01:42
kind of project, and I'll come back into the question of why, really ensures transparency and sustainability of the method, of the use of the method in the context of a public administration. So, just to show you where we are talking about, this is Walloonia, which is the southern
02:00
part of Belgium. It's about 17,000 square kilometers. And political system in Belgium is a bit complicated, but these regions, they have quite extensive competencies. The entire spatial planning, economic development, urban planning, but environment transport,
02:21
all these are competencies of these regions. And the region is also, for large extent of the data it uses, its own data producer. So there's hardly any data that is nowadays being produced on the national level, most is produced on the regional level.
02:43
And so, in that context, there's a need for new land cover and land use maps covering Walloonia. And the desire of the administration was not to just pay people to create these maps, to sell them these maps, but actually to elaborate a methodology.
03:04
And that's why they asked us, the research teams, to actually work with them to elaborate a methodology which should be open, reproducible, and easily understandable by the administration. This is also in the context of different existing research projects that the different
03:24
university teams had already with the Walloon region, and which then fed into this new project which I'm presenting today. And the general obligation, obviously, is that the results should be compliant with both what the users need, but also with the EU-inspired directive, which has some clear
03:45
rules on land cover, land use maps, and how they should be, what the content should be. So to refine what we wanted to do in the project, we had a quite intensive interaction with
04:00
the user base, means a lot of the different administrations of the Walloon region, but also universities, companies, anyone who might use the data, actually. And there was work on trying to define the legend, what classes are needed for a land cover map, on what is the temporal resolution you need, how often do you need
04:21
an update of a land cover map, what is the accuracy that you need. And what we used as a kind of a tool to force everyone to find their best compromise. They all had a certain number of points, and they could give these points either to
04:41
minimum mapping unit or to temporal resolution or to overall accuracy, so they had to distribute the points between them, which then showed, okay, if you don't have enough points to have, let's say, a very high temporal resolution, are you willing to go down a bit, but then put more emphasis on the overall accuracy? What are the most important aspects for you?
05:00
So what came out of that is that for most users, actually, an overall accuracy of at least 85 percent is pretty good for most of the applications. The minimum mapping unit that was requested is around 15 square meters, and an update frequency of, let's say, three to five years. This is still an ongoing process, so there might be some changes in that, but this is
05:23
what we have. And you can see the proposed land cover legend below, so actually six classes if you want to, but two of them are subdivided, so the artificial surfaces are, let's say, in ground features and elevated features, and the same, and the trees and shrubs of the
05:41
higher elevated vegetation is subdivided into coniferous and broad-leaved deciduous trees. There's also, we're still discussing the question of dividing the water into water bodies and water courses. The data that we have at our disposition is mainly the yearly coverage that the balloon
06:05
region pays for in terms of orthophotos of its entire region, which are at a 25-centimeter resolution, and they provide four bands, so red, green, blue, and near-infrared. They were taken during different flights, they couldn't cover the entire region in one
06:22
flight, and then obviously depends on weather conditions and all that, so we have a whole series of different strata defined by different dates and different cameras used, and we actually work individually in each strata to make sure that we don't, that we take into account these differences in terms of camera and, let's say, weather or sun conditions.
06:45
We also have a height layer, so height above ground, which was derived by photogrammetry from the images taken during the flights, and then a whole series of auxiliary vector data, which already exists because the balloon region already has a whole series of vector
07:03
data, which are not necessarily up-to-date, really depends on each layer how up-to-date they are, not always easily usable, so in the current situation, the entire road layer of the region is lines, not polygons, which obviously when you try to class a 25-centimeter
07:21
resolution is not very useful. And the total side of the data set is around 2.5 terabytes. So what I'm going to present today is, let's say, the core of the entire project, which is on the one side an approach based on object-based image analysis, and on the other side
07:41
a pixel-based approach to classify, and then a semi-automated fusion of these results in order kind of to get the best of both worlds. So the OBIA method is a method that some of you were at the workshop on Monday where we presented this more in a practical way.
08:02
It's a whole collection of modules that we've been working on in the last years within the GRASS GIS software, models that take you all the way from, let's say, defining best parameters for segmentation, so for creating the object, all the way to the variant
08:20
into classification through machine learning. It's all GRASS GIS modules combined then with Python scripts in order to create an entire pipeline of treating the data. So the idea is the whole thing should run as automatically as possible, so with as little human interaction as possible, going from cutting up the images into objects, into
08:46
identifying training data, to then classifying. And this was all run in a highly parallelized fashion on the high-performance computing system of my university.
09:04
More detail, how do we do this? We actually, in our research in the last years, really noted how when you try to optimize segmentation for a large area, it doesn't really work because you have such a diverse landscape that what works well in terms of cutting up the objects in one
09:25
part of the image doesn't really work well in another part between built-up areas, forests, agricultural areas, such different realities. And here you see from a paper from last year where this is Wagadougou where we worked on this question as well, and you can see this is just to show you different values
09:43
of the parameter called threshold within the grass segmentation. And you can see the wide variety of values that we got when we then actually optimized it locally, and that shows that there is a need to do actually this local optimization.
10:03
And to do that, we cut up the image with cut lines, which is a module which implements an algorithm which tries to create these cut lines in a way that it doesn't cut through objects that you're interested in. So you can see it goes around the house and then goes along the streets and things
10:24
like that in order to have tiles which then are more easily mergeable afterwards, and won't perturb the classification. We also used a super pixel approach, which means that pixels which are very similar
10:43
are regrouped before we go into the actual segmentation object delineation. This is mainly because it really accelerates the treatment because you divide the number of pixels or objects you have to treat by four or five, things like that.
11:01
So we then segmented or created the object tile or little sub-tile by little sub-tile. So this runs all in parallel on the HPC. And then it's put together again to have then a large segmented raster.
11:21
And so this was run with the unsupervised parameter optimization module which we have been developing the last years. And then within these small segments we selected those for which we were fairly confident that we know that they are automatically by using existing data. And so saying if this falls into what we know should be a field, and if the NDVI is
11:44
high enough then we will say, okay, this is low vegetated area. If it falls into forest, if it falls into known buildings, whatever. So we selected automatically the training data out of the objects. And then ran a random forest classifier implemented in GRASS.js but using actually
12:06
R as a back end to do the actual calculations. The other approach is the pixel-based approach which uses OTB. And so off your toolbox and combined also in Python scripts.
12:21
So running everything on the command line. And this was run on the HPC system of the university, the Catholic University of Louvain. And what they did is they first run a mean shift smoothing on the images in order to avoid, let's say, at least some of the salt and pepper effect that you have when you have a pixel-based work.
12:43
They then used an existing LC map that they had from another project and the height layer. And they derived shadows from the height layer as well in order to have that information. And then they used these reference data sets, again, doing some mathematical morphology and testing in order to actually identify
13:03
those regions where we're pretty sure this is really, we know what this is. And they used that as training data and classified then again. And they divided into two strata depending on the height so they classified separately high features and low features.
13:21
And they also created two other layers but at a different resolution because those are based on Sentinel-2. One which is a two-date land cover map which is especially tuned to allow good forest classification. So to distinguish between deciduous coniferous and large stands and
13:41
things like that. And then the other one is based on the sent to agri project which is part of the Copernicus efforts in the last years. Which is a multi-temporal toolbox doing multi-temporal classification to identify crops. And then we fusioned the results.
14:02
So these different, so we have the OBIA results, the per-pixel results based on the orthophotos, and then the other two, pixel-based, sentinel-based, and some auxiliary data. And we fusioned all that. And this is part of, let's say, also the methodology development and scientific work being done is to develop methods,
14:21
to test different methods to see how that works best. There was, we kind of used as a benchmark rule-based handmade fusion, but that obviously is not very reproducible and we can automate it. And then we used on the pixel-based a Dempster-Shaffer approach and OBIA, we used machine learning again to reclassify
14:40
based on the results of the different classifications. And what is quite interesting here, you see as an example, what comes out of the random forest in terms of which variables were the most important during that fusion. And you can see by the colors here that all of the different sources kind of intervene. So the red ones are from OBIA, the yellow ones is the crop one,
15:03
then you have the blue ones from the pixel-based. So you can really see how these different layers actually really work quite well together to allow you to get a better classification out of this. So what are the results? So if you look at the different approaches separately,
15:24
the OBIA, you have quite good sharp building edges, but often over-segmentation for vegetation, which then often leads to higher uncertainty on the vegetation. And this is what the map shows on the right there. The darker values are higher uncertainty values of classification.
15:41
And the per-pixel approach, you had really the classical issues of salt and pepper effects, as you can see here, and but also quite a lot of problems within urban areas of correctly delineating the buildings, that was always an issue. So the fusion then, here you see two approaches.
16:01
You really have to go more into detail to see the exact differences. But what we found is that the object-based obviously allowed us to get sharper edges and to have, let's say, a smoother outcome in general. The Dempster Schaeffer had a lot of difficulties combining the different resolutions. So we didn't really work quite well, whereas in the object-based,
16:21
we could just aggregate per-object information coming out of the different resolutions, and it didn't really matter. Some of the thematically, some of the classification came out a bit better. Dempster Schaeffer, that's really class by class. You have to see that. There's quite a lot of difficulty with the class arable land, which we can already discuss whether that's actually land cover or land use.
16:41
It's a bit between the two. The problem is that you need multi-temporal to work on that, but that's only the Sentinel images, and those are only available at 10-meter resolution, so you get some problems there. So the final run is currently ongoing, but we expect that the completely automated system will be way over 0.85,
17:02
so the overall accuracy that was expected. So what we can take out of this is that the fusion really provides a qualitative improvement over individual classification, that in terms of, let's say, the approaches we tried,
17:21
the object-based, machine learning-based approach of fusion was the one that looks most promising, but that obviously, the garbage in, garbage out in a certain way. So if the input classifications are really not good, then fusion can't correct all the issues either, so you can really see that in some areas.
17:42
So what we did do as well is kind of this iterative approach. We tried classification fusion, and then sometimes we had to go back and say, okay, we have to improve this part of the classification because the fusion doesn't correct it. So how does this project contribute to take part here in the life of Phosphor G?
18:03
Actually, throughout this project, this has really allowed us to test, sometimes a bit to its limits, the entire OBI tool chain that we developed in GRASS, and a whole series of enhancements that, while working with them, said, oh, it would be cool if we had this feature. Well, then we implemented that feature, and all this is obviously fed back directly into the GRASS code base,
18:23
so available to everyone. And we even developed a few very simple new modules as well. All the scripts for the OBIA are available on GitHub, so anyone can look at them, use them, do whatever with them. And finally, perspectives.
18:41
So we are currently finalizing the whole project, or let's say the land cover mapping. There is going to be one phase of manual correction still, because the idea is here to have like a one-time T0, a very, very, very good land cover map, which we can then feed into an update procedure,
19:00
which will be then automated. But so we said, if we have a really good land cover map to start with, the updating will be better. The idea is also to integrate land cover information into domain-specific polygons, because we noticed that a lot of people actually, you give them a raw land cover map, they don't know what to do with it when they work on very domain-specific issues.
19:21
And so the whole idea is then to create a whole series of land cover layers. For example, you can see here, this is the cadastre layer with land cover inside it. This is forest management information. And so we can feed this land cover layer into these different polygons, which then allows easier use by the different administrations. And the other part of the project which is coming up
19:42
is that we have to also create a land use map, which will also, the idea is to make it automated. So it's going to be based on a lot of existing alphanumeric databases that the region has. But also on the use that we've been developing the last years, and the method of taking the land cover map and applying landscape metrics on this land cover map
20:02
to then kind of classify that into land use. All the products we hope, normally that's what the reason has promised, there should be open data as well so anyone can then get them and download them. And so obviously there's the updating issue that's going to come up. And the other idea is that once we have this very,
20:20
very nice land cover map, high precision, so high resolution land cover map across Wallonia, it's just going to be a gorgeous input data set for deep learning. Because then we will have the entire data set of images and the classified and we can then train them easily that way. So thank you.
20:43
So questions. Hi, I'm having quite similar workflows for the detection of change of land use. I'm not using GRASS, I'm segmenting in our field toolbox with the min shift. So I got interested because I also have the similar data set.
21:02
So two, three terabytes. Did you import the data in the database of GRASS or did you just link the data? This is the first question. And then you said that you are doing it for a temporal analysis. So we are talking about getting very high resolution data.
21:23
We are talking about getting data that is almost every time made on a different time period. Like I'm talking about April, May, and when we have the luck to have this area scanned. So sometimes we have a field that is the same field
21:42
and it has, I don't know, corn and the next month it's bare soil because of other things. How do you avoid it? This is the second question and the third. How did you calculate your accuracy? Was it from the Orfeo toolbox output or with random quantile regression forests
22:01
or something similar? So we're not, actually we haven't reached the real solid validation state yet. So the actual validation is going to be ground points and so calculating overall accuracy from that. In the OBIA right now we work with the out of box error in the cross validation when the random forest
22:23
tunes if you want to as a measure. And we have some other validation aspects that we use with that. So it's still something we have to discuss with the region as well, how they want to validate these projects. And the question again is what is reproducible for them? What can they do every year to test that? But at this stage overall accuracy calculated
22:43
looking at stratified overall accuracy as well. So different approaches. But so the first question was about importing. In this case actually I decided at one point I imported everything into GRASS once and I put it on the HPC in that form.
23:01
Normally that is not necessary. You can actually leave all as a TIF and then just link it with our external into GRASS. There's some let's say advantages of both. But both are possible. Yeah sorry what I didn't say here
23:21
I should have mentioned in the presentation is what I used is actually the new option to create virtual raster within GRASS as well. So there's about six seven thousand tiles I think and all these tiles have been combined into one let's say layer through as a virtual layer which then allows you not
23:40
because the problem is the delineation of these tiles do not necessarily correspond to other delineations that we have to use. And so that way you don't worry about that. You have the entire region and you just say okay now I'm working on this and it takes those tiles it needs to work. So this also accelerates obviously treatment because it doesn't have to read the whole layer every time just the tiles it needs.
24:01
And there's a third question which oh the temporal but that's why we integrate the Sentinel. It's to have the multi-temporal aspect. What's the amount of training points that you used more or less for this type of data?
24:22
So it really depends on each stratum that we had and on the classes of what is available. So for example the water class it's not that easy to have a lot of water bodies and a lot of segments in there. Forest class it could be in the millions.
24:41
So actually what we did is we had to reduce that size because it actually at one point R couldn't handle this huge training data set that we gave to it. And so we're talking about let's say a maximum of 50,000 per class but some classes we're just talking in the hundreds.
25:00
So it really depends. Do these classes match with the coding land cover classes? Yeah the idea is obviously as I said it has to be in conformity with the INSPIRE rules so it has to be Copernicus land cover. I just have a question about the virtual raster. Is that one of those like GDAL XML files
25:20
where you have like all of these virtual layers? It's the equivalent of that but within GRASS as well. So because GRASS has its own raster format and so now GRASS also has the possibility to regroup different tiles into one virtual so it's handled internally in GRASS. Exactly the same idea. It's the same idea. So I think last question because then we have to stop because the next.
25:42
Yeah has any research been done how portable is this kind of approach? If I take this approach and use it for a different part of the world which has a different morphological profile would be able to provide usable results? It should. The big question is what data you have available
26:01
more than the methodology actually. That's the problem. But normally this is this completely automated stuff it's then you have to feed in the right training data and things like that. I mean the right data from which you can then select the training data if you want to. That's often the problem. In other regions you might not have enough auxiliary data to then automatically select your training data.
26:22
But other than that the segmentation since it's locally optimized shouldn't cause much of a problem. So I think we have to stop because we have next speakers come up. So thank you.