We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Remote Sensing Analysis for Urban Research at Scale

00:00

Formal Metadata

Title
Remote Sensing Analysis for Urban Research at Scale
Title of Series
Number of Parts
295
Author
Contributors
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Image classification of urbanization processes require reference data either for training or for validation. However, there is scarcity of reference labeled datasets specifically to detect urban areas. In this talk, I will present three approaches for collection of open-source reference data that mark built-up areas.
Keywords
129
131
137
139
Thumbnail
28:17
Mathematical analysisScale (map)Multiplication signWorkstation <Musikinstrument>Scaling (geometry)Key (cryptography)TwitterDimensional analysisForcing (mathematics)Mathematical analysisPlanningSatelliteCovering spaceFreewareSelf-organizationGoogolLecture/Conference
Service (economics)Condition numberPressureCondition numberAreaBit ratePressurePattern languageIntegrated development environmentECosPhysical systemMultiplication signArmOpen setDigital photography
Archaeological field surveySatelliteSystem administratorInformationSatelliteState observerImage resolutionSeries (mathematics)Real-time operating systemDigital photographyAreaUniform resource locatorArchaeological field surveyMathematicsMachine learningPolygon meshVirtual machineCartesian coordinate systemOrbitMultiplication sign
Image resolutionSpectrum (functional analysis)FrequencyOrbitSatelliteFreewareMedical imagingMathematicsMetreSatelliteState observerProduct (business)Data structureImage resolutionRevision controlAngular resolutionUniform resource locatorSeries (mathematics)Cartesian coordinate systemBuildingPosition operatorGreen's functionLaptopWebsiteArmElectronic visual displayGeometryMultiplication signLecture/Conference
Computing platformPoint cloudMathematical analysisSpacetimeScale (map)Computational physicsSatelliteGeometryScripting languageJava appletCodeText editorComputer-generated imageryGoogle EarthRaster graphicsTable (information)Vector spaceLibrary catalogCartesian coordinate systemMathematical analysisCloud computingObservational studyVirtual machineScaling (geometry)Computing platformPoint cloudAlgorithmLaptopChannel capacitySpacetimeSocial classService (economics)Decision theoryInformationGoogolGoodness of fitThumbnailCASE <Informatik>Machine learningComputer animation
PredictionMachine learningAreaInformationPixelSocial classMedical imagingVirtual machineObservational studyDecision theoryCharacteristic polynomialCASE <Informatik>GoogolGroup actionMetropolitan area networkHeat transferModemUniverse (mathematics)Cellular automaton
Source codeComputer-generated imageryPixelBoundary value problemCovering spaceStudent's t-testSet (mathematics)Cartesian coordinate systemVirtual machinePolygonStorage area networkSampling (statistics)Computer animation
Sample (statistics)PolygonRectangleData bufferConvex hulloutputMaß <Mathematik>Subject indexingLevel (video gaming)PolygonCovering spaceUniform resource locatoroutputDependent and independent variablesMedianPixelSampling (statistics)Student's t-testPrice indexDemoscenePoint cloudVirtual machineSocial classCharacteristic polynomialSummierbarkeitMereologyUniverse (mathematics)Graph coloringPresentation of a groupSpectrum (functional analysis)AdditionCASE <Informatik>
outputBit ratePixelType theoryDifferent (Kate Ryan album)RandomizationSocial classPlanningForestBit rateBoundary value problem
Boundary value problemCovering spaceEntire functionAreaMetreImage resolutionGoogolLabour Party (Malta)Computer animation
Utility softwareSpacetimeCartesian coordinate systemAlgorithmLevel (video gaming)DataflowMultiplication sign
Source codeObservational studySource codeSocial classCASE <Informatik>Covering spaceObservational studyLevel (video gaming)
ForestSurfaceSatelliteFreewareImage resolutionSpectrum (functional analysis)Musical ensembleSurface of revolutionOpticsoutputPoint (geometry)Source codeLevel (video gaming)FreewareSatelliteCASE <Informatik>Covering spacePixelBit rateNoise (electronics)Arithmetic meanDifferential (mechanical device)Social classMetreCellular automatonState of matterPolygon
outputPoint (geometry)Level (video gaming)Term (mathematics)Different (Kate Ryan album)Data structureCASE <Informatik>Image resolutionProxy serverPlanningObservational studySource codeGoodness of fitComputer animation
Proxy serverElectronic data processingDreizehnImage resolutionSource codeLevel (video gaming)Covering spaceProcess (computing)Proxy serverMathematicsSelf-organizationBoundary value problemThresholding (image processing)PixelComputer animation
PixelMathematical analysisMathematical optimizationExecution unitMaxima and minimaDivision (mathematics)Square numberSpectrum (functional analysis)Price indexDemosceneSample (statistics)Just-in-Time-CompilerThresholding (image processing)Calculus of variationsHexagonoutputImage resolutionPrice indexExecution unitPixelAreaBit rateSound effectEstimatorMathematical analysisIndependence (probability theory)ResultantDeterminantMedical imagingObservational studyWorkstation <Musikinstrument>Level (video gaming)Impulse responseArmComputer fileTheory of relativityForm (programming)Right angleSource codeComputer animation
Heat transferCloud computingScaling (geometry)Level (video gaming)Population densityMultiplication signEstimatorCartesian coordinate systemMereologySatellitePoint (geometry)BitMedical imagingImage resolutionPixelBit rateChemical equationAreaSource codePosterior probabilitySystem administratorComputing platformPredictabilityPolygonForestThresholding (image processing)Reference dataRandomizationCellular automatonArmMathematicsPerformance appraisalSpacetimeMachine learningVirtual machineCore dumpWater vaporSelf-organizationSocial classGoogolTwitterObservational studyCASE <Informatik>Term (mathematics)Point cloudGradientNoise (electronics)Grass (card game)Patch (Unix)HexagonMachine visionFocus (optics)Decision theoryWorkstation <Musikinstrument>Sigma-algebraWeightLecture/Conference
Image resolutionSatelliteData structureMetreProjective planeNeuroinformatikProduct (business)MappingMathematical analysisScaling (geometry)Object (grammar)Level (video gaming)ResultantGrass (card game)Patch (Unix)Multiplication signComputer virusLatent heatSupport vector machineDifferent (Kate Ryan album)Slide ruleRandomizationCASE <Informatik>Pixel1 (number)Point cloudTerm (mathematics)Medical imagingProcess (computing)ForestoutputObservational studyFrequencySet (mathematics)Combinational logicOntologyValue-added networkForcing (mathematics)GoogolAreaArmComputerInformationGame theoryRoundness (object)Lecture/Conference
BuildingData structureUniform resource locatorRoundness (object)Multiplication signDecision theoryLecture/ConferenceComputer animation
Transcript: English(auto-generated)
OK, it's time for the last talk of the session. It's about remote sensing analysis and urban research at scale by Ron Goldstadt from the New Light Technologies.
Thank you. So thanks for coming. In the next 20 minutes, I want to show how free satellite imagery can be utilized for urban research, and specifically to map urbanization and build a plane cover at scale. And all this analysis that I will show now is done in Google Earth Engine, which is not open source,
but is free for non-commercial use. So as you all know, urbanization has been a fundamental trend of the past two centuries and a key force that shapes many dimension in our world. In 2007, for the first time, more people lived in urban areas than in rural areas. And by 2050, more than 2 third of the global population
is expected to be urban. Urbanization, of course, has many positive implications. It helps to grow economies, enhances opportunities for education, and in general, improves living conditions. But at the same time, it also creates immense challenges to society and the environment by damaging ecosystem, greenhouse gases, and air
pollution, and putting pressure on public infrastructure. So this means that tracking the rate and the patterns of urbanization is fundamental for any sustainable urban development. Now, traditional methods to map urbanization include ground surveys, administrative data,
aerial photographs in high resolution, which are, of course, expensive, time consuming, and making it very hard to monitor changes on the ground in close to real time. On the other hand, there are currently close to 2,000 active satellites that constantly orbit Earth and collect data from every location on Earth and provide this data in close to real time.
600 of them are designed specifically for Earth observation applications. And when you apply fancy or less fancy machine learning algorithms, you can extract meaningful information from this data. Now, I guess that most of you heard about Landsat.
Landsat is one of the main satellites that has been used for urban research and, in general, for Earth observation applications. It's actually a series of eight satellites where the first satellite was launched in the 1970s. And Landsat is collecting data from every location on Earth every 16 days in a special resolution of 30 meters.
All this data is publicly available because it is owned by the US government. All this data is stored since the 1970s, which you can already understand that it allows us to track how Earth is changing across time. In 2015, a new satellite was launched, also a series of satellites, Sentinel, which is the European version
of Landsat. Sentinel collects data from every location on Earth at a temporal resolution of every five days in a higher spatial resolution of up to 10 meters. So the red, green, and blue, whatever we're used to seeing in the eyes, is 10 by 10 meters, which is better than the 30 meters.
And this is also publicly available. I just wanted to show you this demonstration because it shows that even with this Sentinel-2 data, which is free, you can see changes on the ground that also detect individual structures and also production of buildings. So we have all this data that is constantly
being collected since the 1970s. And right now, we are unable to analyze all the data like we used to do in our own laptop. We are talking about, everybody's talking about big data. This is big geodata. There is a vast amount of data that needs to be analyzed, which means that we need to reconsider the capacity and the methods
that we will use to analyze this data. Luckily, there are more and more cloud-based computational platforms that allow you to do this analysis and scale it up across space and time. One of them is Google Earth Engine, which I will present now a few research studies that we did in Earth Engine, which is a cloud-based computational
service for planetary scale analysis. It has petabytes of imagery and also algorithms that allow you to do the analysis, which means that you don't need to download or upload the data, and you do all the analysis on the cloud. And it has many, many, many algorithms for machine learning applications.
So the trick or the main challenge is, how do we convert all this data that is collected into meaningful information that can really be used to improve decision making? There are many approaches for machine learning. We heard a few of them supervised, unsupervised,
semi-supervised. In the remote sensing domain, the supervised image classification is frequently used. And generally speaking, in supervised image classification, we use real examples. We use labeled examples to train the machine to learn the characteristics of whatever we want to classify. So in the case of urbanization,
we will need examples of real urban areas, which we will use to train the machine. We will ask the machine to learn what are the characteristics of a built-up pixel. What are the characteristics of a vegetation pixel? Based on what the machine learned with all the examples we provided it, it will try to predict the class of any pixel in the universe.
The main challenge in any supervised image classification is where do you get training data? And especially in the urban research domain, this is a big challenge. Because where do we get millions of examples that can be used for supervised image classification to map all the built-up land cover, all the urban areas in the world? So I want to show you three approaches that we
used to map urbanization, starting with the simple one and moving to the more complex one, which is more transfer learning oriented. So the first study that we did also in Google Earth Engine was to map all the built-up land cover in India.
For this application, we created a data set of 20,000 labeled examples. We hired students in UC San Diego who helped us label these 20,000 examples. And by the way, this is publicly available in open source. So anyone can use these examples for machine learning applications.
So this is very basic. We created our own 20,000 labeled examples. We created a stratified example. Because imagine if you have 20,000 examples, you can just throw them on the map. Because many of these polygons that we will ask students to label will be not urban. So we did a stratified sample, created these examples.
So now we have 20,000 labeled polygons. They are labeled as built-up or not built-up. If a polygon has more than 50% of it is covered with built-up land cover, it was labeled as built-up. Else, not built-up. So this is the response. This is the labels what we want to predict.
Which data will we use to predict? In this case, we use dumblandset data. As we said, landset is collecting data from every location on Earth every 16 days. But some of the scenes have a lot of cloud coverage, right? So we need to remove those scenes that have more than 10% cloud coverage.
Then for each pixel, we calculate the median value. This value represents the value of the pixel, the median value of the pixel over the year. Additional to that, we also calculated other spectral indices. For example, NDVI, which is an index to map vegetation, or NDVI, which is another index that is frequently used to map built-up land cover.
So now we have the polygons that we created. We also have these pair pixel values. We sample all the pixels that overlap with these examples. Each pixel will get the label of the overlapping polygon. So now we have, this is what we want to predict,
and these are the inputs that we provide. We teach the machine to learn what are the characteristics of a built-up pixel, label one. What is the green color? What is the red color of it? What is the temperature of it? Based on that, we will try to predict the class of any pixel in the universe, very generally speaking.
We evaluated different types of classifiers. I will not get into these details, but we found that random forest, with random forest, we get an accuracy rate of around 80%, which is quite good in the remote sensing domain, but we also wanted to map all the built-up land cover
in the country. So you can see that here, we can see the boundaries of the cities, and here, when we zoom in, you can see that even with Landsat, the free 30-meter resolution, we are able to detect the fine boundaries between built-up land cover and vegetation in the surrounding area.
Just to illustrate, once we had the 20,000 labeled examples to map the entire built-up land cover in India, took us a day or a day and a half. So it's very fast because we utilized Google Earth Engine. When we zoom out, you can see the fine boundaries of the cities, and then you can also use this algorithm
to map urbanization across space and time and apply this algorithm also on past imagery. So this is one approach which is obviously time-consuming because 20,000 labeled example for India, imagine how many examples we need for the entire world. So the next application that we show utilizes
administrative data as a source for these labeled examples. We did this study for the World Bank to map built-up land cover and land use in Ho Chi Minh City in Vietnam. In this case, we used administrative data
as our source for training examples. We classified this administrative data into three classes, built-up residential, built-up non-residential, and not built-up. You know, in remote sensing, usually we use, in remote sensing, usually it is used to map a built-up, it is used to map land cover.
In this case, we wanted to show whether we can also use remote sensing methods to map land use and differentiate between residential and non-residential land use. So we utilize two sources of free satellite data. One is Landsat, as we discussed, 30 meter resolution. The other one is Sentinel, Sentinel-2,
which is the electro-optical imagery, and also Sentinel-1, which is radar-based or SAR-based imagery. We wanted to see whether, if we will combine these two sources, we will improve our classification. And we found that with this data, with this administrative data, we got an accuracy rate of 81%.
And also, in differentiating between residential and non-residential land use, we got 67% accuracy. Now the problem that we have is class noise, meaning we have the polygons, but what happens, we do the pixel-based classification. If you automatically classify each pixel according to these polygons,
obviously we will have a lot of noise in the data. So we also created our own random certified dataset, in this case of 15,000 labeled examples, and you can see that we improved the accuracy rate, 96%. And also, in differentiating between residential and non-residential land use, we got 79%,
which is very good in terms of land use. And here you can see a map of the classification, and you can see that we can even identify these individual structures on the map, just to show the difference between Landsat and Sentinel here on the right. Then, the third case study, which I will do fast,
but this is the most, I'm proud of it most. This uses nighttime light data as a source to extract labeled data to map built-up land cover in high resolution. And the idea is that nighttime lights are a very good proxy for GDP, population size, and they are also used
to map organization processes and changes in cities' boundaries. And the main assumption is that we take the intensity of light that is emitted at night, we define a given threshold, any pixel that exceeds this threshold can be defined as an urban area.
But you can also see in this image that it is very coarse. The resolution in coarse, there's a blooming effect or diffusion of light. So what we did in this study, we relied on two sources. One is the nighttime light data. We defined this rough estimation of the threshold.
Then, many of the pixels here are vegetation. So we used Landsat data to determine where there are vegetation area or bodies of water, and we removed it from this nighttime light data. We threw examples on the map, and we labeled the examples according to where they fall.
If an example falls in a highly lit area that is not vegetation, it is automatically labeled as built up, else not built up. This way, we can label millions of examples, and yes, they will not be as accurate as we want, but if we have many, many examples, we can get a relatively high accuracy rate.
So again, Landsat data, nighttime light data, we use Landsat data to determine areas of vegetation. We remove them from the nighttime light, train the classifier, and you can see that the result is in much higher resolution. But there are many things that we need to consider. How do we define highly lit pixels?
How do we define vegetation pixels? Which inputs we will use for the classifier? What is better, a global classifier that can be generalized, or many, many, many small classifiers of very specific to specific areas? So we divided each country, India, Mexico, and the US, into hexagons, and we treated each hexagon
as an independent unit of analysis. Why did we use hexagons? This is a very interesting discussion on the advantages of using hexagons for analysis. It's also very sexy. We used, again, Landsat 8 annual composites. As I showed you previously, we created annual composites
and added other spectral indices. So now we have the pixels with all the inputs that we want to predict. We defined the threshold of nighttime lights per hexagon, because the problem is that if we want to capture also villages and small villages that do not emit a lot of night,
a lot of nighttime light, then the threshold will be lower. This is why we defined this threshold for each hexagon independently, and you can see here the variations in this threshold. In each hexagon, we identified those nighttime light data
that exceeds the threshold, which is either 95, if it exceeds the 95th percentile in the hexagon, it is labeled as highly lit, else if it is lower than 75th percentile, it is labeled as not highly lit.
Then, in each hexagon, we remove the vegetation areas from the highly lit pixels, we throw the examples, and we label the examples as I showed you previously. If the example falls in an area that is highly lit and not vegetation, it is automatically labeled as built up.
If it does not fall in a highly lit area that is not vegetation, it is automatically labeled as not built up. And again, this allows us to map, to collect millions of examples without labor-intensive methods. Then we train the classifier, random forests. The prediction of random forests
is what is called posterior probability. What is the probability that a pixel is built up? There's another question, if we want to take this posterior probability and convert it into a binary map, built up or not built up, how do we do that? So we use the OTSU method, which is a method for image segmentation.
Any pixel that exceeded this threshold was labeled as built up, else as not built up. So here you can see, this is the posterior probability map, and this is the binary map. All these pixels, the probability that it is built up exceeded this threshold that we defined with OTSU.
We also validated the accuracy of this classification, and we found that we got a balanced accuracy rate of up to 81%, which is pretty good. It's very close to what we found in the first study, where we used hand-labeled examples.
We also wanted to map urbanization density, not just built up or not built up. So we created a grid, a fishnet, and for each arm cell, we calculated the percentage of urbanization within the cell. This allowed us also to evaluate changes in density between cities.
I also don't have time to show how we can apply this method to map urbanization also across time. But the idea is that once you have a trained classifier that is trained with Landsat data from 2016 or 17, you can use the same trained classifier to map any previous year,
because it's the same satellite. We are using the same features for classification. So to summarize, satellite data are becoming increasingly available at ever-improving spatial and temporal resolution. We saw Landsat, we saw Sentinel-1, Sentinel-2. With cloud-based platforms such as Google Earth Engine, it is now possible to monitor urbanization
across space and time. And AI and machine learning can be utilized for urban research, including the use of supervised image classification. But the trick is always how do we get these examples. And I showed you three approaches to collect this reference data manually using administrative data as a source
and a very basic approach of transfer learning where we use nighttime light data as a source to collect these labeled examples. And there is, of course, trade-offs. If you will hire people to label the examples, obviously the accuracy will be higher. But this is not very scalable,
will not allow you to map urbanization in global scales. So it depends on the application that you want to use this method for. If you want it for taxation purposes, obviously you won't be able to compromise on 70-something percent. But if you just want a rough estimation of urbanization in the country, then you can compromise.
So thank you very much.
The part on the stratification when you went from 75% accuracy to 95, I think it was your second approach. Can you elaborate a little bit what it is? Yes, so the question was how did we improve the accuracy from 70-something percent with administrative data to 90-something percent
with the hand-labeled examples? And the reason we improved it is that the classification is done per pixel. If you get a polygon from administrative data, any pixel within this polygon will be labeled similarly, built up or not built up. If you rely on points,
then each point represents a single pixel. So there's much less noise. So in the urbanization example where you talked about the noise,
there seemed to be patches of grass in gums and stuff, right? But in this case, you make the grid and you classify the grid cell, right? We classified each pixel, yes. We classified each pixel. The classification was done per pixel, not per object.
Yes. Oh, I guess that was actually my question. Because in that case, would just modern, high-resolution optical satellites just... Obviously modern, high, very high-resolution satellite data will be much better, but this data is not always available for developing countries, for example.
Yes, if the pixel will be 10 by 10 centimeters, obviously the result will be better and then you can do also object-based classification. But if you want to scale it up and do the classification, the level of a country or region or the world, then you need to compromise on the size of the pixel.
Hello. I'm curious, how long does it take to run the classifications? How are you finding the times and the resources needed if Google Earth Engine was doing it? So Google Earth Engine is free.
You don't pay for the classification. Yes, you have limitations on... You won't be able to teach a classifier with millions of examples. But in the case of India, with 20,000 labeled examples, it took a few hours. In global scales, you take much more
and you will need to iterate per country, but a few hours. And again, the advantage is that I don't work for Google, right? I just see the advantage of using Google Earth Engine because you don't need to download or upload any data.
You can do all the analysis on the cloud. And then you can just download the result. So you can download the map of your urban areas. I saw you on the slides that you also run SVM.
But at the end, it was two different results from SVM or which kernel? SVM was slower. Yeah, I didn't go into all the specifics. I can show you all the specifics. But what we did find with SVM... Sorry, I was thinking of... What we did find with, I'm sorry, I'm excited.
What we did find with SVM is that when we added more inputs to the classifier, such as NDVI and NDVI, we improved the classification because random forest knows how to deal with non-linear combinations while SVM does not. And this is why when you add more features to SVM, you improve the classification.
We have time. So I saw that on the Landsat set of images that you had over a year, I think, or period of time that you had, you were actually making an average across all the... Yeah, and I was wondering whether using...
Because if you do aggregation, you kind of lose some information, the process. Whether using all those images after removing the ones with lots of cloud, we did have an impact on the performance. Because I assume the trade-off there is
that the computation that you will... Because you will end up with triple or more, like 10 times more data for the... Yeah. Yes, you are right. I think that creating annual composites for seasons would improve the accuracy. But in terms of computation time, it would take more time.
We also did another few research projects for the World Bank on mapping agricultural land productivity. And then there, obviously, you need to... Even a quarter will not be enough. You need to do it every two weeks.
Hi, I wanted to know if you have a study on what went missing in the classification. So if you have an idea, what was the finest urban scale you managed to detect with these methods? Or which kind of problem you found in this?
So the finest resolution obviously depends on the resolution of the satellite. Here we used Sentinel-Sentinel-2,
which is an special resolution of 10 meters. And you can see that we are even able to detect individual structures. So you won't be able to extract the footprint of a building, but generally you will be able to detect the location of a structure.
I would like a round of applause for Rafael and Ryan. Thank you for your attention. This session is over. And it's time for the lunch in the first floor. Thank you.