We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Urban Geo Big Data

00:00

Formal Metadata

Title
Urban Geo Big Data
Title of Series
Number of Parts
295
Author
Contributors
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Nowadays about 54% of world population lives in urban areas and, according to the 2014 UN-ESA report, this percentage is expected to increase up to 66% by 2050. We are clearly facing a rapid and global trend, that will affect daily life in the next few decades. It is, therefore, crucial to managing this social and cultural change in a much more sustainable way, compared to what was done in the past. Within this framework, the collection, integration, and sharing of reliable and open spatial information is a key factor, benefiting both of different space (Earth Observation (EO) satellites and Global Navigation Satellite Systems (GNSS)) and ground (low-cost devices networked in the Internet of Things (IoT), 50 billion are expected within 2020) technologies. The contribution deals with the general presentation of the Urban Geo Big Data, a collaborative acentric and distributed free and open source platform consisting of local data nodes for data and related service Web deploy, a visualization node for data fruition, a catalog node for data discovery, a CityGML modeler, data-rich viewers based on virtual globes, an INSPIRE metadata management system enriched with quality indicators for each dataset.For data visualization and analysis, a 3D model of the urban environment was created. CityGML is an open standard that has been thoroughly tested in the past years. One of the activities in this project was to create an Extract, Transform and Load (ETL) procedure for converting information from cartographic sources into CityGML at LOD1 (Level of Detail 1). Data are viewable by means of Cesium or Web World Wind depending on the specific examined case. Three use cases in five Italian cities (Turin, Milan, Padua, Rome, and Naples) are examined: 1) urban mobility; 2) land cover and soil consumption at different resolutions; 3) displacement time series. Concerning mobility data and analysis, particular attention has been given to data modeling and processing algorithms with the aim to deliver value-added information enabling standard and innovative services (Origin/Destination matrix, flows checking, routing options, etc.) based also on crowdsourced data. Land cover and soil consumption data derive from semi-automatic classification of Sentinel 1 and 2, integrated with Copernicus land monitoring services at different resolutions and enhanced by photo-interpretation. Several environmental and landscape indicators are assessed at municipal level, exploiting spatial datasets. For displacement, SAR derived time series and the related Web services (WMS, WFS, and WMTS) metadata in RNDT format (the Italian extension of INSPIRE format) are automatically generated thus relieving the data provider from the need to create them manually. Besides the case studies, the architecture of the system and its components will be presented.
Keywords
129
131
137
139
Thumbnail
28:17
GEDCOMProjektive GeometrieGeometryLecture/Conference
GeomaticsForschungszentrum RossendorfSpektrum <Mathematik>InformationPersonal identification numberGeometryGeometryProjektive GeometrieGoodness of fitXML
GEDCOMExecution unitProjektive GeometrieUniverse (mathematics)Presentation of a groupAuthorizationLatent heatMachine visionPrisoner's dilemmaRule of inference
InformationVariety (linguistics)DisintegrationContent (media)Texture mappingPersonal identification numberGeometryStatisticsData managementDecision theoryPrincipal idealGeomaticsProcess (computing)Software frameworkOpen sourceOpen setAttribute grammarVelocityVolumeFrequencyHausdorff dimensionAreaDisplacement MappingComputer-generated imagerySatelliteMultiplicationSpectrum (functional analysis)Series (mathematics)Error messageGraph (mathematics)Linear mapTerm (mathematics)Data miningTemporal logicMereologyPoint (geometry)Traffic reportingRemote procedure callData miningCASE <Informatik>Decision theoryAreaOpen setState observerStatisticsUniverse (mathematics)Projektive GeometrieWeb 2.0VelocityOrder (biology)GeomaticsInformationINTEGRALVolume (thermodynamics)Multiplication signProteinSpacetimeVariety (linguistics)Observational studyArithmetic meanThomas BayesArmReal-time operating systemType theoryLink (knot theory)Content (media)Digital photographyArchaeological field surveyDressing (medical)Computer animation
Personal identification numberInformationOpen setFreewarePersonal digital assistantOpen sourceVirtual realityVisualization (computer graphics)Client (computing)Variety (linguistics)Channel capacityIntegrated development environmentDecision theoryFaktorenanalyseData modelUniqueness quantificationDisintegrationData managementTransportation theory (mathematics)SimulationService (economics)AreaCondition numberThermodynamisches SystemPrincipal idealGeometryMatrix (mathematics)CalculationTexture mappingAssociative propertyGraph (mathematics)outputOperations researchPlanningFunction (mathematics)Latent heatElement (mathematics)Computer networkPhysical systemExecution unitInformationSatelliteTable (information)Open sourceVector spaceCASE <Informatik>BitMereologyPoint (geometry)Universe (mathematics)Sampling (statistics)Axiom of choiceView (database)Computer programmingFile viewerImplementationField (computer science)Different (Kate Ryan album)Mobile WebCellular automatonDisplacement MappingOrder (biology)Service (economics)Procedural programmingWorkstation <Musikinstrument>FrequencyFreezingGraph (mathematics)AreaProjektive GeometrieMultiplication signReal-time operating systemMedical imagingLinearer GraphVirtualizationOntologyOpen setUniqueness quantificationDimensional analysisPartial derivativeComputer animation
Element (mathematics)Computer networkMatrix (mathematics)Function (mathematics)Uniqueness quantificationCalculationService (economics)Texture mappingOperations researchPlanningGraph (mathematics)Associative propertyWellenwiderstand <Strömungsmechanik>Data modelLatent heatEstimationArithmetic meanGraph (mathematics)Real-time operating systemMatrix (mathematics)Condition numberElectric generatorMultiplication signWordCausalityComputer animation
Service (economics)Matrix (mathematics)CalculationTexture mappingAssociative propertyGraph (mathematics)Wellenwiderstand <Strömungsmechanik>Data modeloutputOperations researchPlanningFunction (mathematics)Uniqueness quantificationLatent heatElement (mathematics)Computer networkPersonal identification numberInformationGeometryTransformation (genetics)Process (computing)AerodynamicsInsertion lossSurfaceCharacteristic polynomialComputing platformArtificial neural networkSpectrum (functional analysis)Computer-generated imageryDatabaseDigital photographyImage resolutionSystem identificationBuildingOpticsSatelliteSystem programmingSequenceBasis <Mathematik>Regular graphDifferential geometrySeries (mathematics)Displacement MappingAugmented realityReliefVelocityArithmetic meanPoint (geometry)Observational studyDisplacement MappingState observerProcedural programmingComputing platformSampling (statistics)Multiplication signMathematicsOrder (biology)Process (computing)Medical imagingMappingStatisticsSatelliteDynamical systemWebsiteSet (mathematics)File archiverComputer animation
InformationPersonal identification numberNatural languageObject (grammar)BuildingFunction (mathematics)Scripting languageFile formatProgrammer (hardware)Kolmogorov complexityProcess modelingShape (magazine)Process (computing)Attribute grammarStandard deviationAxiom of choiceArchitectureAudiovisualisierungOrdinary differential equationServer (computing)Text editorLibrary catalogMetadataComputer networkPrice indexGeometryConfiguration spaceInstallable File SystemInternet service providerData modelUser profileInterior (topology)Observational studyQuery languagePresentation of a groupVisualization (computer graphics)GeomaticsEndliche ModelltheorieLibrary catalogLatent heatPresentation of a groupInformationInterpreter (computing)Shape (magazine)AudiovisualisierungPlanningMultiplication signPurchasingMereologyArithmetic meanPoint (geometry)WaveCASE <Informatik>Open sourceExecution unitStandard deviationWeb serviceGeometryClient (computing)MetreOperator (mathematics)MetadataArchitectureVirtualizationProjektive GeometrieCovering spaceVisualization (computer graphics)Procedural programmingWeb 2.0SoftwareProgram flowchart
Displacement MappingTransportation theory (mathematics)DatabaseMereologyCuboid2 (number)Open setMultiplication signCovering spaceComputer virusTerm (mathematics)Maxima and minimaImage resolutionNumber theoryMedical imagingFlow separationSampling (statistics)File archiverSatelliteMobile WebDecision theoryBitUniform boundedness principleBasis <Mathematik>ExistenceMappingFrequencyIdentity managementMilitary baseTable (information)Lecture/Conference
GeometryInformationData modelDatabaseFingerprintData managementPairwise comparisonComputer networkFloating pointOpen sourceCovering spaceOpen setDigital photographyMathematical analysisAssociative propertyEndliche ModelltheorieBuildingDisintegrationMessage passingScalable Coherent InterfaceLevel (video gaming)Error messageSeries (mathematics)AudiovisualisierungPerfect groupProjektive GeometrieMultiplication signComputer animation
Transcript: English(auto-generated)
So, I think we can start to introduce the second speaker of this session. Most of you know her, she's Professor Broveli from Milan, and she is going to tell us about the outcome of this project, this urban geo big data project.
So please, the floor is yours. Thank you and good morning everyone. This project is about, you see, urban geo big data. You see also many names as authors, but as a matter of fact, we had to put many, many
other names, because this is a big project, it's not finished, but it started more than two years ago. And as you see, many research units and universities are involved, and we decided to put only the
name of the leader of the different units. But as a matter of fact, every unit is and was composed by four or five people. So for instance, there are some people who don't appear, who doesn't appear here, like
Elul, she's in my research unit, and she will present the next presentation about one specific topic we were working. So my presentation is a very general one, it's just an overview about what we did.
I'm not entering into the details, but at the end of my presentation, you will find all the papers that we wrote by now. So if you are interested in one specific topic, you can contact directly the authors
of the paper, or if not, you can contact me and I will put you in contact with the people. I'm not able to answer every question because I was just following mainly the research done
by my unit, and I was managing the project as a whole because I'm the leader of the project. Okay, so first of all, we can start with the motivation. And the main motivation is that nowadays, 54% of the world population lives in urban
area. And this is really impressive, but according to the report of United Nations and the European Space Agency of 2014, this percentage is expected to increase up to 66% by 2050.
It means billions of people, and so we have to consider that having so many people concentrated in areas, these can cause many, many problems that we have to face, or better.
It can be an opportunity, or it can become a problem, depending on what we are doing before and how we are planning the future cities. The part that is very relevant for us is that if we consider all the data describing
everything around us, it is recognized that more or less the 80% of the data is geographic. So as a meaning, or as a greater meaning, if we consider where it happens. Then the other point that is interesting for us is also that if we want to consider these
geospatial data, we are dealing with a great variety of data, starting from data from the survey, the GNSS, the photogrammetry and remote sensing, laser scanning data, mobile
mapping, geolocated sensors, geotech web content, voluntary geographic information, and so on and so on. So it's very various, it's not just one typology of data, we have many, many typologies of data. And the main problem is to find efficient way for handling and integrating all those
data. Okay, so this was the motivation of the project, and as I told you, many universities are involved. We started the project in 2017, and so the project lasts three years, so we have six
months now for finishing everything and wrapping. The idea of the project is to develop innovative geographic information, methodologies, and
tools to exploit the integration of what we call the traditional geomatics data and data observation data, statistic data, with the new generated data, like the data generated by the crowd. The approach that we decided to adopt with this project is what is called the data-driven
approach. So obtaining information from the data, this is the main point. And the aim, the final aim, is to provide tools to be made available to the decision
makers in order to be able to better manage the cities. Okay, so the other point that is relevant is the point that we decided for the data mining to consider specifically open data.
Therefore, our procedures, tools, and so on are based completely on the open data available. One main problem that we had to face is that the situation, also in Italy referring
just to Italy, because this was and is an Italian project, so we are concentrating on some cities in Italy, but even in Italy, the situation is very not homogeneous. There are cities where there are many open data available, there are cities where there
are no open data available at all, or very few open data. So we had to deal with a very homogeneous situation. And the second point we had to deal with was the problem related to big data.
When we speak about big data, we are not only referring to the volume, so huge amount of data. We have huge amount of data in some cases, but we are referring also to velocity, so real-time generated data. We are referring also to variety, so data that can be structured and that you have
to merge or to integrate with data that are unstructured. And then we have to deal also with what is called velocity. So considering the accuracy uncertainties related to the data.
Okay, this problem is very general, so we have to decide to consider some use cases. And specifically, we decided to concentrate in three use cases. One is mobility, the second one is soil consumption, and the third one is displacement.
The reason why we decided to concentrate on these three is that because they are very different with respect to the procedure and the method that you have to adopt. So for instance, in case of soil consumption and displacement, in order to deal with this phenomena, you have to consider long-term period data, and you have
to consider that this phenomena is distributed on an area dimension. On the opposite, in case of mobility, we have to consider that generally, from the point of view of the time, this is more limited.
We are more interested in real-time phenomena, and it is a phenomenon that we can consider as a vector one, distributed on the linear graph. Then the other difference is that in case of soil consumption displacement, we are dealing
mainly with satellite images, and we can integrate this data with other ancillary data, but the main sources are satellite imagery. On the opposite, in the case of the mobility, we are dealing more with in-situ sensors
or sensors that are on the cars. Okay, as I told you, we are dealing with some Italian cities.
So we defined these three use cases, but we defined also the cities. We wanted to work on because the choice of the city was motivated, first of all, by the reason that the different research units are in these cities, and the second point
was that at least some of those cities are the most populated in Italy. So, for instance, if you consider Milan, which is the city where we have the university, you can see that the population of Milan, the municipality of Milan, is a bit more than
one million, we can say, but if we consider the metropolitan area of Milan, it is much more because we reach eight million people, and you see here the inhabitants of the different cities.
This is a research project. We decided to consider in every step and in every part of the implementation, we considered to use technologies that are innovative.
So, for instance, just an example, dealing with the viewer of our system, we decided not to consider the 2D visualization, we could decide to use open layers, leaflet, and everything for showing the data, but on the opposite, we decided to concentrate
on the usage of virtual globes, on the usage of virtual globes, and you will see later, Elul will explain you what we did. Okay, so for the first case of the mobility, the first problem we had to deal with was
the problem of collecting data. So, at the end, we were able to collect both private and public, so data from private and public vehicles, at least in some cases, we had some sample of data in such a way
to start studying the situation in the older cities. What we had to deal with, with respect to the tools and methodologies, was to merge
what we have in the field of GIS and what we have in the field of what is called the Intelligent Transportation System. Very often, these two worlds are not speaking together, so you have to understand the
ontologies and the procedures, the methodologies that are applied in the two worlds in order to merge them. So, the main problem was to put all the pieces of information coming from people working on these in this field in such a way to define a unique GIS partial data model.
And specifically, also in this case, we had to choose what we wanted to do because obviously this team is very, very spread. And we decided to concentrate by now on these themes.
The traffic condition with an almost real-time mapping generation. Road graph impedance, which on the opposite is given considering the mean travel speed
and the times, routing preferences, and special origin destination matrix. So, now this is what we have available for those five cities, starting from some samples that were made available as open data and for the five cities.
The second case that we are considering is the case of Vancouver and soil consumption. The reason why we are considering that is because it is recognized, the soil is
considered as the most important new renewable resource. Generally, when we speak about soil consumption is when there is a change from a non-artificial to an artificial Vancouver.
And what is really crucial is that doing the opposite is a slow process. So, if something becomes artificial, it requires time to be converted again to non-artificial. Okay, so the point is to also in this case, starting from data observed considering earth
observation data and so starting from the earth observation platform to assess the status of the soil. Okay, so we did that.
We are doing that using semi-automatic procedures and starting from earth observation data, specifically the Sentinel-2 images, but also the Sentinel-1 data. And using also as ancillary data, many other data that you can see, also VGI data,
in situ data and so on. And we have the statistics about what happened in the last years. The third case that we consider is a case of displacement. The main reason is because in Italy, we have some problems related to displacement
in the cities. And the other reason is that in this case, we have also huge archives or other images, specifically what we considered are ERS and MVSAT satellite data from 1992 to 2011.
All the images that are available. So, we derive this dynamic map of displacement that you can visualize on the website, but
that you can also query in order to see in the individual points what happened in the last 20 years. The other point that we have to deal is to find, because as I told you, we are using
virtual globes of 3D visualization, so we have to deal with the free modeling of the data, starting in the most of the, in old cases, starting from every shapefile. And so, one part of the project was devoted to creating a procedure starting from every
shapefile to obtaining a CTGML that is the standard that can be used for the visualization of the data. Okay. I have three minutes, three minutes. So, very quickly, going to the infrastructure, we decided that the best way was to have
a distributed architecture in which every unit made available in a standard way using OGC standard, the data they created. And there is a common endpoint node in this spatial data infrastructure that provides the discovery facilities.
The other relevant point with respect to the architecture is that everything is implemented using the free and open source software. With respect to the discovery, we have a catalog service that is based on GeoNode and the metadata.
There is something spatial with respect to general catalog service, because in Italy we have a standard that is a bit different with respect to InspireOne. So, this is based on the Italian standard that is these RNDT. And the metadata are offered by the different node or the infrastructure periodically
are vested in such a way that the catalog is updated. About the 3D client, I'm saying nothing, because the next presentation will be related to the ground information visualization.
And then if you are interested on this specific topic, so virtual globes for web visualization, there will be also tomorrow the presentation in operator room in the EO challenge about the visualization of land cover and soil consumption.
So, this is the end of my presentation, just the concluding. It's a very general overview. The point is that in the future we will have to deal with a growing amount of spatial data. What we decided to do is to create some procedure and some guidelines and also spatial
infrastructure for dealing with the data of these five cities in Italy. And the most relevant point is that what we have in mind is that we have to start from the spatial data for converting them to spatial information that can be used by
everyone, as a matter of fact, for planning and managing the future of cities. Thank you. Thank you for the presentation.
To be on time. Nice working. There are questions. No questions. Oh, it's good. I have, okay. Please. Hello. It's so hard to say a data is a big data or just a data.
So, yeah, what's the number of data you're using for a separate place? Maybe. Do you have the maximum number of layers or the maximum number of data that you have collected? Oh, I don't have in mind exactly, but if you consider only the last case that I
presented, the case about displacement and the fact that we considered all the radar images in 20 years, you can imagine how big is the archive of data.
I don't know exactly how many gigabyte, but it's 20 years of images. Or with respect to the mobility, for instance, we have some sample, but this sample are
detailed in terms of time are given every, I don't know, five seconds. I don't remember if it is five seconds. And even if you are dealing with two months of data with the resolution of five seconds, you are dealing with a big amount of data.
And this is the same also for the land cover because for the land cover, if you start considering all the satellite imagery in case of Sentinel-2, now we have a repetition every few days. And so if you are dealing with that is also a very big amount of data.
I didn't enter into the detail of the data, but it was big. It is big. There are other questions. Just a quick question. I know, I know, because I'm part of the project, but it's just maybe to explain
because we speak about open data, why vehicles data are really difficult, we know, to get. So just share the experience on how struggle we had on getting that. Yes, no, no. Obtaining data about vehicle is very complicated.
Also in case of the public transportation. In fact, we added the data only of public transportation only for two cities. We are still asking and waiting for the answer from the other cities.
On the opposite, we were enough lucky of obtaining some open data about the private vehicles. And those data were related to the insurance of the car because for some insurance, we have to put these box on the car.
I don't know if you have, I have it. And this is collecting data that can be anonymized and can be shared. So we obtain this kind of data as open data for a certain period. And it's very important because it's a really rich data set that you have to consider for
starting with, yeah. Okay, thank you. We have time for one fast question in case. Oh, probably I can add also something else because I did not enter in many other details like, for instance, when I say that the situation in the cities is very, it's definitely
not homogeneous. You have to consider that, for instance, for Milan, we have a very good database or what is called topographic database. In case, for instance, of Rome and Naples that are two important city, it doesn't
exist. It doesn't exist the topographic database of Rome and Naples. So we added to use OpenStreetMap for those cities because it's the only available database. And so we added to it also with problem related to the quality of OpenStreetMap.
So we develop also some approaches for assessing OpenStreetMap data comparing with the existing topographic database and so on and so on. So the research is much more richer. But as a matter of fact, you find here at least all the papers that we published
by now. And in the next six months, for sure, we are going to publish more because we have to wrap up all the outcomes of the project. Okay, thank you so much. Perfect on time. So I will invite you to stay for the next talk.
We have the five minutes in case you need to move.