Geoscan: spatial data country profile
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 237 | |
Author | ||
Contributors | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/57269 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
FOSS4G Argentina 2021140 / 237
12
15
16
23
26
36
44
52
53
54
59
72
90
99
114
121
122
123
124
127
129
130
139
154
155
166
203
204
219
223
224
230
00:00
Port scannerTouchscreenMultiplication signStreaming mediaMeeting/Interview
00:30
Goodness of fitField (computer science)BitProfil (magazine)Information technology consultingProjective planeOnline helpLevel (video gaming)System callGeometryCollisionProcess (computing)Digital photographyCellular automatonPhysical lawComputer animation
03:04
QuicksortInformationStudent's t-testTouchscreen2 (number)BitMeeting/Interview
03:51
Computer programType theoryInformation securityBand matrixDifferent (Kate Ryan album)Time domainIntegrated development environmentCartesian coordinate systemNumberType theorySet (mathematics)SoftwareTerm (mathematics)Field (computer science)Open sourcePhase transitionAreaPoint (geometry)Goodness of fitSoftware testingProjective planeScaling (geometry)Entire functionDecision theoryQuicksortRevision controlPower (physics)AdditionInformation securityDifferent (Kate Ryan album)Query languageProcess (computing)Traffic reportingOperator (mathematics)Form (programming)Independence (probability theory)Visualization (computer graphics)InternetworkingLevel (video gaming)Product (business)Connected spaceBitPerformance appraisalMappingRemote procedure callDigital photographyGoogolForcing (mathematics)Data storage deviceGame controllerSuite (music)Frame problemCentralizer and normalizerKey (cryptography)MaizeMusical ensembleCoefficient of determinationMultiplication signObject (grammar)MereologyShared memorySweep line algorithmDataflowComputer animation
07:47
Process (computing)WhiteboardComputer-generated imageryOpen sourceCountingData modelCorrelation and dependenceLocal ringGoodness of fitBenchmarkNumberCombinational logicMixed realityWater vaporTouchscreenDemosceneStatisticsLevel (video gaming)Logic gateVotingVisualization (computer graphics)Forcing (mathematics)TelecommunicationDigital photographyGroup actionDescriptive statisticsPhysical systemDigitizingOpen sourceEnterprise architectureAreaQuicksortDifferent (Kate Ryan album)Shared memoryDecision theoryInformationEmailMetadataProper mapInternet service providerOrder (biology)Graph (mathematics)Domain nameFunction (mathematics)Cartesian coordinate systemCalculationWeb applicationData modelSet (mathematics)Process (computing)Price indexContent (media)Product (business)Projective planeTraffic reportingDiagramData structureComputer animationProgram flowchart
11:05
Integrated development environmentOpen sourceInternet service providerPreprocessorCartesian coordinate systemInformationStreaming mediaProcess (computing)Digital photographyOrder (biology)AreaComputer animation
11:29
AreaCartesian coordinate systemStreaming mediaZoom lensLevel (video gaming)Web 2.0Shared memoryForcing (mathematics)Computer animation
11:59
Wechselseitige InformationWide area networkInformation securityDialectAreaInterface (computing)Web applicationContent (media)Degree (graph theory)Level (video gaming)Table (information)Software testingOcean currentCartesian coordinate systemOrder (biology)Decision theoryGoodness of fitWindowDistanceComputer animation
13:10
Gamma functionHexagonRaster graphicsVector spaceVisualization (computer graphics)Probability density functionCodeProper mapData structureMultiplication signTable (information)Graph (mathematics)Open sourceCASE <Informatik>Product (business)Form (programming)Traffic reportingRaster graphics1 (number)Vector spaceStatisticsError messageBitTask (computing)Numbering schemeSet (mathematics)Operator (mathematics)BenchmarkParallel portData managementIterationGame theoryMathematicsCalculationInformationHexagonPlug-in (computing)Boundary value problemLevel (video gaming)Formal languageProcess (computing)Grass (card game)InternetworkingLattice (order)Server (computing)Power (physics)NeuroinformatikScaling (geometry)Subject indexingPoint (geometry)CuboidTouchscreenPhase transitionAreaDialectComputer animation
19:49
Raster graphicsBoundary value problemComputer-generated imageryInformationWitt algebraEinbettung <Mathematik>Computer configurationMultiplication signRevision controlLevel (video gaming)Direction (geometry)AreaFormal languageStatisticsPairwise comparisonCASE <Informatik>Selectivity (electronic)MereologySet (mathematics)Field (computer science)Visualization (computer graphics)Right angle2 (number)BitApproximationInheritance (object-oriented programming)Different (Kate Ryan album)Cartesian coordinate systemLine (geometry)HexagonError messageScripting languageProduct (business)Game theoryTask (computing)ExpressionKnotDataflowProcess (computing)EmailMappingOpen sourceBit rateElectronic mailing listVideo gameArmGroup actionBenchmarkWordResultantACIDEndliche ModelltheorieTraffic reportingTouchscreenFood energyComputer animation
26:28
InformationComputer animation
Transcript: English(auto-generated)
00:17
So whichever of you wants to share, basically what you do is if you share your screen now,
00:23
one of you or the other should share the screen, then I basically add it to this stream like this. And basically in two minutes at 8.31 my time, I will introduce you and then I'll turn myself off
00:44
and then come back at the end to help field questions. Shall we switch off for the cameras and microphones to each other? So first I'll start, then I'll pass to Giuseppe, or what's the preclamation, if anything?
01:01
What you can do, I think you can, do you have the ability to mute yourself? Yep. So yeah, that's what I would do. I'll just disappear and then whoever's gonna talk first, unmute yourself and you talk and then switch it over to the other guy.
01:23
Perfect, great, thank you. So I'll give people one minute and I'll let you guys introduce yourself. Again, I apologize for the mishap. I don't have your bio handy.
01:41
No worries, no worries. Okay, good morning. Welcome to day two of Phosphorgy Porto Iguazu Room.
02:04
I'd like to introduce Lyubov Filipov and Giuseppe Reimann. Apologies for the bad pronunciation. They will introduce themselves shortly as they enter the presentation
02:21
and teach us about GeoScan, a spatial data country profile tool. Thank you very much for participating and the stage is yours. Great, thank you, Michael. Good morning, good afternoon and good evening to everyone and thank you very much for joining our talk today.
02:43
My name is Lyubov Filipov. I'm a GS consultant in the International Fund for Agricultural Development and today with my colleague Giuseppe Biomonte, we're going to try to show you something hopefully interesting, a project which we have done for IFAAT.
03:02
So what is IFAAT? First, a little bit of background on the topic. Just one second.
03:31
A little bit of background on information on IFAAT. I hope you guys still can see my screen. Giuseppe, if you can just confirm. No?
03:41
I don't. Michael, is the screen visible? Sorry. Okay, thank you. So a little bit of background on IFAAT. IFAAT is a specialized UN agency but also a financial institution
04:02
which has a very specific mandate to try to support the poorest of the poor, rural farmers in remote agricultural areas. So what we did in IFAAT in terms of, let's say, GS challenge or GS requirements, we shape in the form of this application called Geoscan.
04:23
First, we receive various requests for different geospatial data in a number of different countries, starting from, let's say, Solomon Islands, moving to Cambodia, India, Pakistan, shifting all the way to Latin America or African region. And these data requests usually come
04:41
from different domain areas. So social, environmental, economic, climate related data. And usually, of course, they are based on a very short timeframe to be delivered. So one thing is to produce a number of data sets for Solomon Islands, but larger countries like China or India or Brazil are much more challenging in terms of data processing,
05:03
storing, and manipulation. So what we further explore as requirements is very often the data is needed to be provided for offline usage. So usually, to be used on the fields or used in the areas with not good internet coverage or internet connection
05:23
or shared with third-party agency or third-party companies outside the internal network of operations. We also receive different requests from various user types. So GS professionals who wants to export actual data to do some additional modeling, analysis,
05:41
and queries on the actual GS files, or by non-tech or non-GS users who not necessarily know what is all the technology behind GS and simply need a report, a map, to embed in the report in a proper presentation, the report will simply share it with a third-party agency. And also on the higher level decision makers
06:01
who are looking for basically to take advantage of the power visualization of the maps itself in a more data-driven approach so they can prove their point in various negotiation phases or various decision-making phases. So all these different types of users and data are very well structured in the workflow process of HIVOT,
06:23
starting from strategic country overview, going through a particular project design and the project monitoring and evaluation. All these activities should be aligned with the existing GS infrastructure in HIVOT, which is entirely free and open-source based,
06:40
built around Postgres, OGS, GeoServer, GeoNote, OpenLayers. So we try to stick with the good principles of open source and also additional requirements on our end as GS guys, we try to follow a good practice on naming convention, on test staging and production deployment environment,
07:02
on security by third-party independent penetration testing companies, et cetera. And we also try to do the whole project in a more agile approach. So we started back in 2019 with just a few countries. We proved with different users the overall approach
07:21
and we come back on 2020 on a much more bigger scale covering entire region of HIVOT called Western Central Africa. And now this year we are doing sort of a third version of the application, which is actually now having a global coverage and global scale. And hopefully we're going to expose this
07:41
including outside HIVOT network and for basically public usage to outside world. This is a snapshot or a description of the whole process we did in a nutshell. So these various socioeconomic, environmental, climate and a number of other domain areas we needed to cover with the proper literature
08:02
and data review in the beginning. So we benchmark a number of data sources, majority of which you see some of this on the screen, approximately 28 different data providers. And out of these 28 different data providers we benchmark more than 180 geospatial layers.
08:22
And we shape all this in a proper documentation and a metadata description. And then we move to structure various data sets into a common naming convention which is covering basically our internal HIVOT needs. We shifted then as the next step to processing
08:42
various conversions, structuring, renaming, analytical, statistical calculations which Giuseppe will touch in a bit. And the output products were targeting first a nice GS package with all this 180 layers of information with nice styles, nice visualization,
09:03
nice QGIS project all wrapped together for one particular country. So the users basically click and grab all the data they need and use it offline or share it with a third party local vendor, local provider, agency, et cetera. The next output is in a sort of automated reports
09:21
in a PDF format containing a number of maps, diagrams, and traditional visuals which are ready to use for the variant user to graph export and take it, let's say offline. In order to align with the HIVOT infrastructure, we placed the whole application in HIVOT Enterprise GIS system
09:43
which is based on GeoNode. So all the applications, all the layers and web applications are exposed to the GeoNode and allowing the user to search by country, by topic, by different area and to go to the particular data country of interest they need.
10:01
And then at the very, very end, we focus our, let's say not so good design skills because we are like map-orientated guys but we try to develop interactive user-friendly high level dashboard with selected set of indicators with high level statistics on a regional level,
10:22
on a country level to be more engaging for a higher level decision-making audience. This is a beautiful and I guess very difficult to read snapshot of our data model but basically we use this very often to engage with the end user. It's not, let's say a UML diagram or the data model
10:41
but it's based on the ISO team classification and we engage with the user explaining the different layers of information structured in the different data teams. And this has been proved very useful to try to explain all the complicated content which we have behind the scene
11:01
in the various different levels of different applications. These are the nice logos of all the data providers which we're using. Again, the emphasis here is that we do have requirements to provide offline or off the grid information. So all this is basically usually grab,
11:23
download it, pre-process the package and then deliver to the user in these various application streams which I mentioned. The first application stream is this if at Juneau portal which I mentioned allowing the user to search by country, by topic
11:40
or just zoom to a particular area and get all the data which is available underneath the particular zoom level. And of course they have full access to all layers. They can search, they can export, they can even create their own web map applications and share it with our users or departments.
12:01
This is how our dashboard application looks like. And we aim to be exposing this towards the end of the year. For now it's available only for internal needs with some pending penetration security testings to be done in order to be exposed to outside. It's very standard way to present
12:21
a geospatial web application. We do have the table on contents on the left. What is powerful and what we find very useful is exactly engaging on higher level decision-making. For example, this simple snapshot is displaying the temperature increase in the future scenario in 2061
12:40
for the region of Western Central Africa, displaying the areas above 30 degrees Celsius, Celsius which are getting bigger and bigger in the future scenarios and also overlapping with the current economic activities or GDP by districts, basically showing that the areas of greater climate change impact
13:03
are going to affect the poor regions of these countries, which is of particular interest of IFAT interfaces. And I'll pass the floor to Giuseppe to share a little bit some final challenging task which we find on our way to developing the whole process.
13:21
Thank you. Thanks, Leobo. So this is a very simplified scheme on how we treat the data. So after the data search, data review, data gathering phase, we end up with a lot of data, some of which is global, some of which is just local.
13:42
So for some countries or regions, both in vector and raster format. So what we do is we clip them so that we obtain data for the specific country if the source is global, or we verify and paste on the data if it's local.
14:04
And we have also created the styles for proper visualization for all of these data sets that are packaged all together, keeping a very precise and defined data structure and the precise data naming convention
14:21
that fits the needs of the agency. On top of this, of course, we use this quantity of data sets to calculate statistics. We have two geographical references. One is within administrative boundaries.
14:41
Let's think of district level. Another is an hexagon grids that has been designed in house and it has a global coverage. So in the end, we obtain a lot of statistics, both by districts and more granular on this hexagon grid.
15:01
And also we make use of this amount of data that we have to calculate additional statistics doing just math calculations on the tables. So click please, Ljubo.
15:23
So basically we have this country data, the statistics and the visualization for every single one of them in this nice, well-organized GIS data package
15:42
from that, we also create some automatic reports that now cover over 150 countries that go, yes, from Afghanistan to Zimbabwe. So all of the countries in which the agency works. And it's not just maps,
16:01
but also we have nice graphs and statistics. I mean, a lot of information. Again, a very interesting use case as Ljubo was mentioning is that, first of all, we have some users that are not necessarily GIS savvy. So they can use a pre-packaged product
16:23
that contains all the information that they need. And also we have a use case, which is offline usage. So our people on the ground that maybe they don't have access to internet, working in very remote places, they actually have some form of access to information
16:43
because they have a PDF or they can print it out. So they can carry all that amount of information with them. It was a long journey and yes, a long coding journey, which started as well as an innovation challenge.
17:02
So we started with a very humble script, which is all Python. And well, then it evolved over time with the needs, with the quantity of data that we had. And in its final form, it is a nice QGIS plugin
17:23
that makes everything quite simple. Also, you can barely see it probably because it's a bit small, but it's a parallel data processing, the first box, parallel statistics computation.
17:41
And that's because at some point, managing all of this quantity of data sets actually started to be a computational burden. And so to take a lot of time, so we implemented parallel processing and actually this allowed us to process all of the data
18:03
in a much shorter amount of time. And also to scale up our operations to global coverage without having to wait days to produce the data, because sometimes these are time-sensitive operations.
18:22
So we need to actually have a very fast turnaround time to produce this data and give them to our colleagues that need to use them. This is the preliminary benchmark
18:41
of the impact of parallel processing, which is between the first iteration of GeoScan and the second one. But then after that, we improved on it. So we're saving even more time now. So it was an interesting journey. We had a lot of fun
19:01
and it was not a journey without challenges, but we ended up to do some interesting things. One of our activities was to play this game within the theme, which is called foreign language or encoding error. And you're all invited to play that with us.
19:21
So sometimes you end up with a table that has characters that look like that. And so you question yourself and you start to be aware of your knowledge gaps and you ask, is this a foreign language or an encoding error? I don't know what your guess is, but this is actually Armenian.
19:42
This was an easy one. These other ones show stuff like this showed up in a table after the Armenian one, we were questioning it. So again, foreign language and coding error. This time it was an encoding error, yes. And this was probably the most interesting case
20:01
because yes, we were actually thinking was an encoding error. And actually we just became aware of our ignorance discovering that it's a proper language, a proper script. And so that actually our data set was absolutely correct. Moving on, sometimes the game was where is the border?
20:24
And again, this was challenging and interesting, although this is not as fun, because what we're showing here in this very streamlined map is several disputed borders
20:42
or disputed areas around the world. And actually this was challenging on one end to actually get to define where the border was and also extremely interesting because these are not just maps,
21:02
but these areas indicate also places of conflict, places where people have been displaced. So each and every case was actually an opportunity for the whole theme to have a better understanding on what happens in parts of the world
21:23
that sometimes are very remote and not under the spotlight. Sometimes the issue was unexpected and hilarious, sometimes also frustrating. So one of the questions that we have to ask ourselves
21:43
is where is Fiji? And well, yes, you would say in the Pacific, but actually a map could just look like this, which is not ideal for visualization purposes, but you're all GIS people here,
22:03
so you know the answer to this, which is, well, reprojection, isn't it? So next slide, please. Yes, reprojection, right. Yeah, except the reprojection doesn't always work as intended. So these are the big guys that know how to do these
22:23
things. So we see left Google, and you can see that there is a little data gap there. And even in Bing maps, there is a data gap. So sometimes the solution seems straightforward, but it isn't, but we take this to the next level.
22:41
And this is the issue that we had to face. So we had this mirroring issue that was a bug, but in the end, and that's my last slide, I have to report that we managed to do what we were supposed to do and that Fiji is all in one place.
23:01
I thank you. And I leave the floor to Ljubo for his final thoughts. Thank you, Giuseppe, and sorry, Michael, for going a little bit over the line. Just on behalf of the whole team, thank you for joining the talk. I'm happy to take any questions on you guys. Over to you, Michael, thank you.
23:21
Very nice job, guys. Thank you very much for the talk. Has definitely generated some interest and some questions. I think we have about three minutes or so to go through those questions. So I'll start handing them off to you quickly. First question is, someone else has the same task
23:42
of collecting, embedding broad swaths of data across many countries. Do you have any published work with more detail about how you went about collecting data or your findings regarding the comparison between different data sources? Excellent question. We do have the final documentation,
24:01
which basically is describing in detail the final selection product. Internal benchmarking was done within the team members. Evaluating different data sources, what is the better one? And there is never a straight answer to that, even for one team and for one particular country. So we tailored this based on the different needs
24:23
and different requirements. In some cases with discussion with other colleagues with the respective professional field. Great. The second question is, how much time does it typically take to update an existing data layer or add a new data layer for the end-to-end processing?
24:44
This was actually the triggering of the application. Approximate time or duration for compiling data for one country was two weeks time. And we managed to narrow down to approximately two hours of work. And growing more and more data layers
25:02
of course brings more and more sophisticated time processing. But usually this is done in a matter of hours. Super. And the last question, what underlying tools do you use for the zonal statistics on the hex grid?
25:23
Yep, I'll pass this to Giuseppe as he is the statistics guru because we had quite a lot of discussions and challenges. Giuseppe, over to you. Yes, actually there were several options for the zonal statistics, but in the end to make it easier for our users,
25:41
we try to make a very straightforward QGIS workflow so the zonal statistics that are used are our QGIS zonal statistics in the current version. Great. Well, thank you guys very much. Very interesting and important initiative.
26:00
You have the team's email there if you want to try and get that documentation or engage with the team directly. So thank you very much. I'm going to start switching rooms, saying goodbye to Leobo and Giuseppe
26:21
and getting Timur up and running. Thank you very much again. Thank you, Michael. Goodbye.