Logo TIB AV-Portal Logo TIB AV-Portal

Building applications with FOSS4G bricks: two examples of the use of GRASS GIS modules as a high-level "language"' for the analyses of continuous space data in economic geography

Video in TIB AV-Portal: Building applications with FOSS4G bricks: two examples of the use of GRASS GIS modules as a high-level "language"' for the analyses of continuous space data in economic geography

Formal Metadata

Building applications with FOSS4G bricks: two examples of the use of GRASS GIS modules as a high-level "language"' for the analyses of continuous space data in economic geography
Title of Series
Part Number
Number of Parts
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
In a world where researchers are more and more confronted to large sets of micro-data, new algorithms are constantly developed that have to be translated into usable programs. Modular GIS toolkits such as GRASS GIS offer a middle way between low-level programming approaches and GUI-based desktop GIS. The modules can be seen as elements of a programming language which makes the implementation of algorithms for spatial analysis very easy for researchers. Using two examples of algorithms in economic geography, for estimating regional exports and for determining raster-object neighborhood matrices, this paper shows how just a few module calls can replace more complicated low-level programs, as long as the researcher can change perspective from a pixel-by-pixel view to a map view of the problem at hand. Combining GRASS GIS with Python as general glue between modules also offers options for easy multi-processing, as well as supporting the increasingly loud call for open research, including open source computing tools in research.
man terms Development level spacetime bits modularity GRASS Modular Continuous part libraries
man matchings indicators GRASS lines Continuous GRASS part total fields CAN-bus processes Computer animation Meeting/Interview Commodore Normierte Räume spacetime modularity libraries
Integrationen sets analysis GRASS fields programme Central spreadsheets graphical user interface different program sets spacetime model Office category libraries man link real The list applications Modular machine Types means Computer animation spacetime
control programming language Open Source code open subsets part elements structural different program logics logic system model implementation libraries man parallel processing sources programming language algorithm generate Graph point construction programming Open Source parallelization applications system call elements Modular component-based means processes Computer animation program Domain modularity structure
constantly script integrators routine GRASS Mass distances Axonometrie theoretical elements product graphical user interface ACCESS SCRIPT system spacetime libraries predictive script link Graph real reflection Development files projects analysis code routine GRASS elements Modular Demo distances Types means processes Computer animation raster shell modularity level progress matrix
files distribution code open subsets number workloads sign different terms cores model libraries form script man multiple link parallel analysis parallelization applications Modular processes workloads Computer animation Hardware program fitness
neighborhoods code sets analysis area product hypotheses structured data estimates Quotients matrix objectives libraries link information classical neighborhoods analysis share hypotheses Computer animation Schätzung raster calculations orders Right objectives matrix
unit font distances area product orders different radius level Display circle unite libraries area time man unit regional link concentrations share experts code orders of magnitude bits distances Computer animation Schätzung uniform distribution hill Results spacetime
code time distances vision area orders terms radius reduce model unite libraries area man time unit link code orders of magnitude distances radius Computer animation Schätzung hill localization
point man unit link share effects bits distances total distances Wendepunkt structured data estimates malicious code Computer animation Schätzung functions objectives model pixel libraries spacetime
man regional Manufacturing execution system unit ones distances theoretical distances product estimates Computer animation Quotients website libraries
man necessities pixel algorithm algorithm point calculations experts distances structured data Computer animation Schätzung naturally real vector matrix information model implementation sum pixel libraries weighted sum spacetime
point pixel algorithm calculations GRASS part distances structured data terms model implementation Gamma pixel sum libraries weighted sum man compactness algorithm NET translation system call distances Computer animation Schätzung series URN calculations modularity loops
neighborhoods pixel images calculations GRASS analysis metadata image terms real vector autocorrelations objectives law pixel libraries man algorithm neighborhoods construction analysis translation system call product Vector autocorrelations means Computer animation real vector raster calculations phase versions objectives matrix
neighborhoods pixel mapping algorithm directions The list GRASS objectives implementation pixel libraries man Graph directions interactive The list Amsterdam Ordnance Datum lines entire processes Computer animation raster objectives metrics matrix
paradigm neighborhoods pixel mapping programming language views GRASS Modular Compression mathematics program libraries programming language paradigm mapping point neighborhoods Modular elements Types Computer animation raster modularity sort matrix
building system call Open Source files code states time GRASS Codes loops several memory memory CPUs program Authorization system processes model libraries multiple man parallel code energy routine Hot several Modular processes Computer animation raster functions different program modularity pressure buffer overflow Results
point Actions code sources Imaging open subsets energy fields theoretical number second image specific Meeting/Interview operations Video matrix level functions libraries man GPUS bases Tiled Ionic graphs projects analytics virtual mining photos voting processes Computer animation real vector orders Domain libraries
Computer animation
it is that started this is the last
session of today for his room and there's going to be the birds of a Feather session afterward stem self steady discussions this is the last thing they have today mostly thank you I'm going to maybe go down a bit in terms of level of sophistication in terms of the the development parts of what I'm presenting but I'm gonna talk about the users of same modular GIS tools such as trust
uh in the field which is not talk about matches economic job is
also what I'm saying is there's more genetic generic in the
grass but I hope all of you know
was called what's the primordial soup of GIS vital I think the 1st but it's still very much alive and kicking and still no part going in and new lines being created every day in economic geography we have the same problems
as in other fields of moved to its bigger datasets especially because there's more and more of a moving to microdata meaning that instead of using aggregated data sets we used 1 more individual that is a for example individual firms and the other thing is that might sound surprising that we talk about economic geography the there's actually in a certain way beginning of integrating real space uh because we using these Michael datasets where very often before was model-space work geography of Central did not play such an important role in the research community that I work with I C 2 types of people
dealing with these issues you have this is the programmers people might actually come to foster G conference who know how to use different tools nanometers write their own programs and then you have the others were combined the gooey uses war let's say advanced Excel spreadsheet users that I don't want to touch any programming but have enormous difficulty of getting things done because they only just who brings the things has often been difficult we're trying to argue here is that a lot of office for G tools offer the opportunity to go a way you wanted to build applications but using the models of these different sets and there a non differing on existing exhaustive list here of tools such as in the 1st 4 G world so the idea is that models of these tools are
elements of a programming language and allow very rapid construction of programs general you will need some general-purpose programming languages as you but that depends on how complicated your application the important part is here you can concentrate more on the high-level logic of what you wanna do as a researcher than to actually deal with all the different synergies issues of program what very powerful aspect
of this model tools such such as structure is that each individual tool normally is a program on its own meaning that you have a very robust system in which want to fail so that the whole system that fails the other thing is that as we've just seen examples of distributed program on distributed computing the fact of having this very modular structure makes it also very easy to create parallel processing algorithms to and then obviously we're had but in the scientific world this is not as obvious the fact of using a control source allows adaptation but also sharing and peer review of the code something which is more more becoming more and more important important science we really heard the call for open science that to key he noticed this afternoon so graph generators over cities
30 years old which means it has a lot of accumulated experience and and to 2 elements are particularly useful for what I'm talking about here 1 is the fact that it has that actually deals with all predictions you want not only to but all projects as you might 1 and 1 and deal with obviously through the integration of cross foreign all retains but also some internal and that's very important in some elements of economic geography monitoring with distances when believe some of the analysis of scene of people making the theories around the distance is switched on mean very much without any reflection on the production system being used the other is that to its known development they this progressive development of very rapid routines special meaning with massive data so you can really there's a lot of elements and that are really fast in grass and you have a large class of and so the possibility to really dealing with very large datasets on the blue 1 using here's Python that won't come much more
into that but graph provides 2 different type from the API so there's the grass scripts which is very simple and you wanted to use it and there's a moral little progress which allows it to go much deeper into the actual elements of process there's actually very nice we model or this and developments constant development but it that actually
allows you to to graphically developed the code and then the spit out yeah script in the Python form so that makes it very helpful as well for non-programmers to actually build the application and that the and the question of large datasets and parallel
processing often in these analyses and the majority do not have a lot of looping because you do a lot of repetitive analysis of different aspects of and places whatever and so what I've been using is the same model as it might be as the multiprocessing model to distribute the workload across multiple cores it's the signs of its own of 100 we already discussed this and also the problem of not overloading each node in terms of the random you make it use number of open files you have a new a new node set up but it does allow you to do very quickly very quickly developed multiple single utterance so what I'm going to presented through examples of applications and i'm gonna go really deeply
into the code because I think that's the no entity but if the code is available if you wanted to see it as 1 example is really directly analysis and working on right now which is the estimation techniques trying to estimate how much of the production of a given region is actually sold outside that region what is sold in some reason something for which we don't have any information so it's all est estimated generally and the order of works on individual farms so for example for France working on 1 . 2 million firms now strolled you located and uh some information of the 2nd example is and as a tool in the which allows to calculate neighborhood Matt matrices from roster objects so if you go to the 1st example the basic
hypothesis that we have estimating this share of production in the region that is exported outside the reason is that the fact that the more and economic sectors concentrated the more the farms that produce in that sector will expert export somewhere if the same set of distributed just like the population not much need for the for exports and generally did indeed the estimation production is consumed locally the classic approach is the location quotient so what you do is you take your original
production and if you look at the share of the sector uh in that region compared to the sector on the national level and you look at the share of general employment in that region to the general comment on national level so you subtract 1 from the other so if there's more concentration of that sector in the region and the population then you postulate there's exports so that's that's the way it works the big problem with that is is that it always works and aggregate the data and so you all you have the famous modifiable area unit problem that comes up immediately if you look here you have 2 regions here because it was decided that space was set up here and you have completely equal distribution of production little circles have so there no special concentration or experts in this but it's enough to head up displays a bit differently and suddenly production is concentrated into regions and you want exports so the results of these analyses are very much dependent on the spatial delineation of the region so the idea is to to say that this doesn't work because
it doesn't take into account distance doesn't make any difference if you here here here here the results
are going to be the same in terms of the vision at the same time research in recent research shows 1 1 that distance extremely important in terms of
where things are sold there's been work in the US which shows that sales within the same zip code area about 5 4 miles large is 3 times higher than sales outside and that almost all sales on a very short radius so shipments from companies to other companies more to end consumers is known something very local even if we live in a in a globalized world so this a real need to take into account the actual localization so the idea here is to reduce cost model for
them that's a model that was created for uh retail share estimations to use your and use that model uh and to estimate the exports for each farm based on the real distances instead of on aggregated data so I'll go through this quite quickly but what it does and it it has estimates and a probability for each point in space that people in that point is space will consume something from a given farm and that is dependent on the objectivity and effectiveness of the firm and on the distance between the given population and that and you can then uh asymmetry each firm what is the total population that this firm caters to wanted and if you look at that if you compare this population of the firm caters to
with the resident population we can then estimate in the aggregation that you want you can estimate the share of exports so that's the general idea just to make it a bit more graphical here
you have a theoretical scenario with spatial units you have the this population and these are firms so you have 3 sectors the green the blue and the orange ones more or less concentrated especially if you do estimations of how much of production In each of these regions will be exported outside the region with the location quotient you get something like this all these regions are
strictly identical because distance doesn't play any role distance to the city the main cities we have more population doesn't play any role whatever with this new indicated you have a somewhat differentiated picture and that allows you to take
into account that distance now the Board value of day is what the
idea saying that you have this algorithm that you need to to to to run to estimate the model and so that you have some some have you been going on there and especially if you work and it's a classical tools a lot of economists use you will have a matrix is with of space and you will run through all the pixels of that nature it is that it GIS tools discussed here is you can do that much Maurice
and the the intensive calculations are have been thought through within these jails and so you can do that so just an example what you have here is you take the 1st part of the algorithm where
you have the charm and each pixel with its population and you try for each pixel you have to calculate the distance of the pixel to the firm and then calculate the see the weight of the term for the population that done into into calls which are different from its down because you models calculate the distance and you can have you have the the the basic calculator that does it for you now this is at 1 point you have to aggregate all these scores of all the farms and then afterwards calculate the ratio of 1 from 2 to it's it's it's it's compact competitors again 1 tool and does it for you the and this is the last part the you then calculate the pixel consummated
consequently assuming population and again just 2 calls to the grid uh tools to do the work for you and these tools have been highly optimized in terms of calculation speed and things so it makes it really easy to construct these algorithms their efforts my 2nd example is the construction of neighborhood might
make this is something which is quite important in general geographic analysis and economic geography especially for example for calculations such as spatial autocorrelation meaning need in a neighborhood relationships and generally a lot of tools allowed to do that quite easily with metadata but when you have rested data where you have all objects in roster data such as defined by just adjacent pixels that have the same pixel value with something that for example can come out of the image segmentation or things like that and then you go to the vector version of that same map and then you do the calculation of the name my images and then you go back but that's highly efficient especially when you have hundreds of thousands of objects just the vectorization phase is often quite so so the idea is to do this strictly Gnostic to make it much faster and so the idea is to say for
each pixel you check that neighboring pixels in for a direction choose that and then you look is that doesn't have the same value yes and no and if yes then Example that is not a neighbor otherwise it's in and again in graph as well for interaction you
have 1 car which takes care of the entire matter metrics just check for each pixel whether it has a name it is never not you get the statistics of about you just have to run 1 line in Python 2 unique could 5 the list because you might have an object which across 10 or 15 or 20 or 100 pixels as neighbors and others so will have duplicates and the job is done and so can here for example is the example of a map of
soils and you can have view of the neighborhood mathematics that comes out of which soil type is next to it sort of an example non-economic geography but uh obviously it's optical applicable in all the so what do we take home from the idea is really that
the modular GIS tools should really be seen as programming languages which you are a high level and allow you to a lot of things the main difficulty that I've seen with people trying to do that is to to change can of the paradigm from I have to deal with every pixel to say no I deal with the entire map at once and the program deals with the pixels dresses excellent memory-handling which allows you to
really work with these big files and not overflow your memory and the fact that you can use these models and very easily build separate loops for example fought each economic sector you do it separately and paralyzed process very easily across many nodes and have done very quickly all this is not always as fast as you could be if you work directly in the code and you multi-thread and all but as we just discussed as well sometimes time of coding is war pressures and the time of writing and so that's something that has to be balanced and just again the last calling into open science the fact of using the open source tools obviously for this allows the peer review that you can do with people tell you we did this we did our magic and as a result and you have no way of doing looking at how they did that and or even if they give you code but it's coding of program that you don't have that is provided that buy line of able to reproduce so thank you and little
advertising for similar coming up in Brussels with authority them have state of the mouth and hot summers and all these things happening before this system something tailored to take questions 1 is about leaving
the voters leaving it the 6 30 or 7 the I go over 630 because if you think it's going from this should be fast will I will I go from artificial-intelligence research field and and the people of I've working like images and videos and processing too much data mining 2nd seconds so I'm asking if have a there's that research order have tried to process data using GPU using sorry to be used in graphical processing units I don't know if any GIS packages used to be used by domain but it's obviously feel that could be interesting and look at markers denote theory probably the most of his so graphs there were a few summer of Google Summer of Code projects in the past where some specific functions like energy balance and things like that there are books and photographs users have been implemented with GPU support the the problem is always that you have to kind of maintain the code maybe alongside with the non GPU code and at this point run into the trouble to have to go to the source code bases which are not necessarily now corresponding anymore after White so it's not easy to and the I think if you go low level and for example the number crunching which is done in numerical functionality like a matrix processing and the like you better invest their into the GPU always support rather than doing it at a higher level and this is I think the way to go good and not because and when I was doing research I was using Python and I was using I recall coffee and many of you heard that so that later you create that can graphical the composing graphs and that person graphed against since those that global operating or compiled in to 0 GPU worse if you if you want so that that was because it was we just 1 small comment I that for image processing we often use libraries such as Open CV and analytical ability as molecules which uses GPU and so we might actually already do it even if we don't realize is that enough ITK which kind OTB of virtual books I wonder if they don't have GPU it seems slightly than the OK do we have more questions last chance that then thank you Morris the coming up our birds-of-a-feather session 7 now so that I think is a sketch neural vector tiles there's 1 in the tunnel above routing and there is a program of on the welcome desk where you can see which sessions are going on but the further sessions are discussions in the group the non-directed and true he had to think