Building applications with FOSS4G bricks: two examples of the use of GRASS GIS modules as a highlevel "language"' for the analyses of continuous space data in economic geography
Video in TIB AVPortal:
Building applications with FOSS4G bricks: two examples of the use of GRASS GIS modules as a highlevel "language"' for the analyses of continuous space data in economic geography
Formal Metadata
Title 
Building applications with FOSS4G bricks: two examples of the use of GRASS GIS modules as a highlevel "language"' for the analyses of continuous space data in economic geography

Title of Series  
Part Number 
89

Number of Parts 
193

Author 

License 
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. 
Identifiers 

Publisher 

Release Date 
2016

Language 
English

Content Metadata
Subject Area  
Abstract 
In a world where researchers are more and more confronted to large sets of microdata, new algorithms are constantly developed that have to be translated into usable programs. Modular GIS toolkits such as GRASS GIS offer a middle way between lowlevel programming approaches and GUIbased desktop GIS. The modules can be seen as elements of a programming language which makes the implementation of algorithms for spatial analysis very easy for researchers. Using two examples of algorithms in economic geography, for estimating regional exports and for determining rasterobject neighborhood matrices, this paper shows how just a few module calls can replace more complicated lowlevel programs, as long as the researcher can change perspective from a pixelbypixel view to a map view of the problem at hand. Combining GRASS GIS with Python as general glue between modules also offers options for easy multiprocessing, as well as supporting the increasingly loud call for open research, including open source computing tools in research.

00:00
Metropolitan area network
Term (mathematics)
Software developer
Energy level
Spacetime
Bit
Grass (card game)
Continuous function
Mereology
Library (computing)
00:41
Metropolitan area network
Matching (graph theory)
Price index
Grass (card game)
Line (geometry)
Continuous function
Grass (card game)
Mereology
Total S.A.
Field (computer science)
CAN bus
Process (computing)
Commodore VIC20
Normed vector space
Spacetime
Library (computing)
01:13
Disintegration
Set (mathematics)
Mathematical analysis
Grass (card game)
Field (computer science)
Computer programming
Programmer (hardware)
Centralizer and normalizer
Spreadsheet
Graphical user interface
Different (Kate Ryan album)
Computer programming
Set (mathematics)
Spacetime
Endliche Modelltheorie
Office suite
Category of being
Library (computing)
Metropolitan area network
Link (knot theory)
Real number
Electronic mailing list
Cartesian coordinate system
Virtual machine
Type theory
Arithmetic mean
Spacetime
02:54
Game controller
Programming language
Open source
Adaptive behavior
Open set
Mereology
Computer programming
Element (mathematics)
Structured programming
Different (Kate Ryan album)
Computer programming
Logic
Endliche Modelltheorie
Implementation
Library (computing)
Physical system
Metropolitan area network
Parallelverarbeitung
Source code
Programming language
Algorithm
Graph (mathematics)
Electric generator
Point (geometry)
Constructor (objectoriented programming)
Computer program
Open source
Parallel port
Machine code
Cartesian coordinate system
System call
Element (mathematics)
Componentbased software engineering
Arithmetic mean
Process (computing)
Logic
Computer programming
Problemorientierte Programmiersprache
Data structure
04:25
Logical constant
Scripting language
INTEGRAL
Coroutine
Grass (card game)
Mass
Distance
Axonometric projection
Machine code
Theory
Element (mathematics)
Product (business)
Graphical user interface
ACCESS SCRIPT
Spacetime
Library (computing)
Physical system
Predictability
Scripting language
Link (knot theory)
Graph (mathematics)
Real number
Software developer
Reflection (mathematics)
Computer file
Projective plane
Mathematical analysis
Coroutine
Grass (card game)
Element (mathematics)
Demoscene
Distance
Type theory
Arithmetic mean
Process (computing)
Raster graphics
Gastropod shell
Energy level
Arithmetic progression
Matrix (mathematics)
05:56
Computer file
Distribution (mathematics)
Open set
Number
Workload
Sign (mathematics)
Different (Kate Ryan album)
Term (mathematics)
Core dump
Endliche Modelltheorie
Library (computing)
Form (programming)
Scripting language
Metropolitan area network
Multiplication
Link (knot theory)
Parallel computing
Mathematical analysis
Parallel port
Machine code
Cartesian coordinate system
Process (computing)
Workload
Computer hardware
Computer programming
Curve fitting
07:06
Neighbourhood (graph theory)
Set (mathematics)
Mathematical analysis
Area
Product (business)
Hypothesis
Data model
Estimator
Object (grammar)
Quotient
Matrix (mathematics)
Library (computing)
Link (knot theory)
Information
Classical physics
Neighbourhood (graph theory)
Mathematical analysis
Shared memory
Machine code
Hypothesis
Estimation
Raster graphics
Calculation
Order (biology)
Right angle
Object (grammar)
Matrix (mathematics)
08:36
Execution unit
Computer font
Distance
Machine code
Area
Product (business)
Order (biology)
Different (Kate Ryan album)
Radius
Energy level
Electronic visual display
Circle
MaÃŸ <Mathematik>
Library (computing)
Area
Multiplication sign
Metropolitan area network
Execution unit
Dialect
Link (knot theory)
Concentric
Shared memory
Expert system
Order of magnitude
Bit
Distance
Estimation
Gleichverteilung
Hill differential equation
Resultant
Spacetime
10:02
Area
Metropolitan area network
Multiplication sign
Execution unit
Link (knot theory)
Multiplication sign
Order of magnitude
Machine code
Distance
Machine code
Machine vision
Distance
Area
Order (biology)
Radius
Estimation
Term (mathematics)
Radius
Reduction of order
Hill differential equation
Endliche Modelltheorie
MaÃŸ <Mathematik>
Library (computing)
Local ring
10:43
Point (geometry)
Metropolitan area network
Execution unit
Link (knot theory)
Shared memory
Sound effect
Bit
Stack (abstract data type)
Distance
Total S.A.
Distance
Inflection point
Data model
Estimator
Malware
Estimation
Function (mathematics)
Object (grammar)
Endliche Modelltheorie
Pixel
Library (computing)
Spacetime
11:49
Metropolitan area network
Dialect
Manufacturing execution system
Execution unit
1 (number)
Distance
Theory
Distance
Product (business)
Estimator
Quotient
Website
Library (computing)
12:25
Metropolitan area network
Modal logic
Algorithm
Pixel
Algorithm
Point (geometry)
Calculation
Expert system
Distance
Data model
Estimation
Natural number
Vector space
Matrix (mathematics)
Information
Endliche Modelltheorie
Implementation
Summierbarkeit
Pixel
Library (computing)
Weight function
Spacetime
13:01
Point (geometry)
Pixel
Algorithm
Calculation
Grass (card game)
Mereology
Distance
Data model
Term (mathematics)
Endliche Modelltheorie
Implementation
Gamma function
Pixel
Summierbarkeit
Library (computing)
Weight function
Metropolitan area network
Compact space
Algorithm
Weight
Translation (relic)
System call
Distance
Estimation
Series (mathematics)
Uniform resource name
Calculation
Programmschleife
14:11
Neighbourhood (graph theory)
Pixel
Computergenerated imagery
Calculation
Grass (card game)
Mathematical analysis
Metadata
Medical imaging
Term (mathematics)
Object (grammar)
Vector space
Autocorrelation
Physical law
Pixel
Library (computing)
Metropolitan area network
Algorithm
Mapping
Constructor (objectoriented programming)
Neighbourhood (graph theory)
Mathematical analysis
Translation (relic)
System call
Vector graphics
Autocorrelation
Arithmetic mean
Vector space
Raster graphics
Calculation
Phase transition
Revision control
Object (grammar)
Matrix (mathematics)
15:31
Metropolitan area network
Neighbourhood (graph theory)
Pixel
Mapping
Graph (mathematics)
Algorithm
Direction (geometry)
Direction (geometry)
Interactive television
Electronic mailing list
Amsterdam Ordnance Datum
Electronic mailing list
Grass (card game)
Line (geometry)
Entire function
Process (computing)
Raster graphics
Object (grammar)
Object (grammar)
Implementation
Metric system
Pixel
Matrix (mathematics)
Library (computing)
16:29
Programming paradigm
Programming language
Neighbourhood (graph theory)
Pixel
Programming paradigm
Mapping
Programming language
Mapping
View (database)
Point (geometry)
Neighbourhood (graph theory)
Grass (card game)
Element (mathematics)
Dressing (medical)
Type theory
Mathematics
Raster graphics
Computer programming
Modul <Datentyp>
Quicksort
Matrix (mathematics)
Library (computing)
17:20
Building
System call
Open source
Computer file
State of matter
Multiplication sign
Grass (card game)
Machine code
Programmschleife
Flow separation
Readonly memory
Semiconductor memory
Befehlsprozessor
Computer programming
Authorization
Process (computing)
Endliche Modelltheorie
Library (computing)
Multiplication
Physical system
Metropolitan area network
Parallel computing
Food energy
Coroutine
Machine code
Perturbation theory
Flow separation
Process (computing)
Raster graphics
Function (mathematics)
Different (Kate Ryan album)
Computer programming
Pressure
Resultant
Buffer overflow
18:55
Point (geometry)
Functional (mathematics)
Group action
Source code
Image processing
Open set
Food energy
Theory
Field (computer science)
2 (number)
Number
Medical imaging
Latent heat
Operator (mathematics)
Videoconferencing
Matrix (mathematics)
Energy level
Library (computing)
Domain name
Metropolitan area network
Graphics processing unit
Tesselation
Military base
Chemical equation
Graph (mathematics)
Projective plane
Analytic set
Virtualization
Machine code
Data mining
Digital photography
Process (computing)
Voting
Vector space
Order (biology)
Library (computing)
00:08
it is that started this is the last
00:13
session of today for his room and there's going to be the birds of a Feather session afterward stem self steady discussions this is the last thing they have today mostly thank you I'm going to maybe go down a bit in terms of level of sophistication in terms of the the development parts of what I'm presenting but I'm gonna talk about the users of same modular GIS tools such as trust
00:42
uh in the field which is not talk about matches economic job is
00:47
also what I'm saying is there's more genetic generic in the
00:54
grass but I hope all of you know
00:56
was called what's the primordial soup of GIS vital I think the 1st but it's still very much alive and kicking and still no part going in and new lines being created every day in economic geography we have the same problems
01:14
as in other fields of moved to its bigger datasets especially because there's more and more of a moving to microdata meaning that instead of using aggregated data sets we used 1 more individual that is a for example individual firms and the other thing is that might sound surprising that we talk about economic geography the there's actually in a certain way beginning of integrating real space uh because we using these Michael datasets where very often before was modelspace work geography of Central did not play such an important role in the research community that I work with I C 2 types of people
01:55
dealing with these issues you have this is the programmers people might actually come to foster G conference who know how to use different tools nanometers write their own programs and then you have the others were combined the gooey uses war let's say advanced Excel spreadsheet users that I don't want to touch any programming but have enormous difficulty of getting things done because they only just who brings the things has often been difficult we're trying to argue here is that a lot of office for G tools offer the opportunity to go a way you wanted to build applications but using the models of these different sets and there a non differing on existing exhaustive list here of tools such as in the 1st 4 G world so the idea is that models of these tools are
02:54
elements of a programming language and allow very rapid construction of programs general you will need some generalpurpose programming languages as you but that depends on how complicated your application the important part is here you can concentrate more on the highlevel logic of what you wanna do as a researcher than to actually deal with all the different synergies issues of program what very powerful aspect
03:23
of this model tools such such as structure is that each individual tool normally is a program on its own meaning that you have a very robust system in which want to fail so that the whole system that fails the other thing is that as we've just seen examples of distributed program on distributed computing the fact of having this very modular structure makes it also very easy to create parallel processing algorithms to and then obviously we're had but in the scientific world this is not as obvious the fact of using a control source allows adaptation but also sharing and peer review of the code something which is more more becoming more and more important important science we really heard the call for open science that to key he noticed this afternoon so graph generators over cities
04:25
30 years old which means it has a lot of accumulated experience and and to 2 elements are particularly useful for what I'm talking about here 1 is the fact that it has that actually deals with all predictions you want not only to but all projects as you might 1 and 1 and deal with obviously through the integration of cross foreign all retains but also some internal and that's very important in some elements of economic geography monitoring with distances when believe some of the analysis of scene of people making the theories around the distance is switched on mean very much without any reflection on the production system being used the other is that to its known development they this progressive development of very rapid routines special meaning with massive data so you can really there's a lot of elements and that are really fast in grass and you have a large class of and so the possibility to really dealing with very large datasets on the blue 1 using here's Python that won't come much more
05:36
into that but graph provides 2 different type from the API so there's the grass scripts which is very simple and you wanted to use it and there's a moral little progress which allows it to go much deeper into the actual elements of process there's actually very nice we model or this and developments constant development but it that actually
05:57
allows you to to graphically developed the code and then the spit out yeah script in the Python form so that makes it very helpful as well for nonprogrammers to actually build the application and that the and the question of large datasets and parallel
06:16
processing often in these analyses and the majority do not have a lot of looping because you do a lot of repetitive analysis of different aspects of and places whatever and so what I've been using is the same model as it might be as the multiprocessing model to distribute the workload across multiple cores it's the signs of its own of 100 we already discussed this and also the problem of not overloading each node in terms of the random you make it use number of open files you have a new a new node set up but it does allow you to do very quickly very quickly developed multiple single utterance so what I'm going to presented through examples of applications and i'm gonna go really deeply
07:08
into the code because I think that's the no entity but if the code is available if you wanted to see it as 1 example is really directly analysis and working on right now which is the estimation techniques trying to estimate how much of the production of a given region is actually sold outside that region what is sold in some reason something for which we don't have any information so it's all est estimated generally and the order of works on individual farms so for example for France working on 1 . 2 million firms now strolled you located and uh some information of the 2nd example is and as a tool in the which allows to calculate neighborhood Matt matrices from roster objects so if you go to the 1st example the basic
08:02
hypothesis that we have estimating this share of production in the region that is exported outside the reason is that the fact that the more and economic sectors concentrated the more the farms that produce in that sector will expert export somewhere if the same set of distributed just like the population not much need for the for exports and generally did indeed the estimation production is consumed locally the classic approach is the location quotient so what you do is you take your original
08:37
production and if you look at the share of the sector uh in that region compared to the sector on the national level and you look at the share of general employment in that region to the general comment on national level so you subtract 1 from the other so if there's more concentration of that sector in the region and the population then you postulate there's exports so that's that's the way it works the big problem with that is is that it always works and aggregate the data and so you all you have the famous modifiable area unit problem that comes up immediately if you look here you have 2 regions here because it was decided that space was set up here and you have completely equal distribution of production little circles have so there no special concentration or experts in this but it's enough to head up displays a bit differently and suddenly production is concentrated into regions and you want exports so the results of these analyses are very much dependent on the spatial delineation of the region so the idea is to to say that this doesn't work because
09:57
it doesn't take into account distance doesn't make any difference if you here here here here the results
10:04
are going to be the same in terms of the vision at the same time research in recent research shows 1 1 that distance extremely important in terms of
10:13
where things are sold there's been work in the US which shows that sales within the same zip code area about 5 4 miles large is 3 times higher than sales outside and that almost all sales on a very short radius so shipments from companies to other companies more to end consumers is known something very local even if we live in a in a globalized world so this a real need to take into account the actual localization so the idea here is to reduce cost model for
10:45
them that's a model that was created for uh retail share estimations to use your and use that model uh and to estimate the exports for each farm based on the real distances instead of on aggregated data so I'll go through this quite quickly but what it does and it it has estimates and a probability for each point in space that people in that point is space will consume something from a given farm and that is dependent on the objectivity and effectiveness of the firm and on the distance between the given population and that and you can then uh asymmetry each firm what is the total population that this firm caters to wanted and if you look at that if you compare this population of the firm caters to
11:36
with the resident population we can then estimate in the aggregation that you want you can estimate the share of exports so that's the general idea just to make it a bit more graphical here
11:49
you have a theoretical scenario with spatial units you have the this population and these are firms so you have 3 sectors the green the blue and the orange ones more or less concentrated especially if you do estimations of how much of production In each of these regions will be exported outside the region with the location quotient you get something like this all these regions are
12:12
strictly identical because distance doesn't play any role distance to the city the main cities we have more population doesn't play any role whatever with this new indicated you have a somewhat differentiated picture and that allows you to take
12:27
into account that distance now the Board value of day is what the
12:33
idea saying that you have this algorithm that you need to to to to run to estimate the model and so that you have some some have you been going on there and especially if you work and it's a classical tools a lot of economists use you will have a matrix is with of space and you will run through all the pixels of that nature it is that it GIS tools discussed here is you can do that much Maurice
13:06
and the the intensive calculations are have been thought through within these jails and so you can do that so just an example what you have here is you take the 1st part of the algorithm where
13:18
you have the charm and each pixel with its population and you try for each pixel you have to calculate the distance of the pixel to the firm and then calculate the see the weight of the term for the population that done into into calls which are different from its down because you models calculate the distance and you can have you have the the the basic calculator that does it for you now this is at 1 point you have to aggregate all these scores of all the farms and then afterwards calculate the ratio of 1 from 2 to it's it's it's it's compact competitors again 1 tool and does it for you the and this is the last part the you then calculate the pixel consummated
14:12
consequently assuming population and again just 2 calls to the grid uh tools to do the work for you and these tools have been highly optimized in terms of calculation speed and things so it makes it really easy to construct these algorithms their efforts my 2nd example is the construction of neighborhood might
14:36
make this is something which is quite important in general geographic analysis and economic geography especially for example for calculations such as spatial autocorrelation meaning need in a neighborhood relationships and generally a lot of tools allowed to do that quite easily with metadata but when you have rested data where you have all objects in roster data such as defined by just adjacent pixels that have the same pixel value with something that for example can come out of the image segmentation or things like that and then you go to the vector version of that same map and then you do the calculation of the name my images and then you go back but that's highly efficient especially when you have hundreds of thousands of objects just the vectorization phase is often quite so so the idea is to do this strictly Gnostic to make it much faster and so the idea is to say for
15:36
each pixel you check that neighboring pixels in for a direction choose that and then you look is that doesn't have the same value yes and no and if yes then Example that is not a neighbor otherwise it's in and again in graph as well for interaction you
15:58
have 1 car which takes care of the entire matter metrics just check for each pixel whether it has a name it is never not you get the statistics of about you just have to run 1 line in Python 2 unique could 5 the list because you might have an object which across 10 or 15 or 20 or 100 pixels as neighbors and others so will have duplicates and the job is done and so can here for example is the example of a map of
16:29
soils and you can have view of the neighborhood mathematics that comes out of which soil type is next to it sort of an example noneconomic geography but uh obviously it's optical applicable in all the so what do we take home from the idea is really that
16:49
the modular GIS tools should really be seen as programming languages which you are a high level and allow you to a lot of things the main difficulty that I've seen with people trying to do that is to to change can of the paradigm from I have to deal with every pixel to say no I deal with the entire map at once and the program deals with the pixels dresses excellent memoryhandling which allows you to
17:20
really work with these big files and not overflow your memory and the fact that you can use these models and very easily build separate loops for example fought each economic sector you do it separately and paralyzed process very easily across many nodes and have done very quickly all this is not always as fast as you could be if you work directly in the code and you multithread and all but as we just discussed as well sometimes time of coding is war pressures and the time of writing and so that's something that has to be balanced and just again the last calling into open science the fact of using the open source tools obviously for this allows the peer review that you can do with people tell you we did this we did our magic and as a result and you have no way of doing looking at how they did that and or even if they give you code but it's coding of program that you don't have that is provided that buy line of able to reproduce so thank you and little
18:32
advertising for similar coming up in Brussels with authority them have state of the mouth and hot summers and all these things happening before this system something tailored to take questions 1 is about leaving
19:00
the voters leaving it the 6 30 or 7 the I go over 630 because if you think it's going from this should be fast will I will I go from artificialintelligence research field and and the people of I've working like images and videos and processing too much data mining 2nd seconds so I'm asking if have a there's that research order have tried to process data using GPU using sorry to be used in graphical processing units I don't know if any GIS packages used to be used by domain but it's obviously feel that could be interesting and look at markers denote theory probably the most of his so graphs there were a few summer of Google Summer of Code projects in the past where some specific functions like energy balance and things like that there are books and photographs users have been implemented with GPU support the the problem is always that you have to kind of maintain the code maybe alongside with the non GPU code and at this point run into the trouble to have to go to the source code bases which are not necessarily now corresponding anymore after White so it's not easy to and the I think if you go low level and for example the number crunching which is done in numerical functionality like a matrix processing and the like you better invest their into the GPU always support rather than doing it at a higher level and this is I think the way to go good and not because and when I was doing research I was using Python and I was using I recall coffee and many of you heard that so that later you create that can graphical the composing graphs and that person graphed against since those that global operating or compiled in to 0 GPU worse if you if you want so that that was because it was we just 1 small comment I that for image processing we often use libraries such as Open CV and analytical ability as molecules which uses GPU and so we might actually already do it even if we don't realize is that enough ITK which kind OTB of virtual books I wonder if they don't have GPU it seems slightly than the OK do we have more questions last chance that then thank you Morris the coming up our birdsofafeather session 7 now so that I think is a sketch neural vector tiles there's 1 in the tunnel above routing and there is a program of on the welcome desk where you can see which sessions are going on but the further sessions are discussions in the group the nondirected and true he had to think