A New GIS Toolbox For Integrating Massive Heterogeneous GIS Data For Land Use Change Analysis
12 views
Formal Metadata
Title 
A New GIS Toolbox For Integrating Massive Heterogeneous GIS Data For Land Use Change Analysis

Title of Series  
Author 

License 
CC Attribution  NonCommercial  ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and noncommercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license. 
DOI  
Publisher 
FOSS4G, Open Source Geospatial Foundation (OSGeo)

Release Date 
2013

Language 
English

Production Place 
Nottingham

Content Metadata
Subject Area  
Abstract 
Agricultural land use in Germany and related impacts on the environment and the use of natural resources are key research topics at the ThünenInstitute of Rural Studies. As spatial context is essential for the analysis of causal connections, GIS data regarding all necessary information was gathered during different research projects and prepared for processing in a database. In particular, the Integrated Administration and Control System, which was available for certain project purposes for several Federal Laender and years, serves as a very detailed data source for agricultural land use. We use different Open Source GIS software like PostgreSQL/PostGIS, GRASS and QuantumGIS for geoprocessing, supplemented with the proprietary ESRI product ArcGIS. After introducing the used input data and the general processing approach, this paper presents a selection of geoprocessing routines for which Open Source GIS software was used. As an exemplary 'use case' for the conclusions from the consecutive statistical analysis, we summarize impacts of increased biogas production on agricultural land use change highlighting the trend in biogas maize cultivation and the conversion of permanent grassland to agricultural cropland.

00:00
Point (geometry)
Observational study
Computer animation
Angle
Mathematical analysis
00:24
Polygon
Context awareness
OverlayNetz
State of matter
INTEGRAL
File format
Area
Roundness (object)
Vector space
Statistics
Process (computing)
System identification
OverlayNetz
Area
Observational study
Electric generator
Process (computing)
Product (category theory)
Regulator gene
View (database)
Software developer
Point (geometry)
Open source
Sound effect
Interface (computing)
Quantification
Message passing
Vector space
Raster graphics
Natural number
output
Modul <Datentyp>
Resultant
Slide rule
Statistics
Open source
Process (computing)
Disintegration
Mathematical analysis
Vector processor
Sound effect
Performance appraisal
Natural number
Database
Software
Integrated development environment
Selectivity (electronic)
output
Algebraic variety
Modal logic
Electronic data processing
Projective plane
Core dump
Mathematics
Computer animation
Software
Integrated development environment
Data acquisition
Object (grammar)
Conservation law
Address space
03:31
Beta function
Process (computing)
State of matter
Scientific modelling
File format
Denialofservice attack
Mathematical analysis
Mereology
Field (computer science)
Data model
Type theory
Natural number
Database
Vector space
Integrated development environment
Statistics
Process (computing)
Organic computing
output
Extension (kinesiology)
Subtraction
Condition number
Control system
Physical system
Area
OverlayNetz
Electronic data processing
Process (computing)
Information
Moment (mathematics)
Water vapor
3 (number)
Digital signal
Data model
Computer animation
Integrated development environment
Vector space
Software
Raster graphics
output
Condition number
Boundary value problem
05:49
Polygon
Modal logic
Process (computing)
Disintegration
File format
Open source
Mathematical analysis
Mathematics
Computer animation
Raster graphics
Database
Vector space
Software
Statistics
Integrated development environment
Modul <Datentyp>
Process (computing)
output
06:15
Geometry
Polygon
Statistics
OverlayNetz
Process (computing)
Transformation (genetics)
Calculation
Mathematical analysis
Complete metric space
Area
Variable (mathematics)
Table (information)
Summation
Mathematics
Database
Vector space
Uniqueness quantification
Boundary value problem
Statistics
Process (computing)
output
System identification
Physical system
Polygon
Set (mathematics)
Singleprecision floatingpoint format
Summation
Error message
Computer animation
Vector space
Raster graphics
Boundary value problem
Physical system
07:32
Geometry
Polygon
Slide rule
Divisor
Process (computing)
Multiplication sign
1 (number)
Shape (magazine)
Vector processor
Area
Table (information)
Gaussian elimination
Preprocessor
Latent heat
Vector space
Data structure
Logic gate
OverlayNetz
Area
Process (computing)
Theory of relativity
Decision theory
Uniqueness quantification
Polygon
Content (media)
Shape (magazine)
Table (information)
Uniform resource locator
Gaussian elimination
Resource allocation
Computer animation
Personal digital assistant
output
System identification
Energy level
Data structure
Resultant
Congruence subgroup
10:19
OverlayNetz
Latin square
INTEGRAL
Scientific modelling
Mereology
Area
Variance
Image resolution
Data model
Vector space
Statistics
Process (computing)
Förderverein International CoOperative Studies
Arc (geometry)
Area
OverlayNetz
Process (computing)
Mapping
Point (geometry)
Basis (linear algebra)
Functional (mathematics)
Vector space
Database
Raster graphics
output
Resultant
Geometry
Point (geometry)
Metre
Game controller
Statistics
Mapping
Computer file
Open source
Process (computing)
Disintegration
Calculation
Number
Power (physics)
Operator (mathematics)
Scripting language
output
Electronic data processing
Information
Scripting language
Polygon
Core dump
Set (mathematics)
Grass (card game)
Calculation
Computer animation
Personal digital assistant
13:49
Point (geometry)
Geometry
State of matter
Ferry Corsten
Combinational logic
Matching (graph theory)
Field (computer science)
Number
Mathematics
Term (mathematics)
String (computer science)
Database
Lattice (order)
Information
Address space
Metropolitan area network
Addition
Geometry
Information
File format
Point (geometry)
Vector potential
Uniform resource locator
Computer animation
Database
String (computer science)
Field (mathematics)
Stress (mechanics)
Device driver
Dew point
Free variables and bound variables
Condition number
Object (grammar)
Domain name
Address space
16:05
OverlayNetz
JustinTimeCompiler
State of matter
Multiplication sign
View (database)
Coroutine
Grass (card game)
Mereology
Table (information)
Subset
Usability
Mathematics
Computer configuration
Statistics
Process (computing)
Data storage device
Sensitivity analysis
Exception handling
OverlayNetz
Musical ensemble
Observational study
Process (computing)
Product (category theory)
Point (geometry)
Open source
Functional (mathematics)
Flow separation
Digital signal processing
Arithmetic mean
Message passing
Computer configuration
Sample (statistics)
Network topology
Raster graphics
Data storage device
Cycle (graph theory)
Electric current
Point (geometry)
Frame problem
Topology
Addition
Presentation of a group
Mathematical analysis
Mass
Graph (mathematics)
Number
Latent heat
Performance appraisal
Operator (mathematics)
Database
Software testing
output
Modal logic
Expert system
Set (mathematics)
Grass (card game)
Table (information)
Subset
Computer animation
Software
Data acquisition
Object (grammar)
Exception handling
00:00
licensing and from point angle of the day the last I think this isn't working very that Hello and we want to prevent our paper and you see if that's what you have to look for integrating methods that humans do think of volunteers and I've lost and 1st of all I want
00:26
to if you use a short overview I would prevent the sciences expect round and I was the set objectives and all use input data happened and I will showed a conceptual approach of and yes to with its generative processing workflow and the used software product and the quote requirements for vector processing enough that my colleague it like no will and prevent the the results of the DAS to the vector processing then the a and qualify yes toolboxes spatial overlay and the rest of data processing and the additional data and in the end she would have been made on the 2 books and shows some pros and cons of used software products OK and and both of them are
01:23
working at the Jean Institute of Food start studying that will sector many the research area is the use of resources and environmental and nature protection and particularly that we analyzed then just change in agriculture areas of Germany focusing on environmental impact and effects of legal regulations you can see you we project example the first one this to current project and we presented GIS 2 books and can be seen as the desired of all of these the projects yeah and their research object all our research objective was to developer Juris toolbox and and handling the processing and analyzes of message the terrigenous Surge'' as state for statistical analyzes and therefore the 2 books subsumes all unnecessary processing steps for data preparation and for integration of all input data sets into 1 combined dataset the data or there are some conceptual considerations 1st medium and decided to choose the vector putty going approach as the main database and really we defined some necessary processing and in the end we yeah you know there are reviling requirements when it comes to software selection and for example may software has to fit in our existing in lecture and the efficiency and after softer a product that has to be considered and interfaces has to be minimized and invest in that context and was very important to generate some open source software and on the slide
03:34
I want to show all the user's input data scientists concede there different performance of the data on most beta is a vector formant but there's also a 3rd formant available or data in X a chance the extent of the data can be it can be nationwide extend of only available for some of the Federal enough and there are different sources be of due to the federal system as you can see their states have for analyzed of land use on natural environment special conditions and other data i it's the an integrated admistration controller system is a very detailed data on fields they're in a vector formant it combines Gs data how would land use and then who's related information but it's only available for areas for which subsidies have been paint and uptight affecting we got the did the basic landscape models for Germany
04:48
and on the fly at you can see in the processing workflow and being used software products because of the different data models the process being there are 2 parts like the vectorprocessing and left the data processing and at the moment to data stored in a PostgreSQL database with the post GIS extension because main data on most of the data is available a sector data we use a spatial overlay of all vector data and that the main database the roster data processing is done in graphs and after all these processings finished willing leave it the data to the rest the data and as a the additional data and on here OK what got
05:51
manner and it the additional data
05:56
like in the situation of by about plants in Germany is integrated into the database after due referencing and it is all the resides of all dataprocessing we use for our statistical analyzes and as a it's as already mentioned we use
06:18
spatial overlay and of all vector data has 2 main data formant as many as the main database and the reside without every other sums calculated for other geometries for all statistical analyzes of land use change and therefore the data has to meet and several requirements the data sets must be complete with and system boundaries so the German the boundaries of Germany all depend on the land there should be no larger overlapping things the geometries should be valid and the transformation of singer polygons in political advances also the integrated in our data preparation OK now the final and the the complicated the thank you FIL know now I would go right
07:36
to go to the results some of our ideas to look and and the 1st slide so it's about time the vector processing and that of preprocessing and the the 1st factor is the flexibility check relook at the data and the structure that the I a rough rough visual examination of the data for a general understanding and then all we have to be projected on the geometries for homogenizing of the coordinate system and repair all that enveloped geometries and then be a check for overlapping so in our input data and we do this with the solvent affection and depending on the data we have this that together with the elimination of the overlapping can be very time consuming at the end we convert all geometries interesting polygons the said Wilfong from Great and and create a unique ID for and identification after the overlay process so this is a slide
08:50
about our and routine regarding the overlapping geometries you see some examples of for overlapping securing our datasets and we have several of criteria as before and specification of the overlapping for the therefore picking the instruments for elimination an interest this area shape which could be a permitted for the added a cation of of the polygons and the next this the proportion of the intersecting area in relation to the original polygon area which enables us to identify the P. K. is almost the P. and we have should be data which are relevant to for the the understanding if overlapping so are intended as the case of social structures of what a protected areas or if they contain on similar content and can be aggregated as instruments for the elimination we have found heat for depth than the P. it's the clip of 4 of those structures also polygons and became union on the on the trees and as the other gate found in the that case we use in our location table to relate this year geometries to the old ones the slide is about our crop
10:23
process the vector overlay operations and is example 2 data sets and and what we want to do is to integrate them into 1 combined results a way all outlines of the polygons and have to be preserved and the ic resulting geometries keep the information from the source polygons we do this maybe it was the overlay union function in our case number control this was on HyperScript because we have to cut our data and too small a chance because we have of this mass of of the input data and we have to to adjust our on Latin script for every fertile land because of the differing the input data for each 1 and in the end we import the results into our posters relative based on ends with all parts together and do in area calculation and so on this result we get back into text files for analyzing this analyzes
11:40
here is is the power of 1 roster of data processing and together with the Director overlay process and since we used post yes 1 comma decimal 5 there wasn't any roster of processing and and that therefore we decided to use across so we have to import all our data and in into grass and calculates the slope and alarming exposition maps for the innovation model on yes we try to have a direct overlay of our vector and data with a mustard and data but it it wasn't possible because of the massive data and so we decided to use a point roster as the basis for the integration and what we do is we created a point worth of 100 meter resolution on this post yes and post yes we are doing an intersection of our vector overlays result and dreary all polygon information and in grass that we do the century for the rest values and then we can and joined the data and based on the point ideas and this is OK for most of the analyzes questions but we also have the ICS data which have right very high resolution and together with the elevation model with a resolution of about 25 meters who wanted a more precise approach and therefore we decided to use an iterative over operation and so on because we could reduce or that data to the ICS vector data only we were able to do you knew very statistics and grass and than the joined the statistics down to the vector overlay result based on the ideas of the ICS geometries so this
13:51
is our addition or additional data we have an institution already said we have addressed data for biogas plants in Germany and as EXIT charts and we wanted to integrate the location of the biogas plants into our database because we've learned because of by a gift plants are and regarded as a potential driver for agriculture land use change and the for the referencing of leaves the database with reference address data from federal agencies of cartography and you new and after over modernizing formatting of our address strings of state still problems and with joining because maybe because of this of field with the street address information and as you can see very very yeah that it's a combination of of strings in America a characters together with the emitter stands at connectors and sometimes the only numbers on this these street addresses inhabitant only placeholders like this object without street and is all information is not included in durations reference database and so on we're mostly are only able to base our and you have a dereferencing on the combination of location and post out and only for 35 % of the by August plans we use the 1st characters of this street address fields to resin additional information for joining and in the end we
15:45
had a cluster of matching point of for each term biogas plant address and what we did the took the since century Douglas cluster of 4 and we their final point geometry OK I now want
16:08
to go and come to the discussion part of the presentation of our research object was to handle this is gene mass of India data within an adequate timeframe and we have to say that our approach is able to derive a comprehensive started yes datasets was statistical analyzes but we also have to state that on this approach me using is very timeconsuming and as regarding for as were guiding the data acquisition the prepossessing and on the processing part so we have to think about changes in our toolbox and we have a few options we think about for future handling and the first one and at 1st when we would use the approach that we are using now and their use in a fixed dataset 1st taking data and an update only in very long cycles just to minimize the effort and the 2nd as option would be a step back he would only prepare the the datasets as a state that they're ready for the elementary operation and the overlay itself would only be done a for small data sets ends tailored for a specific goal of research stressed research questions or the search option would be based on the points approach the at least 2 or the victim of a rest overlaid and so on it would be very easy to add new datasets into our existing database then but we still have the drawback of losing information and in the end I would like to evaluates of the Alps so so rare we used from our point of view and generalized of say that the software refused to was very capable of processing and the messages due data with neck sectoral and processing time for us acceptable mean something between hours and several days and guiding 1st as well yeah we think it's a very great to for the data storage of this massive data and each member of our team has access to every datasets of a need on to not having to exchange them and then also the table manipulation processes are very effective and unethical use the test posed yeah as yeah past year as the very strong tool of money and your processing routines and also the communities very but amazing and we still have found some drawbacks on oppose jails for all approach and the main thing was that the high accurateness of some functions which resulted in geometrical artifacts or a topology exceptions yeah of graphs and as journalists have shown itself as a very efficient in the processing but for us be human experts it was a direct and also we could say that if there are improvements in the usability of and documentation rethink that's on the the capabilities of of grass could be better access so on this that her this I would like to finish our
20:06
presentation will be of for questions and discussions thank you but the of there you have it the in this kind of of and what you see is what you of the of the best yeah we we tried to use idea as to just to minimize the the number of software we had to use and but of flora yeah massive data it wasn't really working I have to say so grass was the only solution for us the opposite the goal of the yeah and if I have a lot of time to both where using both yes it was doing it yeah we tried this I don't think the every every don't develop the MPL and you just grab a routine for doing the same thing as we do in IJS but yeah the problem we have it's not the able to record more than 2 datasets at 1 time which arkoses able to to do and also it's much more slower theft to admit and so probably yeah was the topology and the new topology at the at the onset of what is in the variable the changes in the future but I can only speak for but her in of the you you auctions on 1 the SAS could you know what you mean it seems like a lot very strong that that the and this is if I don't know because the the statistical analyzes and looked on by the people who will do the J. S. apart and we have to add that to to deliver a product of our colleagues our able to and to work with so and JIT census of the software and when users at our institute so