Tips for parallelization in GRASS GIS in the context of land change modeling
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 351 | |
Author | ||
Contributors | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/69065 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Production Year | 2022 |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
FOSS4G Firenze 2022179 / 351
1
7
13
22
25
31
33
36
39
41
43
44
46
52
53
55
58
59
60
76
80
93
98
104
108
127
128
133
135
141
142
143
150
151
168
173
176
178
190
196
200
201
202
204
211
219
225
226
236
242
251
258
263
270
284
285
292
00:00
Grass (card game)Mathematical modelContext awarenessParallel computingSoftwareOpen sourceSoftwareSlide ruleEntire functionMathematical modelGrass (card game)Software developerParallel portComputer animation
00:44
Grass (card game)Interface (computing)Open sourceProcess (computing)Core dumpProcess (computing)Interface (computing)Software developerCASE <Informatik>Grass (card game)Open sourceComputer animation
01:06
Parallel computingGrass (card game)ForestDistanceLevel (video gaming)Core dumpCASE <Informatik>ForestNumberWater vaporNeuroinformatikParallel portProcess (computing)XML
02:10
Computing platformCodeAlgorithmParallel computingSource codeDisintegrationMultiplicationCodeAlgorithmGrass (card game)Type theoryParallel portFlow separationOpen setDistribution (mathematics)Library (computing)Inheritance (object-oriented programming)Installation artJSONXML
03:28
Revision controlPatch (Unix)Grass (card game)Mathematical analysisSeries (mathematics)Information managementWindowInterpolationSpecial unitary groupWater vaporPoint (geometry)Vector spaceComputer animation
04:06
Bit rateGrass (card game)CodeStudent's t-testStatisticsCodeComputer animation
04:29
Grass (card game)Process (computing)MultiplicationTask (computing)Independence (probability theory)Physical systemLevel (video gaming)Grass (card game)Function (mathematics)Operator (mathematics)Task (computing)MehrprozessorsystemDifferent (Kate Ryan album)Type theoryRaster graphicsOperating systemParallel portFlow separationComputer animation
05:57
Task (computing)Parallel portProcess (computing)Parallel portComputer filePhysical systemLevel (video gaming)Core dumpSystem callExterior algebra2 (number)Flow separationComputer animationJSONXML
07:00
Module (mathematics)TesselationModule (mathematics)Type theoryRaster graphicsWrapper (data mining)NeuroinformatikGrass (card game)Mathematical modelAlgebra
07:53
BenchmarkWindowMultiplication signPlotterRandomizationCASE <Informatik>NeuroinformatikBenchmarkAdditionLaptopCore dumpResultantAlgorithmGrass (card game)Different (Kate Ryan album)Message passingModule (mathematics)Cellular automatonParallel computingImplementationAbsolute time and spaceLibrary (computing)Parallel portRight angleMathematical analysisNumberStandard deviationAverageDrop (liquid)Online helpCartesian coordinate systemComputer animation
12:35
Abelian categoryRoute of administrationTask (computing)Integrated development environmentNeuroinformatikDifferent (Kate Ryan album)Level (video gaming)Point (geometry)DialectGame controllerModule (mathematics)System callGrass (card game)Variable (mathematics)Line (geometry)CASE <Informatik>Parallel portParallel computingSet (mathematics)Single-precision floating-point formatRaster graphicsMultiplicationSource code
14:25
Complex (psychology)Regulärer Ausdruck <Textverarbeitung>AlgebraRaster graphicsCellular automatonProcess (computing)Module (mathematics)Overhead (computing)Read-only memoryNumbering schemeComputer multitaskingInteractive televisionGrass (card game)Scripting languageParallel portMathematical analysisScalable Coherent InterfaceComa BerenicesAbelian categoryOpen sourceLaptopAlgebraParallel portMultiplication signBitNeuroinformatikLevel (video gaming)System callMereologyRow (database)Kernel (computing)Grass (card game)AlgorithmMehrprozessorsystemScripting languageProjective planeSemiconductor memoryOverhead (computing)LaptopProcess (computing)ExpressionTesselationDialectSquare numberAuditory maskingComputer fileRaster graphicsComputer animationXMLJSON
18:10
Mathematical modelScaling (geometry)Patch (Unix)Integrated development environmentSimulationData structureStochasticTerm (mathematics)Grass (card game)Vector potentialCartesian coordinate systemMathematical modelNeuroinformatikAnalytic continuationSet (mathematics)Link (knot theory)Grass (card game)PreprocessorProcess (computing)Computer animation
18:40
LoginProcess (computing)Cellular automatonSupercomputerParallel portLaptopGrass (card game)Observational studyCASE <Informatik>MereologyCellular automatonLaptopObservational studyLink (knot theory)Computer animation
19:09
Serial portParallel computingInterface (computing)Task (computing)Message passingDistribution (mathematics)Vertex (graph theory)State of matterAerodynamicsWindowSerial portMereologyDifferent (Kate Ryan album)Multiplication signPressureNeuroinformatikSupercomputerGrass (card game)Heegaard splittingSystem callMathematical modelSimulationSoftware developerState of matterCore dumpComputer animationXML
21:02
Process (computing)Level (video gaming)Parallel computingState of matterProcess (computing)Heegaard splittingComputer animation
21:22
NeuroinformatikMathematical modelSoftware developerComputer animation
21:43
Grass (card game)Parallel computingLink (knot theory)Parallel portGrass (card game)Computer animation
Transcript: English(auto-generated)
00:03
Hello, my name is Anna and I will be talking today about parallelization techniques in GRASS GIS and specifically how I applied them for my, for projecting urban growth model
00:20
for the entire United States. Okay, I have a lot of slides so let's get to it. Just a brief introduction, my name is Anna Petrasova, I develop research software at North Carolina State University and I'm also a GRASS GIS user and developer.
00:46
So in case you don't know, you should know after all the talks, but GRASS GIS is an open source geo-processing engine, it has a lot of processing tools, also it has a lot of interfaces.
01:00
I will be mostly showing examples using Bash and Python here. So when we talk about parallelization in GIS, I think it's useful to distinguish between tool level parallelization and workflow level parallelization. So what I mean with that is when you run a tool, it can use, for example, a certain
01:30
example, how many cores you wanted to use. So that's the example there. But sometimes the tools are not actually parallelized, they can use only one core,
01:42
but still you can be able to speed up your GIS workflow by grouping the processing in a certain way and parallelizing it on the workflow level. So in this case you are calling, you are computing a distance to water, roads and forest
02:07
features in parallel. So in GRASS GIS we have basically two types of parallelization going on and it's multi-threading with OpenMP and then multi-processing with Python.
02:26
So OpenMP is used in GRASS for parallelization of the actual geospatial algorithms, which are typically in C or C++. And I would say OpenMP is relatively easy comparing to other parallelization techniques
02:45
such as MPI and so on, which are often used on HPCs. That doesn't mean it's super easy comparing to Python. The advantage of OpenMP is that it is reasonable to just integrate it into the existing non-parallel
03:04
code, as the example shows. And another advantage is that for distribution of algorithms that you don't have to have separate code base for parallel and non-parallel code.
03:21
Also the code compiles even if you don't have the particular OpenMP library installed. These are the currently OpenMP-enabled tools in GRASS GIS 8.2, the released version. The first three, the top three, are doing moving window analysis.
03:42
And then there is R.series for aggregation, then R-patch for merging data or filling nulls, R-sun for solar radiation, V-surf RST for interpolation from vector points, and R-sim water and R-sim sediment for hydrologic erosion analysis.
04:07
There will be more coming soon. For example, R.univar for now faster univariate statistics for large data. These were developed by Jeroen Hofjerke and also by Aaron who was a Google Summer of
04:26
Code student last year. Okay, so the other type of parallelization I mentioned was multiprocessing in Python. So typically you would use multiprocessing package although there are other packages now.
04:42
And the difference here is that these are, the multiprocessing package spawns separate operating system level processes, not threads. And this is like a real simple example how multiprocessing can be done.
05:01
So it's fairly straightforward. And it's used in a couple of different tools in GRASS and also the tools in add-ons. So this was the two-level parallelization. And now let's say you are scripting a workflow and thinking how you can speed it up.
05:24
So again, Python is your friend. You can fairly simply parallelize your workflow if you have multiple independent tasks. So this is an example where you can simply compute increasing level of inundation using
05:45
module r.lake. And the only thing you have to really care about is that your output rasters need to have a unique name. Similarly in bash you can do it even perhaps more simply.
06:03
As you probably know, you can just, on a Unix system, you can just use the magic ampersand to send the process into background and you can do it for a couple of processes. If you have really a lot of these calls, you can, for example, generate them or write
06:24
them into a file and then just execute the file using, for example, GNU parallel or other alternatives. You can also combine it together with the two-level parallelization. So here, for example, we can call r.neighbors as separate background processes, but each
06:43
of them is actually running four threads. So then eight of your cores should be busy. So you just have to be careful then so that you don't oversubscribe, although it's typically not a huge issue.
07:00
Another approach is tiling, where you can use this grid module in GRASS, which will conveniently wrap all the tiling computation per tile in parallel and then merging the data back. This is, of course, not suitable for some types of computations like watershed modeling,
07:24
but it works very well for other types. You can also specify overlap so that your edges are correctly computed. And there is a wrapper if you want to use the tiling for raster algebra, and that's
07:41
the r.mapcalc tiled at on, where it has completely the same syntax as r.mapcalc, but it runs it in parallel. Okay, so this was an overview, but I would like to mention several random, or less
08:03
random, tips, tricks, and benchmarks I ran into during my computations. So let me walk you through this. So this is r.neighbors benchmark.
08:21
It's around 400 million cells, so it's not like huge, huge, but it's decent size. And the first plot on the left is showing the performance time, that the y-axis is
08:42
logarithmic, and on x you have number of cores. So what you can see is obviously the time. Okay, let me say first what r.neighbors does. It's a moving window analysis, so we can compute average standard
09:01
deviation and so on. And so if you see with the increasing size of the moving window, the time jumps up really fast. But maybe what may be more interesting sometimes than the absolute time is how
09:22
efficiently the algorithm can use the cores. Because you want to know like how many cores you can throw at it and still get, you know, an efficient usage of the cores. And that's what the second plot on the right shows.
09:45
And I think here it's interesting that the parallel efficiency... So the parallel efficiency means, let's say, if you run... If you use 10 cores, maybe you would expect 10 times the speed up, but that's usually not the case.
10:02
So that would be parallel efficiency of one or 100%. But usually it's low for various reasons, lower for various reasons. So here you can see the parallel efficiency is really influenced by the size of the moving window. And you can see for the small 7x7 cells window size,
10:28
the parallel efficiency drops off really fast around those four cores. So that means if you, in this case, tell R.neighbors to use more cores,
10:42
it won't necessarily give you...you won't get there faster. But if you have really high, large window size, you can actually use many more cores and you will still get speed up with
11:00
each additional core. So I guess the takeaway message from here is if you use these OpenMP-enabled moving window tools, feel free to use as many cores as possible if you have really large window sizes.
11:23
Otherwise, it really won't help you. There are even more benchmarks in manual pages of the tools I showed earlier. So feel free to look at those. They can be helpful when you are trying to decide how to distribute your cores
11:41
among different parallel computations. These were derived with GRASS benchmarking library. So you can even use that library if you want to experiment yourself. Overall, because all these modules use different algorithms, so how they scale really depends then on the particular implementation.
12:03
But what I found is if you are just doing like computation on your laptop and you just...it's not like huge data and you just want to get the result faster, usually four cores is what you will help really to get most out of the parallelization.
12:23
If you are doing large data computations, you have, you know, more, many more cores, then, yeah, feel free to use more as needed. Then, and often, I guess, often asked question is in GRASS.js,
12:44
how do I parallelize computations if I need to compute them in different geographic region? So GRASS uses this computation region for raster data computation, but it is typically limited or it is limited to this single map set.
13:06
So you can't have multiple, you can't use multiple computation regions. If you would try to do that in parallel, you will run into issues. But there is actually a trick how to get there.
13:23
So in this case, this is computing viewsheds for different points across the road. And so here you actually want to have different computation regions for each of the viewshed computations. So the trick here is really this line where you define the region
13:46
and save it into this GRASS region environment variable. Even better, if you want to have even more fine control over which tool
14:00
is using which computation region, you can get a copy of the environment and then set the environment variable over there and then pass the environment into the call of the module. So then you have absolute control over what computation region is, or which computation region is used with which tool.
14:27
Okay, then the tiling approach. So what I was having trouble with there was related to large overhead. So part of the tiling approach is that you typically need to merge the data back.
14:42
And that part is taking quite a bit of time. Also, there can be problems with IO. And so I didn't get really a lot of parallel efficiency out of the tiling approach, unless you actually use it with large data.
15:07
Also, it's better for processing which is long-running in-memory computations, because then you can avoid the problem with IO.
15:21
And if you are doing raster algebra, it makes more sense for really complicated raster algebra expressions. Also, I found out that it really matters how you tile your data.
15:41
So this is, you know, you would think of tiling the data, well, let's do squares or something like that. But if you do actually slices, it will run faster. I think the merging part is currently written in a way that the algorithm prefers, you know, row-wise computations.
16:02
And this will impact the speed. So sometimes you don't want to... So the previous examples expected that you are running the tools from GRASS session. But you don't want to always do that. So that's what this GRASS minus minus exec interface can do.
16:27
And these are just different examples. If you want to create like a new mapset or a new subproject, you can also execute a script using this. And you can also use a temporary mapset.
16:42
You can combine this, again, you can put all these calls in a file and then execute them in parallel. There are two problems I was running into, and hopefully they will be fixed soon. So that's a warning.
17:01
Currently, do not use a mask in parallel with the same mapset. This is going to be addressed in 8.3. There are other APRs for this. Another one is with R.reclass, which is reclassifying rasters.
17:21
And the problem is specifically that it's not safe for parallel processing because it is actually writing backlinks to the same raster file so you can get actually corrupted data. So hopefully we will address this as well.
17:42
This is just a random thing I ran into. You might also run into if you work with Jupyter Notebooks. This might not work with the multiprocessing in Python. You have to use the if name equals main. You have to call it differently. Otherwise, you get into some weird problems
18:05
with Jupyter Notebook with the kernel. Just really quick, about the application I was using these techniques on. So we are computing an urban growth model for the continuous United States.
18:24
The model we use is Futures. It's implemented in GRASS GIS as a set of tools. So it's like a whole GIS workflow with data pre-processing. And you can look at the link if you are interested in that.
18:41
Here you can, on this link, you can find a notebook for a paralyzed workflow. I did for a smaller part of the US. It's still two billion cells. So it will show you how you can, for example, paralyze your workflow. And then the actual case study is 16 billion cells
19:05
and we run this on our institutional HPC. And here are just some takeaways from that. So for some cases, for some parts of the workflow, it really matters that you have parallelization.
19:22
So this is one of them where we use this r.futures development pressure tool, which internally uses r.mfilter, which is an OpenMP-enabled. And here you can see we use quite large window size. And if we would run it as a serial tool,
19:43
it would take like five days. With the OpenMP parallelization, it's done in four and a half hours. So that's a big difference. On the other hand, for some of the computations, if you are not repeating them that often, it doesn't matter that much if you wait
20:02
in half an hour or 10 minutes. In, you know, sometimes it matters, sometimes it not. That's kind of up to you to decide. Because US is big, so what I had to do is I had to,
20:20
so the simulation itself is not parallelized for different reasons. So what I ended up doing is splitting the computation by state. And also the model itself is stochastic, which means you have to run it multiple times. So that's a lot of runs you need to do. And so what I ended up doing is using
20:44
this grass exec interface. And we use a tool which is available on our HPC, which is distributing these individual calls using MPI to different cores on different nodes on the HPC.
21:04
One problem with this is that Texas is just too big. So it just slowed down, all the processes had to wait for Texas to finish. So what we ended up doing is splitting Texas in half.
21:24
Okay, and this is just, you know, so that you have idea what I was actually computing there. So this is the Urban Growth model. I will briefly show you just some pictures of, this is the probability layer of like where the development would happen in 2100.
21:51
And this is the talk. So you can find the talk over there and all the rest are links you can then get to.
22:02
If you are interested in the futures tutorial, you can try futures in JupyterLab online. And then in the Grasshopper G workshop, some of those parallelization tricks are mentioned there as well. So you can try that there.
22:23
Okay, thank you.