Analysing access to UK public rights of way with the QGIS Graphical Modeler
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 351 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/69175 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Production Year | 2022 |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
00:00
Mathematical modelGoodness of fitComputer animation
00:24
Open setLengthRow (database)Indian Remote SensingCodeTime zoneLoginVector spaceThermodynamischer ProzessGEDCOMMathematical modelBoundary value problemNetwork topologyContent (media)NumberAverageMaxima and minimaRaw image formatContinuous functionDistanceData bufferLatent heatMetreMathematical analysisModal logicPolygonAreaWeb pageBuffer solutionComputer networkFunction (mathematics)Buffer solutionMathematical analysisInteractive televisionElement (mathematics)Resource allocationSelf-organizationElectronic mailing listPlastikkarteNumberHypothesisDivisorThermodynamischer ProzessTraffic reportingTerm (mathematics)Mathematical modelGoodness of fitOpen setFocus (optics)BitCategory of beingRight angleQuicksortSystem callAuthorizationProjective planeFigurate numberDifferent (Kate Ryan album)Heat transferAreaStatisticsSoftwarePresentation of a groupRandom matrixMultiplication signExpressionSet (mathematics)Theory of relativityLimit (category theory)FrequencyType theoryCodeComputer networkForestState of matterWeb pageEndliche ModelltheorieComputer animation
07:26
Data bufferPolygonAreaComputer networkFunction (mathematics)Buffer solutionBoundary value problemLengthMetric systemDistanceOpen setTerm (mathematics)Core dumpNumberThermodynamischer ProzessoutputMeta elementMathematical analysisMathematical modelData typePoint (geometry)Duality (mathematics)TetraederApplication service providerLatent heatPrice indexBuildingIterationMereologyField (computer science)Object (grammar)Line (geometry)Right angleThermodynamischer ProzessLine (geometry)Computer networkOpen setLengthExtension (kinesiology)SoftwarePoint (geometry)AuthorizationPolygonAreaFunction (mathematics)Mathematical analysisMereologyPrice indexDegree (graph theory)Order (biology)Process (computing)Intrusion detection systemSet (mathematics)Multiplication signSubject indexingBoundary value problemDifferent (Kate Ryan album)Insertion lossQuicksortProjective planeBitContext awarenessMathematical modelComputer animationProgram flowchart
14:27
Field (computer science)AreaPrice indexBuffer solutionPolygonMereologyoutputPoint (geometry)Line (geometry)Object (grammar)Digital filterTape driveLengthAliasingIterationLogicThermodynamischer ProzessMathematical modelLatent heatFunction (mathematics)BuildingMathematical analysisAsynchronous Transfer ModeVector spaceOnline chatComa BerenicesEndliche ModelltheorieAlgorithmLocal ringNumberThermodynamischer ProzessMathematical modelArithmetic progressionPrice indexQuicksortSolid geometryProcess (computing)Software development kitMereologyDifferent (Kate Ryan album)LoginRevision control1 (number)Traffic reportingBackupAreaBitInequality (mathematics)Operator (mathematics)Projective planeStructural loadMultiplication signAttribute grammarReal numberFunction (mathematics)Computer fileSet (mathematics)RandomizationSubsetTerm (mathematics)Heat transferFlow separationField (computer science)Mathematical analysisoutputMetadataDescriptive statisticsTelecommunicationRight angleRepresentation (politics)CurveError messageLevel (video gaming)Online helpExistential quantificationSemiconductor memoryPreprocessorInterrupt <Informatik>Computer animationProgram flowchart
21:28
Computer animation
Transcript: English(auto-generated)
00:01
Good. Thanks very much, Chris. And welcome, everybody. If you were here earlier, you would have had my colleague Andy talking about Aston Technologies, so I'm not going to talk too much about that. You can look us up if you're interested. But I'm going to be talking about something different to what most of the people in the room today have been
00:21
talking about, and that's the QGIS graphical modeler. And in particular, I'm going to talk about a project that Aston's been doing, which is a bit out of the ordinary in terms of the kind of work that we're doing. We're doing a bit more of this kind of work. And it's working with an organization in the UK called
00:40
Ramblers. So for those of you who are not familiar with it, basically it's the walking organization in the UK. It's people that go out and walk. It's a very large, active membership organization. It's a nonprofit. And as well as
01:01
having an interest in the state of paths and the people that use them, it's also very active in lobbying for improved access to the countryside and basically for getting people outdoors. So it's got a great kind of mission
01:20
and, as I say, very active membership. So they came to us and asked us to do some work on an analysis of access to the rights-of-way network. So there's a rights-of-way network in the UK where anybody can go and walk. It's data which is maintained and published by local authorities.
01:42
And they wanted to know who was able to access that, how easy it was for people to get access to it, because their suspicion was, their hypothesis was that it wasn't enough people and it wasn't necessarily the right people. So they wanted some evidence to back this up. So they commissioned a research
02:01
project involving us and a research organization, New Economics Foundation, and got us to have a look at it. We did the kind of data crunching and the research organization did the analysis and wrote a 43-page report.
02:21
So my talk's about the graphical modeler and I was in bed with just COVID a few weeks ago and I was lying in bed thinking about various things and I thought, wouldn't it be a good idea, because I'm talking about the graphical modeler, maybe I could do my talk using the graphical modeler to model
02:42
the process of my talk. So that's what I'm doing today. It may or may not work. You'll be able to tell me at the end it will maybe work better. I did one a few years ago using the I did a presentation using QDIS Atlas and that was vaguely interesting. So we'll see how this one goes.
03:01
Maybe next time I'll use the expression builder or something like that. So here's my talk. And what we're going to be focusing on is the data that we were looking at. Paths, shape, and essentially where it is in relation to the population of the UK. That was
03:23
the analysis. So a relatively limited number of data sets, but quite a lot of different questions that we had to answer. And these were the kind of questions that we were asked. I'm not going to go through all of them in detail, but it was about
03:42
within a buffer of each postcode to the UK zip codes, which are areas I guess average about 40, 50 people live. What was there in terms of rights of way within 800 meters, 600 meters, 3,000 meters, and so on. And then what was the quality of those rights of way? What was
04:04
wild? What was green? What was forest? What was near rivers? And so on and so forth. So there was quite a bit of digging in terms of those sort of questions, and also quite a lot of interaction to try and tie down not just the questions they wanted answers to, but
04:23
what questions it was going to be possible to answer. Because they're dependent on data availability, and they had a limited budget, and we needed to focus on the quick wins that we could do stuff with in the time allocated. So we this is the list of questions we came up with
04:42
in the end. This is New Economics Foundation, as I say. This is the organization who did all the sort of clever stuff at the end involving looking at all these stats and boiling them down into something that could be made into policy recommendations. So that report is still in draft actually. It's not been published yet, but I think it's going to be published
05:02
pretty soon. So the timing is good. Keep an eye out for that if that's something you're interested in. So we came up with a list of models, and we essentially went through and developed a model for each of those questions. Just a quick
05:22
show of hands, by the way. How many of you have worked with graphical modeling in a reasonably serious way? Okay, so that's interesting, because that's about half. It's interesting, I mean, I don't see that much about it, but it's, so I think it's a sort of slightly under, I don't know if it's
05:42
underused, but it's not something that tends to get a lot of publicity, but it's, you know, my conclusion jumping to the end is that it's a really solid tool, and it was actually essential for this work. So why did we use it for this work? Well, I mean, I work with QGIS a lot, so if I'm
06:02
asked to do some work, tends to be my first port of call is to look for whether something in QGIS I can use. I can write SQL, I can hack a bit of Python, but actually I'm much more comfortable sitting in front of QGIS and pressing buttons. So if we can find something that allows me to do that, then that's a win as far as I'm concerned.
06:22
And the other reason it's a win is because Ramblers are keen to ramp up their own GIS capability. You know, they've got a lot of smart people, they're very capable, they've got limited experience of the kind of tools that we're using, but they're very keen to learn. So the other strong
06:42
factor in terms of using the QGIS modeler was that at the end of the day we wouldn't just be producing a bunch of statistics and walking away, we'd have something that we could hand over to Ramblers in terms of something they could use in the future so that they could feed different data sets into it so that they could look at it in a year's time
07:01
and see what had changed in the period since we produced our figures. So this transferability factor was very important to us and important to them. So we looked at these models, I'm not going to go into all of them in detail, but I thought I'd focus in on one of them in particular.
07:22
And this was about, it was a category of land in the UK called open access land, which is what it sounds like, you know, you can go onto it and you can have your picnic and wander around and so on. And they were keen to know first of all, what was the length of rights of way within open access land,
07:42
that's fairly straightforward you would probably think, but also the extent to which open access land actually connected with the rights of way network, because open access land is all very well and good, but if you can't actually get to it easily, then it's not so useful.
08:01
And they had a suspicion that there was quite a lot of open access land which actually wasn't connected or was too small to be very useful. So that was the other question they asked us. And then the final question, so it's a sort of three part question, this one really, was from every postcode in the
08:21
country, or at least in England and Wales, how far is it to your nearest bit of open access land? How far do you have to travel along using the rights of way network in order to get to a bit of open access land? So there were three parts to that question. And for each of these questions, you know, we had to go through a process of saying, okay
08:41
this is what we think we want to know, we'd say okay, well we think in order to find that out this is what we're going to have to work through, and we'd reach a point where we'd agree, and quite often we'd produce some data and we'd say no, I don't think that's quite right. So it's very much an iterative process, but this was what it boiled down to. So
09:02
just to give you a bit of context on the data, about 50,000 lines is our rights of way network data. It was actually quite hard to get hold of, but that's another story we can, if anybody's interested I can talk to them about that, even though
09:21
it's maintained by local authorities in the UK. And it represents I think something like 200,000 kilometers of rights of way, the rights of way network in the UK. We have these open access areas of about 14,000 polygons, and then one and a half million postcodes
09:41
covering, again, England and Wales. So there's some quite chunky data in there, and as we will see, you know, processing and performance is one of the issues we came up against and had to grapple with. So I'm going to ask for a bit more audience participation at this point.
10:01
So having given you, which I'm sure you all remember in detail, having told you about that question, I'd just be interested to know how many processes, how many individual processes in a QGIS workflow people think that that might involve. So hands up for less than five. Okay. Hands up
10:23
for six to ten. Hands up for eleven to fifteen. And over fifteen. Okay, so the answer's fifteen. So you're pretty good.
10:41
You're a smart audience, I mean, as if I didn't know. But yeah, it was more than I thought, to be honest, because you end up having to do things that are not really, you wouldn't think of as being processes when you design a process. You know, there are processes, and then there are processes you have to do in order to make the processes work or work better or to tidy things up.
11:04
So yeah, there were a lot of processes, and it turned out to be a slightly more complex workflow than I thought. So we've done that, and I was unable to catch anybody else.
11:20
So what I wanted to do, though, was not to take you through the whole workflow, because that would take probably an hour or so, but you know, here's a screenshot of the workflow overall. Most of the fifteen processes are in there. If you've used graphical modeling, you'll be familiar with that kind of graphic. Just to focus in on a few of the things that maybe were unanticipated or we needed to tweak. So things
11:44
like simplifying data. We found performance throughout was an issue. We had some processes which were taking twenty-four hours to run, which was a bit of a pain. It wasn't a tight deadline project, but when something runs for twenty-four
12:00
hours, you sort of forget about it and start doing something else, and it takes you another half hour to work out what you were trying to do in the first place. So we needed to speed things up. Simplifying our polygons made a big difference, because we realized it wasn't really important. We're not interested in the detail of the boundary of each polygon. It's not going to make any difference to our answers. So we could simplify to quite a reasonable extent, not lose any
12:24
quality in our output and gain a loss on performance. So things like that were useful. We found we had to do quite a lot of sort of pulling things apart and putting them back again. So turning things from polygons into lines and then linking them up with our polygons in order, because we needed the area of the polygon to do our
12:44
final analysis. So we needed to add IDs for example, because we were producing data sets as part of the process which didn't have IDs, so we needed to give them IDs so that we could get back to them and find out what their area was.
13:01
Spatial indices, you know, any of us who've worked with spatial data knows that spatial indices make a vast difference to how quickly our data work. QGIS helpfully usually tells you if you're running a process which is not very fast. So I found that we were adding spatial indices maybe two or three times during
13:24
a process, because every time you produce a new output, you think oh, I've already got a spatial index on that, because you don't, because you're creating a new output which needs a new spatial index. So you need to drop in a spatial index. I mean it's very quick, it doesn't really take much in the way of processing time.
13:42
And then we did this thing in order to one of the things I mentioned that they were interested in was the intersections, you know, the degree to which these areas intersected with the rights of way. So how many points did you have to get into this rights of way?
14:03
And unless somebody can tell me different, the only way I could find of doing that was doing a, you had to turn the polygons into lines in order to get the intersections between lines and lines. So I turned the polygons into lines and then run a QGIS process which identifies those intersections and then
14:22
I had to link back up to the polygons to get the, you know, because it's the polygons that I was interested in. So there's a bit of sort of around the houses to get, you know, specific accounting intersection bit of this process was probably the one that needed the most thinking about and needed the most sort of
14:42
experimentation. And the rest of it, as you can see, is running these parallel processes and then putting things back together again at the end. And all that works pretty well, you know, once you've kind of worked out your process and the things that you need to do,
15:02
it all runs. So, what do we learn from that and what are my conclusions? Well, you know, I mean I have to start off by saying that the graphical modeler is a very useful to me, very useful and very solid kit, bit of kit. You know, it didn't fail.
15:22
It didn't fall over. There was, you know, absolutely calm fault. It's robustness. There are things which is quite a learning curve in some parts of it and there are things that I kind of discovered at the end and I wish I'd discovered at the beginning but that was a failing on my part rather than anything else. So, you know,
15:41
it is a very, as a workflow tool, it was absolutely right for the job. Specifically, it's got useful things like you can deactivate parts of the model so if you don't want to run the whole thing, you can turn bits on and off. That's something I found out quite early on happily so I was able to sort of run subsets of the model.
16:04
I learnt that it's good to generate the intermediate outputs. Don't just rely on getting your final output and then working with that because you'll probably find along the way that something's gone wrong. If you've got all your intermediate outputs in memory, you can go back and kind of trace through and see where it's gone wrong. I'm not
16:21
talking about errors in the processing, I'm just talking about where you didn't do something right so you don't get the answer you wanted. There's some general comments on documentation. You can document processes, you can document your inputs, you can put comments on stuff, you can see them in the map here. Really useful. Do that as you go
16:42
because tomorrow or next week you won't remember quite why you put in that process so just put a few notes in. And you can also document at the model level. The model help is really useful, you can put the model metadata, you can put an overall description, you can describe your inputs and outputs. Again, we all know this about documentation,
17:01
most of us don't do it but document as you go along because it's much harder to do it at the end when you've forgotten why you did something. And the graphic is useful just as a communication tool to explain to somebody else what you're doing, why you're doing it, particularly
17:21
people that aren't used to working with this sort of data. There is a whole load of issues around processing time and working with big data sets, that's inevitable. Spatial indices and there was one model where I kept producing these outputs
17:42
until I realized that they were outputs we were never going to change, you know, it's like pre-processing. So when I finally tweaked onto that, the process ran a lot quicker, I just produced them once and then they became my inputs so that was worth reminding myself about really.
18:02
You can do things like clean up in a retained field process which I came across in CUJA so you can clean up stuff that you don't want because if you join stuff together you end up with loads and loads of spurious attributes, probably three different versions of area or something so you can scrub all those, only retain
18:21
the ones that you want. And you've got SQL as a backup if you need it. The progress bar is probably one of the things that is sort of least reliable in the graphical modeler because the progress bar works out progress on basically the number of processes. So if 80% of your
18:41
processing is in the last 20% of your processes, it's not going to give you an accurate representation of where you are. Progress bars are progress bars, we know all about those. What else did I wish we'd done? Apart from all those things,
19:02
I had the idea, I kind of like the idea of packaging everything so I saved my model in the project and I saved the project in a geo-package and everything sort of Russian doll-like and then I realized that actually it's not so great in terms of transferability because it's better to keep them separate as separate model entities because then
19:21
it's easy enough to fix but I got a bit carried away with trying to keep everything nicely packaged. It's actually, you know, make a model, it's a stand-alone artifact, keep it stand-alone it's much easier then to find it, get at it, don't risk losing it if something terrible happens to your project, keep it as a stand-alone file.
19:43
There was a few things, you know, with the processing toolbox, everybody who's used it knows there's loads of stuff in there that you just keep coming across so I found right at the end of the project a whole set of tools called modeler tools. I thought, okay, that's maybe something I should have looked at earlier.
20:01
Anyway, so I know next time I'm looking forward to using the modeler tools next time I use the modeler, maybe other people have used them, I don't know. So I've nearly finished, what else, the logs there's a lot in the logs, they tell you everything that's going on, they tell you the timing so again, don't just ignore the logs and move on, it's worth having a look at them.
20:24
So just to throw out a few of the things that came out of this draft report, you know, what they say is based on all this evidence, public rights supervision is deeply unequal, probably not surprising, but it's evidence backed now, we've got this evidence, we've done this analysis.
20:41
Residents of the least deprived areas have got 80% more right of way provision than people in the most deprived areas, so there's this massive inequality, you know, largely it's about where people live I guess in urban and rural and so on, but there are health indications, there are, you know, all sorts of other indications, this is just a few
21:01
sort of random quotes I pulled out from the report, so it's really nice to be involved in, you know, what our operations there actually likes to call a real GIS project, apologies to everybody else who does the real GIS, but to actually work through and take publicly available data, it's all publicly available
21:21
data and produce this kind of evidence that's quite satisfying and just graphical model it turned out was the tool for the job, so we were very happy with it. Thank you.