We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Bringing Validation to Users Integrating Quality Assurance Checks into Map Editors

00:00

Formal Metadata

Title
Bringing Validation to Users Integrating Quality Assurance Checks into Map Editors
Title of Series
Number of Parts
70
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
There is a need for validation in crowdsourced mapping to ensure that the quality of created data meets the community’s agreed upon standards and best practices. The OpenStreetMap community has created many Quality Control (QC) tools (Osmose, KeepRight, OSMLint, etc.) to identify existing errors within OpenStreetMap data, but there has not been as much emphasis on Quality Assurance (QA) tools to prevent issues from being created during the editing process. We are developing methods to introduce these data quality checks in both the iD and JOSM editors to educate mappers and provide immediate feedback while they are mapping. In order to introduce these new tools, we first need to recognize the methods a user typically uses to learn how to contribute to the map properly: - Perusing the many pages on OSM Wiki; - Reaching out to community members on mailing lists, forums, or Slack groups; or, - Following detailed instructions and receiving feedback (sometimes untimely) on tasks completed in a focused project (via Tasking Manager or MapRoulette) This process is accepted and works for those who are resourceful and careful, but we wanted to reduce the barrier to entry for new mappers by creating MapRules. MapRules is an interface to create instructions which generate custom presets and validation rules that are then integrated into the existing validation frameworks in JOSM and iD. Contributors are directed on how to map features according to the generated rules and are provided instant feedback if their changeset does not meet the set specification. In addition to using MapRules to create specifications for collecting features, we are creating validation checks specifically for JOSM by extending the Validation Tests and using MapCSS based on rules found in QC tools like Osmose, KeepRight, and OSMLint. This approach follows the paradigm of checking data before it is committed to the map. This approach is truly for all contributors and users of OSM. It clearly shows its worth to the mappers who are creating new edits, but it also aids validators to quickly identify problems where they may have had to visually and manually inspect each feature or rely on numerous QC tools outside of the editing environment. To further assist in validation and clean up, we have created a series of corresponding Overpass queries that download only the features with identified data issues. Within JOSM, these issues can potentially be resolved by applying automatic fixes on a feature or in bulk. By bringing more validation into the regular mapping workflow, we can help create a better map.
26
Thumbnail
31:06
38
Panel painting
Panel painting
Computer animation
Computer animationLecture/Conference
Computer animation
Transcript: English(auto-generated)
Hey, everybody. We're going to get started, because I think we are the last talk of the day, so no pressure to get between people in their delicious Minneapolis food. So this is Clarice Abelos. I'm Matt Gibb.
We both work at Maxar, and we're based out of Washington, DC. We're excited to talk to you today about some of the work that we have been doing creating validation and quality assurance checks into different map editors. So there's a few different problems that we face as the OSM community. First is just the amount of data quality issues
in the OSM database. There's very many topological and attribution issues just sort of waiting to be fixed. Second is that as OSM grows with huge numbers of new mappers, there just doesn't seem to appear to be as many people willing to fix
and validate those contributions. So that's a good and a bad problem. OSM is growing, right? That's good. But we want to make sure that there's quality data within the database. And third is more of a responsibility for us as OSM contributors, is that we shouldn't criticize
new mappers for making mistakes if they don't have all the resources in front of them to make the best decisions when editing. So the validation and quality control tools available in the OSM ecosystem are wonderful. We just learned a little bit about Atlas, which is great.
But there's also a lot of tools out there. And they're all over the place. Some are for very specific reasons. KeepRite is very focused on highways, while tools like OSMOS are a bit more comprehensive. However, not all of these tools are front and center in the eyes of new or experienced contributors.
We should be able to take these community-accepted validation checks and put them in front of the user to catch these issues before they ever make it into the database, instead of running validation checks after the fact. So our team's approach has been
to stop bad or incomplete data from being created or uploaded in the first place. This can help give immediate feedback to new mappers and give them the confidence that they may need to ensure that they're contributing quality data. And we're going to share with you some of the checks that we've created for both ID and JOSM editors.
I'll give you a minute to read if good old Dilbert. All accurate data, all of it. So in order to bring more validation to users in the editors, we have added to the validation rules in JOSM and ID, provided a way to create custom presets and validation rules,
and created validation-centric overpass queries to pull in data that has issues. So JOSM's data validator allows you to validate your current data set as you are editing and also warns you before you upload the data, if there are any issues that you may need to fix. There are two types of data validators in JOSM.
Tag checker rule is written in map CSS, which are great for basic topology and tag checks. Then tests, which are written in Java to handle the more complex geometry checks. So what is map CSS? It's a CSS-like language for map style sheets that allow you to specify how a given feature should
be displayed. As in the picture on the left, you could have a selector for water features and specify them to be displayed in blue. It's used in JOSM for map styling and highlighting data validation issues. To the right, you'll see we created a map CSS style that would show buildings that have the same name in the same color.
This helped us to identify potential relations that needed to be made around buildings with the same name, maybe like a campus or something. Here are some basics for creating a map CSS validation check. In the selector, you would specify the geometry of the feature you're looking for, so node, way, relation, or star for all the geometries.
Then the attributes to look for, so in this case, features with the amenity equals hospital tag that don't have a name tag. Then in the body, you would write instructions for what to do with the feature. So you could throw a warning in the JOSM validator window that says hospital without a name tag. When the validation is run, if you
have any features that match your rule criteria, the warning will show in the validation results window. So if you were to create your own map CSS checks and save them as a .map CSS file, or more specifically, .validator .map CSS file, you could go to your JOSM preferences and add them as an active rule,
and they would be incorporated into your local JOSM validations. What we've done is began creating map CSS checks based off of those common issues detected by the QA tools that Matt mentioned earlier. Here's a more advanced example based off of a keep right check that uses variables and pseudo classes like node connection to find junctions where motorways are
connected directly to a highway and might be missing something like a motorway link or a motorway junction. You'll find these Osmo's keep right in OSM lint rules available in your tag checker rules, and we'll just need to make them active in your preferences to use them.
They are maintained on the JOSM Wiki rules page, where you too can add rules you'd like to share and help add to or improve the ones already out there. We really appreciated getting feedback especially when it comes to refining these checks to reduce false positives.
Like Clarisse mentioned, there's two types of validators in JOSM. The map CSS ones, which she explained how to create, and also the tests that are written in Java. So where map CSS is limited to mainly shared nodes and the attribution checks, the Java tests allow for a bit more advanced analysis and geographic type
validation. For instance, looking at distances and relations between existing data. So here's an example of an Osmo's analyzer called far from water, which uses a list of identified water related man-made features, like a pier, being flagged at a specific threshold from a natural water
feature. So you can see in the GIF that the man-made equals pier is moved inside a 30 meter threshold. It's no longer flagged for validation. I won't let it go through again. But if you move it outside, then it gets flagged. It's moved inside. It's no longer flagged.
Here's another example, just looking at waterways needing to start and end inside a water body, improving the water network that exists. And if it does intersect, it needs to go the length of the water body. And again, all sorts of relationships are able to be identified here, where map CSS wasn't able to flag these because there
wasn't a shared node. So just a bit more advanced geometry checks. And then just a third example, looking at route relations for public transport, public transport related features. I'm not going to spend too much time here, just
an example using non-natural features. If you have a specific check, you can write your own. So what have we done? We've added 58 new validators to JAWSOM, all using map CSS. We didn't want to reinvent the wheel at all.
So if JAWSOM already had a similar check already there to OSMOS or other checks that existed, we didn't add it. We didn't want to have duplicate checks going on. We just wanted to help create a comprehensive collection of validation tools available directly to the mappers at their point of entry.
41 of our checks have come from OSMOS, seven from KeepRite, OSMLint has a few, and even one from Atlas, which we just learned about a little bit. And again, a lot of those are mainly looking at highways. And that lives in the OSMLab GitHub repo,
if you want to learn more about that, or talk to the critical folks. Like I just said, most of the checks we've looked at are related to highway issues, which makes sense, given how many organizations and just companies in general are using OSM data for routing or general road information, as well as general tagging
issues and attribution issues. That's not to mention there may be very specific one-off checks, like validating trees. If you happen to like mapping trees, make sure you put the tree type on there, or it's not as valid as it could be. And that's OK.
Because if you have a more niche area of mapping that you enjoy mapping in OSM, you can create validation checks for yourself and then share them with others so that everyone can benefit or improve upon the checks that you've made and how specific features should be mapped. Especially if you're an expert in tree mapping,
share your knowledge with everyone else so they do it right and you don't have to clean up someone else's issues. Thanks. And I want to first apologize for this. If you wanted good data visualization, you should have gone to Jennings and Seth's talk earlier. This isn't it. It's just another way to show what I was just saying.
More of what we've converted to Map CSS are highway and tagging related, but just, again, showing there's plenty of space for improvement for other mappers, people who are great at mapping railways or whatever, go for it and please share.
So all those validations are great for identifying issues which the general public agree upon. But we've seen it time and time again where people request new presets and validations or changes to existing ones for their specific needs. Oftentimes it turns into a debate about whether or not it should be included since there are use
cases in certain parts of the world where it's actually valid. So we made an application called Map Rules that can be incorporated into Tasking Manager to create campaign specific attribution guidelines which generate custom presets and validation rules that can tailor the editors to your campaign or organization specific needs.
This sort of end-to-end communication helps enforce the requirements for mapping features in a simplified, standardized, and streamlined way. Here you'll see the simple UI we created to specify what types of features you'd like the mappers in your campaign to focus on mapping and how. This can be a standalone app or integrated into Tasking Manager with an iframe.
Let's say you wanted them to map hospitals, which should all have the building equals hospital tag on them and be drawn as an area or a point. And all the features with those identifying tags must have a name and may have an indication of whether or not they handle emergencies. All the dropdowns are populated with the assistance of tag info popular combinations.
And then this turns into a clean list of guidelines for mappers to see what is required of features they add in your campaign. When they choose to edit an ID editor from a Map Rules campaign, the list of presets will be limited to the features indicated in the Map Rule that can be applied to the geometry drawn.
This helps guide your users by reducing the otherwise large number of presets in ID to sift through that aren't relevant to the focus of the campaign. When a preset is selected, it automatically adds the identifying tags from the Map Rule, and it only shows the fields and values relevant to the campaign and throws validation errors or suggestions for further information while editing.
So here, you can see they forgot to add a name and might want to add a sport that the field is used for. And we're specifically looking for soccer or football fields. We do have a GitHub repo if you're interested. The great Max Grossman has recently implemented user authentication, which
will allow us to implement the concept of user presets so you can have your favorite presets at the ready and then share and reuse existing presets. As you may know, the Tasking Manager is undergoing a recent redesign. So we will work on integrating Map Rules into Tasking Manager 4 when it is ready.
As our initial development efforts, they began when we were still on Tasking Manager 2. The last thing we wanted to touch on are validation-based overpass queries. A couple of reasons why this becomes important is because, as we mentioned, there are a lot of error-detecting QA tools, each with their own databases of issues that may not get updated very frequently because it takes
a decent amount of time to run all those validations on the entire world. Not to mention, issues may have been fixed through various applications and never marked as resolved in one QA tool or another. So if you can query the data with issues yourself, you'll only be looking at outstanding issues. Also, by downloading data in JOSM with a validation-targeted
overpass query, you'll be able to pull data with issues over a larger area without hitting the data limit you would normally get if downloading all the data in the beat box. It also enables you to perform bulk operations to fix multiple issues at once, like maybe deleting orphan nodes or nodes that have no tags on them and aren't on our way.
In the interest of sharing these queries and reducing redundancy, we added an OSM wiki page that maps a query to references to the QA tools that they are based off of. That way, you can easily find more documentation with examples and possibly ways to fix them. Speaking of redundancy, as I was finding a place
to put these queries, I found some more wiki pages that also share quality assurance overpass queries. So at least you know that we weren't alone in trying to fill this need. But there are lots of resources out there, and we'd love to see what else people have. So now that you can pull a bunch of data with quality
assurance issues directly into JAWSOM, how do you go about fixing them? With the JAWSOM validator, you can also indicate ways to automatically fix the issue. In the map CSS, it finds this is a deprecated tag amenity equals hotel. That should be changed to tourism equals hotel. So we can use fix remove and fix
add to automatically update the feature's tags when fix in the validation window is clicked. Adding more automatic fixes will make it easier for people to get through all of these validation issues more quickly. So why does all of this matter? Like I said at the beginning, we as OSM community members
have a responsibility to new mappers to give them the tools necessary to create good data. By creating custom validations, users or teams or groups can have the directed focus that they need for what they're working on. For example, if you're interested in mapping railways, why not sharing your validation written in map CSS
for detecting where there should be a railway equals switch where railways intersect? Thematic validations can also assist with directed or organized editing and clean up afterwards. So if there is a way to standardize and share validation checks, everyone's going to benefit.
So here's our challenge to you. Please create map CSS validation rules. Add them to the JAWS and validators. Create map rules. We'd like to host a validation birds of a feather tomorrow. I know the board is filling up pretty quickly, but we'd love to have people there. And we'll be able to talk about map rules.
Anything else you want to talk about validation wise? And yeah, just join us tomorrow if you'd like to collaborate further. We're here all weekend. Thanks. We're not sure yet. We've got to put it on the board.
Happy to answer any questions if we have time. Stunned silence.
So mainly it's been a lot of validation issues. The question was, how did we determine priority for what checks we were putting in JAWS? A lot of them are just common issues that we were seeing starting with highways, which it seems like everybody else is, because they're
more high priority and bigger issues to fix. And then just basing it off of the existing code for OSMOS or keep right, whatever is there using those criteria to determine that.
Thanks, everybody.