We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Tools for linking Wikidata and OpenStreetMap

00:00

Formal Metadata

Title
Tools for linking Wikidata and OpenStreetMap
Title of Series
Number of Parts
266
Author
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Editors of OpenStreetMap can use my software to search for a place or region, generating a list of candidate matches from Wikidata, which can then be checked and saved to OpenStreetMap. Linking the two projects isn't without controversy. They use different licenses which raises questions about what information from one project can be copied to the other. This presentation will give details of a new version of the editing tool. The benefits of linking, the process of finding matches, the community response - including the controversy - and how people can get involved will be discussed.
Computer animation
Process (computing)Link (knot theory)WebsiteType theoryWeb applicationComputer animation
Drag (physics)Abelian categoryFinitary relationBoundary value problemWebsiteObject (grammar)Link (knot theory)Client (computing)LoginWeb pageLevel (video gaming)Open setMatching (graph theory)CASE <Informatik>Electronic mailing listStudent's t-testComputer animation
SurfaceInternet forumWebsiteTerm (mathematics)Object (grammar)Workstation <Musikinstrument>ActuaryScaling (geometry)PolygonMobile appOpen sourceBitLevel (video gaming)Personal identification numberMatching (graph theory)Type theoryPhysical systemFormal languageBus (computing)Workstation <Musikinstrument>WebsiteContent (media)Different (Kate Ryan album)Computer configurationInformationRouter (computing)SoftwareKey (cryptography)Zoom lensComputer animation
HypermediaCellular automatonWorkstation <Musikinstrument>InformationMatching (graph theory)Mobile appWikiComputer animation
Workstation <Musikinstrument>CuboidConfidence intervalMatching (graph theory)Goodness of fitTouchscreenComputer animation
Complex (psychology)Finitary relationLocal ringTexture mappingMathematicsTouchscreenFile formatComputer animation
Library catalogLink (knot theory)Link (knot theory)Physical systemMobile appMedical imagingIdentifiabilityLibrary catalogFormal languageStatisticsComputer animation
NumberMathematicsGraph (mathematics)Level (video gaming)Object (grammar)Multiplication signWikiStatisticsChainSet (mathematics)BitMatching (graph theory)Computer animationDiagram
PermianBitMatching (graph theory)Workstation <Musikinstrument>Bus (computing)Computer animation
Statement (computer science)Computer-generated imageryWorkstation <Musikinstrument>Workstation <Musikinstrument>Category of beingStatement (computer science)Web pageBus (computing)Key (cryptography)Amenable groupInstance (computer science)WikiComputer animation
Cellular automatonWorkstation <Musikinstrument>Bus (computing)Matching (graph theory)Address spaceMatching (graph theory)Physical systemNormal (geometry)Address spaceType theoryIdentifiabilityComputer animation
Coordinate systemIdentifiabilityCodeBuildingComputer animation
Coordinate systemElectronic mailing listStatisticsUniform resource nameService (economics)CodeWebsiteWorkstation <Musikinstrument>Standard deviationIdentifiabilityPhysical systemObject (grammar)Different (Kate Ryan album)Computer animation
Expected valueMatching (graph theory)BitConsistencyComputer animation
DatabaseOpen setProjective planeOpen setPublic domainDatabaseComputer animation
Category of beingRule of inferenceDatabaseWikiPhysical systemInformationDifferent (Kate Ryan album)Level (video gaming)Rule of inferenceDatabaseRight angleMatching (graph theory)Uniform resource locatorCategory of beingSelf-organizationHypermediaPhysical lawLink (knot theory)Polygon meshComputer animation
System identificationLink (knot theory)PermanentString (computer science)Image resolutionWeb pageComputer fontInflection pointGeometryParity (mathematics)Finitary relationLocal GroupSimilarity (geometry)Instance (computer science)Rule of inferenceReading (process)Normal (geometry)Module (mathematics)Personal digital assistantWebsiteMaß <Mathematik>Intrusion detection systemIcosahedronOpen setWikiStatement (computer science)IdentifiabilityMobile appPhysical systemMultiplication signOpen sourceCategory of beingTheory of relativityMereologyMatching (graph theory)Link (knot theory)Stability theoryPoint (geometry)Direction (geometry)WikiCASE <Informatik>Moment (mathematics)Open setState of matterReverse engineeringIntrusion detection systemMathematicsComputer fontIdentity managementBuildingMappingFrequencyComputer animation
SoftwareMatching (graph theory)Direction (geometry)SoftwareInjektivitätWind tunnelWikiType theorySingle-precision floating-point formatComputer animation
Revision controlFinitary relationLocal ringBuildingInstance (computer science)Computer-generated imageryReading (process)Formal languageStatement (computer science)View (database)Musical ensembleArchitectureBuildingWikiUniform resource locatorWeb pageCategory of beingLink (knot theory)Computer animation
Link (knot theory)BuildingMoment (mathematics)Link (knot theory)BuildingAxiom of choiceFigurate numberVirtual machineCASE <Informatik>Computer animation
Attribute grammarExecution unitOperator (mathematics)Computer networkProduct (business)Standard deviationSoftwareOperator (mathematics)WikiMobile appCapillary actionCodeComputer animation
Hill differential equationLink (knot theory)CodeRevision controlCodeRevision controlComputer animation
Link (knot theory)Type theoryArchitectureWebsiteLoginACIDAreaBootingKey (cryptography)Computer-generated imageryMobile appLevel (video gaming)Type theoryPersonal identification numberDifferent (Kate Ryan album)Bridging (networking)Revision controlElectronic mailing listOpen setLink (knot theory)Computer animation
MaizeLoginKey (cryptography)Computer-generated imageryHill differential equationBuildingSimilarity (geometry)PolygonComputer animation
Transcript: English(auto-generated)
Okay, yep, I'm going to talk about some tools I've been building for linking OpenStreetMap and Wikidata. So I'm going to go straight in. This is my tool I've built. It's working,
it's web-based. It's for adding links to OpenStreetMap that go to Wikidata. So you can go to this website and you type in a place name and it'll search. It takes a while to run the process and then it'll find things that it thinks are the same thing on Wikidata and OpenStreetMap. So I'm using as an example, prize ran where we are.
This is what you get. You get a map and then a list of the matches it's found. So in this case, it's found 25 candidate matches, 25 things that it thinks it could link up. And to be able to save these things to OpenStreetMap, we have to log in to OpenStreetMap. So we click
this button. We go through the OpenStreetMap login page and then come back here. Just scroll down so you can see some of the features. We've got the type filter, so you can filter the matches. So I've got things like historic site and river,
the six river matches. I've got the option to change the language that the contents displayed in. I haven't translated the tool into other languages, but I get text in different languages from both OpenStreetMap and Wikidata. So the software has chosen English for prize ran,
and it's got Albanian as the second option. But if I want to, I can change them. I can move the top, then it changes into Albanian. So you can see the type filter is partly in Albanian and partly in English. The Albanian Wikidata doesn't have the Albanian name
for watercourse, so you get it in English like it's a fallback. So if I scroll down, you can see some of the candidates it's found. It's found a hostel that it matched between the
systems, and it's found a bus station. If I made that a bit bigger, you can see better. Then I've got a show on map button. So you click show on map, and it'll zoom in on that match. So the red pin is where the Wikidata coordinates are, and then you get the polygon
from OpenStreetMap. So this is a pretty good match. I think it's the same thing. And then I show you some data that's come from various sources. So this is from Wikidata and Wikipedia. You get the name in English from Wikidata, and I'm showing you the first
paragraph from Wikipedia. There isn't any Wikipedia article in English, so you get the paragraph in Albanian. And I show you the picture that's come from Wikidata. And then underneath that, I've got the matching information from OpenStreetMap. It's found one candidate. It's nearby. I've got this button, toggle RSM key tags that will show you some
more information from OpenStreetMap. So you can see all of the OpenStreetMap tags, and the things are highlighted in green that match. So the various names on the OpenStreetMap match Wikidata, and the tags are matching.
So because I'm confident that this is a good match, I've got tick box. I can tick and say, yeah, let's save this. And then I can click the add Wikidata tags to OpenStreetMap, and it'll take me to this screen where it just confirms everything, shows me again
in a slightly different format the edits that I'm going to make. And then I can check those again if I want. And once I'm happy, I can adjust the change comment. It generates the change comment automatically for adding to Wikidata. And then I can hit save, and it'll save my edits to Wikidata.
So why do I want to do this? Why do I want to link up these two systems? It makes OpenStreetMap more useful if it's got the links into Wikidata. We can get the labels in more languages, like people have been translating and annotating Wikidata with
more labels. We get the images from Wikimedia Commons, and we can get a lot of identifiers from Wikidata. Wikidata is a good catalogue of identifiers. It's almost like the Rosetta Stone of linking together different systems. So this is why it's useful to link up between
Wikidata and OpenStreetMap. So I'll give you some statistics about this tool. Or rather, there's almost 3 million objects in OpenStreetMap with Wikidata tags. And you can see I've got a graph there. It's been increasing over time. And people are using my
tool. These are some updated stats. They updated them yesterday. So I've got almost 450 users and 25,000 chain sets. And 26% of the Wikidata tags that are in OpenStreetMap were added using this tool. So I'll talk a bit about how I do the matching.
So back to my example of the bus station. In Wikidata, it's tagged as a bus station. So this is the Wikidata page that shows you this bus station. And you can see here, one of the statements is instance of bus station. So if I click on bus station,
then I end up on the bus station page. This is all about what is a bus station. And I can scroll down here. And you've got, there is a property on the Wikidata bus station page for OpenStreetMap tag or key. So it says tag amenity equals bus station.
So if I go back to my tool, you can see that's what I'm matching on. So I can be confident that these two things represent the same thing. So when it comes to matching, I'm matching based on, it has to be the same entity type
and the same coordinates or close, the two things in the two systems have to be close together. And then I need either a name or street address or identify a match. So I do lots of normalization on the name. I lowercase the name, I remove stop words and compare them.
So I try and get more matches that way. So another thing that I do is I match based on identifier. So things like railway stations will have a code. And this is a historic building in Price Wren that has an identifier that appears both in Wikidata and in OpenStreetMap.
So I can be confident that it's the same thing. And here's an example of the airport and Pristina has got a standard identifier that's in both systems and I can match on that. And there's a whole bunch of different identifiers that I match on,
that I can be confident that objects are the same thing in the two systems. So there's existing Wikipedia tags already in OpenStreetMap. And I'm not using the Wikipedia tags. They're a bit all over the place. They're a bit inconsistent.
And when I started building this tool, I thought somebody else might use the existing Wikipedia tags. There hasn't been much of that yet. So I might go back and see if I can use the Wikipedia tags to get better matches. So I'm just going to talk about the licensing for
the two projects. So Wikidata is licensed CC0 or public domain. So you can do anything you want with it. And OpenStreetMap has got its own license, which is the open database license, which is more restrictive. So you can't copy any data from OpenStreetMap to Wikidata because
that would violate the open database license. But it turns out it's more complicated than that because OpenStreetMap and Wikidata use different intellectual property jurisdictions. So OpenStreetMap asserts database rights, which is a kind of European concept. Whereas Wikimedia is a US organization. And so they tend to use US intellectual property rules,
which doesn't have database rights. So in US law, facts are not copyrightable. But they've got database rights in Europe. So the two systems don't mesh together very well. Like there's a feeling within the OpenStreetMap community that Wikidata is a derived work of
Google Maps that people think people are looking up the locations of things on Google Maps, finding the coordinates, putting them in Wikidata. And so even though the license suggests it should be possible to copy data from Wikidata into OpenStreetMap, the OpenStreetMap
people are not keen on that because they think it's a derived work. But luckily, my tool is not copying information from either system. I just add links between the systems. So I think I'm fine with this. So one aspect of being able to link things together is you need
stable identifiers. And OpenStreetMap doesn't have stable identifiers. There's a feeling that the IDs for things can change. But in reality, they don't tend to change. Like the only example would be if someone maps like a railway station as a single point,
as a node, and then later on, they change it to be the outline of the building, then they'll change it to a way or a relation, and then the identifier will change. But that happens infrequently. So it's almost as if we can treat OpenStreetMap identifiers as stable.
And then the reverse is the case with Wikidata. So Wikidata tries very hard to have stable identifiers. Like part of the design, Wikipedia came first and Wikipedia articles get renamed. So Wikipedia articles don't have stable identifiers. And they were keen with Wikidata to try and everything has a Q number, which is supposed to stay the same all the time.
But it turns out that there's duplicates in Wikidata. Like I've been using my tool, and I find lots of duplicates to where people have imported data from Wikidata from two different sources. And let's say the same church or hotel ends up having two items on Wikidata.
And so then I merge those, and then it's as if the identifier changed for one of those items and you end up with a redirect. So at the moment, of the Wikidata tags in OpenStreetMap, there's 10,000 of them that point to redirects on Wikidata. So somebody needs to go in and upload those tags. I'll probably do that at some point.
So how about adding links in the other direction? Like for a long time, Wikidata was not keen on adding links into OpenStreetMap because of the problem of stable identifiers. But now the community has come to the consensus that the identifiers are stable enough,
and so they're relatively new. There are now properties in Wikidata for linking to OpenStreetMap. There's three properties for one feature node, way, and relation. So I need to change my tool to save to both systems. Like when you say, yes, these are a match,
it should edit Wikidata and add links to OpenStreetMap so that you can go each way. So the automatic matching doesn't always work. I've got some problems. This is an example not from Prizer and this is from Gothenburg in Sweden. The tunnels are represented as single
items in Wikidata. But in OpenStreetMap, when there's two bores in the tunnel or two traffic directions, it will be represented as two different ways. So there isn't a one-to-one mapping. Like the way I designed my software, I kind of assumed there was a one-to-one mapping,
but I need to change it so that you can add the Wikidata tag to both ways of a tunnel. And you get this problem with other entity types as well. I've got this difficulty with distinguishing between a building and an institution. So this, again, is another
example from Gothenburg. This is a museum in Gothenburg. And you can see here it's got a Wikidata tag. We can click on that Wikidata tag and we can see the museum on Wikidata. So this actually is the item page that represents the institution. And if we scroll down, there's
a location property on here that links to the building. And then this is the item that represents the building. So we've got two different items that kind of represent the same thing, institution and building. Whereas on OpenStreetMap, there's only one thing. So it's like which two of those do we link to? So at the moment, the Museum of Gothenburg
links to the institution. But then right next door to that is the German church, which also has items for both the building and the congregation. And in that case, the German church on OpenStreetMap links to the building. So it's kind of ambiguous. And even as a human, I can't figure
this out. So how I'm going to have the machine make the right choices is complicated. And on OpenStreetMap, there's a pile of secondary Wikidata tags for representing
things like brand and network. And so maybe the answer is the operator Wikidata tag. Maybe I should be storing the institution in that tag instead of just putting it in the standard Wikidata tag. So the code for this is on GitHub. It's GPL. So anyone can send me
contributions. And then I'm working on a new version. So the existing version is kind of slow. It's a few years old. I mean, it works. People are using it, but I want to make something better. So this is my new version, more about the map.
This should open instantly. You don't have to type in a place name. It'll try and figure out where you are and show you a map. And this is showing existing items that are already linked and things that aren't linked. And there's a filter that you can filter on different
types of things. And this is an example in prize round, the stone bridge. So it's already linked. You can see the green pin is where the Wikidata coordinates are. And the yellow pin is the OpenStreetMap. And then I have a list of what it could link to. And it's highlighting in green
the existing link. So you can use this tool to go through clicking on the red pins and find OpenStreetMap things that should be linked. And this will upload the tags to OpenStreetMap.
Here's another example, similar kind of thing. I'm showing the polygons of the buildings on OpenStreetMap.