"Sliding" datasets together for more automated map tracing

Video in TIB AV-Portal: "Sliding" datasets together for more automated map tracing

Formal Metadata

"Sliding" datasets together for more automated map tracing
Title of Series
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date
Open Source Geospatial Foundation (OSGeo)
Production Year
Production Place
Portland, Oregon, United States of America

Content Metadata

Subject Area
Importing new/updated geometry into large dataset like Open Street Map is tricky business. Features represented in both need to be detected and merged. Often times editors are asked to completely "retrace" over updated maps as automated methods are unreliable.While a 100% accurate merge is impossible, it is possible to auto create a best guess and let the user refine from there, eliminating as many manual, tedious steps as possible.Slide is a tool designed to solve this problem and works by iteratively refining roads, trails and other complex geometries to match another dataset, where the features are correctly mapped. In a single click one geometry is "slided" to the other, eliminating hundreds of tedious clicks.The form of the new dataset is flexible. It could be an updated representation of roads such as the new TIGER database, a scanned historical paper map, or a large collection of GPS data points like the 250+ billion made available by Strava, a fitness tracking website.Overall, Slide is designed to leverage what we already know, collected in various datasets, to speed map tracing. Map editors should be focusing on higher level challenges and not just retracing over another dataset.
Keywords Open Street Map geometry mathematical optimization
Point (geometry) Building Demo (music) Mapping Information Computer file Algorithm Tesselation Eigenvalues and eigenvectors Set (mathematics) Heat transfer Shape (magazine) Open set Roundness (object) Vector space Personal digital assistant Term (mathematics) Analogy Right angle Address space Geometry
Point (geometry) Dataflow Information Mapping Video tracking Fitness function Computer network Event horizon Fitness function Number Software Website Website
Point (geometry) Wechselseitige Information Mapping Information Chemical equation Point (geometry) Staff (military) Open set Revision control Goodness of fit Population density Revision control Energy level Right angle
Mapping Information 1 (number) Mereology
Web page Slide rule Mapping Information Moment (mathematics) Commutator Virtual machine Computer network Product (business) Product (business) Number Software Cycle (graph theory)
Area Web page Trail Standard deviation Matching (graph theory) Knowledge base Mapping Forcing (mathematics) Multiplication sign Sampling (statistics) Core dump Bit Line (geometry) Computer font Annulus (mathematics) output Normal (geometry)
Point (geometry) Surface Trail Equals sign Connectivity (graph theory) Multiplication sign Tape drive Disk read-and-write head Population density String (computer science) Endliche Modelltheorie Mathematical optimization Distribution (mathematics) Slide rule Surface Point (geometry) Physicalism Line (geometry) Component-based software engineering Angle Personal digital assistant Cost curve output Gravitation Iteration Quicksort Mathematical optimization
Mobile app Default (computer science) Slide rule Mapping Information Demo (music) Code Demo (music) Sampling (statistics) Function (mathematics) Line (geometry) Revision control Loop (music) Process (computing) Raster graphics Set (mathematics) output Iteration Text editor Resultant
Point (geometry) Dataflow Text editor Line (geometry) Subset
Inheritance (object-oriented programming) Personal digital assistant Weight
Revision control Algorithm Energy level Text editor Mereology
Mathematics Network topology
Area Algorithm Mathematics Population density Demo (music) Personal digital assistant Online help Line (geometry)
Multiplication sign Sampling (statistics) Software testing Subset
Area Point (geometry) Mathematics Process (computing) Internetworking Multiplication sign Neighbourhood (graph theory) Sampling (statistics) Mereology
Area Information Energy level
Point (geometry) Algorithm Density functional theory Distribution (mathematics) Smoothing Information Mapping Direction (geometry) Multiplication sign Maxima and minima Theory Geometry Order (biology) output Summierbarkeit Information output Mathematical optimization Geometry
Metre Point (geometry) Trail Presentation of a group Group action Link (knot theory) Observational study Execution unit 1 (number) Design by contract Set (mathematics) Resampling (statistics) Heat transfer Mereology Tracing (software) Twitter Revision control String (computer science) Energy level Software testing Contrast (vision) Mathematical optimization Area Algorithm Mapping Information Surface Software developer Feedback Polygon Bit Digital photography Vector space Software Internet service provider output
welcome to talk about friends call marked and I'm talk about death merging datasets and taking like large like sensor data sets and merging with like vector datasets so it's kind of a a dry topic you know it's not there eigenvector tiles or anything like that but hopefully you know I have some demos and 7 pictures and stuff towards the end to but you make it interesting and fully yeah I think so but for some start off with a a big picture of the problem and trying to solve with tools and building and so the goal is basically to merge data into OpenStreetMap and make it easier so this kind of a glide goal like whatever but this would define so it terms to really specify when talking about so why talk about data and talk about light sensor data so not like this shape file and imported into I give this geometry imported into OpenStreetMap it's more like I have these billions of data points and I want to and there's information there and I want transfer that into another dataset specifically OpenStreetMap in this case and then the data so that's like the data coming in the downtown fixes the geometry you not find that this this isn't right importing addresses or building a land use something like that it's a fix the roads and trails primarily so that's collected data talking about emerging the and then what about the easier is I want to be it's not unalterable command line tool and I'm not trying to build and I don't want to be a like tons of pointing and clicking a analogies map tracing on top of a map so somehow like some I automated helping of going from the information in 1 dataset and merging it in with the OpenStreetMap stuff and so why OpenStreetMap well why not I mean if you talk about like the philosophical stuff of it work but it's used at at my work stronger we for routing so as to across benefit of improving the dataset can helps everybody so will cut dataset I talk about others there's much different examples specifically since I work at Ostrava we have these large global GPS dataset with hundreds of billions of GPS points from millions of rounds and I those you not familiar with striver it's that
fitness so fitness tracking website online network for athletes know basically the flow is you turn on the aptly people in your pocket you go for a bike ride when you're done you upload it and wheat shower you with beautiful experiences n and that's that's it's a collapse but what's relevant here is we end up with these billions and billions of GPS data points that you start to wonder like what can we do with this what kind of like information is hidden in these numbers of these like basically lat-long points so the 1st thing I did
about 6 months ago was just take all these billions of data points and put my map so here's an example of a this is the basic heatmap event end so here's 1 here's another example of
Europe but yet is not just a is not just the population density map but it does go down to zoom level 15 and you
do see like there is some information here where people go and where they don't go so it
is just a clear-cut fish upon the heat map staff yeah does have as true as 22 billion points for March and so have to update it here in the next few months with more data and others Orion right version and it's should screenshots here but there's a slipping that version online we can zoom pan all that good stuff and it's technically not like an open dataset Mustafa stuff but it is available here for browsing inferred tracing in so you try you know tried advertiser balance the needs of what like the business people struggle want and barter good idea with like what can we do to open up this data for mutual benefit on both sides
so how can we use this data to improve map well as use as you can see here that there
is definitely like information here there's there's a trails there's the roses ones are more popular ones are less popular that your parts of the map that people just don't go in or that cyclist ongoing and so 1 thing that we're doing this kind of
side is we have about this were mapping all this data to road networks for cities so cities of commerce and then like a look but you has a lot of cycling data we want improve reduced infrastructure based on data that commuters helpers out so the Sistine there's 2 2 2 guys that that will work were working with cities to kind of you know the city provides you've it has ever since differ requirements but they provide their own road network from their GIS stuff we map all the siphoned added to that and tell you like time of day Number of Users Number rides like all that great stuff and then it's like a GIS product out so cost start a matter of and yes side like the page the advertisement that may be relevant to some audience but that's not what I'm here to talk about
but from you talk about is this tool that build cults slide and that you will you understand a few moments why called slides and the idea is to take that information that's in the heat map and that bringing in opens being in OpenStreetMap specifically bring out and have like a meaningful fast way to just more automated map tracing that's going on at a whole and OpenStreetMap so machine
example here so this this is a page that a built to discover like show off slide and we're looking at is the standard OpenStreetMap base layer in knowledge base glory covered up with that the blue purple red that he mapped data for that area for this 1 area it's a norm somewhere and there if you look close you'll see that there's no there's no trail that corresponds to that heat maps so what you can do the way slide works is you get outline that is the course out the course outline of this map but this trail and then click the slide but anal match the that he that so the idea is you know 5 klicks forces 100 clicks to get this this line that matches up the trail because if you've ever like been mountain biking or running river trails a nice in 1 D and that takes a long time to like sample properly so so as the input you this course black line and then it iteratively you improves that black line slides it in the place with the duty that such kind where the name comes from that but you just you just off the ballot it's a server-side tools not running JavaScript so does do round-trip but it is pretty fast it takes about like a quarter of a 2nd to run this they animation takes a little bit longer but but the it is yeah it's it's quan real-time this and this is sometimes it does depend on like the input line being cluster see how it this kind of stuff and watch so so here's another
animation that just back at the back of the animation so how does it has a work I
not high-level to basically but you can think of all the GPS data as like a density distribution of where people are so there's places people go all the time like on the trail on the tapes people that never go which is 10 years or so with that data you can bill like this density distribution surface like this where the high density quarters will be lower and that other places will be higher and then you can take your input polyline the black line from the previous example and I consider collect a string of beads and lay on the surface and just let gravity do its thing and slide down into the valleys so it's kind of like that the model that I was thinking about my head when I developed a tool like that's the physics not so physics but the concept that I want to that model there so it's kind
of you know there's a lot of overlap between all sorts of stuff in science but this is kind based off of mathematical optimization where you have a cost function anyone iterate over your function and improve the cost of lower in most cases so in this in slide there's 3 cost functions right now the 3 components to the cost function and that 1 is obviously the debt to the surface like you wanna go lower in that that then you I make sure that like points are equidistant and that the angle doesn't like it super shop in the line so that's just to maintain like the rich rigidity of the line and that to keep it from collapsing on itself so those 3 those 3 costs are computed every time so this kind of
public that is a complicated slider too detailed slide that I put it in there so it's in their like the online version but the basic concept is you input the line you input the heat map data and then you go to this loop for you it is iteratively tried improve the cost and so once you improve it where you can't anymore but he simplified down and you output the result so it's kind of this iterative refinement process of matching what you put with like discourse sample and making it right better or in some sense transferring the information of that's in Ostrava data into Europe all so here's
here's kind of like just by it's server-side written go I can leverage any dataset I think you'd currently the 1 I'm using is a struggle 1 the most interesting but I have some other examples that you can use that it's an iterative refinement process which um this guy cool and it's reasonably fast so you you can work as a like a weather typescript and so I 1st presented this at the of conference a few months ago and incorporate this code into the ID editor which is like the default OSM editor it so you can instead just that demo that I showed
you you can actually you know add data to OpenStreetMap using it and since since then like I haven't done like the best marketing on it but you to a people of used it there's been 6 thousand changesets using this editor which I think is pretty significant court so you on OpenStreetMap are in idea the flows you have the same so you can connect up here for those of you that have used it before you can in dry your costs to course overview line annotated as like a general path or whatever you like and then you click little slide can't and I'll do the same thing
so there's there's 2 ways to interact with it in there In idea editor and 1 is 2 you select and to select a subset of points so here I have 3 nodes on that way and it's the slide that portion in between those nodes of you know in
practice you can have like a really really long bike path super super-long way anyone at best is best to just slide like portions of it and walk or along so that's that's 1 way to do it or the the weight I showed you your slide the whole thing which works in this case this is
relatively short yeah so that's that's sliding to have started at which is I found very useful you know we're like from a company standpoint were trying to routing based off of OpenStreetMap that everyone every mountain biker once around on the trails those trails are in OpenStreetMap so we can really provide solution of them so yeah we want improve OpenStreetMap to improve around and have like a win-win and as a bonus we wanna take we want you leverage our data to make that easier and that's kind of the birth of what's lightest it
but you know it's not at and try think of this concept is like at a higher level than just like how I get trotted data into OpenStreetMap but how do we get like other data and so at so somebody you some of the datasets I've been playing with and just to kind of like the to merge these data this data in in carbon some I automated way so that yeah is that's part of what what I 0 1 try shows like it's not just the algorithm it's like the incorporation with the editor that makes it so that you still doing the same thing but just faster is still of a person looking at it in verifying that something down didn't happen but it's way faster than before because what I don't want like slide to be is like this command line prompt were you press go it commits a thousand things and you don't know whether was right or wrong 1 so this approach and try take like that middle ground of so my automated the so the other place i've have incorporated this algorithm is to tiger that's the Tigers the US Census stuff that they put out and what has been
merged in OpenStreetMap like a few years ago in 2007 age or whatever is the old stuff and since then counties have improved the tiger data but it's unclear how to like mercenary in with what's already in OpenStreetMap so here's 1 example this like a screenshot from the idea editor where you have the white and the green are like the OSM is and the yellow stuff underneath it is the new Tiger that that yeah you can zoom in and look at it you nothing's perfect but it's a hell of a lot better than what's there you can we just have OpenStreetMap like SNPs spider that is basically the concept is like the you know all the ultimate like thing would just be like yes this looks right do you think and fix it so right now is kind of like the 1st version of that and here's
some other examples where the new Tiger stuff is like totally great some but what's in OpenStreetMap isn't like can we merge that in In a similar automated way because doing it on your countrywide automated thing is a bad idea but in some automated where it is least of some looking at every change but quickly is the approach to use known come I
favorite like like this the that the topology is there like it's there which is totally off and the tightest stuff is just is perfect is too right so why can't we just have that like matching and in other places
you have like a smaller stuff small changes like this so you can basically apply the same slide algorithm but instead of sliding to Strobel she density you can just slide to yellow line so just can't do a quick
demo and fixing this area the the and as I was playing with this of like well you know as a 1st step can't I just snap all the nodes to the nearest node in this case the we should also the help to automated but the idea is is the same as you can you don't have a course like background and the cost lines that match the but not dead but not completely then you can can select these wastes and slide
session of this 1 of the most cited
over its say idea is like
the tightest stuff smooth properly sample
looks pretty great but I want to be you know us and I don't want to point click a thousand times to get in there and validated 2 here in this in this subset of test the
Internet got panning around so
agency like come at that intersection up there it's not totally perfect and that's part of the process is like it doesn't change and points of the end point is in right but that's part of the like a tonight and is going to end is make that small edit in this area is fixed you where it works it works well for a like wine the roads kind of those like rural roads like neighborhoods where there's a lot just like often often not used roads the warranty that take forever sample and no 1 has taken the time
to you correct them in OpenStreetMap yet so here's another example just
emerging it like that but sigh ICS so but again it's it's a meiotic so the idea is to be like a you know some and the so my exper at editing OpenStreetMap iters formula clean up this area superfast and that's how it is like a low level but I like a higher level concept and transferring the data of Tiger the information there and bring it to OpenStreetMap in in an easy way so
that's where I have working so far and i have links and on the next slide but does the I had just something something about way of improving this is how to get more input from the dataset so can the astute conference uh listener would realize that hey this straw data is these polylines is like 1 D polylines and that she met data is just like a density function to losing like direction in order of your debt you're losing that information so like can we pull more in like what if we knew the direction at every point on the map ahead like a direction distribution or something to incorporate direction into into it because right now density and that I can the motivation for that is like sharp turns and switchbacks and stuff don't do so well with the slide algorithm because it tries to minimize sharp turns so In my theory were like this and I wanna work on next is incorporating them the direction information from that to help help so you bring more better smoothing and sometimes I like the way the optimization happens is you might get like a little Jagatee minimum so smoothing with that and then more complex geometries that this is come I like like
which is your friend but it's edited OpenStreetMap and you see something like this where the tiger data is perfect but the OpenStreetMap isn't and to you it's completely obvious what should happen to OpenStreetMap that it should just shift a little bit and clean itself up I sort but that does that thank so that's that's can I like the goal in like the vision of of this thing is to step you automate that like the infos their sums that a lot of time I cleaning up the tiger data you know there for us can we just merge it writing there that's a that's my
presentation slide here's all links that don't have their own a few has 1 copy that Ana I'll post on Twitter poster presentation on Twitter but at yes so it's kind of like so that's come at the slide in home appliances trotted out and the Tigers stuff that a high level like it can it was like I explore techniques of like merging in these like non-vector data sets in your vector that set so as like Big Data gets bigger and there's access to that can the some open ways how can we merge that into like something like OpenStreetMap even like a city network like how do you really take yeah and how can you like contract with the cell phone provider that gives you the Jillian's of GPS points and make that useful you yes you can trace over it has like an unreliable you know they gets but really boring fast so she studies like tools to make that information transfer that merging of data automated thank you very questions few but but but fixed the optimization of hurry deciding how many points of the poem Iris resample at l like every 5 meters so she's like a fixed fight Europe yes so resample exo I mean I want so question is how many points to add to my polygon in optimization so the 1st step is i resample at like 5 meter intervals so I can like mimic the flexibility of a string and so take those and minimize it and I simplified again after the fact so that hopefully you know we can talk simplification algorithms which at this i khaki way of doing it which works pretty well but Michael you more points of the curvature part unless at the flat part so at that's like the final price the trade is using something like this with a small user groups who may not use strove affair the activities of to map out areas that they go the thinking people like rock climbers hunters I agree would be tracking activity will you doing it but you benefit from something like this to actually map the trails are areas that go yeah I think I think the way to do that would be to use the OSM traces is they have their layer underneath that to be of the slide that it's a little bit questionable or if you have like multiple ones they're right next to each other like what are assigned to you but hi yeah I mean the more the more we can make of this thing that better that but like I actually having people like go walk around with their devices and like use travel just walking around in the woods and using the input from 5 10 people to slide the trail they haven't like like Abstr of a mapping party years something that not having really but honestly I presented this a few months ago and this is how it the next version of that I discovered like working on my own and like car while are presented get get out there get people's ideas and feedback on what could happen and you know also part like the next step is to like market in some sense you know test unit of that those some community inches prey people like that would be willing to use it and that know how to use it right but that is doing and just is on what using the sitting still in aerial photo it seems awfully tempting to try to snap to high contrast the the it in in a in a photo edited by have tried to that had not tried that but yet conceptually anything they can like build this like surface concept of of like yeah like this this comic a concept of like things that have higher value lower value you can apply to so I tried using like map stand in the data was little noisy in didn't work so well I'm quite given up on it but it was a little bit harder do to the but the other thing you want right to the asked what what what about if Mrs. but greasing minutes going to be an idea or it is our way as I can be ever back in like those him the idea that the is the beauty of idea is that you can for kidnap that whatever you want to so I forked it twice for 1 is to have the version where you struck a slide Estrada data and then 1 is to the Tiger stuff so maybe it's best to combine those 2 but so the fork of ID such I keep it light up to date with the development that's going on there but yes no like officially on the on the website thank you


  301 ms - page object


AV-Portal 3.20.1 (bea96f1033d39fbe77f82542458e108105398441)