We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Use of FOSS4G at Gojek to automate map error detection at scale

00:00

Formal Metadata

Title
Use of FOSS4G at Gojek to automate map error detection at scale
Title of Series
Number of Parts
351
Author
Contributors
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Production Year2022

Content Metadata

Subject Area
Genre
Abstract
Our digital maps are not always up to date with the real world. New road constructions and road blockages could reduce the accuracy of the map data. In a logistics company like Gojek that serves millions of users per day in South East Asia, the core undertaking revolves around routing and ETAs. Any inaccurate local map data can lead to a direct negative impact on business metrics. So how do we ensure that map inconsistencies are detected and fixed promptly to minimise interference of our services? When manual detection is labor intensive and not scalable to millions of road networks in vast regions, how can we effectively automate this at scale? This talk is a story of how we, at Gojek, built a pipeline that uses bad customer experience as the trigger to identify potentially faulty data in OpenStreetMap. Our solution makes use of noisy GPS traces and Overpass, an open source tool, to automate this detection. This solution enabled us to identify 100s of potential issues per day, categorise them, associate business impact to each map issue and allow our map analysts to fix them seamlessly.
Keywords
Scale (map)Scaling (geometry)FehlererkennungTexture mappingComputer animation
Square numberDiscounts and allowancesComputer configurationDrop (liquid)Texture mappingOffice suiteComputing platformLogistic distributionComputer animation
Computing platformComputer networkSoftwareTexture mappingComputing platformDevice driverComputer animation
Digital filterHeuristicRoutingOrder (biology)Connected spaceData structureComputer networkError messagePhysical systemFeedbackRule of inferenceDirected graphDiscrepancy theoryCone penetration testInterior (topology)Device driverRadiusFlagVapor barrierBuildingAreaVirtual machineConnectivity (graph theory)Type theoryInformationGraph (mathematics)WikiOrder (biology)Texture mappingLine (geometry)Rule of inferenceObject-oriented programmingRadiusType theoryPosition operatorInformationRoutingError messagePlanningAdditionUniform resource locatorWikiDevice driverVirtual machineInferenceDirection (geometry)HeuristicEmailDiscrepancy theoryPoint (geometry)Connected spaceProjective planeRight angleLocal ringFlagSet (mathematics)1 (number)Thresholding (image processing)SoftwareFeedbackFunction (mathematics)Data structureAreaPresentation of a groupMappingGraph (mathematics)Shared memoryCASE <Informatik>Computer animation
Transcript: English(auto-generated)
So, yeah, hi everyone, I'm Li Chen, I'm from Gojek. Today I'll be talking about how we automate map error detection at scale in my company. So Gojek is like a ride-hailing company based in Indonesia. We have offices in Singapore, where I'm from, and Vietnam as well. We do ride-hailing, we do food delivery, and also logistics, which is like a parcel delivery.
And I'm in this team of cartography, and what we do is that we provide a map platform that's tailored for the Gojo ecosystem. We do OSM map road networks, we look at traffic problems, routing problems, POI accuracy, ETA of arrivals of our drivers.
So to the meat of the presentation, so we have this project, we call it the Maps Error Inference. What we do is that we are trying to detect this type of mirrors on the left, you see here.
Non-routable, missing turn restrictions, as well as wrong two ways. So what this means is that you can look at the animation, the blue dotted lines are the ones that were actually taken by our drivers. And they are, how we get them is that we map match the driver pings, we snap them to the road, so that's how we get it.
And then we also have the green line, which is the suggested route. That is the one that's being suggested by our routing engine. So some of the heuristics that we use is that we look at the ratio of the orders that we're traversing through the route. We also try to incorporate some road network structures, such as like whether this two OSM WID were connected,
and if they are, what is the chances or the ratio of the orders actually traversing through them. And we also look at the GPS ping direction that we can extract from the driver pings.
In addition to that, we also try to utilize local knowledge to reduce false positives from our outputs. So these are usually like handcrafted routes by our local map operation teams. There's also another project that we do where we try to look at the discrepancy between the driver location and the pickup route starting point.
And you see in the radius over there, there's this small gap, right? So usually what we should be looking at is that the red marker should be at the point where the blue marker is at. So usually when this happens, it could be that there is some issues around that area.
That's why we are not able to route starting from the driver location. So in this case, we extract all these routes and then we use overpass to try to extract the OSM taggings of the routes in the radius. And we have a set of rules with our map operation teams to only flag out those that we deem as true problematic routes.
And we will fix these problems and also provide further feedback to the system so that it can improve. So the impact of this project for the business side is that we can serve more
accurate routes and that will translate to better ETA and also better driver and customer experience. To the OSM community, we plan to move these edits into the OSM public mapping. And we also do share lessons that we have learned and best practices that we have developed from validating
these problems that were flagged by the pipeline to the community via our wiki and also the OSM Indonesia. So we also have some further plans to incorporate machine learning approaches to improve the precision and the recall and other types of the map errors that we have. Yeah.
Oops. Okay. So, yeah. And in Azure Internet, we want to reduce some arbitrary defined threshold in our pipelines. And the last point is the pretty interesting point for me is that we want to incorporate like graph-based features that will provide more information about the road network connectivity.
Yeah. Right. So, yeah, you can check out our OSM or wiki page or you can like drop me an email if you're interested. And, yeah, just approach me as well. Thank you. Thank you.