CoffeOSM: improve OpenStreetMap a receipt at a time
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 542 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/61502 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
TheoryOpen setLevel (video gaming)Projective planeComputer animation
01:12
BitInsertion lossInformationNumberProcess (computing)Uniform resource locatorExpressionWebsiteTask (computing)Perspective (visual)Multiplication signPosition operatorAddress spaceData storage deviceForcing (mathematics)
02:31
InformationOpen setMultiplication signLevel (video gaming)Task (computing)Uniform boundedness principleMereologyRow (database)PlastikkarteComputer animation
04:35
Level (video gaming)Open setInformationType theoryProcess (computing)NumberWebsiteMultiplication signAddress spaceComputer animation
05:39
BitComputer architectureProjective planeProcess (computing)Open sourceWebsiteFront and back endsMedical imagingCartesian coordinate systemOpen setFunctional (mathematics)SoftwareSheaf (mathematics)State of matterDebuggerInformationSource codeClient (computing)Information privacyShared memoryINTEGRALMobile appForm (programming)Text editorLevel (video gaming)Pivot elementDiagramEndliche ModelltheorieUniform boundedness principleServer (computing)Software testingDigital photographyResultantPreprocessorMultiplication signDifferent (Kate Ryan album)Task (computing)Proof theoryProduct (business)Nominal numberExistenceInsertion lossElectronic mailing listResource allocationAddress spaceType theoryFerry CorstenExecution unitOffice suiteOrder (biology)Point (geometry)Group actionBeat (acoustics)Water vaporUniform resource locatorHypermediaAxiom of choiceOnline helpSummierbarkeitAvatar (2009 film)MathematicsArm
14:40
Program flowchart
Transcript: English(auto-generated)
00:05
Okay, the next speaker is Micael Tammany, which has a talk about coffee osmo, improve
00:37
coffee osmo. So, the stage is yours.
00:41
Hello, everyone, and thank you to be here. It's really nice to be here, like, after attending, like, 10 edition of FOSDEM, so it's my first talk, it's really great to be here. So, I want to talk to you today about a little project that I started, like, five months ago, I think, and it basically want to be a new, different, and hopefully easier
01:04
way to add place in particularly business place to OpenStreetMap. And to give you a little bit of perspective, the usual process to insert a new place or a business into OpenStreetMap usually involve with check if the place is already on OpenStreetMap,
01:26
so open the map and look for the location or where you want to add the place. And gather all the information needed, so the name, the position, the address, maybe the phone number, the website, and then open your preferred editor, the website,
01:44
or, you know, Osmond or something like that, insert all the information, check if it's correct, and then save it. And it became easier over the time, but it's anyway a time-consuming task, and especially
02:02
I found myself sometime having problem found updated information about a business, like the phone number or the website, sometime you find incorrect information online, so I think it's quite hard sometimes to insert a new place.
02:21
And so I got this idea that I'll tell you later how it's come, but to validate the idea, before coming to Fosham, we took a little bit longer way, we went to Zagreb with my friends to drink some beer, for a serious purpose, obviously, and when you
02:46
are in a new city that you don't know, you usually open them up and have a look where the majority of pubs, restaurants, or something like that, where they are, and when you
03:02
see something like that, you know that on that street maybe it's a good place to be, or on the other side of the city there is something else, but what you usually do is looking for restaurants, pubs, bars, etc, so it's quite important for me having this kind of information on the map, Open Street Map I think it's improved a lot over the
03:24
last years, especially in Europe maybe, but sometimes it's lacking the biggest information So we went to Zagreb, we tried to find as many bars as possible for this research,
03:42
and what we found out is that there is so much more pubs and drinks in place than what we spot on the map for the first time, so our question is how we can improve Obviously we can do more travel like this one, and insert all the information that we
04:03
gather travelling, fortunately there is so many volunteers around the world that do this kind of stuff and insert all the kind of place that we nowadays we can find on Open Street Map, but as I told you before, it's a time consuming task and you have
04:23
always to find the correct information about the business you want to add So what we can do, we can do the things that I already said, or do something like Bob does, that is sitting over there, I think he does quite smart things
04:44
His colleague is hyped that we get over our drinking night, and after he checks if the place where we went is already on the map or not, and if it's not, it's in Open Street Map
05:00
With all the information already there, no need to look everywhere, because usually the receipts have the business name, the address, the location, the numbers, sometimes the website, so I think it's quite smart things to do something like this Maybe it's not that smart that Bob usually does this after too many years, but that's
05:23
another problem, and to avoid his mistakes, like type or something like that I think it could be interesting to try to automate the process and access the information from the receipt, and basically the idea is to use
05:41
to snap a picture of your receipt, and you get all the information that you need to insert the place, and eventually you insert the place already, if the place is not on Open Street Map, so Coffee OSM basically does these things extract the text from the receipt, try to take it easy and label all the data
06:02
that you can find, check if the existence of the place, if the place is already on Open Street Map, and if not, maybe, because it's not actually possible you can insert it already on Open Street Map, or at least copy and pass
06:21
all the information you need And actually the project is quite small, it's just a proof of concept and I thought a little bit about the architecture of this project, and maybe I started to try to do a stand-alone app, but I think that maybe
06:43
it's better to have something that can be easily integrated in other applications, there is something like Street Complete or other projects that do a great job to improve and make it easy for people to contribute to Open Street Map, so I think it can be
07:00
really really interesting to maybe integrate a function like this in those apps, so I just mocked up a small Python API that exposes an endpoint where you can just upload an image and then the software tried to understand what is on the site
07:23
label all the data and just return it, actually the front-end is just a really really small application that shows a small form and visualizes the information that the backend could extract
07:40
As I said, future integration with all the editors I think it will be the way to use this kind of function if it's probably to be interesting and useful and relatable or maybe it can be just a stand-alone app or a pivot app I actually don't know, it's something open
08:01
I'm here to discuss it actually So, how it works actually The receiver is loaded to a server I remove the EXIF data just to have a little bit of privacy because it can be there like the location, the time where you went in a place, I think it's not something that you want to share
08:23
The image is a little bit pre-processed before the OCR Actually, it's something really basic I think it can be improved really a lot And there is OCR with test.OCR That works quite good, but I think it can be a little bit better
08:44
maybe processing a little bit more an image before Then I tried several ways to parse the data I found that actually it's what I found most reliable for this task and maybe it could be interesting to train a custom model
09:05
I found some open source model that can understand what a receiver or an invoice says but it's usually trained about the product that you buy and find the price and the total
09:21
and not the business name So, it doesn't work really well actually Then we can just look with the nomination if the place is already on the open straight up or not If we can find the exact name I try allocation search with the Overpass API
09:41
So, I look at the address Have a look at all the business of that type There is a round and show a list to the user Just to make sure that the place is not there with a little bit different name So, what can be done different or better
10:01
I have in my to-do list Improve the text section As I said before, the OCR actually works good I tried the Google Vision API that worked much better but I don't want to use it actually So, I think that maybe with a little bit more pre-processing on the image
10:20
all the data excited could be a little bit more accurate As I said, the front end actually it's a really, really small application that just shows some information It's maybe better to do on the client side all the stuff for privacy reasons So, I just can read the text keep it on my device and just upload
10:41
Or save the new place to open state map Choosing what information I want to share Integration So, as I said Using the place directly from the app will be great Or integrate it with some other editor More safe to test and improve
11:01
Because what we spot in diagram So, our chip is not useful actually Is that there is not a clear standard obviously for a chip And can change a little bit from place to place In Italy almost they are all the same We don't find this true for like Zagreb
11:24
So, maybe collect more information about how the chip looks all around the world Could be great What we can do Be done different Because this is the step that I thought It's the easiest But maybe like I said
11:42
Having a custom model that can label a little bit better The information Could be interesting At least because sometimes you find Not the name of the place And the receipt, but the name of the business That is sometimes different Or sometimes you find both And actually leap postal
12:01
That get confused So, you don't have A reliable result over time So, why we don't have To find more pubs More beer, more fun Help Bob to drink Some more good beer And obviously To improve OpenStreetMap together
12:22
And just having Easier way To add place And so we can collect more information So, thank you This is the website Where you can find the source code I also have a temporary playground Where you can upload your receipt to test
12:41
And that's it I just want to know If you find this idea interesting If it could be good to go on with this project If you have some suggestions So, I ask questions I don't expect You to have questions So, thank you
13:24
Because Maybe you haven't snapped The photo when you Just went out from the place The first reason And the second reason is for privacy As I said before But I think it will be great to offer The users the
13:41
Possible to choose if you want to share information It's much easier because If you have the exact coordinates You can just have a look In a really small area around And just easily found if the place is already On open state map or not And even be more accurate When you insert it So, it's obviously a nice idea
14:02
I haven't done it just because Sometimes, like Bob He doesn't do The insert right away But the day after the hangover So, like Yep I have a question Normally in the receipts, it always shows The time when the receipt was printed
14:22
Or given to the customer Can I repeat the question? Ok, I repeat the question Maybe the picture was taken That's a nice idea Yes