Merken

Image Geocoding as a Service

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Beta
Erkannte Entitäten
Sprachtranskript
thank you very much like the manner thanks for
sharing the this session I'm coming from Portugal is almost the same logic to have and I'm really enjoying the conference and already solved the intestine gaining spine presentations I will talk about he misses geocoding and and I'll try to explain all the overall process behind these services the was is possible try to to explain it
in why do we are so interesting in images because in fact the people use a lot of images people are taking pictures here which is good but they are using the camera as the sensor and I think it's the most use of self so that we carry with us today is the the camera so if you see that the number of features of flow live thing per they are about 1 . 8 billion pictures uploaded every day so we have lots of pictures so what we like to do we like to use the images to xt make the priests precise location of the user for example and this lady took a picture and I would like to use to tell her where she is in this room precise position and orientation and the other methods like GPS are not able to provide the orientation for example you could say I'm here but you have to walk and to to move around to find your orientation and the and even as you our vision tool to find where ourselves are so use our vision to competed to estimate our opposition and in in this project we tried to do these geocoding in less than a 2nd so it is there not a bad user experience to upload the photo and have the the position estimation less of a 2nd we also tried to use the camera In continues to the streaming there's the view and for each 2nd we extract the frame and do the processing based on the frame per 2nd and we are able to do that and it's was a requirement the
and I am who by explain it but I have a small video here we really small the which
is the usage of these geocoding
as any other geocoding and in
this example we change open trip planner and instead of this the the possibility of introducing coordinates are a street name and we can use that our camera and we
Apollo the 18 a report we take a picture with their mobile phones tablets was established in this case this is the inside our university campus we already prepared information for
every world and as you can see there serotiny immediately their precision their position inside the
building sorry inside the building and the the precise orientation of the
user just to faster the but it doesn't say the 1st option last time and 180 degrees to the to move to the opposite side of so
we have uh from that the the user only sees these from and is found in this use all geocoding service the it gets the requesting a response as an acid which is the response and for the user or a form of a vowel point of view is just the just the the geocoding therapies and it provides an API behind the scenes on the backhand we divide our Our self-training tool models the the first one is related to that of ozone and orientation estimation and to be able to compute the position we have used that other ways of for involved models so everything was fueled previously use that tho we took images to create is thank you to create these models serve I to to talk about this a few different parts of the software the geocoding and API it's uh an API is very similar with the uh jokily API as smart box mom where will lapse and Yahoo and you name it there are lots of you calling API as all these API uses strings to compute to the make the position so it is and can so our
API is almost the same with the the idea is that from a developed point of view you don't need to change your code you just use this API as API eyes for the request we use an additional parameter which is the image it's color photograph so it will be easy users to upload the image the disorder that can have our cannot have xt stacks the probably if you are using the services you don't have the location but you might have been and we use an optional parameter scholar region that despair and for example is the use of well the Maps API and for this region retraction tools submit to the I P the IP address of the device building the image because if you know the IP can geocoding DIP and limits the the uh the scope of the answer so might include the answers with these that and the IP address but it's uh an optional parameter the answer as uh structures fuel the columns geometry 3 D and this geometries filled with provided the location as a 3 D point and we provide also it uh regarding the ground the ground the orientation the and also the pitch to know if the camera is pointing up are pointing downward pointing forward emotional and additional sphere of the related to the confidence level of our a geocoding there visual opening services uh their response several their positions or might have several positions and the 1 the by score so uh we just mentioned the visible part of this the API and all this runs in the background before up he talking about disposes stimulation I will talk about the the that they're point cloud models that may generate so it's easier I think to understand so before so now able to
estimate the position what we do for and news site for and for this conference we need to take pictures lots of pictures and is a structure from motion algorithm to create a point clouds as a matter of so you might not figure well on the outside the point clouds and also the pictures taken our related Mr. the copula the size position of those images and created a huge by clouds and we use this class of algorithms for section for motion and afterwords also work is related to the is trying to reduce this point cloud as you imagine breaking point class for all the world ascribed to complicated and and you get the hinge point called so in several steps to try to reduce this point clouds so the 1st step is just uh feels some redundancy and afterwords who use technique Collins synthetic views that is his staffers unfairly is room I can take a form ample 16 pictures of newsroom and with these 16 pictures I have all been whereas all the details of his room but I could have all the details and let the last pictures this synthetic views I can pictures there's um as an an official picture that we can get to that is the lumen set of views that I can use to to after was georeferenced is room present with the for image 1 for each of these well as I could do the job of the our referencing and we just for pictures so instead of having a point cloud based on 16 20 images of having a point cloud based on just for images the same for outside building 7 many picture of the building all creating that views of its 1st save sometimes I mean think that use it with 2 4 thirds and so on but it's it's a way to compute the minimum said of use that dying is June estimate afterward the position the this is the only information that to want to tune the this but to be able to have a very fast the search algorithm we to reuse use of a cover letter tree and all those features in this the accused i indexes in his the calorimetry so the search mechanism is quite fast in this vocabulary tree so this previous step this offline step of creating new synthetic views and is going retreat we have to do that prior to the to be able to estimate the the image position so the regarding the the pose
estimation the as I said in the group is saying that it views and a new image that to we need to find out the the location is related to the image that we already have and for by 2 2 images yesterday uh with the in different positions in front of the questions that will find that that people around and so on but even though I was able to create and they have features that matches in those 2 through images so it's if I have the permits of 1 image I can use a social mission metrics to compute or to estimate with the other images so the thing in in our system will use this thing that can use tools is similar to the position of a new
images so this is the and will will talk about this model so to the made the position we have a new image that someone uploads we extract the features from that the images this time more well known technique of feature extraction from images and with this features we go to the into the Volker where the tree to find the similar features already in our in our synthetic use and the opposite of this will be uh a list of seed that views that have some features matching the features extracted from the images so we tried to do a next step to match our of image with those seen that can be used to see which is the synthetic views that best match the input damages for those that we have a match we compute this decimation matrix and we are able to With this estimation matrix we are able to estimate the the position of the uh the image so an having no files in doing to show that the learning process it's quite complicated and the has this concept of computer vision and so on but we no way this and this is a considered a lot of steps we know a lot of structure and uh complicates after aspect to the world is and the today the are able and we were able
to to do its fora for several for in there not spaces so we have a very lousy repository online it's it's real food softer had at but I would say it not completely open because to be completely open needs should be about development but I think it there's some documentation around but uh my yeah at my purpose of this talk is try to find the fact that we all people into singing in this kind of logarithms and in this kind of processing tool truly pressure student to write better recommendation that states that the heat and so on balance take from the temporal from the bucket to get the we we are still working on this project well and try and other techniques instead of 4 some synthetic views which I believe does its features and our technique to her reduce the amount of the point clouds uh the results well aware not so good we have some benchmarks comparing the 2 but where are open to 2 other ideas to make this uh geocoding works we are the tagging all the images with semantic you descriptions instead of just trying to extract features through trying to extract objects so I can say these pictures has some some sky some buildings some persons cars signs and so on to enrich the and about the bayes also are working on improving the scalability scalability of the lower overall process because we like to do this uh at that scale so this there's some problems and to try to do this in their world scale now also trying to move everything to PostgreSQL of all Daytona about the bayes already you are able to cluster the point features and so on this and then we also need to develop a point cloud algebra to manipulate to this point clouds I would like to see if this part of the point already exists I should like to to work with the historical data for example city center is changing there is a new building there is a new shopping even inside the mall shopping scan change and so on and for example if I'm trying to geo-reference or geocoding and all the image I would like to use unlabeled point cloud The related with that space so we probably need to keep several versions are a several temporal point clouds in our system and so on but as I said that the idea of this presentation is to see if they are someone else into singing these algorithms that we can work together I have um I'm running edges lab in my university but since 2 years ago we created the groupings are are a lot just to to do computer vision algorithms because there's so many images and so many projects that using images that we need and me feel that we need to improve our skills related computer vision that's where and doing this so thank you very much if you young men there is a yes I in the kind of uh and uh programming language you're using for a phrase the simplest pulse simple and could get there and it was no later to than the the the requirements performance to be able to do this in less than a 2nd and we use open cv library Open CV libraries written in C + + can visiting Java or another languages even in in Python and so on but it's fostering in C + + OK as was the looking for speed so that's 1 yesterday because they have found this clearly if you were able to 5 seconds 10 seconds to ever uh that is the position of analysis of you could you could started in our I did you could do it another language and when it's working you could what we write it to another language for yes yes you're solution and things in the world and doing this as natural as possible so this is
steps II decomposes steps because the steps corresponding to software levels analysis and so it we show the world we can replace this this models with the present is easy image features extraction again looking in any language so he afterwards you never uh a set of descriptors the scriptures of the features so any language works so the 8 on open so the has many I wouldn't uncanniest detector lines of transformation and and many other algorithms and which I would have to use for image feature extraction and and that that that's a good question and the we use several algorithms and for example it's different than the the image feature extraction is different in outer space is an indoor spaces for example as a set for example that there are feature X churches there mostostal to to differences and so on but the meaning those spaces there as our almost equal university everywhere but cleaning out their spaces we have a lot more uh must so the if I can send it to our full paper you can see that for different the the feature extraction algorithms and have different results from their not forever but basic defined area taught me use differences which a lot of this this couple I will and I was just wondering that the how does reacts with the dynamic and mean use for our your environment that is changing could you see a lot uh looking at the the choose the 1st something has changed him if you're myself has and there was this is this yes there there are several changes in the environment that that do not affect the the algorithm but so that they are at the end of it depends it depends so is dynamic and his concept to use the meiji cheese is that the user asked to locate the to still and then extract New Europe models he in all I could use a condom observers yes yes no we can't send you a major US you OK you look at me take a major stakeholder for review the neural point about this right uh users to contribute to the images to beauty at the bank of yes yes so I am asking you know within a year of very Mageed and then if you have enough images of this are you can recreates the founding yes systems and that the the last 1 that unless the user about the do you have any policy in both right so epitome shows a license for the user when they have a role to the majors that mean they give you the right to use the media how we use them and what is the restriction and all these things do you think about these well in what did think but there's think about it that the process is testing and development stage but that's why I'm if we stop there for us it's a more people and so on when you will be right there's something about the things more legal terms than that I think that the major decision if he's if you use the images or we don't use the images because we can do this uh geocoding and without using the images and even without storing the image from the user so there is no problem regarding the user right so we just use the English extract the features and it without starting and we also tried to extract the features on the mobile phone for example instead of uploading the image we just do this feature extraction on a mobile phone and we only get below their the features descriptions to the two-hour service so in this way we don't see the image of the users just areas have the features so and we answer our asses is based on the features so in this case there is no you should relate to the their personal rights are explaining features so people can send features from everywhere you and we don't see thank you very much we'll get further and you
Streuungsdiagramm
Web Services
Prozess <Physik>
Weg <Topologie>
Content <Internet>
Kombinatorische Gruppentheorie
Mathematische Logik
Computeranimation
Datenhaltung
Unternehmensarchitektur
Spezialrechner
Web Services
Schätzung
Demoszene <Programmierung>
Modelltheorie
Schätzwert
Orientierung <Mathematik>
Prozess <Physik>
Rahmenproblem
Ortsoperator
Zahlenbereich
Gemeinsamer Speicher
Extrempunkt
Datenfluss
Computeranimation
Videokonferenz
Spezialrechner
Digitale Photographie
Schätzung
Projektive Ebene
Maschinelles Sehen
URL
Maschinelles Sehen
Orientierung <Mathematik>
Spezialrechner
Freeware
Wechselsprung
Offene Menge
Vorlesung/Konferenz
Extrempunkt
Computeranimation
Intel
Ortsoperator
Tablet PC
Grundraum
Verkehrsinformation
Computeranimation
Spezialrechner
Orientierung <Mathematik>
Minimalgrad
Gebäude <Mathematik>
Computeranimation
Orientierung <Mathematik>
Konfiguration <Informatik>
Retrievalsprache
Punkt
Atomarität <Informatik>
Spezialrechner
Web Services
Visualisierung
Orientierung <Mathematik>
Kartesische Koordinaten
Parametersystem
Sichtenkonzept
Google Maps
Dateiformat
Variable
Digitale Photographie
Konfiguration <Informatik>
Dienst <Informatik>
Funktion <Mathematik>
URL
Zeichenkette
Orientierung <Mathematik>
Ortsoperator
Quader
Keller <Informatik>
Räumliche Anordnung
Netzadresse
Code
Datenhaltung
Demoszene <Programmierung>
Unternehmensarchitektur
Informationsmodellierung
Bildschirmmaske
Kugel
Bereichsschätzung
Software
Digitale Photographie
Adressraum
Endogene Variable
Schätzung
Softwareentwickler
Datenstruktur
Schätzwert
Streuungsdiagramm
Winkel
Chipkarte
Endogene Variable
Mapping <Computergraphik>
Offene Menge
Mereologie
Parametersystem
Räumliche Anordnung
Kantenfärbung
Entropie
Modelltheorie
Streuungsdiagramm
Relationentheorie
Subtraktion
Web Site
Punkt
Ortsoperator
Extrempunkt
Relationentheorie
Klasse <Mathematik>
Gruppenkeim
Kombinatorische Gruppentheorie
Computeranimation
Datenhaltung
Überlagerung <Mathematik>
Netzwerktopologie
Suchverfahren
Spezialrechner
Bildschirmmaske
Algorithmus
Prozess <Informatik>
Schätzung
Ordnungsreduktion
Datenstruktur
Neuronales Netz
Orientierung <Mathematik>
Streuungsdiagramm
Schätzwert
Kraftfahrzeugmechatroniker
Sichtenkonzept
Linienelement
Gebäude <Mathematik>
Kopula <Mathematik>
Physikalisches System
Sichtenkonzept
Netzwerktopologie
Datenstruktur
Automatische Indexierung
Garbentheorie
URL
Information
Modelltheorie
Streuungsdiagramm
Resultante
Matrizenrechnung
Prozess <Physik>
Punkt
Skalierbarkeit
Formale Sprache
Applet
t-Test
Versionsverwaltung
Gruppenkeim
Computer
BAYES
Unternehmensmodell
Raum-Zeit
Computeranimation
Netzwerktopologie
Spezialrechner
Deskriptive Statistik
Puls <Technik>
Skalierbarkeit
Algorithmus
Vorzeichen <Mathematik>
Code
Maschinelles Sehen
Benchmark
Zentrische Streckung
Sichtenkonzept
Prozess <Informatik>
Dokumentenserver
Merkmalsextraktion
Gebäude <Mathematik>
Temporale Logik
Ein-Ausgabe
Druckverlauf
Dezimalsystem
Ein-Ausgabe
Ablöseblase
Projektive Ebene
Aggregatzustand
Algebraisches Modell
Ortsoperator
Mathematisierung
Matrizenrechnung
Kombinatorische Gruppentheorie
Logarithmus
Reelle Zahl
Schätzung
Programmbibliothek
Softwareentwickler
Datenstruktur
Parallele Schnittstelle
Analysis
Streuungsdiagramm
Schätzwert
Matching <Graphentheorie>
Transformation <Mathematik>
Zwei
Mailing-Liste
Physikalisches System
Elektronische Publikation
Sichtenkonzept
Netzwerktopologie
Objekt <Kategorie>
Summengleichung
Offene Menge
Mereologie
Streuungsdiagramm
Resultante
Subtraktion
Prozess <Physik>
Punkt
Mathematisierung
Formale Sprache
Matrizenrechnung
Transformation <Mathematik>
Kombinatorische Gruppentheorie
Ähnlichkeitsgeometrie
Raum-Zeit
Computeranimation
Übergang
Spezialrechner
Deskriptive Statistik
Informationsmodellierung
Algorithmus
Software
Schätzung
Luenberger-Beobachter
Vorlesung/Konferenz
Softwareentwickler
Grundraum
Gerade
Analysis
Transformation <Mathematik>
Merkmalsextraktion
Sichtenkonzept
Entscheidungstheorie
Netzwerktopologie
Arithmetisches Mittel
Dienst <Informatik>
Flächeninhalt
Menge
Rechter Winkel
Ein-Ausgabe
Hypermedia
Ablöseblase
Programmierumgebung

Metadaten

Formale Metadaten

Titel Image Geocoding as a Service
Serientitel FOSS4G Seoul 2015
Autor Rocha, Jorge Gustavo
Lizenz CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 3.0 Deutschland:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben.
DOI 10.5446/32158
Herausgeber FOSS4G
Erscheinungsjahr 2015
Sprache Englisch
Produzent FOSS4G KOREA
Produktionsjahr 2015
Produktionsort Seoul, South Korea

Inhaltliche Metadaten

Fachgebiet Informatik
Abstract Driven by the ambition of a global geocoding solution, in this paper we present the architecture of an image geocoding service. It takes advantage of the ubiquity of cameras, that are present in almost all smartphones. It is an inexpensive sensor yet powerful, that can be used to provide precise location and orientation. This geocoding service provides an API similar to existing ones for place names and addresses, like Google Geocoding API. Instead of a text based query, images can be submitted to estimate the location and orientation of the user. Developers can use this new API, keeping almost all the existing code already used for other geocoding APIs. Behind the scenes, image features are extracted from the submitted photograph, and compared against a huge database of georeferenced models. These models were constructed using structure from motion (SFM) techniques, and heavily reduced to a representative set of all information using Synthetic Views. Our preliminary results shows that the pose estimation of the majority of the images submitted to our geocoding was successfully computed (more than 60%) with the mean positional error around 2 meters. With this service, an inexpensive outdoor/indoor location service can be provided, for example, for urban environments, where GPS fails.

Ähnliche Filme

Loading...