Towards a more readable Openstreetmap based world map for westerners

Video thumbnail (Frame 0) Video thumbnail (Frame 2002) Video thumbnail (Frame 3715) Video thumbnail (Frame 5853) Video thumbnail (Frame 7048) Video thumbnail (Frame 7599) Video thumbnail (Frame 8452) Video thumbnail (Frame 9597) Video thumbnail (Frame 13620) Video thumbnail (Frame 17241) Video thumbnail (Frame 20930) Video thumbnail (Frame 23150) Video thumbnail (Frame 24859) Video thumbnail (Frame 26329) Video thumbnail (Frame 28067) Video thumbnail (Frame 33450) Video thumbnail (Frame 35179) Video thumbnail (Frame 41959)
Video in TIB AV-Portal: Towards a more readable Openstreetmap based world map for westerners

Formal Metadata

Towards a more readable Openstreetmap based world map for westerners
Title of Series
Part Number
Number of Parts
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
The standard rendering style used in Openstreetmap today produces hardly readable maps in countries where the usage of latin script is not the norm, at least from an average westerners point of view. Our map style uses a renderer independent approach to solve this. We use localization (l10n) functions that create readable names. They are implemented as stored procedures in the PostgresSQL database which contains the Openstreetmap data. The targeted latin langage (german, english, …) can be easily selected. The talk will show how these functions currently work and will give an outlook on potential future extensions. In contrast to almost all legacy geographic data Openstreetmap does already contain a lot of localized data acquired by mappers from all around the world, which should be used whenever possible (Example: japan instead of 日本). Automatic transliteration can then be used as an alternative if no latin names are available in the database. Especially when using transliteration there are many pifalls which have to be addressed depending on language and country. Some of them have already been dealt with by the current implementation and are presented in the talk. Others, which appear difficult or impossible to solve are also shown. Another challenge which exists in localization of maps are political problems. I will briefly describe some of these issues at the end of my talk. Sven Geggus (Fraunhofer IOSB)
Keywords Fraunhofer IOSB
Metropolitan area network Mapping Regulärer Ausdruck <Textverarbeitung> Process (computing) Computer animation Electronic meeting system Projective plane Ripping Mereology Wide area network
Scripting language Rule of inference Programming language Standard deviation Randomization Scripting language Programming language Mapping Projective plane Rule of inference Perspective (visual) Object-oriented programming Computer animation Object-oriented programming Contrast (vision) Normed vector space Computer cluster Normal (geometry) Contrast (vision)
Object-oriented programming Computer animation Object-oriented programming Key (cryptography) Inheritance (object-oriented programming) Computer cluster Compiler System programming Right angle 3 (number) Arc (geometry) Physical system
Scripting language Programming language Decision tree learning Latin square 3 (number) Exterior algebra Computer animation Object-oriented programming Alphabet (computer science) System programming Moving average Local ring Exception handling Physical system
Scripting language Source code Mapping Latin square Server (computing) Computer file Source code Shape (magazine) Independence (probability theory) Geometry Object-oriented programming Computer animation Object-oriented programming Function (mathematics) Procedural programming Local ring Internationalization and localization
Functional programming Slide rule Implementation Scripting language Programming language Table (information) Latin square Code Decision theory Database Different (Kate Ryan album) Hybrid computer Alphabet (computer science) Compiler Functional programming Implementation Extension (kinesiology) Social class Physical system Scripting language Programming language Inheritance (object-oriented programming) View (database) Decision theory Latin square Sound effect Parameter (computer programming) Database Flow separation Type theory Computer animation Personal digital assistant Function (mathematics) Hybrid computer System programming Data conversion Social class Finite-state machine Table (information) Logische Programmiersprache Form (programming) Reading (process)
Implementation Open source Latin square Execution unit 1 (number) Numbering scheme Unicode Word Sign (mathematics) Object-oriented programming Different (Kate Ryan album) Alphabet (computer science) Functional programming Implementation Library (computing) Physical system Form (programming) Scripting language Default (computer science) Relational database Electronic mailing list Usability Database Sign (mathematics) Component-based software engineering Word Object-oriented programming Computer animation Software Alphabet (computer science) Function (mathematics) System programming Procedural programming Freeware Physical system Electric current Library (computing)
Ocean current Slide rule Mapping Latin square Code Image resolution 1 (number) 3 (number) Mereology Special unitary group Code Single-precision floating-point format Personal digital assistant Set (mathematics) Local ring Task (computing) Form (programming) Area Computer font Dialect Texture mapping Mapping Inheritance (object-oriented programming) Image resolution Uniqueness quantification Latin square Code Mereology Single-precision floating-point format Mathematics Computer animation Personal digital assistant Different (Kate Ryan album) Volumenvisualisierung Local ring Internationalization and localization Electric current
Programming language Addition Algorithm Addition Mapping Algorithm Code Code Single-precision floating-point format Computer animation Different (Kate Ryan album) System programming Implementation Physical system Reading (process) Library (computing) Physical system
Point (geometry) Context awareness Network operating system Maxima and minima Distance Mereology Special unitary group Graph coloring Emulation Mathematics Sic Object-oriented programming Oval Moving average Functional programming Extension (kinesiology) Summierbarkeit Sanitary sewer Newton's law of universal gravitation Scripting language Injektivität Metropolitan area network Programming language Information management Raw image format Parsing Texture mapping Demo (music) Inheritance (object-oriented programming) Block (periodic table) Server (computing) Menu (computing) System call CAN bus Error message Computer animation Uniform resource name Computer cluster E-learning Extension (kinesiology)
Functional programming Service (economics) Latin square Code Direction (geometry) Set (mathematics) Open set Repetition Computer font Revision control Frequency Mathematics Mechanism design Meeting/Interview Functional programming Scripting language Default (computer science) Inheritance (object-oriented programming) Gradient Analytic set Bit Database Line (geometry) Type theory Uniform resource locator Computer animation Personal digital assistant Uniformer Raum Infinite conjugacy class property Video game Right angle Reverse engineering
OK hello and to welcome to the of certain last talk of this recession look and we now have 1st when they're ghosts who so we are working on improvements for long label placement especially for for Westerners so that sometimes syndicalists and I'm working actually at the following over research institute idealist being and Council which is 1 of about 60 of all around Germany and they are publicly funded and of promises of and GA someone Linux up in all things Linux Guide there and I'm doing the chairman that makes style as a hobby and fortunately I was able to do some work for this project as part of my day job and so part of this what I'm presenting here is we have have been able to do as part of my day the so what I'm talking about what's going the motivation for doing
this is that if you look at the openstreetmap . maps and go to countries where Latin script is not the norm you will usually understand you you want understand anything so the reason for this rule is that the project usually uses local languages for acquiring names and then have 4 objects achieve graphical objects in the map and so we can just random names like OpenStreetMap . map standard style silos if we want to understand is from a western this prospect perspective so but fortunately In contrast to conventional geodata OpenStreetMap does contain at least some localized status so we should use this when rendering maps so to get them more understandable 1 to 1 so proud of to
localize objects in OpenStreetMap data looking like I have 2 examples for a country that borders the country objects on on the left hand side you can see that the object for Germany on the left and the right hand side you can see below object-place right so what we want to use this is actually something Romans so not the name tag and today we have to the columns separated named texts in this OpenStreetMap key well you system where you can use arbitrary keys and I think what we want to use here if I out target languages German but we want to use the name the 3rd column tank that probably want to have the original of the name in parent aces uh that's that's so that is our target so let's have a quick look at so let's have a quick
look for that the writing systems of the world and as you can see in and they are mostly dominated by Latin with 2 exceptions the rest of the 3 exceptions probably the Arabian world to Russian Russian Foundation and all all around Asia so the objective is used all the localization for all these countries that's what makes India an exception because India has been English as the official language so in India we will always have a Latin script alternative so he like
OpenStreetMap Carter style looks like that from the the rural area of Moscow for an example and the here's what it looks like if we add to that localization so is our
main objective making readable for Westerners by using Latin script used to localize data from OpenStreetMap itself whenever this is possible this is not possible on any object and will never be because many objects just don't have and Latin name so we use the other localization methods and then open if OpenStreetMap does not contain the data we want so use transcriptional transliteration the
so how would you do it is my approach has been used PostgreSQL will start procedures which is an advantage but this the court can also be a disadvantage it is an advantage because it is rendering independent it doesn't matter if you using that make maps of archaeal solid whatever for rendering because all you have to change your is the SQL the disadvantages well if you data sources not PostgreSQL and for example you're using Raul OpenStreetMap 5 so shapefiles or something you can use this so
here's what the implementation look like have treaty PostgreSQL the functions of which can be used as they are placed in their own extensions the end of the summer this is actually usable for any language and western language use in Latin script or any language using scripts Latin script doesn't need to be over some language it is done in German language but without delay and of jitter Roman languages in mind that so a convenient way to do this is to use this is to add database fuse which looked like the original table so you can and in the best case you don't even need to change anything in your style OK I know 1 thing I want need to say well why why I have a separate function for ticket place name and street name the reason is that a few at and and and name in parent is you have to know and that the the local name and and localized name and this will get where along the so what I want to do is approved by abbreviation of street like in English on this T 4 stars in Germany and and this is actually a place where I would like to get more native speakers of other language to get these abbreviation code for as many languages as possible so here's what my state machine like to do the decision which which name to use and a have a look at some of the target language in my example it's named up the column and if we have this just use it now if we don't have it have a look at naming it name is written in Latin script well we are in a country which uses the Latin script so just use this if name is not Latin script to have a look at induced name which is an international name does exist it if it but it is so I am expecting this to be so just use it and if this is not the case however look if there is an English name if there is not well as a last resort we have to use transcriptions there
as a shot in the slide difference between transcription transliteration uh transliteration means something which is through audible the transliterated something transliterated back you really get your original script this is not the case for transcription transcription is more are done with the the Western reader vessel reader in mind reading this trying to get the pronunciation as as nearly as possible as stores as possible to the origin of the pronunciation the transcription is what we need because we don't need to be added to the reversible the effect of that and another thing that you need to know if you think about stuff like this is over there are actually 3 you probably for a few the consider hybrid forms classes of writing systems 1st sterile alphabets well-known Latin Greek literary became the transcription issues usually easy using them but this is also true of a syllabaries stated the Japanese counter is 1 example so is relatively easy transcription and then you have lot of lot of traffic writing systems like Chinese but Chinese is actually the only or look at Rafik writing system which is still in use that have been historically once in the in in the history of humankind and but the Chinese is the only 1 left and and the problem would look logic graphic writing system is that's transcription is only possible by language and if you do a transcription in Chinese and do a transcription in japanese for the same characters you will get something completely different as does have to has to be resolved uh in some way and last but not least there are hybrid forms so uh like type a language of Korean language the user are mostly easy to transcript so a few
known problems of transcription I have been encountered there may be more I would be glad to hear from more and best would be I would how how to resolve them I so 1st of all there's lots of lots of graphic alphabet transcription of Chinese characters has to be based on the place of geographical objects OK we're doing that we having uh um geographic the way of database so this is not a problem we can just determine what country relation and decide based on the country we're in which transcription to use uh current implementation does this would Japanese and this will probably need to get extended to some some some other scripts another thing is that Thailand uses the Royal Thai generous system of transcription which you get on local roads signs and stuff but unfortunately the ICU library which is publicly away libelous free software users ICO 11 940 which is something completely different and not widely used as I'm not aware of a free and open source library implementing RGT RTGS uh if that would be it would be it could be easily adapted another thing is a red beak and the hybrid Hebrew for example but I do not have been but do not add all vocals to the written word instead to readers at them by Roberto while reading so they are not they are actually not in the word so and automatism can be added without word lists something so a transliteration will therefore be often incomplete and as an example I have Tehran but using ICO ICU it gives you Toronto which is not to what you were learn what likely to expect that the farther the emblazoned this isn't that problematic because they are they usually have OpenStreetMap text so you don't need to go for transliteration is if you
words to the current implementation using the you'll international compliments for unit code library in that as the default if there are better ones better ones can be used but a PostgreSQL start procedure has been implemented which is actually just the thin layer for calling DNA to Latin transliteration form of the function of this library and is then available as a stored procedure that and uh we have the place dependent use of transcription libraries for the currently do doing this the Chinese correct so which which Japanese Qualcomm GE and this is performed by the kamikaze library but which is also free software and this is only done if the object is located in Japan and it's not to be used the otherwise so that this scheme is extendable to other writing system and countries and we can also add something like the usage of different different transcription libraries disturbed by writing system and not by placing um which can also be easily added 2
other problems encountered by using internationally haven't there's no single fond way label which contains all unique codes characters that I so you have to use different ones and 2 compromise in the current code is to render OK the names only if the use Latin griego career lake but this should be extended possible resolution would be and I learned yesterday that maps out of 7 task actually implements this so I have to have a look at the renderer can use different forms spent based on the correct set and but some of the 7 dust this uh I'm not I'm talking not exactly sure about looking at the current lead us back it might also be possible to to do to produce the best of fondant from where resources uh In that didn't try it 1 last
slide political problems and localization political problems are usually unable to solve them murder by technology that many regions of the world to have been part of former of other countries in the past and of for example German settlement areas in user Europe if you have even the smallest villages in Poland Alsace-Lorraine has still German names and nobody knows if there are are still in widespread use or not the worst case scenario that the usage will people soul so the only thing we can do is just a map of some of us will only apply with names which are still used use old name otherwise hopefully did do um what we do as a compromise is always rendered current local name and parent aces so prospect and
enhancements of technical solution of the problem to render the use of different writing systems in a single label as I already said it disappeared probably already working maps of 7 the the addition of more and better suited libraries if somebody knows of it not only in the audience of the library which might be due to I will I will be happy to know and integrated but more fine-grained distinction after transcription algorithms by place um it might be needed to do to do this more fine-grained and they had to read abbreviation code for all common languages so actually at street abbreviation code for all languages was streets that Japan actually abbreviated and brought an end suggestions from the audience there I will do 2 minute alive the more and then well I would take questions in OK so
I will not I will call the PostgreSQL will of course online not OK 1st of all I need to create my extension and then I will show you the to transliterate uh probably somebody knows this is characters as those other characters from Tokyo and you get on watching the this is not very good so I we need to leave you use something else no the the provided it is home OK I get Tokyo the that I would I would say much better than don't change still not perfect but well that already acquired Tokyo in in in the proper script for us but if you would do if you would use transcription which should go for the 2nd 1 as and last but there's still you'll be aware transliteration function is all about and I will I just added a point here and 102 37 need to devices somewhere in this somewhere in in Japan if I had somebody in Germany like 19 foreign 9 uh I will speak at the launching the so this is aware of the location and so it it whenever they had the object distance located in Japan it will use Japanese transliteration so and and if you will it is easy to also dropped the egg and the extension OK 1 more demo and I take the call discussing block which is called color while injections and and German names all over the place because of its history and and the part in the real world you see also your new can only see the chase here streets the signs as so we had to but we should that's the chairman in the parent faces so this
is what it looked like a looks like and this is actually what we're using our forehead map and as you can see here if you use that gets street name function that will be abbreviated to achieve . and currently have does abbreviations for Russian German English and Ukrainian uh I would be glad to get this for other languages Buckingham so is my don't use street names but place name this stuff will not get abbreviated to 1 the can yeah so OK so far more questions thank you
very much goes it would distribute turn around because obviously a lot of a lot of tourists like from Japan Niger come to life in Europe and they wanna see you like the Japanese name for German street so for German places many have have a have this can be likely down that you need to change the code but the code because the code is mostly done with the Latin script in mind there are a few corner cases where it won't work and 1st of all you have to implement something else than that you have to extends to ICU rep functions which currently use hard-coded any 2 letters in this that should be any tool whatever type of the Japanese and only I think it's it's it's it's not a problem for Japanese because they use Latin script analytic script it's probably more for Chinese people and to me I'm actually not aware if there even is the a transliteration function and the out into the other direction so it might be possible to do this but you can call configure the code to do it but it will probably not to toward what you're expecting what you're expecting that it this so it can be extended to display what I would say hi grade talk things and just answer your question about during the talk earlier I know met because a fallback mechanism so if it if it doesn't if it isn't able to render a place name with the fonts then that esthetic can provided with uniform for example and it will use another font sets to when the placement do you know if this is also the case inside 1 label because I have japanese characters and the European characters inside 1 labelled as well as the following year again I'm I'm not quite sure about that there but I think so this because we're using false false starts with using both names as well as possible so the local name and the English name and this is the same label and that should work I mean the the only problem that discovered was that the uniform seems to be a little bit smaller so if apply text size of 10 for example then it works fine for for the euro default font but the uniformed with period is like almost unreadable and as it did this is something which which which have to has to be addressed because I would like to have did parent is this all over the world do it should occur in main text for a wise in characters so do use it so the right from the database currently I'm using it up the 2 might be needed so that it is Latin function in the code actually contains these function because this would be probably just the 2 lines that show sold you know reversal of OpenStreetMap is you don't trust hopeless and for example in Northern Africa and you will get a lot of the Arabic names because for some reason the things that Arabic is pretty International all here that there any other questions OK that's our 1st 1 short Christopher Meek all Yuko changes are online there unknown OpenStreetMap Dior that are in use on open service reader being or currently in OpenStreetMap the there's an older location after colds but in the current version will get online on OpenStreetMap the company in a couple of weeks so want to support would expect OK no more questions than things against