We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Standing up a OSM clone with GeoServer and CSS

00:00

Formal Metadata

Title
Standing up a OSM clone with GeoServer and CSS
Title of Series
Number of Parts
295
Author
Contributors
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
See that little map on the geoserver.org web site? While looking a lot like plain OpenStreetMap tiles, it’s actually rendered by a GeoServer, using CSS styles, off a PostGIS database. The map aims to be a very close clone of the actual OSM official one, meaning it has a lot of little details often removed from lookalikes to reduce the data to be loaded and rendered. This presentation will provide a little history of its development, the performance added to speed up its rendering, a good look at the styles styles used, and the overall setup and the of the GeoServer, both in terms of physical deployment, configuration and tuning. Finally, we’ll show an alternative setup and map, still working on GeoServer, but based on styles and data from the OpenMapTiles project.
Keywords
CloningCloningServer (computing)AreaGoodness of fit
Cross-site scriptingCloningComputer networkClient (computing)Image processingData fusionJava appletJava EnterpriseVisualization (computer graphics)Projective planeCore dumpData storage deviceService (economics)Execution unitQueue (abstract data type)Software
Electronic data interchangeArtistic renderingBitZoom lensLevel (video gaming)Artistic renderingOpen setServer (computing)Home page
Web pageServer (computing)Level (video gaming)Formal languageWeb pageCloningTesselationCategory of beingOpen setUniform resource locatorGeometryRaw image formatComputer animation
WikiLine (geometry)Open setFormal languageCuboidLevel (video gaming)Flow separationMultiplication signOnline helpMatching (graph theory)Software maintenanceMappingCatastrophismServer (computing)Metropolitan area networkCloningVolumenvisualisierungXMLSource codeUML
CloningTime zoneText editorFormal languageCross-site scriptingCompilerMonster groupRule of inferenceLine (geometry)Point (geometry)PolygonData conversionSoftware repositoryElement (mathematics)Translation (relic)Query languagePerformance appraisalRead-only memoryCurvatureHeat transferMonster groupGoodness of fitMultiplication signResultantVariety (linguistics)Complete metric spaceTime zoneNumberSet (mathematics)Electronic mailing listRepository (publishing)Server (computing)Process (computing)Video gameMeeting/Interview
Axiom of choiceComputer configurationQuicksortMappingMoment <Mathematik>SoftwareClosed setPairwise comparisonMultiplication signTesselationLevel (video gaming)GravitationCloningXMLComputer animation
Pairwise comparisonGreatest elementArtistic renderingAreaGraph coloringMappingInformationLevel (video gaming)Field (computer science)Computer animation
Revision controlSpacetimeProcess (computing)Artistic renderingLevel (video gaming)QuicksortObject (grammar)Overlay-NetzInformationComputer animation
Default (computer science)Scale (map)Level (video gaming)Link (knot theory)Clique-widthData conversionFraction (mathematics)RoundingText editorDecision tree learningTouchscreenStructural loadSinguläres IntegralFunction (mathematics)Computer iconRun time (program lifecycle phase)Shape (magazine)Symbol tableCross-site scriptingRule of inferenceCodeUser profileFeasibility studyLine (geometry)Translation (relic)Asynchronous Transfer ModeCurvatureExclusive orNumberBitZoom lensLevel (video gaming)Fraction (mathematics)Scaling (geometry)View (database)Default (computer science)Set (mathematics)Web 2.0Complete metric spaceProjective planeMathematicsTranslation (relic)Rule of inferenceCombinational logicMultiplication signStructural loadClique-widthCategory of beingText editorMorley's categoricity theoremCodeVariable (mathematics)Selection ruleCompact spaceMereologyAssociative propertyShape (magazine)ScalabilityFunction (mathematics)Graph coloringPairwise comparisonTable (information)Line (geometry)RoundingObject modelShared memoryType theorySpeciesSource codePhysical lawTouchscreenComputer fileArithmetic meanMedianSemiconductor memorySystem callServer (computing)CASE <Informatik>Endliche ModelltheorieObject (grammar)Profil (magazine)XMLSource code
Asynchronous Transfer ModeTranslation (relic)CurvatureRule of inferenceExclusive orCross-site scriptingRevision controlJava appletFreewareHeat transferWKB-MethodeStatement (computer science)Read-only memoryGeometric quantizationLevel (video gaming)Attribute grammarContent (media)Digital filterTranslation (relic)Line (geometry)Asynchronous Transfer ModeAmenable groupLevel (video gaming)DatabaseHeat transferRule of inferencePerformance appraisalSemiconductor memoryScaling (geometry)GeometryFraction (mathematics)Table (information)Codierung <Programmierung>Latent heatFilter <Stochastik>Data storage deviceWritingLocal area networkZoom lensImage resolution2 (number)EncryptionMappingPoint (geometry)TesselationWater vaporProjective planeServer (computing)Category of beingCellular automatonBinary codeThomas BayesPosition operatorWeightMultiplication signRange (statistics)Electric generatorComputer animation
Level (video gaming)Attribute grammarDigital filterContent (media)BitIntegrated development environmentDatabaseComputer fileAttribute grammarConfiguration spaceLevel (video gaming)Table (information)Greatest elementStability theory
Level (video gaming)Attribute grammarDigital filterContent (media)BitDirectory serviceMultiplication signRepository (publishing)TesselationDesign by contractObject (grammar)Presentation of a groupConfiguration spaceVector spaceComputer fileData storage deviceRevision controlMaschinelle ÜbersetzungComplete metric spaceVideo game consoleMassServer (computing)XMLUML
Link (knot theory)Web 2.0Structural loadMappingOverhead (computing)Presentation of a groupMultiplication signMereologyDatabase2 (number)Entire functionMoment (mathematics)Vector spaceRepository (publishing)Computer reservations systemResponse time (technology)Mathematical analysisRight angleProjective planeLevel (video gaming)Goodness of fitTheoryObject (grammar)Electronic program guideLecture/Conference
Open setSide channel attackNormed vector spaceMeta elementServer (computing)Water vaporWeightMach's principleDatabaseInformationMaxima and minimaSoftware testingComputer animation
Server (computing)Source codeStandard deviationFinite element methodSoftware maintenanceQueue (abstract data type)Quadrilateral2 (number)DatabaseHeegaard splittingCore dumpWebsiteMultiplication signVirtual machineZoom lensAreaArtistic renderingSemiconductor memoryLevel (video gaming)Instance (computer science)Physical systemPlanningException handlingSet (mathematics)Medical imagingRight angleServer (computing)Computer animation
Probability density functionOpen setXMLComputer animation
Transcript: English(auto-generated)
Okay, good morning, or actually precisely good afternoon as it's at 12 p.m. I'd like to introduce our last speaker, Andrea Aimee, who's going to talk about standing up OSM clone with GeoServer and CSS.
Thank you. So my name is Andrea Aimee, I work for GeoSolutions. I'm a GeoServer, the core developer, PSC member, also involved in GeoTools and a few other projects. Our company provides services around GeoServer, GeoNetwork, MapStore, GeoNode, and so on.
So, what are we talking about today? We are talking about this, and you may say, what, the GeoServer homepage? No, wait, let's zoom in a bit more. OpenStreetMap. Well, yeah, but if you look at the bottom, it says rendering GeoSolutions data from OpenStreetMap.
So, it seems like you're looking at OpenStreetMap.org, but instead it's being rendered by GeoServer. So, all the tiles that come from that page are not coming from OpenStreetMap.org, but from this WMTS, which is public, of course, at that
capabilities URL. So, what's behind it? How did we get an OSM clone going with GeoServer? Well, we get the data, the raw data from OpenStreetMap, we use Impose and Pre to turn it into a POSGIS database, then we use Geo
So, motivation and history for it. There were several attempts online to render OpenStreetMap-like maps with GeoServer.
One successful one, which has been abandoned, is OSM in a box, then there's OSM-GeoServer styles from MELP server, there was the boundless OSM repository, you can all go and have a look at them. Why did I start a new attempt? Well, the previous attempt either went and maintained,
we're not really a match to OpenStreetMap.org cartography rendering, and we're not using GeoCSS, and I'm the GeoCSS maintainer of the language, I wanted something that would push the language forward. But more importantly, as the man said, we choose the clone OSM not because it's easy, but because it's hard.
Going beyond the comfort zone is always a good teaching experience. OSM is interesting in a variety of ways. The data set is huge, there is a lot of details, the style is complicated, no, no, it's not complicated, it's a veritable monster, it's dragon. And the working on it brought us several improvements to GeoServer which are not specific to rendering OSM, but are useful for everybody else.
So timeline, I started in this repository, which is public, you can go to it in 2017 in my own spare time, and went for a number of months making improvements. And I'm not going to list them, they are in the slides,
but I'm going to list them now because we are going to see them one by one in a minute. Then GeoSolutions picked it up for a job, and we made a bunch of other performance improvements, which are also going to see. This repository you can look up, but it's private, so no joy there.
And then in this year, we made a bunch of other improvements, and yeah, the repository is still that private one. So first of all, choosing the right style. When you look at OSM, the style that you look at, the map rendering that you look at, is called OSM bright.
There are many OSM bright sort of clones or things called OSM bright on the network, but there is only one that actually renders, like OpenStreetMap.org, and it's a .url, Gravity Storm slash OpenStreetMap Carto. It's a Carto CSS style sheet, which is going to get rendered by Mapnik.
There are other options around, like the Mapbox OSM bright and the OpenMapTiles OSM bright GL style. They are loosely based on the official style, but they have different rendering choices and much simpler maps. Let's have a look. So this is a side-by-side comparison. Official OSM in the middle, the Mapbox rendering is at the top, the OpenMapTiles
rendering is at the bottom. What you can see is that, well, there are obvious color differences, but what really stands out is the amount of detail that the other two maps lack. If you want to have more confirmation, let's have a look at this area that you by now know very well.
One is rendered by OpenMapTiles and the other by OSM. As you can see, there is much less information in the OpenMapTiles rendering rather than in the official OSM. So what I see personally is different objectives. OpenStreetMap.org tries to represent as many details as possible.
It tries to make an atlas of sorts. Mapbox and OpenMapTiles are sort of trying to go towards a base map instead, on which you can add overlays. And of course, they cannot clutter them up with all possible information because you have to add your own on top.
The only note that I have, it would be less confusing if they were not all called OSM-bright. Consequences of doing a simpler map is that you have to load so much less data, you have a simpler style that renders so much faster. And I could have done that, gone that way, but the objective was to do it because it's hard, so I kept the official style instead.
Challenges. Doing it was really challenging, but it brought a number of benefits to GeoServer. The approach was to take the default Imposon3 export, adapt it a bit, and then manually translate from Carto CSS to GeoCSS,
and fix any hiccup found along the road. And I found many, and I'm going to talk about them. First, zoom versus scale. Carto CSS is based on the zoom levels of Web Mercator. So if you look at the style sheet, it says stuff like, if zoom is greater or equal than 12, than 13, than 15, and blah, blah, blah,
then use a different line width. They are basically changing the line width based on the zoom level. It's nice and useful if you only care about Web Mercator, but what if you are targeting a different projection? And one of the reasons to do this work was actually doing OSM in whatever projection I want.
Also, GeoCSS and GeoServer uses scale denominators, not zoom levels. So I could have just translated the zoom level to the exact scale denominator, which is that bizarre number, but it would have meant that any slight change in the scale denominator would have resulted in a different map,
which was meant for the next zoom level. And I didn't want it. I wanted the scale denominators close to the zoom level to render like the zoom level. So I actually did the mid-cut between the scale denominators and then turned it into a round number that was easy to enter. And I ended up with this set of scale denominators, which are more or less the same.
They are actually not that bad. So 400 million, 200 million, 100 million, 50 million, 25 million, and so on and so on until the end. So, moving on, I was typing the styles, and even if GeoCSS is more compact than Carto CSS,
as you can see by the comparison of the road style in Carto CSS, which is the AMSS file, and the CSS file, which is the GeoCSS, we are 38 kilobytes versus 123 kilobytes, so it's significantly smaller. But still, the GeoServer style editor was, like, too tiny. It could not work in it.
So, we added to GeoServer a side-by-side full-screen editor that would allow me and everybody else, actually, to work with a larger view of the style sheet. And we also added a code completion into it while we were at it.
So, one benefit for everybody. Another thing was the scale dependency verbosity. In OSM Bright, loads of Z level to road width association. There are a ton of them, and you have seen them in the table that I shared before. I could have done the same with GeoCSS.
So, SD less than 400,000, SD less than 200,000, where SD is the scale denominator variable. But it would have been pretty verbose. So, instead, I used the categorize function, which is already part of SLD and CSS, and made SD the variable on which I'm classifying stuff, so that I can build a little table.
If it's less than 50K, use 9. If it's less than 100K, use 7, and so on and so on. So, pretty compact association between scale denominators and widths, which was a lot less typing. Another issue that I found was that Carto CSS for OSM Bright uses a lot of little SVGs,
and GeoServer supports SVGs, but it renders them as they are, with their colors, with their strokes, with their appearance. In Carto CSS, they are used as shapes, as scalable shapes that you can color whatever you want.
That wasn't supported. It got added. So, right now, in GeoServer, you can use an SVG as a well-known mark, and then decide what color to use to fill them, stroke them, and whatnot. Then, we started fighting against dragons. These CSS style sheets are really, really big, and they have to translate it down to SLD.
They get translated to an in-memory object model that GeoTools uses, which is basically one-to-one to SLD. I can actually save it as SLD. The problem is that CSS has cascading. Cascading is this notion that the rules interact with each other, and a more selective rule overrides some properties of the more general rule.
When you have thousands of rules in your CSS file, creating the combination of all and deciding which combinations are actually matching features takes a lot of time. Too much time. So, step one was to run a profiler and make it faster.
I made the translation ten times faster, and for many styles, that was enough. Unfortunately, for some, it was still too much time. It took a minute to translate the roads down to SLD. So, realization. Carto CSS does not really have cascading.
It has rule nesting, but no notion of cascading like in the web. All the rules are already designed to be exclusive with each other, so why bother to try to mix them? I added a new translation mode, which is called flat, which also helps a bunch of people who don't really like the notion of cascading. They would like a light syntax for an SLD-like behavior.
So, we went with this new translation mode for the larger styles, amenities, and roads, which were well above a thousand lines. And that made the translation much faster, and also the generated SLD got smaller. So, the 38 kilobytes road CSS translates to just one megabyte of SLD.
It was like 36 megabytes before. So, quite an improvement. Then, we found, oh yeah, the map is nice, but it's still really slow to render. And you might say, well, dude, this was meant to be tile-cached, not to render it on the fly. So, what did you expect?
I expected more, so I started working on it. One thing is that Imposum generates some of these pre-generalized tables in which you have an overview of the roads with less points in the geometries. I started leveraging them with the pre-generalized data store. In that store, you can basically say, between this zoom level or scale denominator and that scale denominator, use this table.
In that other range, use another table, and so on, so you can go and use more and more simplified geometries. That was a win, but that wasn't enough, so I needed eyes on the target. What is slowing down my map rendering?
I used Java Mission Control, started profiling session, loading GeoServer with map generation. I found out that there was no single place taking most of the time. It was a bunch of things. One thing that was killing performance was using SSL, communicating to the database.
Encryption and decryption of the data was killing performance, so I added the ability to shut down SSL, which is kind of useful if your database is in a local network which is well protected and blah, blah, blah. Then found out that WKB decoding and transfer was slow, so we switched to TinyWKB instead, which is a well-known specification.
Then we found out that there was Base64 encoding and decoding of binary data from the database and backwards. We enabled full binary transfer. Then the evaluation of rules in memory was slow because there were so many. It's unusual to have so many rules, so I added a way to write filters in memory to make the resolution faster.
Finally, PNG encoding of the tiles was a bottleneck. We also improved that one. And here we are. Now the map is fast enough and at high and low zoom levels.
We still need to improve mid-levels. They are much better than before, but some maps at 1 to 100,000 and 1 to 200,000 might still take 20-30 seconds to render, which is way too much in my opinion, and I need to go and fix it. However, I basically already optimized whatever I could in GeoServer.
I have to move my attention to the database. There are attributes in the database that we are not using. I have to get rid of them, so basically I have to modify the imposing map file to get less data. The overviews tables are containing roads that we are actually not rendering, so I have to get rid of them.
I have to study a bit PostgreSQL a bit more to tune it for very large datasets. PostgreSQL typically comes with a very simple configuration that would make it run on Pentium 4. Well, I probably need to tune it to use a more modern execution environment.
And that's it. Or is it? Didn't I forget anything? Can anybody tell me what I forgot? That we didn't manage to do because...
So I have this contract to read the Mapbox vector tiles and render them in WMS, yes, but I'm going to start working on it as I come back to work in a couple of weeks. So no, that's not a thing that I forgot. That's another thing that I forgot.
Ah, the styles. I spent all this time talking to you about the OSM styles. Didn't anybody wonder where they are? So this is an open repository where I dumped all the styles and the imposing file. It's not a complete GeoServer directory yet.
I have to work a bit on it because the closed repository that we have has lots of other layers that I need to clean out. But at the time being, you can go there, it's already open. It has all the CSS styles, the imposing file, and the pre-generalized data store configuration. And remember to use GeoServer 216 with them because all of these improvements that we made,
well, some of them started landing a couple of years ago, but many of the improvements that we did in 2019 are available only in GeoServer 216, which is to be released in a couple of weeks. And that's actually all.
So that went quickly, very interesting. So we have a little extra time for questions, so please bring them.
Nice presentation. I'm wondering, the official style is updated quite often. How will you handle the updating of the style? I didn't. The objective of the exercise was to make one version of OSM up and running,
and there is no automated translation from one to the other. So they would have to be manually updated. Yep. And the repository is open, we accept the request and the like, so if people want to start working on it and help us keeping it up to date,
that would be appreciated. Hi Andrew, great presentation. Just one question, when you were doing the performance analysis, you were probably using WMS to generate the map, but what CRS, what coordinate reference system?
I was using Web Mercator, but reprojection of vector data is normally not a significant overhead. So it's not like if I used another CRS it would have been different. Because I'm putting it in conjunction with the presentation yesterday
that you had with the APH, and if you're asking it in like polar stereographic or something, wouldn't that bring even more higher load times for the map? It will have some overhead, but as I said, the APH overhead should not be that visible,
especially with maps that take still a few seconds to render. I would expect it to become an important part of the response time when we go sub-second, but we are not there yet.
Thank you. How big is the database that you are rendering? At the moment we are rendering Europe. The entire Europe we are not yet ready to go full planet,
mostly because of the reason that I said before about having yet to do work on the database. So before I try to import the whole planet, I want to make sure that I imported the minimum amount of information possible to get it going.
There is another question there. Just how big is the GeoServer instance that you're using to test? I think that right now it's an 8 core with 64 gigabytes of memory
split into three Docker images, running concurrently all on the same machine. But as I said, right now most of the time is not spent on the CPU. It's actually the Postgres struggling to find the data for rendering.
So if we go to whatever place, this map is rendering off the WMTS, so you might say, well, it's cached. True, but we didn't precede anything. So if I go to an area that I was likely not visited by anyone else,
you can actually see the true rendering speed, which is this one. So this zoom level rendering speed, even if stuff is not cached, is quite acceptable in my opinion, also considering there is just one machine behind it.
It's at around this zoom level that sometimes it takes a lot of time, but it's really database driven because I go to a place that I haven't visited before using the WMTS directly, and it takes maybe 40 seconds to load,
and then I pan a bit, and the map comes back in one second. So it's really the database struggling to find data, and I need to find better ways to organize the data in the database. I don't think that there is any significant further work to be done in GeoServer until I fix up the database.
Any other questions? If not, please go on the website that I pointed you at, download the styles and play with them and have fun.