CartoDB Basemaps: a tale of data, tiles, and dark matter sandwiches
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Serientitel | ||
Anzahl der Teile | 183 | |
Autor | ||
Lizenz | CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 3.0 Deutschland: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben. | |
Identifikatoren | 10.5446/32140 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache | ||
Produzent | ||
Produktionsjahr | 2015 | |
Produktionsort | Seoul, South Korea |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
FOSS4G Seoul 2015123 / 183
7
8
47
53
54
65
73
74
79
82
84
92
102
103
105
124
126
127
130
141
142
143
156
161
162
170
176
178
181
183
00:00
FokalpunktMapping <Computergraphik>SystemtechnikTeilbarkeitPhysikalische TheorieHackerTesselationEinschließungssatzComputeranimationProgramm/Quellcode
00:38
SoftwareentwicklerSoftwareMultiplikationsoperatorBitKeller <Informatik>
01:13
KanalkapazitätFunktionalDifferenteTextur-MappingGebäude <Mathematik>MengePhysikalisches SystemProgrammbibliothekPolygonMapping <Computergraphik>SystemplattformSoftwareentwicklerOffene MengeVisualisierungQuaderAssoziativgesetzArbeit <Physik>Spezifisches VolumenBenutzerschnittstellenverwaltungssystemKategorie <Mathematik>Codierung <Programmierung>PolarkoordinatenDemoszene <Programmierung>Skalarfeld
02:22
SystemplattformKategorie <Mathematik>Mapping <Computergraphik>Endliche ModelltheorieCASE <Informatik>Textur-MappingMultiplikationsoperator
02:51
Mapping <Computergraphik>Gemeinsamer SpeicherTextur-MappingOverlay-NetzInformationMengeEinfügungsdämpfungKorrelationsfunktionMailing-ListeStabForcingRechenwerk
03:42
MengeMAPPunktURLTextur-MappingOverlay-NetzTexteditorBitDateiformatMapping <Computergraphik>SchlussregelVisualisierungRoutingMultiplikationsoperatorGruppenoperationTesselationPolygonTabelleAssoziativgesetzSerielle SchnittstelleSoundverarbeitungNegative ZahlGefangenendilemmaDemoszene <Programmierung>PartikelsystemMaßerweiterungElementargeometriePhasenumwandlungWort <Informatik>Ordnung <Mathematik>Kategorie <Mathematik>Physikalisches SystemDifferenteElektronische Publikation
06:28
MultiplikationsoperatorTextur-MappingPhasenumwandlungTexteditorBitUML
06:47
AssoziativgesetzTexteditorDefaultTextur-MappingMinkowski-MetrikFitnessfunktionMapping <Computergraphik>AggregatzustandStabMultiplikationsoperatorWeb SiteInverser LimesSichtenkonzeptVisualisierungHilfesystem
08:16
VolumenvisualisierungDatenbankTabelleInverser LimesZentrische StreckungTextur-MappingElementargeometrieUML
08:53
ZahlenbereichEinsEndliche ModelltheorieBitMarketinginformationssystemFokalpunktHilfesystemSprachsyntheseBefehl <Informatik>DatenfeldBenutzerschnittstellenverwaltungssystemFormation <Mathematik>ServerGarbentheorieExistenzsatzOpen SourceProzess <Informatik>SoftwareentwicklerMapping <Computergraphik>Textur-MappingSystemplattformPhysikalisches SystemVisualisierungZoomSichtenkonzept
10:30
Globale OptimierungMapping <Computergraphik>Endliche ModelltheorieSoftwareentwicklerDemoszene <Programmierung>MultiplikationsoperatorNotepad-ComputerTexteditorTextur-MappingUML
11:18
AggregatzustandPlug inAffine AbbildungSoundverarbeitungComputeranimation
11:41
VariableHierarchische StrukturProjektive EbeneTextur-MappingMathematikAbfrageWasserdampftafelMengeMapping <Computergraphik>Plug inVisualisierungBildschirmfensterGraphfärbungAggregatzustandAssoziativgesetzRuhmasseAutomatische HandlungsplanungNotepad-ComputerWort <Informatik>Element <Gruppentheorie>BildschirmmaskeMessage-PassingProgramm/Quellcode
13:02
Mechanismus-Design-TheorieTelekommunikationVerdeckungsrechnungTextur-MappingInformationFokalpunktAssoziativgesetzWeb SiteZusammengesetzte VerteilungMapping <Computergraphik>Dynamisches SystemMengeVisualisierungEinschließungssatzEchtzeitsystemSchaltnetz
14:18
Program SlicingMinkowski-MetrikTextur-MappingOpen SourceMathematikDatenerfassungBitKomplex <Algebra>VisualisierungRuhmasseProjektive EbeneKategorie <Mathematik>
15:32
MultiplikationsoperatorWorkstation <Musikinstrument>Physikalisches SystemVideokonferenzMereologieDatensichtgerätNatürliche ZahlParametersystemPhasenumwandlungTextur-MappingQuick-SortMapping <Computergraphik>Pen <Datentechnik>AssoziativgesetzIterationAudiovisualisierungServerZoomFitnessfunktionEinschließungssatzTexteditorTesselationHydrostatikBildgebendes VerfahrenAbfrageATMURLVersionsverwaltung
17:47
TypentheorieEndliche ModelltheorieTeilbarkeitRechter WinkelAbfrageOrdnungsreduktionMultiplikationsoperatorDialektElektronische PublikationDatei-ServerDienst <Informatik>Mapping <Computergraphik>SoftwaretestMengeMeta-TagTesselationKategorie <Mathematik>LastSoftwareTextur-MappingDefaultSoftwareentwicklerServer
19:25
MultiplikationsoperatorProgrammierumgebungTesselationUnrundheitServerHash-AlgorithmusCodeHook <Programmierung>Grenzschichtablösung
20:25
Workstation <Musikinstrument>Datei-ServerMultiplikationsoperatorProgrammierumgebungFlächeninhaltWort <Informatik>Nichtlinearer OperatorDateiformatMechanismus-Design-TheorieDefaultTesselationWKB-MethodeMathematikServer
21:09
KnotenmengePunktPolygonMetropolitan area networkWKB-MethodeDämpfungLokales MinimumMannigfaltigkeitOrtsoperatorAssoziativgesetzMathematikXMLUML
21:49
Mapping <Computergraphik>TesselationOpen SourceUmwandlungsenthalpieCASE <Informatik>VariableSoftwareServerHackerCodierung <Programmierung>DatenbankPixelGraphZusammengesetzte VerteilungOrtsoperatorQuick-SortPunktTextur-MappingEndliche ModelltheorieMultiplikationsoperatorBeobachtungsstudieDatensichtgerätHyperbelverfahren
23:41
Computeranimation
Transkript: Englisch(automatisch erzeugt)
00:04
Okay, let's see if this works Okay, sorry, okay So hey everyone. I'm Alejandro Martinez. I'm a systems engineer at Cardi B and I wanted to give this talk. This is about the Cardi B base maps, a tale of data tiles and dark matter sandwiches and
00:27
After all, it's a tale or like a story of how we ended up serving base maps by just an evening hack of a co-founder of Sergio
00:41
which tried to do something and On Friday evenings, we have something that we call the leapfrog Fridays which is basically about the evening to spending some time hacking on top of the Cardi B stack for some experiment experimental stuff or things that we want to improve or
01:04
feed a little bit more in the stack and We do this a lot because we like to push our own limits, and it's a way of development We build a lot of the Cardi B stack, a lot of the new pieces on top of the existing pieces of the Cardi B stack
01:20
For example, the geocoding is just SQL functions on top of the on top of On top of a Cardi B account Which have all the ready data and use the own postures, filters, search capacities to search around for geocoding names and polygons or the data library data sets are built which are the
01:43
if you log into your Cardi B account and go to create a new visualization and you got a lot of open data that you can use Out of the box, that data is actually fetched from another different Cardi B account which is having the data and it's being copied to your own account and then so a lot of internal API's and
02:04
things we use both for development on the systems team or for everything that are built on top of the Cardi B because we think it's a way to sort of improve the experience both for us and for everyone who wants to build things on top of our platform So back to the base maps. A base map is simple and yet complex
02:25
I mean a simple base map is just a layer of data. In our case we want it to be some open data and with a matching style, which most of the times is the most difficult thing So it makes sense for an evening to try to
02:40
try to create some base maps using Cardi B, even though Cardi B since it began, it wasn't envisioned as a platform for making base maps, but for putting layers of information on top of base maps like overlays of data of quite a small amount of data compared to OSM or compared to any other
03:04
data that might be worth to be called a base map to share information on top of But most of our stack was already based on PostureSquad, PostGIS which happens to be like the most common stuff for serving base maps itself
03:20
even though we've focused a lot of serving dynamic data that changes frequently and it's not as big as the data set from OSM We believe it could be worth a shot. So we went working and we obviously went with the with a less detailed data set, which is Natural Earth and
03:44
got all the polygons and related things to make a bare base map, which is not even province level, just a country level and try to study a bit and we did like three to four base maps using the own Cardi B editor with a big account and data we uploaded using the
04:05
Cardi B UI and we used this to try to explore how difficult, how far did we get on our purpose of being a data visualization overlay some kind of specialists from how far we were from being a base map editor and
04:25
we were almost there. I mean you could make a basic base map using Cardi B by just uploading the data sets and pushing the making the styling which can get to be really tricky and difficult as you deal with different zooms and
04:41
things, but we found that the Cardi B editor was not the best suited tool for this because well it started for the UI I mean the UI wasn't designed to make to suit such a big amount of layers one on top of each other and there was a point when they overlapped each other and it broke and some other things like the data set size
05:01
I mean you can upload data sets of two gigabytes, three gigabytes tops but that's not worth it. If we wanted to make a worldwide data set we could not import it using the Cardi B UI and that was fine because you usually don't want to upload and display 100 gigabyte table at once. Base maps are the exemption for us, not the rule
05:24
but yeah despite of all these hurdles in the editor making a simple base map was quite easy and it was simple to make it work because Cardi B, the tiler, already serves XYZ tiles
05:42
but it does with this with this code, we call it this layer group ID which is something that depends on both the time the data sets have changed and the and the style, but we didn't need that because we wanted to get a fetch URL so we did the quick route which is just make a route and make a
06:04
rule and nginx to just make the affix URL pointing to the real one for the visualization and that was the easy way to have I mean we already had something that you could access a base map simply on leaflet without even using Cardi B GIS or any tool that accepted
06:22
XYZ formats for serving base maps without very much work so achievement unlocked, like we got the first base map and we've had the base map, we launched the simple to base map like a year and a half ago I think a little bit more maybe and they were already available in the Cardi B editor for a long time
06:43
but then some almost a year ago we wanted to go a bit further, we wanted to go a bit further because we started to we wanted to remove the map views limiting because if you want to make social base, social maps and makes that get shared by the community and
07:02
like getting to the getting to getting people to make maps without being afraid of how many times will we debut and they want to make them and actually want them to be popular we had to remove that restriction and make all the maps in Cardi B have unlimited map views, but then
07:25
we had the data that is overlaid, but then we also need something unlimited to put behind and we tried, of course there are a lot of people serving base maps that do very much better than us, but we wanted to give it a try and to
07:41
make an OSM base map that we could host ourselves and be that we could be the responsible stuff and we could pay like the usage of it and that also was designed for data visualization in the sense that this base map is really going to be the default base map in the Cardi B editor, so
08:02
a hundred percent or up to ninety percent of the usage is going to get is from visualization that made on top of it and data that is overlaid, so we wanted to be as close to data visualization as we could so that's why we decided to cross the OSM limits and we got the help of omniscale to use an imposome, make a
08:22
definition that makes sense and having the whole OSM inside a table which happened to be inside a Cardi B database and inside a Cardi B account and we got a select all from planet with all the geometries in OSM that well, it doesn't make sense to query a
08:40
150 gigabyte OSM data with only PostgreSQL for data rendering, but we have it there and we could like do stuff on top of it, and then to cross the matter limits. Yes, this is a bad pun because of the name of the base map Dark Matter We got the help of a statement which helped us make two awesome
09:03
open source base maps, which is Positron and Dark Matter, the white and the dark one which were designed with data visualization in mind, as in they're the ones which are going to be using Cardi B by default, so there better be and
09:20
that's how we got a bunch of interesting stuff on top of the already imported OSM to handle the zooming and visualization on top of, while inside the Cardi B platform our systems base map infrastructure, for example we used material we used materialized views to filter the data, the sections of OSM that
09:43
were going to be relevant if it's assumed to avoid like transferring too many stats, too many data to the tiler server and we used materialized views because they're very handy and PostgreSQL 9.4, they can be refreshed concurrently so it made sense to use them as
10:01
some kind of mirror of the OSM data that we keep updating with imposing madness moses with the filter data for the base map and also some a lot of SQL magic to make sure that the data matches its zoom, etc. And this was
10:20
done by the awesome Gaussian statement both with which helped us doing things, doing all these things and then we also got to the development process, which was a bit, we started developing the base maps in TileMill which it also uses Cardi CSS, it pretty much matched what we wanted to display
10:42
but then we found some issues with the Cardi CSS handling is slightly different from the one on our Cardi B map APIs so we decided to just go look for something else and while the initial development was done on TileMill
11:02
we went iterating and did another another draft editor, which was made on HTML with just Cardi B GIS, but we ended up with creating another cool way to create base maps on top of Cardi B, which is the Atom base map editor, which is not really a base map editor per se
11:22
but it's a plugin you can put on top of Atom which just will connect to your Cardi B account and allow you to easily edit any, to easily edit some Cardi CSS and automatically push it to Cardi B and
11:43
and have a preview window to display the the data set you've shown and the visualization you're creating. So this is it and it's cool because you can just change anything. For example, I'm going to change the water color because water is like blue, don't you think?
12:01
so with a bunch of cool plugins for editing and linting and cardio editing on top of Atom we felt it was like the right ecosystem to fit in and you just save and the actual it automatically gets pushed to Cardi B and it generates a new base map with the style you've set, you've sent, and you can also not only change the
12:25
the style which has its own hierarchy and variables and every any custom Cardi CSS common things but you can also change the queries that are applied to the to the map and to each layer so you can, like for example, I just opened a new one
12:43
apply any kind of queries, for example, this is a cool example I got that you can just apply an ST transform and have the same map dynamically render and reproject it into another projection. This is Robinson. So another advantage of this is that all the base maps are rendered using our existing infrastructure, which is
13:06
which is focused on dynamic mapping and making sure that things get updated quickly so you could also mix on top between the layers of that base map. You could also mix your own a data set or your own
13:24
information that you can keep updating using the Cardi B editor, the SQL API or whatever way of accessing Cardi B you want to use and you can use them inside of base map, either masks, overviews, filtering using SQL or any kind of combination you want to achieve as you're pretty much setting in the SQL and CSS and we're just rendering and
13:43
we also have all the caching mechanisms and invalidation mechanisms both with our locker bandage on a CDN, which is fastly, to make sure it will keep the asset updated in almost real time so we went up, we kept experimenting on these base maps and
14:02
developing new features that I'm gonna go briefly through, that I'm gonna go through briefly and the first one is sandwiches. Sandwiches in the sense that our base maps, as I already said a couple times, were thought for data visualization and with data visualization you often get things like this. This is like a map
14:23
that is all the transparency. It doesn't have very much transparency, so it's covering all layers So what do you do? Well, it's quite simple. It sounds quite simple, but it has a little slice of complexity. You can just put the labels on top So what it is, is we're releasing another
14:43
as the the base map project was quite structured in layers. We just released another different layer which was just a layer which only had the labels on top, so you can just put, using leaflet and and carry bjs, put the base map which is the
15:01
thing you see behind the blue mass On top of it, you put whatever you want to visualize and on top of that you put the labels and it makes for a nice visual change. It's a little small detail, but this affected a lot of pieces on our stack even though we had a bit of styling issues, like, well, you have to alter the style and adjust the labels, but
15:26
it was pretty simple to make it, but we also went and implemented some other things all over the category b which ended up exploding and affecting a lot of pieces of our stack, because right now the the Cardiby editor is using the sandwich labels by default, without telling you, and
15:46
most people didn't even notice, which I think is pretty cool because it feels like natural. We implemented some quite things in our tiling server to be able to cope with all of this and
16:01
those preview visualizations on your base maps, on your maps, I mean, sorry the previews were actually in the first iteration of the editor when they're using leaflet and they got all the three layers but we wanted to go a step further and be able to serve just like an actual image for your whole visualization, including the base map and
16:27
the labels and, of course, your data. I went, we went over this by using the maps API that we already have, which is based on WinShift, which is based on Mapnik, and it's called WinShift
16:41
adding what we call the sandwich mode which basically means that you can no longer only request Mapnik layers with a style and CSS and queries but you can also request an HTTP layer any HTTP layer that allows XYZ, for example, in this case, it's our own base map but you can put in pretty much any other base map on it and
17:01
it will request all the layers and compose them together and serve them as another map, which is just like a combined version in JP in PNG of that map and another thing we added for the dashboard is that you're no longer confined to request XYZ tiles but you can also request, we also did a static map API, which means that you can request, give me this tile for a fitting zoom and
17:26
center in these coordinates and of a given size and you can just alter in the URL the parameters that you see here on the end you can just change and tweak the map where you want it, you want to display it so
17:41
the last part is a systems part that I'm most involved in with not because the base map during the development of the base map we ended up improving our infrastructure by load testing and comparing and new settings and spreading new things that we ended up
18:01
testing with the base maps but expanding to the whole category B The first of them is meta tiling. Meta tiling is a simple concept which is that the tiling server, which is the one that serves the tiles, instead of when you ask for a tile, instead of rendering just that tile it renders the, for example, if you want to request this tile
18:23
it will not paint this tile, but also paint the whole adjacent bunch of adjacent tiles and have them saved, so because, you know, most of the times it's going to be people seeing a huge map and requesting a lot of tiles, so it's kind of intelligent to reduce SQL queries
18:43
to just paint a bit, assuming that the user will request it Meta tiling is under a win shaft one you can buy tile like Mapnik, in fact on an internal cache, so when you request a tile, it ultimately generates the adjacent tiles
19:00
The problem with our 16 stack is that our stack is more or less like this, the overall stack of the category B system, our software as a server system, which is that we didn't only have one tiler but we have more than one tiler and we balance amongst them using another upper layer, which is not here, which is in GNX, which by default
19:24
will the routing that we'll use is quite stupid, in the sense that it will be just a round robin and randomize the request along all tilers, but the problem with you have that is that when you request a tile to a tiler the tiler will paint all the adjacent tiles
19:41
but that tiler will not serve the adjacent tiles because round robin will probably serve it to another one so we'll just end up painting like four times the amount of tiles that we wanted to paint without for nothing so we went exploring and hacking around and we found a very interesting way to do this, which is consistent hashing
20:03
which is just that each request is assigned like a hash that will determine which server will serve it but with nginx and OpenResty which is a Lua environment put on top of nginx that you can use to hook into requests you can decide what can you use to
20:22
to calculate the hashing so we went and after some exploring we came up with this, this simple piece of code we'll just do some math operations, given how the quadtree works to make sure that all the tiles that have the same that are contained into the same meta-tile for the same soon are served by the same tiler server
20:45
so it's kind of more optimal routing for the distributed environments of serving tiles and then again the last thing we've done to play with this more for squeezing all the performance out of the serving was
21:02
ditching WKB. WKB is the default transport format for PostgreSQL, which is this is an 8-byte float for each coordinate of its position, for example imagine that you have a polygon with
21:20
a hundred points or corner vertices, it will transfer each of those vertices with approximately a precision of that and usually never have that precision I mean you don't usually have subatomic precision in your WKB visualizations, if you do then you're the coolest man I've ever met
21:43
we decided to explore how could we change this and we ended up with something what we call Tinywell No-Binary which is a specification that we open source and we want to build up on and we work with some other people to do this and it's equivalent to that
22:02
when no binary, but using delta encoding and variable precision to make sure it fits the the more or less the the precision that you want to display in our tile because you don't want subatomic precision for displaying a 0.256, 0.25C style usually with having where in which pixels, a pixel is the point, is enough
22:25
and well, I got another talk about this in a later session, but Tinywell No-Binary basically helped us really squeeze a lot of performance for this and get a huge performance improvement because network in this case of base maps was one of our main bottlenecks and
22:44
we just, moving to Tinywell No-Binary I think you can guess on this graph when we moved reduced to our overall traffic to 10% of what it was on the traffic between the database server and the tile server
23:00
So yeah, this is how Evening Hack ended up like destroying and causing improvements all over the stack and That's all. I think I'm here buried, so if you have any questions Thank you. Any questions and comments?
23:26
No questions? No? Okay, thank you so much.