We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Dynamic Styling For Thematic Mapping

00:00

Formal Metadata

Title
Dynamic Styling For Thematic Mapping
Title of Series
Number of Parts
183
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language
Producer
Production Year2015
Production PlaceSeoul, South Korea

Content Metadata

Subject Area
Genre
Abstract
Current web standards have facilitated the online production and publication of thematic maps as a useful aid to interpretation of spatial data and decision making. Patterns within the raw data can be highlighted with careful styling choices, which can be defined for online maps using tools such as Styled Layer Descriptor (SLD) XML schema. Dynamic generation of maps and map styles extends their use beyond static publication and into exploration of data which may require multiple styles and visualisations for the same set of data. This paper explores the application of thematic styling options to online data, including mapping services such as Open Geospatial Consortium (OGC)-compliant Web Mapping and Web Feature Services. In order to be relevant for both user-specified and automated styling, a prototype online service was developed to explore the generation of styling schema when given data records plus the required output data type and styling parameters. Style choices were applied on-the-fly and to inform the styling characteristics of non-spatial visualisations. A stand-alone web service to produce styling definitions requires a mechanism, such as a RESTful interface, to specify its own capabilities, accept style parameters, and produce schema. The experiments in this paper are an investigation into the requirements and possibilities for such a system. Styles were applied using point and polygon feature data as well as spatially-contextual records (for example, data that includes postal codes or suburb names but no geographical feature definitions). Functionality was demonstrated by accessing it from an online geovisualisation and analysis system. This exploration was carried out as a proof of concept for generation of a map styling web service that could be used to implement automated or manual design choices.
126
Texture mappingSymbolic dynamicsTrailMaxima and minimaAerodynamicsFunction (mathematics)Parameter (computer programming)Query languageVisualization (computer graphics)Presentation of a groupInformationWeb 2.0Visualization (computer graphics)Query languageForm (programming)QuicksortNumbering schemeData storage deviceWeb serviceElectric generatorDiscrete groupCategory of beingBinary fileFrustrationPresentation of a groupSpacetimeWorkstation <Musikinstrument>Descriptive statisticsPolygonDivisorProcess (computing)ResultantPattern languageExtension (kinesiology)MappingInformationPoint (geometry)Physical systemFunction (mathematics)Graph coloringHypothesisSoftware testing1 (number)Server (computing)outputComputer architectureWebsiteClient (computing)Level (video gaming)Mathematical analysisPopulation densityPattern recognitionCASE <Informatik>Data analysisStatistical hypothesis testingMereologyArmPhase transitionDifferent (Kate Ryan album)MetadataState of matterElectronic data processingParameter (computer programming)MultiplicationNumberAxiom of choiceComputer fontClassical physicsMultiplication signTwitterGraph (mathematics)Basis <Mathematik>Feature spaceMotion captureInteractive televisionCellular automatonSymbolic dynamicsDatabaseMathematicsComputer configurationDerivation (linguistics)Constructor (object-oriented programming)Latent heatBitType theoryAlgorithmCountingSet (mathematics)View (database)Computer animationXML
Parameter (computer programming)PolygonComputer configurationTestdatenContext awarenessPivot elementEquals signPolygonWhiteboardFlock (web browser)Parameter (computer programming)Web serviceMappingCASE <Informatik>RiflingRow (database)Gauge theoryVisualization (computer graphics)String (computer science)StatisticsGotcha <Informatik>Web pageComputer configurationPoint (geometry)Numbering schemeClassical physicsLatent heatPivot elementGreen's functionRight angleCalculationQuicksortNumberBinary fileVector spaceCountingContext awarenessFunction (mathematics)Client (computing)Graph coloringMoment (mathematics)Physical systemMetadataoutputInjektivitätTable (information)Bus (computing)Interactive televisionBit rateCurveResultantError messageSocial classVirtualizationBoundary value problemBitEqualiser (mathematics)State of matterDivergenceDifferent (Kate Ryan album)AreaArmVariable (mathematics)Theory of relativityOnline helpMereologyFrequencyNatural numberReverse engineeringTerm (mathematics)Probability spaceDistribution (mathematics)Goodness of fitOutlierFeature spaceComputer fileAttribute grammarEntire functionCovering spaceSmoothingLageparameterStreaming mediaDataflowEllipseComputer animation
Single-precision floating-point formatDivergenceNumbering schemeAttribute grammarRadiusDistanceoutputPairwise comparisonTable (information)RadiusBus (computing)Point (geometry)Meta elementQuicksortMetadataCellular automatonNumberEllipseAttribute grammarInformationVisualization (computer graphics)Goodness of fitCartesian coordinate systemExecution unitView (database)PlotterCASE <Informatik>Buffer solutionResultantFeature spaceScatteringMixed realitySlide ruleSet (mathematics)AverageReading (process)Multiplication signWeb servicePresentation of a groupOutlierMultiplicationNetwork topologyTerm (mathematics)EmoticonBit rateDifferent (Kate Ryan album)QuantileMappingTable (information)Numbering schemeDimensional analysisPairwise comparisonField (computer science)Arithmetic meanGraph (mathematics)Mathematical analysisMetreTexture mappingInformation privacyGroup actionNormal (geometry)ForestElectric generatorSphereMatching (graph theory)Right angleLink (knot theory)Form (programming)Shooting methodGraph coloringElectronic mailing listMathematicsSinc functionDomain nameMachine visionGame theoryAxiom of choiceXML
Computer animation
Transcript: English(auto-generated)
So, my name is Simon Moncrief, I'm going to be presenting a paper I did, or a presentation based on a paper I did on dynamic styling for thematic mapping, which is a paper that EK, Gulland and I wrote together for this.
So it's clear that things are moving more towards data exploration now. So we have more data science and GIS is both the same, they're moving that way. So this has been enabled by automated data processing and this enables hypothesis testing.
For example, you can interact with derived outputs of a dataset, explore it visually and do both hypothesis generation by exploring an aspect and then hypothesis testing by visualising in a different way. So what I'm interested in is more how do we enable this, how do we enable dynamic styling
in this sort of construct. So it's also clear that we need to adopt a user driven approach to Web GIS. So that's the user inputs a query and that query is essentially everything and that gives it a flexible approach.
Rather than a supply push, so you publish a result, this is more of a user pull, so you let the user derive the result on the fly for themselves based on the question they're interested in. And so the eventual aim is to develop a web service for this thematic visualisation. So the idea is you have a piece of data, it goes to the web service with some metadata
tags on parameters on how to style it and it generates the visualisation. So this paper is really just an exploration of methods, functionality, parameters, different ways to answer different questions visually of some data.
So as I said, data exploration and in particular I'm interested in the presentation of data. So we have a data access through WFS and other means, REST queries and with WPS we're also introducing data interpretation. So this is very query driven, you input a complex dataset, you process it to produce
an output. And then the final step is data presentation, how do you present this output? So this is the thematic styling. So there are a couple of phases or different types of thematic styling. One is you can use it to present a result, so in this case you're publishing results
derived from data, so population density, that sort of thing. This is a static result, so you can apply a known styling technique. It can be dynamic styling because you can tie it to the Z value and that sort of thing, but the result is just known.
The next one is what I'm referring to as interactive data. So you're really trying to present data to a user rather than a derived result that someone else has used. So let the user answer the question for themselves. And to do this is far more dynamic because the data is processed on the fly,
choosing a virtual layer. So what happens there is you don't know the final layer prior. So you have to give flexible methods for the user to be able to style this data presentation. And that flexibility is crucial because as I said, we don't know what questions the user is going to ask of the data, but we want to enable them to interact with this data.
So a bit of a sidebar, I guess part of the thing driving this is information visualization. So this is interactively presenting data visually and enabling a user to explore and identify
patterns within the data. And part of this is to make it very dynamic and to try and visually encapsulate underlying trends within the data that sometimes become very visible. I mean, humans are good at pattern recognition. And so some of these trends can become, given the right visualization, very evident.
So, you know, outlier detection. When you look at a graph, you can see the outlier. So in spatial data, this is thematic maps. So for example, a choropleth. So essentially polygons and each polygon is labeled according to, or is colored according to the value within that. So derived using, you can use a map classification technique.
So a method to just partition the feature space of the polygons and then a choice of color to represent the value within that polygon. So there's also a number of methods that we can use, but what's the right one? So is it, should it be a data specific method to determine the map classification, the coloring
and other styling factors, or is it question specific? So, you know, what question the user is asking? And I think it's actually sort of both and depends on the data and depends on the situation. So, and then the other one I want to introduce is, okay, so you have the visualization process is you extract data and then you render it.
When you're doing sort of analysis, you're presenting the data, you can actually derive multiple results from this data set. And so for each result, you can then also produce multiple visualizations. And is this a flexibility? Because as I said, you don't know the pattern in the data before hand.
And so you want the user to be able to tease that out. You have to give them the flexibility to view it in multiple ways to try and find the information they're looking for. And again, we can produce multiple view simultaneously, that's normal. And so just a quick styling to map styling. So map classification, this should be style of description.
So style a descript in GeoServer, you can have an XML which defines how a value and a polygon or a point should be drawn to color, that sort of thing. So to do this, we classify the feature space into in-discrete categories using different criteria, so sort of an equal intervals, natural breaks, those sorts of map
classification algorithms. And then, so PySAL is really good for this. And you can just make it available through a web service. And it sort of becomes this nice map classification where it returns the bins and the counts per bin and the upper and lower bound and that sort of thing. And for color schemes, I adopted Calibro, which are sort of derived from Matplotlib.
So if you like, you can derive n values in this color scheme automatically, or you can store 256 in a database and just choose and pick the ones you want. There's sort of advantages to both. And then the other option is where is this style descriptor generated?
So it can generate on the server. So GeoServer WPS, it generates the style layer descriptor. At WMS, you can specify the style descriptor and also the style layer descriptor server. So you give it some data and it gives you better styling. And then it can also be generated on the client.
So user selects classification, color, and then the color of the map changes. So this is a very broad level of the base architecture required to do this sort of thing. So you have a view site, which is essentially, do you style based on the local extent of the extent of the map that the user is looking at?
Or do you style based on all the data in a sort of data set? And then the analysis method, which is a map classification and then a color scheme. So classification is determined and based on the number of bins the color is determined, and then the data vectors are input with the feature space.
Now that data vector can be anything. It can be WFS service, WPS service, that sort of thing. The idea is to make it very restful-ish and then you can output vectors. But on top of that, what are the parameters? There's a lot, and this isn't exhaustive.
So you can, the styling attribute or attributes. So what attributes in this sort of virtual layer that's created on the fly should be used to create the theme. So for Polygon, you can have a boundary, you know, color thickness, that sort of thing, opacity, but this can be done on the client, so it's less crucial. And do you provide a label?
Is that going to interfere with the visualization? Point options, that's a bit fun. You can have an X, Y radius, so you can have an ellipse if you like. So X and Y can differ and it can be linked to different variables within the data set. And it can also be determined relative to the map. Do you include a label or not? So one label will provide context, but color is more, I guess, intuitive.
And then the opacity and border colors and that sort of thing. So what I'm going to do is present a whole bunch of different visualizations of some data that I've sort of messed with in the last year or so. So one is a health data set, which was 11 million hospitalization records.
And I calculate summary statistics using a WPS and then the thematic map is the output. And sensor data, so gauge data, for example, rainfall. So the health data is spatially contextual. So it's a spatial context is applied to the records for accounting. It's not technically a spatial data set.
So the way to view it is very much a polygon and summarize the values within polygons. The sensor data was point data. So the gauge data, rainfall gauge, stream flow gauges, and then board gauges for aquifer data. And then the final one I looked at is service data.
So I actually looked at bus stops providing public transport services. But this can be applied to health and hospital, that sort of thing as well. So the basic interaction that I'm envisioning essentially is the data is input and then you can specify style data and the parameters, there's extra metadata I use at the moment.
The system sort of injects the metadata to determine the styling. And then some of it can also come from the user. But my real question is, what do I need to supply a user for them to theme the data the way that makes sense? So and then the style service will determine the style and either
say if it gets a GeoJSON file as input, it can inject the color into each polygon point or other. And then that can be viewed as a chart, a table or a map on the client side. You can produce essentially a WMS style thematically. And then that just slippy map client.
And then the final one is the star descriptor itself, the sort of n bins with a number of counts per bin can also be used as a sort of summary of the data set. So the first one is essentially it's a calculation of the probability of access to a service. It's frequent, so it's not usually accurate,
but it's essentially showing that within a region, this is a probability of the people having access to a service. So the one on the left is styled using equal intervals with 10 intervals. So it's probably to be 0 and 0.1, that sort of thing. So it very nicely partitions the probability space into sort of intuitive numbers.
The one on the right is more geared towards answering a specific question. So in this case, if you're aiming to provide at least 75% coverage to a region, so probability of 0.75, you can use that as a pivot point and then have a divergent colour scheme around that. So very quickly, you can see red, OK, those areas need work.
White's about right and green is good. So that one is more answering that specific question for, I guess my aim is to have 75% of the population covered. Where isn't? And so that visualization is very much I guess was that, but the probabilistic one, the equal intervals is more.
OK, so this is my sort of distribution over between 0 and 1. OK, so this one is a rate ratio. So what it is, is a disease prevalence rate compared to another rate. So in this case, each sort of area. So, yes, essentially a census area
is the rate is calculated for the area and the rate is calculated for the entire state of Western Australia, which is where I live. And compared to that, so red is essentially the rate for the region is higher than the normal rate in the state. Green is lower and white is about right.
But because of the way this one is calculated, it really only makes sense to visualize it as a divergent colour scheme, because it's essentially three classes. So 0, 1 and 2. And so this divides the three classes up in a way that makes sense. So giving a user the ability to change their colour scheme
doesn't make sense, because this is essentially this is a data driven, a result driven from the data visualization. So you don't want to change this too much. On the other hand, this is a disease prevalence rate. So this is the rate of disease per person, as smooth.
The one on the left, again, is done using equal intervals. And because of outliers, equal intervals has a tendency to sort of blanket everything within a few bins and then have large gaps and then you might have one at the end. So it's and then the one on the right is quintile. So that divides into five equal number bins.
So the one on the right shows a far better spread of what diseases occurring where. The one on the left is very good for outlier detection. So if that's your question, show me the outliers, that'll work. But if it's show me the distribution of the disease prevalence,
the one on the right is a better way of showing that distribution, apart from the colour, because that's completely wrong. Because green is the highest prevalence of the disease rate, which intuitively makes no sense because you want red. So in this case, and so this is one of the parameters, you either have to have a reverse for a colour scheme.
So, you know, is high bad or is high good? Well, in terms of coverage, high is good. So like coverage for a service, high is good. In terms of disease, it's not. So and then the other part is, well, this really should be the red, yellow, green or green, yellow, red, that way, with red showing bad, just how we interpret stuff.
So green is good, so low rate is good, red is bad, so high rate bad. So while it shows a better spread, colour scheme just doesn't work. If you're sort of looking at that, you say, oh, green, that's fine. But then if you actually look at the legend and think, oh, well, not so much.
So there are, you know, things to account for in this sort of... So giving a user a lot of leverage sometimes doesn't help. So in this one, so these are based on gauge data. So it's 40 years of daily readings. And so the way it's generally visualised,
or the way it's generally summarised is a short term over long term. Is it greater or less than that sort of thing? So the one on the left shows what you can sort of compare around points quite easily. The one on the right, though, is again a divergent based on, well, one is equal to the long term, you know, short term is equal to long term average.
So that actually is better for comparing within a sensor reading. So, you know, it really depends what the user wants to compare against each other or compare within. And you can also show multiple attributes, I've got to speed it up. But essentially, this one shows the radius is based on one attribute,
the colour is based on another and the label is based on another. So we can put three attributes on there. This is more of a presentation, I call this a contextual visualisation, because you're putting too much maybe information in there for someone to interpret. So if you want to publish this, this may be a good one.
But if you want to sort of intuit it, maybe less so. This is a radius relative to ground. So this is the service coverage. So the colour represents the number of units or people who have access to that bus stop. And so what I did here is the radius is actually 500m buffer,
because that's how they calculate the coverage. So you can actually just visualise the radius that they use for the buffer on the map and sort of see where the coverage is, draw the houses behind it and sort of see that sort of thing. Another way to view it is the size represents the coverage.
So size and colour in this case are matched. And the third way to view it is ellipse, which the Y is the number of houses covered, the X is the number of services per bus stop and the colour is again the same as the Y.
And this one is quite useful. You can't see it, which is actually quite good. But if you had an elongated X, which is a large number of services, so if you had an ellipse basically like that, you'd have a lot of services for not many people covered. So this is a very nice way to sort of intuit that. OK, hang on, we've got to change that one. Or why are we covering so much?
This one is essentially you can inject a style layer descriptor in the metadata. So this is a fishnet grid applied to summarise the point sensor data within each grid cell. And they have a very particular way to interpret that, so that that can be injected.
But what I also do with this one is I'm calling it a meta layer. So in the metadata I put the points in the layer so I can just draw them in the same layer. So that's one layer with the grid and the points used to derive the grid. I'm not sure if meta layer is going to work. I'm still kicking that one. You can have linked visualisations, so again colour is linked.
In this one the left is sort of a male disease rate, the right map is the female disease rate. And so the X axis is the left map and the Y axis is the right map. And again they're linked by colour and you can link them by hover and all that sort of thing. It's just a quick way to do a spatial comparison,
but then you also have the graph to do the comparisons kind of plot. Which leads me to this. Some of the data I'm looking at aren't technically or traditionally GIS fields. So epidemiologists like tables, but that doesn't mean we can't put the cartographic styling inside the table
and maybe start them intuiting that colour scheme means this. And so when they see the map they can sort of make that link. And again you can do it with multiple dimensions. So this one is a disease rate with the cartographic styling applied and then a bunch of different attributes.
So you can do a multi-dimensional analysis based on that. And again you can show multiple visualisations. So this is the same as the one we saw before. This is using quantiles. This is showing the feature space. So this is a way of showing the outliers.
There's basically two outliers there. So we can show the outliers while showing a sensible styling method as well. So you get those two pieces of information at once. You don't have to do equal intervals and then do something which will give you a better spread of the disease. And the last one is glyphs. Essentially so that the sad emoticon, there's no result here.
So I'm thinking that essentially we should be able to, just like in normal maps, a tree represents a forest, we should have iconography which represents something within that. So if you can't generate a result, everyone realises that sad face.
If the result is hidden due to privacy, there's actually a emoticon with a metal plate over the mouth, that sort of thing. And basically that's it. Any questions? Oh, I thought I had too much. Okay, is there any question?
No question, okay. I may add more, but I really enjoy your presentation because many GIS tools give us many choices,
but there are no philosophy or the agony how to express or what is liar or what is right or wrong. But I think you gave us what should be considered. I don't have that answer either. This is more me trying to figure out, engaging by users' reactions.
So you can tell sometimes eyes light up, ah, okay, that's the one to show in that one. And other times people like both and then other people like one and not the other. So the idea is to be flexible, but then how flexible is also another question. And also, how do you get a user to say, okay, I'm really asking this question.
You can't put that into a form. Essentially I'm after this in a general sentence. Yeah, flexibility is a really important aspect of your presentation. But I wonder, you showed that one data set can be divided or expressed to three different results
or it should go to many ways to visualisations. But sometimes you can put the two datas and mix it together, right? So yeah, from place. Yeah, so I think all your presentation slides
have one data and some results and visualisation. This one is one data set, but then I filter by male and female. So it's two results and then there's a spatial visualisation and then the scatter plot visualisation.
But again, the thematic styling is adopted in all visualisations. Okay, good. Okay, thank you very much for everybody's presentation and your listening. Thank you. Thank you.