We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Crunching Data In GeoServer : Mastering Rendering Transformations, WPS Processes And SQL Views.

00:00

Formal Metadata

Title
Crunching Data In GeoServer : Mastering Rendering Transformations, WPS Processes And SQL Views.
Title of Series
Number of Parts
295
Author
Contributors
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
This presentation will provide the attendee with an introduction to data processing in GeoServer by means of WPS, rendering transformations and SQL views, describing real applications and how these facilities were used in them. We'll start with the basic WPS capabilities, showing how to build processing request based on existing processes and how to build new processes leveraging scripting languages, and introducing unique GeoServer integration features, showing how processing can seamlessly integrate directly in the GeoServer data sources and complement existing services. Moreover, we will show how to integrate on the fly processing in WMS requests, achieving high performance data displays without having to pre-process the data in advance, and allowing the caller to interactively choose processing parameters. While the above shows how to make GeoServer perform the work, the processing abilities of spatial databases should not be forgotten, so we will show how certain classes of processing can be achieved directly in the database. At the end the attendee will be able to easily issue WPS requests both for Vectors and Rasters to GeoServer through the WPS Demo Builder, enrich SLDs with on-the-fly rendering transformations.
Keywords
Transformation (genetics)Artistic renderingView (database)Presentation of a groupReal-time operating systemMappingBitProcess (computing)Server (computing)GeometryArtistic renderingXMLUML
Computer networkClient (computing)Java appletJava EnterpriseVisualization (computer graphics)Image processingData fusionFamilyBitService (economics)Real-time operating systemWeb 2.0Process (computing)Communications protocolPresentation of a groupSinc function
Real-time operating systemStapeldateiProcess (computing)XML
Process (computing)CalculationImplementationInternetworkingoutputService (economics)Different (Kate Ryan album)CalculationCommunications protocolBitThermodynamischer ProzessProduct (business)WordLatent heat
GeometryData typeUniform resource locatorProcess (computing)GeometryOperator (mathematics)QuicksortComputer wormBuffer solutionProcess (computing)Real-time operating systemThermodynamischer ProzessService (economics)Computer animation
Link (knot theory)DisintegrationThermodynamischer ProzessExtension (kinesiology)Real-time operating systemArithmetic meanMereologyCommunications protocolCore dumpService (economics)Physical systemImplementationLocal ringRemote procedure callClient (computing)Process (computing)Multiplication signPlanningCASE <Informatik>Artistic renderingProgram flowchart
Process (computing)Parameter (computer programming)Electronic mailing listDemo (music)GeometryoutputBuffer solutionInterface (computing)Machine visionServer (computing)XML
Transformation (genetics)ChainArtistic renderingPower (physics)Real numberProcess (computing)Group actionArtistic renderingMappingFormal languageDescriptive statisticsFile formatTransformation (genetics)Computer animation
Thermodynamischer ProzessLocal ringView (database)Parametrische ErregungDatabaseClient (computing)Query languageParameter (computer programming)BuildingLine (geometry)Lattice (order)GeometryOrder (biology)Shape (magazine)Source codeGeneric programmingMassServer (computing)Thermodynamischer ProzessReal-time operating systemView (database)DatabaseBitoutputEvoluteQuery languageLine (geometry)Point (geometry)Vector spacePosition operatorParameter (computer programming)Configuration spaceWave packetMultiplication signTable (information)Functional (mathematics)Visualization (computer graphics)Power (physics)CASE <Informatik>MappingSequelParametrische ErregungNeuroinformatikSquare numberEqualiser (mathematics)Particle systemBiostatisticsWeightMereologyDynamical systemSign (mathematics)Client (computing)MetreXMLComputer animation
StatisticsDigital filterDifferent (Kate Ryan album)CASE <Informatik>Multiplication signCartesian coordinate systemQuicksortOperator (mathematics)Query languageComputer animation
Order (biology)Interior (topology)Regulärer Ausdruck <Textverarbeitung>Local GroupGeometryVariable (mathematics)CASE <Informatik>outputQuery languageInjektivitätOperator (mathematics)Server (computing)Product (business)Film editingLimit (category theory)Service (economics)SequelSource code
Service (economics)Communications protocolService (economics)Limit (category theory)Server (computing)QuicksortProduct (business)Operator (mathematics)XMLUML
MethodenbankCirclePolygonVector spaceComputer reservations systemRaster graphicsMusical ensembleFront and back endsCommunications protocolData storage deviceOperator (mathematics)Level (video gaming)Message passingXML
Formal grammarProcess (computing)Graphical user interfaceProcess (computing)Multiplication signGeometryDirection (geometry)Real-time operating systemAeroelasticityBuffer solutionSinguläres IntegralLevel (video gaming)Real numberRadius
DatabaseArchitectureEstimatorElectronic mailing listWorkloadData storage deviceLevel (video gaming)DiagramInteractive televisionLibrary catalogCASE <Informatik>Multiplication signProcess (computing)Computer animation
GUI widgetInformationMethodenbankState of matterConfiguration spaceProcess (computing)MappingGUI widgetResultantInterface (computing)Data storage deviceLevel (video gaming)Chemical equationMultiplication signBeta functionSinguläres IntegralXMLComputer animation
Transformation (genetics)Artistic renderingElectronic visual displayMusical ensembleSinguläres IntegralVector spaceRaster graphicsInformationMedical imagingMusical ensembleAlgebraFormal languageChemical equationPresentation of a groupCalculationXMLUMLComputer animation
Price indexoutputFormal languageRaster graphicsSinguläres IntegralPlastikkarteDatabase normalizationWikiAlgebraMusical ensembleCalculationProcess (computing)Artistic renderingAlgebraRaster graphicsMusical ensembleCalculationLevel (video gaming)Graph coloringSinguläres IntegralView (database)XML
Function (mathematics)Vertex (graph theory)Raster graphicsView (database)Remote procedure callJava appletProcess (computing)Real numberCASE <Informatik>AreaServer (computing)XMLUML
Chemical equationService-oriented architectureProcess (computing)Factory (trading post)Scripting languageTelecommunicationCommunications protocolBitProcess (computing)Extension (kinesiology)DiagramResultantFormal languageGrass (card game)Server (computing)Operator (mathematics)LastteilungSocial classQuicksortService-oriented architectureData conversionXMLUML
View (database)Parametrische ErregungOrder (biology)View (database)Projective planeSequelMathematical analysisXMLUML
AreaStatisticsType theoryComputer networkDistanceData bufferSquare numberCellular automatonVolumeWell-formed formulaCoefficientPhysical systemView (database)CalculationSoftwareObject (grammar)InformationProjective planeComplete metric spaceCASE <Informatik>Multiplication signSummierbarkeitWell-formed formulaDifferent (Kate Ryan album)Buffer solutionReal-time operating systemGoodness of fitIncidence algebraAreaLevel (video gaming)QuicksortComplex (psychology)Boss CorporationRoutingInternet forumMereology1 (number)NeuroinformatikFormal grammarXML
Well-formed formulaSinguläres IntegralArc (geometry)Artistic renderingReading (process)SoftwareLevel (video gaming)CalculationWriting
View (database)Process (computing)Java appletHeat transferQuery languageData bufferAerodynamicsSequelReal-time operating systemCalculationView (database)outputWeb applicationProcess (computing)MereologyControl flowRhombusDatabaseReal numberPattern recognitionParameter (computer programming)Well-formed formulaJava appletMultiplication signStructural loadXMLUML
DialectMIDIQuery languageVacuumCustomer relationship managementParametrische ErregungVolumeOverhead (computing)Arc (geometry)StapeldateiAreaDatabaseArtistic renderingLimit (category theory)Observational studyData bufferElectric currentGeometryParameter (computer programming)Web applicationCalculationReal-time operating systemSpacetimeHeegaard splittingDifferent (Kate Ryan album)SequelBuffer solutionSoftwareType theoryView (database)Graph coloringSource codeComputer animation
Host Identity ProtocolReal-time operating systemPosition operatorSimulationUser interfaceState observerSequelCASE <Informatik>Group actionNumberDatabaseLimit (category theory)Projective planeMetreControl flowServer (computing)GeometrySubsetWaveResponse time (technology)Process (computing)GradientWorkstation <Musikinstrument>Point (geometry)Forcing (mathematics)Expression10 (number)StatisticsKey (cryptography)Power (physics)View (database)Level (video gaming)Multiplication signOrder (biology)SoftwareUniqueness quantificationParameter (computer programming)QuicksortJava appletCalculationNeuroinformatikLecture/Conference
Transcript: English(auto-generated)
Welcome to our next session. Please come in, come forward. Mauro isn't that scary. So I'd like to introduce Mauro from GeoSolutions.
He's got a presentation here on one of the more exciting aspects of GeoServer, using it to crunch data with WPS and SQL views. And I'm going to get out of the way and pass it over to you. Thank you. Thank you, Jodie. So we are going to talk a little bit about something exciting in GeoServer.
So the ability to process data in real time as you render your own maps. And we will see which tools GeoServer offers to do these kind of things. I work for a company named GeoSolutions, as my colleague Andrea had in the last two talks.
So I'm basically here so that Andrea can rest a little bit, since he's a little bit busy today, with something like ten presentations. No, just five today. Just a little bit. So I try, as a colleague, I try to help him to rest a little bit. Okay, so the first tool that GeoServer offers to crunch or process your data in real time
is one of the probably the most unknown OGC protocols, the WPS. WPS means Web Processing Services, so it's all about processing your data, either in real time or in batch.
So we will try to introduce it a little bit. It's an OGC protocol, and it's a way to standardize one of the most difficult things when we process data. The ability to do any kind of calculation and processing when dealing with our data.
Well, it's a little bit different than WMS and WFS or WCS that are very specific protocols that can do very specific things, while WPS tries to cover all the rest of the world. So the ability to do any kind of things with your special data that is not allowed by the other protocols.
This is an example of what you can do with WPS. So transforming some sort of geometry, doing some sort of geometric operation like buffering. And this is the way you can call WPS with an XML payload.
WPS has the ability to be used either synchronously, if you need to do real-time processing of data, or asynchronously for long-running processes. A common WPS setup is using WPS to interact with some remote service,
like WFS, WCS, or any other kind of HTTP-reachable service to fetch the data you need to process, and then interact through an WPS client to do the processing when needed.
When we talk about remote in this case, we can extend the meaning to also using the local data. It's either remote or local data, especially when we talk about your server, where WPS is very tightly integrated with the rest of the services.
As also Andrea said, other WPS implementation are usually standalone services, while WPS also being an extension of the core, it's not in the core itself. I would like to see it in the core sometime in the future. I don't know if it's in the plan, but I think it would stand well there.
Also, if it is an extension, it is very tightly integrated with the rest of the system. For example, it can be used from other protocols like WMS for doing only the processing part of rendering a map. This is the reason why we talk about WPS when we want to do real-time processing of data in our rendering workflow.
In GeoServer, you also have a way to easily create a WPS request, because as you have seen from the buffer example, probably creating an XML for doing a WPS simple request is not so easy at all.
We have a little request builder directly integrating the GeoServer UI that you can use to generate your request on the fly with the visual interface. The real power of WPS when integrated with WMS is the ability to create so-called rendering transformations.
It's the ability to transform your data as you do your rendering, and it can be done just writing your process description in SLD or other kinds of styling languages
to process your maps through WMS. But WPS is not the only tool that you can use to do crunching or processing of your data. There are other tools that are quite useful for these kinds of things in GeoServer,
especially if your data source is a special DBMS, not a generic source like shapefiles or other kinds of sources. When you deal with special DBMS, that is probably the preferred source in general for vector data in particular, you have another tool that we call the SQL views, and in particular the parametric SQL views.
We will look a little bit to them in detail. Parametric in particular are quite powerful because they allow to take some input from the user to make the processing, so the processing can be not only in real time but also dynamic and depend on what the user is asking,
and can make your maps very interactive and very dynamic. How does a parametric SQL view work? You basically can use them through the usual WMS or WFS clients.
You can interact in a more direct way with the underlying database. When you configure a layer in GeoServer to be used either by WMS or WFS, you usually only specify the table from the database that you are going to use,
while if you want to have a more detailed query of what you can extract from the database, with SQL views you can specify also a SQL query instead of a simple table, so that you can interact with your special database more tightly.
This is an example of a SQL query that you can specify in a SQL view configuration UI. The ability is to use the real power of the underlying database in your query.
For example, if you use PostGIS, which is probably the most used DBMS with GeoServer, you can also use the PostGIS functions like STMakeLine. In this example we are just transforming point data into lines, connecting the points in a particular way.
This is an example we have in our training where we are using Storm's data so that we can basically connect all the positions of a particular storm by a line. So we filter the point by the storm the hurricane they are pertaining to, and we create a line that shows the evolution of the storm itself.
You can probably see, because it is a little bit bold, in this query we also have something that is like a variable, the one between the percent signs, and that makes the SQL view parametric, as we said.
These parameters can be specified when we do the final WMS request, so we can change them at real time. In this case we are just specifying a time interval that the user can specify, for example through a visual time slider or something like that.
Some examples of an application we built using this capability are these ones. We have a time slider in this case, very similar to what we talked about. We can also do some weird things like changing not only the data that comes to the query,
but also pieces of the query themselves, like the operator we want to use for some sort of aggregation in this example. This uses a different query. In this case you can see we also have an OP variable that is not data,
but in this case the operation that we want to apply to our data. So we can make it parametric in a very complex way. Another thing that Andrea mentioned is that since we are accepting input from the user,
we need to validate that input, so we need to specify the valid values for the input. If not, we can have SQL injection problems. Another thing that WPS and other tools in your server are allowed to do
is to go over the limitations that we have in the most simple protocols like WFS and WCS. For example, if we want to do a service of clip and ship, it's very complicated to do that only using the basic protocols.
We need some sort of orchestration to do the complete workflow. And so WPS is very good also as an orchestrator of the operation that we do with the basic protocols like WMS and WFS and also WCS.
This is something we built using our front-end map store to download large amounts of data in a synchronous way. And this is the UI we built for it. Basically, the user is able to build the buffer from a given geometry,
and the buffer can be calculated using the WPS buffer process. And the user is able to change the radius of the buffer in real time, and the buffer is calculated through a rendering transformation each time and shown on the map directly.
Then the download can be run on the given buffer using another WPS process, a custom WPS process for downloading data. And another process can be used to follow the status of the request because downloading a big amount of data can require time,
so we need some asynchronous process to be involved. And we can also monitor how that is going. This is a generic diagram of how these kinds of things can work. So we have different WPS processes that are orchestrated by the map store UI,
in this case, to get the final workload. And WPS interacts with the other services and with all the GeoServerConfigure catalog to extract the data. Another thing that can be done with WPS is creating dashboards and widgets that aggregate your data
instead of showing it in a simple way. Also, this one has been done using map store. We use the aggregation processes that WPS includes to do charts.
These are some examples. And also the other kind of widgets that aggregate data in some way. There is a very simple wizard-like interface for creating these kinds of things. And then you get the final results. If you want to have something more complex like a dashboard,
this can be done too, leveraging WPS processes. Another example of rendering transformation. This time, this is working on raster data instead of using vector data. We can use the bands from our raster data to extract meaningful information.
For example, we want to identify in this example the vegetation status from a multiband image. We can use some of the raster algebra that is included in GeoServer and was mentioned by the previous presentations,
like the G4 algebra language, so that we can do calculations on the bands of our raster data. And we can build a style that uses this transformation together with a color map to get something like this.
This is a different view of the raster that gives us the status of vegetation in the given area. Another capability of GeoServer, WPS, is the possibility to also run processes that are not implemented inside GeoServer itself as Java processes, but as external tools.
In this case, WPS becomes a real orchestrator of something that is processed in some other place. We call this supported WPS remote. Basically, GeoServer is a sort of a broker that can run and monitor external processes running on other servers,
and can support different languages for writing the processes like Python, for example. Or it can simply run some external scripts, so you can do basically whatever you want on the external process. We use the XMPP protocol for communication between the nodes of this sort of a clustered executor.
It supports all the basic operations of WPS, including the dismiss operation and also more cluster-based things like load balancing.
And then the results get automatically or magically, as we say, ingested in GeoServer using proper converters. And this diagram shows a little bit how this basically works.
Okay, one last example of what you can do with all these tools put together, basically. So we will try to understand if we can use SQL views, parametric SQL views, and WPS altogether to accomplish something that is quite complex from an initial analysis.
This is a project that was involved. It was basically my first work I did for GeoSolutions. I was probably hired to do this, and I did it for a couple of years at least. So it's a very complex project. The main feature of this project was to compute the risk of road accidents that was involving dangerous goods.
So chemicals, petrol, gases, and so on. We had a lot of objects to take care of, like the roads themselves, and everything that was around the road that could be involved in case of an accident.
So we needed to calculate the damage that we could do with a possible accident. The objects that we had to handle were both human or not humans.
So that can be the resident population, or there can be hospitals, and stuff like that. Also vegetation. We have to use a very big road network to do our calculations. As we usually do to manage complexity, we try to split the complex problem into more simple ones.
That was our first idea. So we divided all the road network in different aggregation sizes so that we can handle different kinds of information at different levels of aggregation.
We also had to calculate areas around the roads that could be affected by the incidents in different ways. So we had to calculate buffers around the roads of different sizes. In particular, we had around 50 different buffers that we had to calculate.
And we had different kinds of possible targets for our accidents. We were given a quite complex, as you can see, formula. We had to calculate, this is giving us the risk of an accident in a given road segment.
We had to calculate these kind of formulas in real time. You can see that this is something that sums another sum, another sum, another sum. It's quite complex if you need to calculate it by hand, or also by using a computer.
It takes a lot of time if you do it in real time completely. So we had to find some tricks that allowed to do this kind of calculation in sort of a real time way. Finally, we also had to render all this data in a map.
This was our final purpose. And we basically had to color the road segments depending on the level of risk that we had calculated on that particular road segment. And this was happening at different levels at the road segments, but as we were going distant from the roads,
we were doing ARIA calculation instead of working directly on the network of the roads. So the possibilities were using SQL views. It was not so efficient because the variants of parameters that we would need to enter in the SQL view itself were too many.
There were too many combinations. We could decide to use a WPS process or a pure Java process to do the calculation,
that formula that you have seen in real time starting from the data coming from a database. But it was too much data to be calculated using our CPUs in real time. So we had to transfer some load into the database. We decided to do some pre-calculation that was helping us to make the real time calculation faster.
So we split the work into some pre-calculation done by some WPS processes that were called asynchronously, that can take all the time that you need. So basically, calculating everything we needed for a region of Italy required three or four days of calculation.
So you can imagine that you cannot do these kinds of things completely in real time. But thanks to these pre-calculated buffers and pre-calculated things, we could simulate a real time calculation based on the input given from the user.
Because the real challenge of all this is that some of the parameters for the calculation were given by the user in real time through the web application.
So splitting the work between pre-calculating everything that could be using space on disk, allow to do these kinds of things. So in real time we could calculate the color for risk on the road networks, the buffers, and also the involved targets of different types.
That was an example of a SQL view that you use for these kinds of things. And I think that's all. Any questions?
Hi, my name is Octavian. My question is regarding the SQL views. What's the limitation of unique points, the maximum number that you use for SQL views
in order to calculate statistics or aggregations? So how many million or hundreds of million points are used? Well, in this latest example we were basically aggregating, I think, tens of millions of data.
So it really depends on the underlying database since when we deal with SQL views the most difficult job is done by the database itself, doing the aggregation. This is why we use the database to do it because they are usually quite good in doing that.
Especially using PostGIS there are quite a lot of limitations if we use the database directly instead of doing calculation inside our Java processes. It's the real power of using SQL views that you can enable the database to do most of the work for you.
Yeah, that makes a lot of sense. Just to connect you to this question, what's the use case of having risk aggregated real-time to a road network because we also have done this but we've done it on aggregated values worldwide and for a lot of data that we had to ingest. I don't understand the use case.
Well, the use case, the data is not in real-time in the sense that the data of the traffic is coming in real-time. What we do in real-time is to create a thematic map of the network depending on the user inputs.
The user could enter sort of 10 or 20 parameters. The scenario is what happens if I have this kind of traffic in a certain amount of the day. What is the risk of the accident? And that would be for car insurance companies who insure cars in order to verify the risk?
No, in this case it was a public project commissioned by the European community. Thank you. One more question if someone has.
There are no more questions, one observation on your question instead. In this project there was a lot of simulation like what if we consider only petrol transport for example?
Or what if we add another hospital in this position? Or what if we build a school there? Or what if we remove an activity here or there? What if we wanted to look at only the weekend? Or what if we wanted to only look at the workdays? So lots of what ifs that were piloted by the user interface. So that's why it was fully interactive.
Second notion, sooner or later you will stumble into a limit of the database computation capabilities. At that point you have to switch to something that can compute on a grid. And then you can use GeoMaze or GeoWave to offload the computation to a distributed grid.
And still do it in real time with acceptable response times. Mario, Andrea, do you have a booth that they can come see you at? Is this microphone actually on? Okay, so thank you for coming to our session on GeoServer. You've got five minutes before lunch.
Given the lineups we saw at the coffee break, I encourage everyone to head to lunch immediately. If you would like to ask me any questions, you can find me at the GeoCAT booth today. And there's also a GeoSolutions booth if you'd like to follow up with these two lovely gentlemen. Thank you very much.