We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Publishing INSPIRE datasets in GeoServer made easy with Smart Data Loader and Features Templating

00:00

Formal Metadata

Title
Publishing INSPIRE datasets in GeoServer made easy with Smart Data Loader and Features Templating
Title of Series
Number of Parts
351
Author
Contributors
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Production Year2022

Content Metadata

Subject Area
Genre
Abstract
GeoServer is a well-established multiplatform, open-source geospatial server providing a variety of OGC services, including WMS (view services), WFS and WCS (download services) as well as WPS (spatial data processing services). Among the open-source GIS web servers, GeoServer is well known for the ease of setup, the web console helping the administrator to configure data and services, the variety of OGC services available out of the box, as well as the rich set of data sources that it can connect to (open source, such as PostGIS as well as proprietaries, such as ArcSDE, Oracle or ECW rasters). GeoServer also provides several OGC APIs, including the OGC API - Features which recently attracted the interest of the INSPIRE community. As far as the INSPIRE scenario is concerned GeoServer has extensive support for implementing view and download services thanks to its core capabilities but also to a number of free and open-source extensions; undoubtedly the most well-known (and dreaded) extension is App-Schema which can be used to publish complex data models (with nested properties and multiple-cardinality relationships) and implement sophisticated download services for vector data. Based on the feedback of App-Schema users collected over the years, a new generation of open-source mapping extensions have been implemented in GeoServer: Smart Data Loader and Features Templating, these extensions are built on top of App-Schema and ease the mapping of the data models by allowing us to act directly on the domain model and target output schema using a what you see is what you get approach. This presentation will introduce the new GeoServer Smart Data Loader and Features Templating extensions, covering in detail ongoing and planned work on GeoServer. We will also provide an overview about how those extensions are serving as a foundation for new approaches to publishing complex data: publishing data models directly from MongoDB, embracing the NoSQL nature of it, and supporting new output formats like JSON-LD which allows us to embed well-known semantics in our data. Eventually, real-world use-cases from the organizations that have selected GeoServer and GeoSolutions to support their use cases will be introduced to provide the attendees with references and lessons learned that could put them on the right path when adopting GeoServer.
Keywords
202
Thumbnail
1:16:05
226
242
SoftwareSource codePC CardPrinciple of maximum entropyBit error rateSoftware engineeringGoodness of fitTerm (mathematics)Server (computing)Core dumpData storage deviceOpen sourceTexture mappingProduct (business)XMLUMLComputer animation
Source codeSoftwareEnterprise architectureView (database)Transformation (genetics)SpeciesOpen setTrigonometric functionsOperations support systemOscillationTexture mappingParameter (computer programming)FreewareField extensionComplex (psychology)DatabaseWorkstation <Musikinstrument>Table (information)Point (geometry)BootingVisual systemPrice indexDatabaseCartesian coordinate systemFlow separationData storage devicePlug-in (computing)Source codeModule (mathematics)MappingMathematicsNumbering schemeTexture mappingWeb serviceService (economics)Server (computing)Web 2.0Transformation (genetics)Function (mathematics)Complex (psychology)Process (computing)CASE <Informatik>File formatPresentation of a groupOcean currentQuery languageTerm (mathematics)Slide ruleStandard deviationDifferent (Kate Ryan album)Centralizer and normalizerOpen sourceMultiplication signRelational databaseData managementINTEGRALData modelParallel portXMLUMLProgram flowchart
Flow separationSource codeSoftwareWorkstation <Musikinstrument>Point (geometry)State observerCASE <Informatik>Parameter (computer programming)Workstation <Musikinstrument>Table (information)Endliche ModelltheorieReal numberTheory of relativityComputer animationXMLProgram flowchart
Data storage deviceData modelProduct (business)Physical systemBitSquare numberCartesian coordinate systemTexture mappingSimilarity (geometry)Self-organizationMultiplication signTask (computing)Diallyl disulfide1 (number)
SoftwareSource codeDatabaseElement (mathematics)Workstation <Musikinstrument>Texture mappingLink (knot theory)Table (information)Attribute grammarWorkstation <Musikinstrument>Self-organizationDatabaseData modelTexture mappingData structureForcing (mathematics)Relational databaseXML
Data modelFunction (mathematics)Web serviceDataflowSource codeSoftwarePlastikkarteParameter (computer programming)Workstation <Musikinstrument>Domain nameExtension (kinesiology)Beta functionTexture mappingCodeEndliche ModelltheorieComputabilityDomain nameAbstractionBootingPerimeterData structureSpacetime1 (number)PlastikkarteDatabaseWorkstation <Musikinstrument>Server (computing)Video gameGeometryData modelRelational databasePhysical systemState observerClient (computing)Data storage deviceTable (information)Web serviceStreaming mediaContext awarenessOpen sourceQuery languageAttribute grammarTerm (mathematics)Function (mathematics)AutomationNumberDiallyl disulfideExtension (kinesiology)Connectivity (graph theory)Semiconductor memoryBuildingTheory of relativityDiagramProgram flowchartSource codeXMLUML
Source codeSoftwareMathematicsDecision tree learningDomain namePlastikkarteEndliche ModelltheorieMappingGeometryNumbering schemeDifferent (Kate Ryan album)Theory of relativityVideoconferencingState of matterData storage deviceCartesian coordinate systemDatabaseFunction (mathematics)File formatBooting2 (number)Texture mappingProcess (computing)Data structureRootGame theoryComplex (psychology)Selectivity (electronic)Moment (mathematics)XML
Core dumpSoftwareSource codeFreewareField extensionOpen sourceDomain nameMechanism designState observerMoment (mathematics)Inheritance (object-oriented programming)Attribute grammarContent (media)InformationComputabilityFunction (mathematics)Endliche ModelltheorieCartesian coordinate systemTerm (mathematics)Workstation <Musikinstrument>MereologyMultiplication signData modelPoint (geometry)Parameter (computer programming)2 (number)Streaming mediaXMLUML
Category of beingSource codeSoftwareFunction (mathematics)Open setPosition operatorTemplate (C++)IdentifiabilityCASE <Informatik>Attribute grammarWorkstation <Musikinstrument>Source codeString (computer science)Cartesian coordinate systemLaptopUniform resource locatorGeometryField extensionSource codeXML
Food energyAreaSource codeSoftwareTemplate (C++)Text editorFunction (mathematics)Function (mathematics)File formatParameter (computer programming)VideoconferencingData structureBitMathematicsText editorValidity (statistics)Streaming mediaSource codeTemplate (C++)Server (computing)Multiplication signReal-time operating systemWeb serviceResonatorCuboidCASE <Informatik>XML
Personal digital assistantPC CardAliasingView (database)Network topologySource codeSoftwareConservation of energyChemical equationCASE <Informatik>WhiteboardScaling (geometry)Context awarenessAttribute grammarQuery languageComplex (psychology)Category of beingSoftware testingServer (computing)Physical systemBitData storage deviceSystem administratorDoubling the cubeExistenceComputer animationXMLUML
Transcript: English(auto-generated)
Hello everyone, good afternoon, so I'm Nuno Oliveira. I'm a software engineer at GeoSolutions Mostly working on the GeoServer ecosystem GeoSolutions as you may know we are a company with open source at our core, GeoServer, MapStore, GeoNetwork These are some of the examples of the open source products we deal with
in terms of Let's say open source. Not only we embrace it. We also participate in innovative and moving it forward, okay? This talk in particular is about Inspire supporting GeoServer and as well how we can map complex data in different output formats and integrate with
different kinds of data store management So in the past of years this was very focused on Inspire And now we have developed a set of new tools that will allow to support the Inspire use cases But to go beyond them, okay So in terms of support of Inspire and GeoServer, so well we are complying with of course
GeoServer is free and open source so we can just download and use it without any licensing and we are compliant with a bunch of standards of course from the Well if we do the parallelization we can actually do it on the next slide So we have in terms of download services Which are usually the most complicated one to let's say to comply with
So we support WFS, web coverage services, and now as well the ODC API features So the recommendation to use ODC API features to publish download services was approved It's where GeoJSON becomes central Instead of GML. I will say that's one of the main differences in terms of your services
We support WMS and WMTS and transformation services can be obtained with WPS and the transformation process. Okay, so this presentation we are going to focus on the neural services Which means that we have our data storage somewhere
We have a target schema and some way we need to make both of them correspond one to each other Okay, so this is how we use it to do things So basically we are the application schema module that will allow us to read our data from a database storage We'll have to define a mapping and application schema and sorry
I need to start the counter and we'll basically have the data source We'll have application scheme in the middle We have a target schema and we define a mapping for both and then just serve will take care of doing everything So Along the time there were several improvements that were implemented there were several data source that were supported Postgres Oracle solar MongoDB you name it and
Basically, the way go was I have my data. I have a target GML schema and Well, I will obtain my mapping So this was working great until the last couple of years. Why because you know that GML
Kind of flows the let's say the first place now we are discussing about your Jason So now what we have all our data sets map it and we need actually to produce your Jason not flavor flavor GML flavored your Jason but a pure your Jason with about structure data and not only needs to be efficient to produce
But as well to query and it doesn't belong anymore to them It also belong now to OGC API features and we don't use any more relation databases now we use MongoDB so When we are using application schema, we are the tool to help us defining those mappings. That was a
So the current plugin available for application scheme on a only supports a 3.5 Let's see before the end of this year We can actually upgrade it to support a lot for because when a immigrated to a lot for it kind of broke all the plugins because the it was quite a significant change in the in the in the way things were tied up together and
Yeah, so long story short right now. We have our data source We want to go with our data to a target schema. We need to have application scheme in the middle We can use L to help us out But still one way or another we have to define mappings There are mappings go from the data to the target GML schema and we started to notice that this add
several issues so For all the examples I will show so far. I will use this metro stations use case There is a real use case with this name. That's not the same. This is simplified one Okay, it's basically three tables station observations parameters and they have a relationship between them
So station has several observations and each observation as a parameter So the first station from Alexandria has two observations one is for temperature one is for wind speed Okay, a very simple relation model. So
It's time to go one step back and to look a bit at what we are doing So initially we have two needs so we have our data storage and we have a data modeling So someone started the data because someone is producing it. We don't know it comes from sensor data Whatever is the reason we have our data storage and then we have our data modeling So someone sit down and say look, this is the way we should publish your data
There is of course the inspiring one, which is very well know but there is other organism does do exactly the same thing and Then we need of course to publish the data and publish the data is not only make it available Then we need to make it efficient to be query ball because we're if we have a production application
We have 1,000 years squaring the data set The system will need to be able to handle and we'll need to be able to efficiently translate the requests so that kind of stuff and Kind of we need to map everything together, right? And The way it is done now is that basically the person doing the mappings need to take everything in consideration
So he needs to understand that the modeling he needs to understand data storage because you need to understand there is a relationship between the data Everything you need to understand all the data will be query and it will need to understand how the data will be built So this means makes quite a bit of a complex task to achieve
So and of course there is I keep forgetting about it There's still gml was leading the dance because everything we did was with gml in mind because WFS was mostly gml so, you know, that was the way things were done and we see now that with the new OGC API features and Similar ones that gml is definitely not again
Let's say the main format judges and it's kind of slowly taking out that place. So Well, exactly. This is a gml all the example look like so basically we want from the data I have my station table with my Station name colon and I want it to become the common name attribute on my target data model. Okay, this is great
But if you want to put it at your Jason, well, this is kind of an extra step. That is useless And of course this force you to know about the structure in the database Which is definitely something that you may not not know about because who knows maybe is a completely different
organization that tells look there is the relation database and There is the modeling and you just have to do the mappings between them so Basically what we have sit down together and think about the way where we can lower the entity get for this and to use computers for what they are good for which is
Dealing with abstract numbers with abstract stuff So we think look that the storage data modeling and publishing that belongs to humans because that's the stuff they are good at Understanding how I should model my data to start it What should be my data model to publish it and the way I want to publish with the WMS WFS or GC API
Whatever and we build a component that take care of the rest and that company the smart data loader Which you'll be able to look at relation database And of course, I keep mention relation at the base because they are the most well-known ones and the ones we actually support for now They are very well
Designed and very well mature system. So when you have a relationship usually it's very well-defined it I know look my station tables depends on the observations one the observation ones depends on the perimeter ones So the computer can perfectly understand that structure and build the domain for us, right? So let's automate all that step and that's basically what we did
But now we need a second step because with the smart data loader, it means that great. Everything is automated So it reads the domain model and it gives us the domain model But how do we make whatever we obtain in the automatic way correspond to the target output? We are expecting this is where where the second tool comes out the features templating mechanism. Okay?
So although this means in terms of workflow, it means that right now we have our data We have our target modeling So smart at the later will basically look at or that the model will walk through it and Automatically build in memory and efficient streams of features. Okay, there will be complex. There will be relationships
They will have everything that's on the domain model and then for each service We'll have features templating which will allow us with content negotiation to say look if the user Requesting this as this particular context then we want to give to him this
GML or this GOS and or this this JSON LD which is now also supported or even this HTML and The nice thing about this is that this now goes beyond inspire So all these tools that were very specific to inspire and you know in in the practical way of fly
That only gets it was getting investment for inspire now is using in a lot of other let's say systems I don't know for publishing maritime data because you want to have a nice HTML output for very old clients that went to integrate So smart data loader is now a free and open source your server extensions It can be downloaded and basically what it does describing is quite simple it look at your database
It will walk your structure and it look this is the domain model. I was able to extract from your data space Are you happy with it? If we say yes, then it will be in memory a streams of features that way and it will take care of
Make sure they're efficiently Represented and that when we try to access the attributes to retrieve the data to mention them. It will transparently do Do all of that for us on a style on a query we sent on any aspect that May enter the system, of course This is a video. It should play normally
There we go. So basically Our first step is to add Not the store Cannot see very well from here Basically we have to provide the database where we have or not the storage we provide the access to the database, of course
Okay. So now we created our connection to our process SQL database where our main data is started now
We go to smart data loader and basically Okay, we have to define a name for it we select our Database or select our root entity and it build the domain model for us Okay, look, this is the relationships. I understood we can do some pruning like and we publish the data
It detect the geometry the spatial and that's it. It's done So before this before reaching this state before we have to find the target gml scheme We are to define an application schema mapping. We are to know very well. All the data was related together
We have to define your the mappings testing them publishing them and so on in less than 30 seconds We just published that data set, right? So the only piece we put aside for the moment, which is the game changer Is that we got rid of the target schema? So at this stage or and we'll see it now
So it doesn't matter how much relationship or data as as you will see now in the gml output There we go. We have a complex structure Okay, of course the schema that is here match the relation the the relation scheme on the database. Okay
And as we can see we can do the same for your Jazza So there we go We just were able in a couple of clicks to publish or complex data schema model In different output formats and of course, we are still tied up to that target to that
Sorry to that domain model that we defined on the database. Okay, so now is Time to take like let's say care of the second part So now okay, we are in inspire and we are mandated to actually respect that very schema. So how do we do it? well, that was a very tricky part so we come up with the
Features templating mechanism where basically we wanted two things look We don't want to be tied up a lot to the target modeling to whatever we know very well or the main model So we know we have stations. We know we have observations. We know parameters. We know their values So we want to be able to use those attributes. We want to be able to take or what put format
Do you just put it in whatever place it is and start to name the attributes there? And it's up to the computer because that's what they do Well to understand that and make sure it efficiently retrieves the information It needs of course is not let's say straightforward thing to implement but in terms of functionality is quite powerful
So that's basically what we did. So a features template is what you see is what you get approach we'll see how it works in a couple of seconds and It is integrated at this moment with OGC API features WFS and WMS Okay, we'll see later how we are going to deal with
Content negotiation and it's super efficient. Why because it's dealing with an in-memory Stream of features that were built by the very already efficient application schema And this is another very important point If you already if you already modulated all your that in application schema
So if you already did a lot of investment guess what you can just use features templating on top of it, right? so Okay, this was again. Thank you So this is an example of what the template the JSON template will look like for our meteostations use case So basically we'll just look this is my JSON
I start writing on a notebook and I say look here I want my source to be the station stable My identifier is the ID attribute my geometry is the position my name is the common name Wait here, I want to use a SQL function. I want to do a string concatenation here
I want the location but you know what I want in a WKT format So I apply that function what we see is what we get exactly like it, right? So we can see me here in more detail so we can see that basically what the template in giant will do is interpret our template and will just produce the output we want and
Of course you will be able to understand the request So if we say when we get this JSON on your open layers application Whatever if you say look give me all the stations where the value is only 30 degrees You will be able to translate that back to the original data source and to return it back to you. Okay?
So this is a video. I will not show all the video because I'm running out of time. So Okay, we developed an UI to deal with the features template with the features template themselves So basically we are able to of course to write them down is very similar to the styles editor of your server
if you have a user and Okay, this is I can accelerate a bit. Okay We have our first template and this is the important bit We also wanted to have like a sandbox where we could actually try out our template So enter to be able to say look go to my data source select that feature
And let me edit the template see the changes in real time that kind of stuff and we have some kind Let's say some type of formats that are tricky for example GML which needs to be related in the schema so we can perform a validation here and Even for formats like JSON LD. We are also able to perform a validation there. Okay, I
Creating templates we have an UI validating the templates we have a sandbox and we have the last missing bit which is content negotiation So we have quite a powerful tool now where we have an efficient tool to read or that the structure
To build an efficient stream of data We have the features template that allow us to map that data and now we are missing content negotiation Maybe because we are experimenting a new type of service maybe because we want to differentiate between users So there you go we can define Let's say content negotiation buzzed on either. I don't know whatever may be available buzzed on request
For example here we say that look For the if the request parameter is a double FS and if my layer is the judges in one Then I want to use the judges and output format, okay Use cases are lights
I will show some of the use cases that kind of use this one of them is the board also of Where basically we publish around 1 million or 10 minutes and I remember what's the the test case you did here and basically, we are very easily able to publish that data set that complex data set in an efficient way in double-ms and
we actually experimented with JSON-LD so embedding the context in the geojson and Of course, we were able to query it This was a system that was published actually got real users querying it and it worked quite well the second one is a bit more advanced one with the Norwegian public role administration where they have a huge amount of data storage in a MongoDB and here we are to make your server
Know let's say no scale. What does that means that you do a double FS request? We don't know the schema. We'll have to build it on the fly with whatever data We are receiving which means that of course you cannot obtain yet GML
But it works quite well, so you can get a bunch of complex data stored on your MongoDB You can do queries if the attribute exists. It will not complain. You can use it for styling So the styling will skip if the property not exists Anyway, if you have MongoDB You can give it a try and that's all I have to say