Publishing INSPIRE datasets in GeoServer made easy with Smart Data Loader and Features Templating
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 351 | |
Author | ||
Contributors | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/69064 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Production Year | 2022 |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
FOSS4G Firenze 2022180 / 351
1
7
13
22
25
31
33
36
39
41
43
44
46
52
53
55
58
59
60
76
80
93
98
104
108
127
128
133
135
141
142
143
150
151
168
173
176
178
190
196
200
201
202
204
211
219
225
226
236
242
251
258
263
270
284
285
292
00:00
SoftwareSource codePC CardPrinciple of maximum entropyBit error rateSoftware engineeringGoodness of fitTerm (mathematics)Server (computing)Core dumpData storage deviceOpen sourceTexture mappingProduct (business)XMLUMLComputer animation
00:29
Source codeSoftwareEnterprise architectureView (database)Transformation (genetics)SpeciesOpen setTrigonometric functionsOperations support systemOscillationTexture mappingParameter (computer programming)FreewareField extensionComplex (psychology)DatabaseWorkstation <Musikinstrument>Table (information)Point (geometry)BootingVisual systemPrice indexDatabaseCartesian coordinate systemFlow separationData storage devicePlug-in (computing)Source codeModule (mathematics)MappingMathematicsNumbering schemeTexture mappingWeb serviceService (economics)Server (computing)Web 2.0Transformation (genetics)Function (mathematics)Complex (psychology)Process (computing)CASE <Informatik>File formatPresentation of a groupOcean currentQuery languageTerm (mathematics)Slide ruleStandard deviationDifferent (Kate Ryan album)Centralizer and normalizerOpen sourceMultiplication signRelational databaseData managementINTEGRALData modelParallel portXMLUMLProgram flowchart
04:24
Flow separationSource codeSoftwareWorkstation <Musikinstrument>Point (geometry)State observerCASE <Informatik>Parameter (computer programming)Workstation <Musikinstrument>Table (information)Endliche ModelltheorieReal numberTheory of relativityComputer animationXMLProgram flowchart
05:00
Data storage deviceData modelProduct (business)Physical systemBitSquare numberCartesian coordinate systemTexture mappingSimilarity (geometry)Self-organizationMultiplication signTask (computing)Diallyl disulfide1 (number)
06:49
SoftwareSource codeDatabaseElement (mathematics)Workstation <Musikinstrument>Texture mappingLink (knot theory)Table (information)Attribute grammarWorkstation <Musikinstrument>Self-organizationDatabaseData modelTexture mappingData structureForcing (mathematics)Relational databaseXML
07:28
Data modelFunction (mathematics)Web serviceDataflowSource codeSoftwarePlastikkarteParameter (computer programming)Workstation <Musikinstrument>Domain nameExtension (kinesiology)Beta functionTexture mappingCodeEndliche ModelltheorieComputabilityDomain nameAbstractionBootingPerimeterData structureSpacetime1 (number)PlastikkarteDatabaseWorkstation <Musikinstrument>Server (computing)Video gameGeometryData modelRelational databasePhysical systemState observerClient (computing)Data storage deviceTable (information)Web serviceStreaming mediaContext awarenessOpen sourceQuery languageAttribute grammarTerm (mathematics)Function (mathematics)AutomationNumberDiallyl disulfideExtension (kinesiology)Connectivity (graph theory)Semiconductor memoryBuildingTheory of relativityDiagramProgram flowchartSource codeXMLUML
10:59
Source codeSoftwareMathematicsDecision tree learningDomain namePlastikkarteEndliche ModelltheorieMappingGeometryNumbering schemeDifferent (Kate Ryan album)Theory of relativityVideoconferencingState of matterData storage deviceCartesian coordinate systemDatabaseFunction (mathematics)File formatBooting2 (number)Texture mappingProcess (computing)Data structureRootGame theoryComplex (psychology)Selectivity (electronic)Moment (mathematics)XML
13:27
Core dumpSoftwareSource codeFreewareField extensionOpen sourceDomain nameMechanism designState observerMoment (mathematics)Inheritance (object-oriented programming)Attribute grammarContent (media)InformationComputabilityFunction (mathematics)Endliche ModelltheorieCartesian coordinate systemTerm (mathematics)Workstation <Musikinstrument>MereologyMultiplication signData modelPoint (geometry)Parameter (computer programming)2 (number)Streaming mediaXMLUML
15:09
Category of beingSource codeSoftwareFunction (mathematics)Open setPosition operatorTemplate (C++)IdentifiabilityCASE <Informatik>Attribute grammarWorkstation <Musikinstrument>Source codeString (computer science)Cartesian coordinate systemLaptopUniform resource locatorGeometryField extensionSource codeXML
16:22
Food energyAreaSource codeSoftwareTemplate (C++)Text editorFunction (mathematics)Function (mathematics)File formatParameter (computer programming)VideoconferencingData structureBitMathematicsText editorValidity (statistics)Streaming mediaSource codeTemplate (C++)Server (computing)Multiplication signReal-time operating systemWeb serviceResonatorCuboidCASE <Informatik>XML
18:19
Personal digital assistantPC CardAliasingView (database)Network topologySource codeSoftwareConservation of energyChemical equationCASE <Informatik>WhiteboardScaling (geometry)Context awarenessAttribute grammarQuery languageComplex (psychology)Category of beingSoftware testingServer (computing)Physical systemBitData storage deviceSystem administratorDoubling the cubeExistenceComputer animationXMLUML
Transcript: English(auto-generated)
00:00
Hello everyone, good afternoon, so I'm Nuno Oliveira. I'm a software engineer at GeoSolutions Mostly working on the GeoServer ecosystem GeoSolutions as you may know we are a company with open source at our core, GeoServer, MapStore, GeoNetwork These are some of the examples of the open source products we deal with
00:20
in terms of Let's say open source. Not only we embrace it. We also participate in innovative and moving it forward, okay? This talk in particular is about Inspire supporting GeoServer and as well how we can map complex data in different output formats and integrate with
00:43
different kinds of data store management So in the past of years this was very focused on Inspire And now we have developed a set of new tools that will allow to support the Inspire use cases But to go beyond them, okay So in terms of support of Inspire and GeoServer, so well we are complying with of course
01:03
GeoServer is free and open source so we can just download and use it without any licensing and we are compliant with a bunch of standards of course from the Well if we do the parallelization we can actually do it on the next slide So we have in terms of download services Which are usually the most complicated one to let's say to comply with
01:24
So we support WFS, web coverage services, and now as well the ODC API features So the recommendation to use ODC API features to publish download services was approved It's where GeoJSON becomes central Instead of GML. I will say that's one of the main differences in terms of your services
01:44
We support WMS and WMTS and transformation services can be obtained with WPS and the transformation process. Okay, so this presentation we are going to focus on the neural services Which means that we have our data storage somewhere
02:00
We have a target schema and some way we need to make both of them correspond one to each other Okay, so this is how we use it to do things So basically we are the application schema module that will allow us to read our data from a database storage We'll have to define a mapping and application schema and sorry
02:23
I need to start the counter and we'll basically have the data source We'll have application scheme in the middle We have a target schema and we define a mapping for both and then just serve will take care of doing everything So Along the time there were several improvements that were implemented there were several data source that were supported Postgres Oracle solar MongoDB you name it and
02:47
Basically, the way go was I have my data. I have a target GML schema and Well, I will obtain my mapping So this was working great until the last couple of years. Why because you know that GML
03:02
Kind of flows the let's say the first place now we are discussing about your Jason So now what we have all our data sets map it and we need actually to produce your Jason not flavor flavor GML flavored your Jason but a pure your Jason with about structure data and not only needs to be efficient to produce
03:21
But as well to query and it doesn't belong anymore to them It also belong now to OGC API features and we don't use any more relation databases now we use MongoDB so When we are using application schema, we are the tool to help us defining those mappings. That was a
03:41
So the current plugin available for application scheme on a only supports a 3.5 Let's see before the end of this year We can actually upgrade it to support a lot for because when a immigrated to a lot for it kind of broke all the plugins because the it was quite a significant change in the in the in the way things were tied up together and
04:05
Yeah, so long story short right now. We have our data source We want to go with our data to a target schema. We need to have application scheme in the middle We can use L to help us out But still one way or another we have to define mappings There are mappings go from the data to the target GML schema and we started to notice that this add
04:27
several issues so For all the examples I will show so far. I will use this metro stations use case There is a real use case with this name. That's not the same. This is simplified one Okay, it's basically three tables station observations parameters and they have a relationship between them
04:45
So station has several observations and each observation as a parameter So the first station from Alexandria has two observations one is for temperature one is for wind speed Okay, a very simple relation model. So
05:01
It's time to go one step back and to look a bit at what we are doing So initially we have two needs so we have our data storage and we have a data modeling So someone started the data because someone is producing it. We don't know it comes from sensor data Whatever is the reason we have our data storage and then we have our data modeling So someone sit down and say look, this is the way we should publish your data
05:24
There is of course the inspiring one, which is very well know but there is other organism does do exactly the same thing and Then we need of course to publish the data and publish the data is not only make it available Then we need to make it efficient to be query ball because we're if we have a production application
05:43
We have 1,000 years squaring the data set The system will need to be able to handle and we'll need to be able to efficiently translate the requests so that kind of stuff and Kind of we need to map everything together, right? And The way it is done now is that basically the person doing the mappings need to take everything in consideration
06:05
So he needs to understand that the modeling he needs to understand data storage because you need to understand there is a relationship between the data Everything you need to understand all the data will be query and it will need to understand how the data will be built So this means makes quite a bit of a complex task to achieve
06:22
So and of course there is I keep forgetting about it There's still gml was leading the dance because everything we did was with gml in mind because WFS was mostly gml so, you know, that was the way things were done and we see now that with the new OGC API features and Similar ones that gml is definitely not again
06:44
Let's say the main format judges and it's kind of slowly taking out that place. So Well, exactly. This is a gml all the example look like so basically we want from the data I have my station table with my Station name colon and I want it to become the common name attribute on my target data model. Okay, this is great
07:05
But if you want to put it at your Jason, well, this is kind of an extra step. That is useless And of course this force you to know about the structure in the database Which is definitely something that you may not not know about because who knows maybe is a completely different
07:20
organization that tells look there is the relation database and There is the modeling and you just have to do the mappings between them so Basically what we have sit down together and think about the way where we can lower the entity get for this and to use computers for what they are good for which is
07:42
Dealing with abstract numbers with abstract stuff So we think look that the storage data modeling and publishing that belongs to humans because that's the stuff they are good at Understanding how I should model my data to start it What should be my data model to publish it and the way I want to publish with the WMS WFS or GC API
08:02
Whatever and we build a component that take care of the rest and that company the smart data loader Which you'll be able to look at relation database And of course, I keep mention relation at the base because they are the most well-known ones and the ones we actually support for now They are very well
08:20
Designed and very well mature system. So when you have a relationship usually it's very well-defined it I know look my station tables depends on the observations one the observation ones depends on the perimeter ones So the computer can perfectly understand that structure and build the domain for us, right? So let's automate all that step and that's basically what we did
08:42
But now we need a second step because with the smart data loader, it means that great. Everything is automated So it reads the domain model and it gives us the domain model But how do we make whatever we obtain in the automatic way correspond to the target output? We are expecting this is where where the second tool comes out the features templating mechanism. Okay?
09:06
So although this means in terms of workflow, it means that right now we have our data We have our target modeling So smart at the later will basically look at or that the model will walk through it and Automatically build in memory and efficient streams of features. Okay, there will be complex. There will be relationships
09:26
They will have everything that's on the domain model and then for each service We'll have features templating which will allow us with content negotiation to say look if the user Requesting this as this particular context then we want to give to him this
09:42
GML or this GOS and or this this JSON LD which is now also supported or even this HTML and The nice thing about this is that this now goes beyond inspire So all these tools that were very specific to inspire and you know in in the practical way of fly
10:02
That only gets it was getting investment for inspire now is using in a lot of other let's say systems I don't know for publishing maritime data because you want to have a nice HTML output for very old clients that went to integrate So smart data loader is now a free and open source your server extensions It can be downloaded and basically what it does describing is quite simple it look at your database
10:26
It will walk your structure and it look this is the domain model. I was able to extract from your data space Are you happy with it? If we say yes, then it will be in memory a streams of features that way and it will take care of
10:41
Make sure they're efficiently Represented and that when we try to access the attributes to retrieve the data to mention them. It will transparently do Do all of that for us on a style on a query we sent on any aspect that May enter the system, of course This is a video. It should play normally
11:04
There we go. So basically Our first step is to add Not the store Cannot see very well from here Basically we have to provide the database where we have or not the storage we provide the access to the database, of course
11:37
Okay. So now we created our connection to our process SQL database where our main data is started now
11:44
We go to smart data loader and basically Okay, we have to define a name for it we select our Database or select our root entity and it build the domain model for us Okay, look, this is the relationships. I understood we can do some pruning like and we publish the data
12:08
It detect the geometry the spatial and that's it. It's done So before this before reaching this state before we have to find the target gml scheme We are to define an application schema mapping. We are to know very well. All the data was related together
12:24
We have to define your the mappings testing them publishing them and so on in less than 30 seconds We just published that data set, right? So the only piece we put aside for the moment, which is the game changer Is that we got rid of the target schema? So at this stage or and we'll see it now
12:44
So it doesn't matter how much relationship or data as as you will see now in the gml output There we go. We have a complex structure Okay, of course the schema that is here match the relation the the relation scheme on the database. Okay
13:01
And as we can see we can do the same for your Jazza So there we go We just were able in a couple of clicks to publish or complex data schema model In different output formats and of course, we are still tied up to that target to that
13:21
Sorry to that domain model that we defined on the database. Okay, so now is Time to take like let's say care of the second part So now okay, we are in inspire and we are mandated to actually respect that very schema. So how do we do it? well, that was a very tricky part so we come up with the
13:43
Features templating mechanism where basically we wanted two things look We don't want to be tied up a lot to the target modeling to whatever we know very well or the main model So we know we have stations. We know we have observations. We know parameters. We know their values So we want to be able to use those attributes. We want to be able to take or what put format
14:05
Do you just put it in whatever place it is and start to name the attributes there? And it's up to the computer because that's what they do Well to understand that and make sure it efficiently retrieves the information It needs of course is not let's say straightforward thing to implement but in terms of functionality is quite powerful
14:24
So that's basically what we did. So a features template is what you see is what you get approach we'll see how it works in a couple of seconds and It is integrated at this moment with OGC API features WFS and WMS Okay, we'll see later how we are going to deal with
14:42
Content negotiation and it's super efficient. Why because it's dealing with an in-memory Stream of features that were built by the very already efficient application schema And this is another very important point If you already if you already modulated all your that in application schema
15:02
So if you already did a lot of investment guess what you can just use features templating on top of it, right? so Okay, this was again. Thank you So this is an example of what the template the JSON template will look like for our meteostations use case So basically we'll just look this is my JSON
15:23
I start writing on a notebook and I say look here I want my source to be the station stable My identifier is the ID attribute my geometry is the position my name is the common name Wait here, I want to use a SQL function. I want to do a string concatenation here
15:40
I want the location but you know what I want in a WKT format So I apply that function what we see is what we get exactly like it, right? So we can see me here in more detail so we can see that basically what the template in giant will do is interpret our template and will just produce the output we want and
16:02
Of course you will be able to understand the request So if we say when we get this JSON on your open layers application Whatever if you say look give me all the stations where the value is only 30 degrees You will be able to translate that back to the original data source and to return it back to you. Okay?
16:22
So this is a video. I will not show all the video because I'm running out of time. So Okay, we developed an UI to deal with the features template with the features template themselves So basically we are able to of course to write them down is very similar to the styles editor of your server
16:42
if you have a user and Okay, this is I can accelerate a bit. Okay We have our first template and this is the important bit We also wanted to have like a sandbox where we could actually try out our template So enter to be able to say look go to my data source select that feature
17:03
And let me edit the template see the changes in real time that kind of stuff and we have some kind Let's say some type of formats that are tricky for example GML which needs to be related in the schema so we can perform a validation here and Even for formats like JSON LD. We are also able to perform a validation there. Okay, I
17:29
Creating templates we have an UI validating the templates we have a sandbox and we have the last missing bit which is content negotiation So we have quite a powerful tool now where we have an efficient tool to read or that the structure
17:43
To build an efficient stream of data We have the features template that allow us to map that data and now we are missing content negotiation Maybe because we are experimenting a new type of service maybe because we want to differentiate between users So there you go we can define Let's say content negotiation buzzed on either. I don't know whatever may be available buzzed on request
18:06
For example here we say that look For the if the request parameter is a double FS and if my layer is the judges in one Then I want to use the judges and output format, okay Use cases are lights
18:21
I will show some of the use cases that kind of use this one of them is the board also of Where basically we publish around 1 million or 10 minutes and I remember what's the the test case you did here and basically, we are very easily able to publish that data set that complex data set in an efficient way in double-ms and
18:42
we actually experimented with JSON-LD so embedding the context in the geojson and Of course, we were able to query it This was a system that was published actually got real users querying it and it worked quite well the second one is a bit more advanced one with the Norwegian public role administration where they have a huge amount of data storage in a MongoDB and here we are to make your server
19:08
Know let's say no scale. What does that means that you do a double FS request? We don't know the schema. We'll have to build it on the fly with whatever data We are receiving which means that of course you cannot obtain yet GML
19:22
But it works quite well, so you can get a bunch of complex data stored on your MongoDB You can do queries if the attribute exists. It will not complain. You can use it for styling So the styling will skip if the property not exists Anyway, if you have MongoDB You can give it a try and that's all I have to say