We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

FAIR-ization of INSPIRE datasets

00:00

Formal Metadata

Title
FAIR-ization of INSPIRE datasets
Title of Series
Number of Parts
351
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Production Year2022

Content Metadata

Subject Area
Genre
Abstract
Belgian federal authorities are working on PSI/INSPIRE conversion tool. We have produced an enhanced DCAT AP 2.0 profile. We have proposed to use ATOMFeed to instantiate dcat:Distribution classes because of their semantic completeness. We have tried to keep most of the INSPIRE metadata elements in order to keep the work that has been done for some years.. Now we are working on its implementation through GeoNetwork 4.x microservices that would provide a consistent DCAT AP RDF/XML with many languages. By doing this we consider that our datasets will be more accessible through many platforms and open data portals and will become real FAIR data. This work is the result of the strong collaboration between federal belgian authorities (e.g. Cadaster, National Mapping Agency, Office of federal statistics, ...).Moreover we are involving regional authorities in order to reach a certain harmonization. Now we would like to share these developments with the opensource community.
Keywords
202
Thumbnail
1:16:05
226
242
Texture mappingUser profileContext awarenessMetadataWeb portalGeometryOverhead (computing)Open setProfil (magazine)MetadataState of matterMappingLatent heatProjective planeProcess (computing)Formal languageGoodness of fitResultantContext awarenessSystem administratorInformationInsertion lossGroup actionNormal (geometry)Reflection (mathematics)GeometryStandard deviationSoftwareGame controllerRevision controlData managementAuthorizationRepresentation (politics)Level (video gaming)Web portalDialectLocal ringSemantic WebOpen setDirection (geometry)Validity (statistics)Generic programmingSoftware developerParallel portSet (mathematics)Web 2.0Greatest elementLecture/ConferenceXMLComputer animation
CollaborationismAxonometric projectionRevision controlUser profileWeb portalStatisticsExtension (kinesiology)MetadataFormal languageTranslation (relic)Uniform resource locatorString (computer science)Set (mathematics)Perspective (visual)InformationRepresentation (politics)Library catalogType theoryEndliche ModelltheorieAttribute grammarProfil (magazine)MetadataComputer fileRight angleSocial classMereologyDistribution (mathematics)Revision controlMappingService (economics)Element (mathematics)Row (database)Physical systemLatent heatAuthorizationProjective planeData conversionWeb portalCollaborationismMoving averageFile formatLink (knot theory)Function (mathematics)Scripting languageIdentifiabilityGeometryTemporal logicComputer animation
Transcript: English(auto-generated)
Okay. Good morning. I am Céline Villain, I'm a bioengineer and I'm working at the National Geographic Institute of Belgium. And today I will present the results of a teamwork where we try to make most out of the metadata of INSPIRE by describing and making them more findable, accessible, interoperable
and reusable, what we also call the ferrisation process. So actually we made a federal DICAT AP profile and we also suggest an INSPIRE mapping to
that DICAT profile. So I will first tell more about the context in which we work and you will better understand
also the goals of the project and then we will dive into the development, the specification and you will also understand that this project is not finished yet so there is ongoing and future work. But we will first talk about the context. So Belgium is a federal state with three official languages and three regions and
about metadata management, about geospatial metadata, we have a bottom-up approach where you have local metadata portals that are harvested by a higher level portal and in
Belgium you have four portals on the same level, you have the three regional portals of Flanders, Wallonia and Brussels and you have also the portal on the federal level. I'm working on that federal level and the National Geographic Institute is responsible
for maintaining geo.be. The four portals are directly harvested by the European INSPIRE geo portal and in parallel to that you have the open data portal of the European data but it doesn't directly harvest
data of the regional and federal portals, there is a national portal in between, data.gov.be. When you look into more detail to the two portals you have the INSPIRE geo portal which is, which only contains geospatial teams and in parallel you have the data.europe.eu
which based on the idea of the semantic web you have much more ecosystems coming together like the statistical, institutional, also geographical data.
And there, and they describe their metadata by using the defacto standard DKDP. And the INSPIRE world actually has been disconnected from the semantic web.
So the INSPIRE world is based on isonorms, so to be compliant to INSPIRE you need to be compliant to the isonorms. There is also an INSPIRE validator being created to check for compliancy. So you see there is a lot of work being done in that INSPIRE directive but actually
all the metadata elements, so the richness that you find in the INSPIRE directive is not reflected in the DKDP standard but by describing the data, the geographical data
in DKDP we are more findable, accessible, interoperable and reusable among all the other datasets that are, that exist. So that's where the concept of ferrisation starts. The concept is we extend the DKDP standard by adding attributes so that we can make
semantic mapping between the ISO to the DKDP standard. And we want to go further than that where we make from that semantic mapping a
technical mapping so that we incorporate the semantic mapping into a tool where every metadata publisher can directly convert ISO to the DKDP standard. And our metadata is in Geo Network so the tool has to be related to Geo Network.
So the goal of the project was to create a profile and also to suggest a mapping. The profile, the idea was to have on federal level a consensus on the RDF representation but we also want to have a relevant and generic profile so we based on norms and
because also we have some expertise in metadata management in the INSPIRE world, we also want to add that expertise in the profile by making, by having an efficient management of the metadata.
Then for the mapping, the idea was to enable sharing the INSPIRE metadata across sector. Also when a publisher wants to publish metadata, it just needs to do it once and then it's automatically converted to another standard. And by making that, we have control also on which metadata is kept into the new standard.
So we avoid information loss of the INSPIRE metadata. So the development, the project was carried out by four federal administrations.
We presented our reflections and we discussed also in different working groups that already exist in Belgium like the federal working group on Geo metadata and we also discussed with the regional authorities that also thinking about the same idea of making a mapping and so on.
Yeah, we based on norms for the profile so we looked into the DKTAP version two, also in Geo DKTAP and in StatDKTAP and everything has been documented. So the profile and the mapping are documented on GitHub and we tried to improve the profile
through remarks and suggestions in the issue tab. So that's the whole UML metadata model but that's a lot of information. We'll just start simple. So in a DKTAP profile, you have a DKT catalog that contains data sets and data services.
In the perspective of INSPIRE, data service always serves the data sets so you can instantiate the class data sets from the data service class.
A data set is also accessible so you instantiate the class distribution to have your data accessible. We also in our profile defined two different instantiations of the same class. For example, when a catalog is part of a bigger catalog, you don't need to instantiate all
the same metadata elements just to avoid infinite instantiation of a class. We also defined the catalog record class as a class that contains information about
reference metadata. So when we map from ISO to DKTAP, we put the information about the ISO reference file in that class. Then about, we think it would be great to use versioning directly in distribution.
When you want to version a new version of your data sets, you instantiate a new class distribution to be sure that your data set always stays the same.
So if you want to speak about a new version, you only have to fill in the DCT temporal attribute. And then in our profile, we also added some attributes like the representation type and the license that are intrinsically related to the data set in this perspective of Inspire.
We also have to deal with multilingualism because of the three official languages and we want also to publish in English. When you have a multilingual string, it's easy, you just have to specify the language
for each translation, but when you have a multilingual URL, we instantiate the technical elements already of description so that you can instantiate, fill in the DCT language to specify your URL language.
Also, in our profile, we specified the thesaurus to use when you instantiate a SCUS concept. For a DCAT team, we also used a specific Belgian thesaurus because it has one or
two more teams than the European thesaurus. And then to fill in the DCAT distribution class, we use the Atom feed, the Atom feed is the download service because there are some elements that are well structured in
the Atom feed that you cannot find in the metadata reference file of the data set or the service. So for example, the system projection in blue is really in that tag described.
And then we also want a tool, so to convert Inspire to DCAT in GeoNetwork version 4. So we forked a microservice that is available on GitHub, it's a microservice that runs as standalone and we made some upgrades, so the XSLT file is now in version 2, it fits
our mapping and we can convert one or more Inspire files or the complete catalog. And that's maybe not so clear, it's just an example, so you have on geo.be the metadata file and it extracts information of the XML file that you see just on the right.
And then in the URL, you see the metadata identifier, you use that same metadata identifier to run the microservice and obtain a digged AP filled in standard.
So about ongoing and future work, so this project is not finished, about profile, some federal authorities are creating Python scripts to create directly a digged AP out of the Python script, there are some are working on as an output, they have digged XML, other
have digged RDF and we also want to improve our profile based on the reported issues on GitHub. And then about the mapping and the conversion tool, we need to document it, make it available
through GitHub. We know also that the microservice can convert to other formats than just DCAT, so it could be very nice to also look at the other formats. That's all the links that I used in the PowerPoint, so the geo portal of the Belgian
federal institutions, the federal profile and the mapping out of GitHub, the microservice that we forked from GitHub and the microservice confirmed in the mapping will be available soon on GitHub. And this project has been made possible thanks to the great collaboration of four federal authorities, and the persons actively involved in this project are listed here, so Benoit
from the finance, Bart from BOSA, Marianne and Yuri from the economy, Angel and I from the National Geographic Institute and Mathieu from GYM that updated our microservice. So thanks.