We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

CROSS-NATURE project: development of an Iberian Spatial Data Infrastructure using Linked Open Data

00:00

Formal Metadata

Title
CROSS-NATURE project: development of an Iberian Spatial Data Infrastructure using Linked Open Data
Title of Series
Number of Parts
53
Author
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
ImplementationSpatial data infrastructureMathematicsInformationExecution unitInfinityGame theoryService (economics)Digital signalLinked dataControl flowSpeciesSource codeGeneric programmingOpen setStrukturierte DatenLink (knot theory)DatabaseSystem programmingSelf-organizationSet (mathematics)Web pageComputerQuery languageStandard deviationMaß <Mathematik>Linker (computing)File formatSpreadsheetVirtual machineFeasibility studyField (computer science)Real numberData structureProduct (business)Uniqueness quantificationFinitary relationSoftware frameworkData modelRDF <Informatik>Transformation (genetics)OntologyData storage deviceInterface (computing)Personal digital assistantData managementRippingProduct (business)Einstein field equationsInformationObject (grammar)Enterprise architectureSoftwareLevel (video gaming)File formatForm (programming)Special unitary groupMobile appStorage area networkFilm editingProjective planeDigitizingEnvelope (mathematics)Medical imagingGame theoryContent (media)Dependent and independent variablesBoss CorporationLink (knot theory)Graph coloringStudent's t-testGroup actionNear-ringInterface (computing)Bridging (networking)WebsiteComputerFunctional (mathematics)InternetworkingPrice indexParameter (computer programming)Entropie <Informationstheorie>CASE <Informatik>RhombusCoordinate systemGoodness of fitSoftware developerStapeldateiBeta functionPresentation of a groupWeb 2.0Software testingConjunctive normal formMereologySocial classMultiplicationPlanningFocus (optics)Multiplication signCore dumpWaveTotal S.A.State of matterSingle-precision floating-point formatMatching (graph theory)Field (computer science)WeightPhysical systemAngleSet (mathematics)Service (economics)Subject indexingForcing (mathematics)Data structureSpreadsheetFlow separationImplementationUser interfaceInternet service providerStructural loadDifferent (Kate Ryan album)System administratorElement (mathematics)Open setTheory of relativityVirtual machineData storage deviceOntologySource codeProbability density functionIdentifiabilityCartesian coordinate systemTask (computing)MathematicsOcean currentNumberVector potentialProcess (computing)Predicate (grammar)Descriptive statisticsComputer animationLecture/Conference
SpeciesCASE <Informatik>Data modelGame theoryService (economics)Transformation (genetics)Element (mathematics)Texture mappingSoftwareText editorOntologyOpen sourceMobile WebCharacteristic polynomialFreewareDigital filterElectronic mailing listVector potentialSystem identificationThermal expansionHost Identity ProtocolPersonal digital assistantSocial classVideoconferencingDigital photographyProcess (computing)MiniDiscFlow separationGame theoryBridging (networking)Inheritance (object-oriented programming)Data storage deviceProjective planeData managementAuthorizationNumbering schemePlanningDifferent (Kate Ryan album)Service (economics)Structural loadVideo gameWhiteboardNeuroinformatikData centerSpeciesBoss CorporationWeb serviceSpacetimeGraph (mathematics)HypothesisInformationElectronic mailing listOffice suiteFamilyMultiplication signSemantic WebOpen source19 (number)Text editorLatin squareCartesian coordinate systemConnectivity (graph theory)Java appletLink (knot theory)Integrated development environmentTraffic reportingOpen setSoftwareServer (computing)Mobile appPower (physics)Harmonic analysisProduct (business)Closed setThermal expansionShape (magazine)Exterior algebraDoppel-T-TrägerSound effectForcing (mathematics)Crash (computing)DataflowOnline helpFloppy diskHand fanRadical (chemistry)Numeral (linguistics)Point (geometry)Distribution (mathematics)Phase transitionGraphical user interfaceVapor pressureForm (programming)Neighbourhood (graph theory)MultiplicationSpectrum (functional analysis)Inclusion mapRight angleSpecial unitary groupOntologyDirection (geometry)Characteristic polynomialTransformation (genetics)Computer fileDatabaseFile formatElement (mathematics)MappingSource codeVector potentialElectronic visual displayMetadataWebsiteArtificial neural networkLatent heatFreewareSet (mathematics)Software frameworkFunction (mathematics)Arithmetic meanVirtueller ServerSlide ruleField (computer science)ResultantLevel (video gaming)Endliche ModelltheoriePattern languageData modelExpressionCASE <Informatik>Lecture/Conference
SpeciesCASE <Informatik>File viewerControl flowIntegrated development environmentMobile appSpatial data infrastructureImplementationMach's principleFacebookTwitterComputer networkIdentifiabilitySoftware developerWebsiteTransformation (genetics)Process (computing)Optical disc driveWeightProjective planeConjunctive normal formSpeciesSpecial unitary groupService (economics)Bridging (networking)Connected spaceBeat (acoustics)Form (programming)Digital photographyOrder (biology)Distribution (mathematics)Social classObject (grammar)Representation (politics)Mobile WebOperator (mathematics)FacebookLogic gateIncidence algebraLink (knot theory)Group actionPresentation of a groupInternetworkingRaw image formatTwitterHypermediaMultiplication signInformationCrash (computing)Game controllerRadical (chemistry)Product (business)MetreSoftware testingWeb portalObservational studyInterior (topology)File viewerCASE <Informatik>Web 2.0Function (mathematics)State observerMobile appLevel (video gaming)StatisticsResultantStructural loadFile formatPredicate (grammar)Computer fileOpen setMathematical analysisPopulation densityExpert systemAreaNumberDemosceneTheory of relativityLinked dataFunctional (mathematics)Lecture/Conference
Transcript: English(auto-generated)
Good morning, I'm Ana Luisa Gomes. I am an environmental researcher.
I'm going to present the cross-nature project. I'm going to read the presentation. I'm sorry, but I think it's better for for all of us. So before presenting the cross-nature project, I would like to refer to the DGT mission.
DGT is the Portuguese public agency responsible for different tasks in land use planning and geographic information domains, as well as for the development of research support by national or European funding. The DGT is responsible for the production of national topographic reference cartography,
management of national geodetic networks, coordination of national system of territorial information, SNIT, and the national system of cadastral information, SNIC. But the two aspects I would like to highlight is the DGT is the coordination of the Portuguese
national spatial data infrastructure, the SNIC, and the coordination of the INSPIRE implementation in Portugal. So
so cross-nature is a project confidential by the European Union through the connecting people facility, the SEF program, and the focus of my presentation will be on the current state of this project that aims to develop an Iberian spatial data infrastructure using linked open data.
The project consortium involves DGT and two Spanish partners that tracks a group the coordinator of this project, and the University Carlos Torceira from Madrid. MAPA and the CNEF are
institutional data providers that are not part of the consortium, but support the project development. So the DGT team involved several people with different academic backgrounds and professional experience.
However, the nuclear team is composed by these four elements. Ricardo and Paul was the informatic persons that could not come, so I am the coordinator. So I am here to present the work that is being developed. It's a two years project that started in May 2017.
Here we have the main objectives of the project. They are to develop digital service infrastructure using data sets on biodiversity from Portuguese and Spanish public entities, and to adopt the linked open data approach,
to identify new data of interest, and to add value to this information, to improve access to new sources of knowledge, and to provide better services to citizens.
Before proceeding to the presentation of the project, we make a brief reference to the LOAD approach, because it's one of the most core issues of this project. So, linked open data can be defined in different ways. We can see some of them, but following key ideas can be highlighted.
First, LOAD is for publishing structured data on the web. The data are linked with different sources and in different formats. Incorporating URIs, ontology, and relations.
So it's a machine readable format. LOAD contributes to identify new data of interest in the web, to add value to this information, combining multiple data sets, and to improve access to new sources of knowledge.
In this slide, we can see that there are different levels of linked open data. At first level, the data is available on the web, for instance, as PDF. At second level, the data is available as machine-readable
structured data, such as Excel spreadsheet. At third level, the data is available in the non-proprietary format, in open-layers formats, such as CSV. Fourth, the data is published using open data standards, using URIs,
such as RDF format. The five star level includes all of the other properties, plus linked open data, linked linked data.
Namely, linked open data, such as linked RDF. In Portugal, the public administrations are usual at three star level. They have a lot of data, but information is not prepared to be linked with other, as request at LOAD approach.
So, the cross-natural project emerge as an innovative approach in the Portuguese public administration, because he wants to go to the five star level.
RDF. RDF format is a key issue in all this process. So, RDF stands for Resource, yes, everything that can have a unique identifier, a URI. The description, that means attribute, features, and relations of the resources, and
framework, as model, language, and syntax for this description. In RDF, every piece of information is expressed in the triple, as a subject, predicate, and object.
The subject, is the resources, which may be identified with the URI. The predicate is a URI identifier of the relation, the relationship between subject and object, and object, a resource to which the subject is related.
This is the project workflow, and includes these phases. The first one, the geographic, we have the geographic and alphanumeric information,
that comes from different sources in Portugal and Spain, and in different formats. I'm sorry, because the numbers change. It was in PowerPoint, not on Open Source, I'm very sorry. So, this was the first, second, first, so five, five
levels in the methodology. So, the first one is the information, and it's necessary to harmonize and transformize the data to be integrated in the interoperable way. And second, we have to define the ontologies and the vocabularies to be used, considering the best practice.
So, we are using the existing ontologies and vocabularies, making it easier to find data complementary to ours, and decreasing the change that our information, to be reused by others.
Then, we are going to implement the current RDF syntax and semantics, and we are going to store where we are going to store the triplets. Then, we are going to create web interfaces to facilitate
APSS with the implementation of endpoints, which will guarantee free access to all data for reuse. It includes a SPARQL endpoint interface to query data. Finally, we are going to develop two use cases to present the technology in use.
It includes the definition and implementation of interfaces, functionalities, to be included in websites and applications for my mobile devices. Then, I am going to present you the use cases to demonstrate the concept and the potential of the load approach. The first one,
the first scenario, is designated, okay, we can call it protected species use case to reuse open data reported by the European
States and the habitats and birds directives. For these use cases was established the creation of a variant cross-nature endpoint and an app based on the famous who is who game that you read
from cross-nature endpoints and from other external endpoints. This has the methodology. There are several steps that have to be undertaken following the work and the data flow.
In general, first it is necessary to prepare the original data and performing the INSPIRE harmonization to produce a GMEL INSPIRE compliant file. To map an ontology, to transform data into a RDF format and store this data in their
RDF triplet database. These and other database will be available and by the cross-nature endpoint that will be produced using the virtual server. In the end, the app is being developed to use cross-nature
shared data, showing the potential of LOD approach. Now, going step by step, first it is necessary to know the data we are using. So we are using the ECNEF
report under the habitats and birds directives, that are stored in two different formats shapefile and access database. They are in the AONET website and everyone can can go and get there. The shapefile is the source of geographic information and the access
database is the source of alphanumeric data. Some other auxiliary data are also stored in the SVS files and was used to complement the original data.
Currently, we are evaluating the possibility to use more data from external sources as flora on another. Under the INSPIRE directive, I think all of you know what is INSPIRE directive, the main
official geographical data sets have to be reported compliant with INSPIRE regulations, including metadata production, data sets, harmonization, data sharing, and web services. So the INSPIRE S34 environmental thematic data and our data are included in species distribution team.
So this information must respect that the specification for this INSPIRE team. To do that, the species distribution data specification document presents a UAML that establish the elements to be filled, their relations, the multiplicity, and the
obligations of that fields. For the immunization process, we use the WAIL, an open source software. With the WAIL software, it is possible to do a mapping between the target INSPIRE schema and
our source schema, including the shapefile and the data in SVS file format. In the result of this process, we produce an harmonized INSPIRE GMEL. Next step,
to pass from GMEL to RDF format, it is necessary to create an ontology to give meaning to our data as we are working with semantic web. To do that, we use the open source Portogea ontology editor,
allowing to create classes of plants, in this case, plant characteristics, their own relations, and classification classes. So this slide shows an example of ontologies created for the mobile app. The next step
is the transformation of the GMEL in RDF. INSPIRE has already some guidelines for this process. We are going to use Jena software, where the GMEL file is processed using the computed ontology, and that is transformed to RDL
Also, in this case, the RDF is also INSPIRE compliant. Jena is a free open source Java framework for building semantic web and linked data applications.
So here we can see some outputs already. So the project will develop two cross-natural endpoints, one in Portugal and at DGT, and other for Spain. The cross-natural endpoints will provide data for an app that also consumes information for other external endpoints, like UNISH,
it's a classification database, and Uniprot. You can see an example for wolf. This geographic information is already coming from both countries, so we can see the cross
in the border, there are overlays. As I refer, a mobile app based on who is who game to identify endangered species is underdeveloped.
That will allow the user to search for species characteristics and identifying species according to their characteristics. I'm going to show where. How?
So, in the first case, to select the species based on the list of species, according to the characteristics in the mock-up one, that in the mock-up two, three and four present the data sets for species
with detailed information and photos. The last mock-up displays the species' special distribution in Iberian-Pulislan. This mock-up shows what will be the app, the mobile app that consumes information from endpoints, cross-nation endpoints.
The second is made with the who is who game, so we get a list of multiple choice, questions about the species, then we provide the answers. We also can take a photo and
using artificial intelligence components help to identify the species that we see. So we are in the field, we see a species and we want to identify it. When completed, this lists the characteristics and the
special distribution of the species in the Pulislan Iberia. These are the second scenario that are related to alien invasive species and it consists in the creation of
an organized data model oriented toward alien invasive species present using data sets from Portugal. In Portugal, we are going to use the Vespa, Volutina, Asian Vespa. In Portugal, it's an invasive species which allow to identify the special patterns related to their expression
in both countries. Those are the So in this case, we have these faces. First, we have created an invasive risk map based on geographical information modeling in the
GIS desktop environment. We are going to use the SOS Vespa data that will provide access to the distribution of Vespa Volutina. So in this case, we are still standing the most appropriate
process to carry out the data transformation. We are going to host the data in cross-nature endpoint and for the creation of the final output will be a map viewer using open layers.
So the web portal of SOS Vespa is managed by ECN-EF. It's available on the web and includes information about the actual, so it's always
updated, of Vespa Volutina in Portugal. And we are still considering the possibility of incorporating other data such as the services from Invasores.pt portal and that form European alien species information
network. On this use case, it will be development a map viewer to identify invasive species presence which functions will allow finding information about species, namely their detailed characteristics and photos,
viewing the actual spatial distribution and this potential distribution with the risk map and performing spatial analysis for identifying priority areas for intervention.
So I would like to okay, I would like to conclude and to see, to show which are the main cross-nature outcomes.
So this project will produce species distribution data sets, harmonized according in SPIRE. Two cross-nature endpoints, one in Portugal and other in Spain, with open access to all data and the map viewer for
invasive species control and the mobile app for citizen to identify species. It is expected that cross-nature project will contribute for DGT to enrich the sneak development
considering their applied and modernization, to achieve better corporation in public agencies in Portugal and at international level, to apply the load approach to inspire harmonized data, which is very innovative in Portugal and probably in Europe
and to disseminate the load approach across others partners and other public entities. While improving the use of linked open data approach in Portugal, we will be contributing to make the data
more searchable, assessed, connected and used. At last we are here, the people from other cross-nature partners of the consortium and from the data providing supporting entities, MAPAMA and ICENIEF.
Their work is crucial to achieve the results foreseen for the project. So you can follow the development and the results of cross-nature project through those social media, the project portal, Facebook, Twitter and
here are the links to do that. So thank you for your attention. Thank you very much Anna. Questions?
We have the microphone. Thank you for your presentation.
I wanted to ask about the reason why you chose RDF as your data format. What's behind the decision? The RDF format? Well, as I said, it is the main technology partner here is University of Carlos Torceiro of Madrid and
those are the we are learning those technologies and there they have, I think, is an exennial file and it's the better way to
to organize the triplets. I hope I'm not telling nothing that it's not right, but as I see the things, it's the better way to organize the subject predicate object triplets that are needed to link the data.
So that's the format we are using to do the linked open data. I'm not expert, the two guys couldn't come, but that's how I think this works.
Thank you for the presentation. I'm not really going to ask something, but it was just a comment or a thought and forgive me if it's a
wrong thing I will say, but when I saw the for the Vespa study that the sightings one thing that occurs to me is it our density of population on that territory. We have
85% of population of people on the coast side of our country. So you see in Porto, you see lots of sightings, but inside an interior on the Trasmon, Trminho,
there are some sightings, but just my comment is this or observation is if you are going to use this in a statistical way or something like that, be careful because we know that lots of people are in the center of Porto, Lisbon. Of course, there are more sightings there,
but it doesn't, it's not, we know that we can have more of this intrusive alien on the interior, on the Trasmon, just my comment is to be very careful using this information because
technology is on the center of where the people are and maybe some wait, wait, waits for pondering where the species is living. This is just an observation. No, no, you are right. There is a relation between the number of scenes and the number and the population, of course, but in this case
that's why there is a risk map. So the information that is built, this risk map has a lot of variables, climatic about the species and
we use this risk map to do the distribution and the way that Vespa will be expanding. The experts said that Vespa came from, started near Porto, perhaps in the
ships, coming by ships, and they are, she is, it is expanding very fast. In Spain, they have less, less, less Vespa, and
so ICENIEF chose this species because it's very important to ICENIEF and for us, it's okay to do the use case with that, but you are right. That's why we have the risk map.
More questions? No. Thank you, Vanna. Finish, yeah, and I call the João, please. Bequack, is this name? Bequack, I'm sorry.