We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

A knowledge graph prototype for national topographic data

00:00

Formal Metadata

Title
A knowledge graph prototype for national topographic data
Title of Series
Number of Parts
351
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Production Year2022

Content Metadata

Subject Area
Genre
Abstract
Spatial data infrastructures prioritize data interoperability to serve their diverse communities. Geospatial knowledge graphs (GKG) are a form of database representation and handling that aim to meet the challenges of data interoperability, reasoning for information storage and knowledge creation, and user access that provide coherent spatial context to a domain of information. This paper discusses the development of a prototype GKG based on national topographic databases. Geospatial data are used to test interoperability aspects of ontology creation, faceted search and retrieval using GeoSPARQL (Open Geospatial Consortium, 2022), and user interface for data visualization and evaluation. The challenges are to capture and represent geographic semantics inherent in the source data, to integrate data from outside sources through SPARQL Protocol and RDF Query Language (SPARQL) queries and to visualize the data using a cartographic user interface. Poore (2003) identified four levels of data interoperability: articulation, sharing, integration, and alignment. These concepts are carried into the semantic technology design and application. Called the Map as Knowledge Base (MapKB), the approaches use software components to build a system architecture aligned with available standardized vocabularies and is composed entirely of free and open-source software for geospatial data The application was created in the context of The National Map of the U.S. Geological Survey (USGS). For purposes of data interoperability, the GKG ontology, queries, and visualization were studied for the system. Data pre-processing involved creating a GKG ontology. The ontology was semi-automatically transformed from source databases through the application of rules on schema attribute, domain, and metadata files to create classes, properties, and other triple resources of Resource Description Framework (RDF) and Web Ontology Language (OWL) (Hayes and Patel-Schneider, 2014; Hitzler and others, 2012). An R2RML file was created using Web-Karma for transforming the feature-level instance data using the ontology and confirmed using standards specifications (University of Southern California, 2016; Das and others, 2012). The converted data and ontology are imported into a triplestore for data handling. A cartographic user interface (UI) was created as a foundation for the visualization and interaction of users with the triplestore graphs. The general guidelines given by the information search process model serves to guide UI functionality (Kuhlthau, 2004). The user interface offers menu search options by namespace for typically retrieving initial results. Multiple graphs can be visualized at once. Other queries can be performed on the initial results appearing on a map or table by faceted search and by query builder interfaces for SPARQL. An advanced feature description function retrieves related properties to support browsable graph searches. Linked Open Data were retrieved using SPARQL endpoints to test linking triples. Some GeoSPARQL support was created for geospatial queries on feature geometries of the GKG use cases. The automated transformation ontology revealed aspects of data silos that were known to exist. However, the ontology model created a new perspective of data resources across the enterprise, where resource semantics could be streamlined for reuse. This was demonstrated in the post-processing stage of the ontology creation. The system and ontology design were validated through reasoning of semantically related data and pre-determined competency questions relevant to reasoning results. An ontology pattern of aligning feature classes represented as codes and geometries of The National Map matched to the GeoSPARQL ontology feature and geometry classes was validated using reasoners. The ontology for feature interoperability provided inferred information for competency questions such as “What type of feature is classified as FCode 73002,” or “How are streams represented geometrically?” The GKG alignment with Linked Open Data used some specific widely used vocabularies to be reused between graphs, and problems encountered could be resolved by designing a better metadata annotation approach for structural alignment in addition to syntax matching. Multiple GeoSPARQL queries executing topological relations on features were successfully demonstrated with a pre-built query to find specified buildings on a road section between two cross streets. Such a query can depend on the shape of the road, building distance from the roadway, and other factors. The queries required a change in viewpoint from machine computation to landscape cognition creating related semantic factors, and then were followed by GeoSPARQL function computation. This project tested some key challenges for GKG applications for spatial data infrastructure interoperability including data transformation, ontology design, information search and retrieval, and multi-modality cartographic visualization. Completing the resulting ontology from automated data transformation for knowledge representation is still a cognitive activity. RDF and OWL vocabulary were sufficiently expressive to demonstrate linking and reasoning successes. Improved metadata annotation systems are needed for on-the-fly entity resolution. Although initial tests of GeoSPARQL techniques were successful, the full capabilities of SPARQL as a rule-based reasoning tool would need further research for queries that leverage the full semantic capabilities of knowledge graphs and for their portrayal. Disclaimer Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.
Keywords
202
Thumbnail
1:16:05
226
242
Internet forumTexture mappingPrototypeComputer animation
Query languagePoint cloudOntologySurfaceWater vaporEvent horizonDivision (mathematics)Semantics (computer science)Open sourceCategory of beingLatent heatRepository (publishing)PrototypeSet (mathematics)Data storage deviceUser interfaceComponent-based software engineeringArchaeological field surveyCore dumpObject (grammar)SoftwareFreewareDatabaseMotion captureInformation retrievalOpen setSoftware frameworkDescriptive statisticsKey (cryptography)Source codeMultiplicationInformationOntologyAttribute grammarInstance (computer science)Social classLevel (video gaming)Graph (mathematics)Observational studyComputer animation
Attribute grammarTime domainBoundary value problemData structureTransportation theory (mathematics)PlastikkarteSinguläres IntegralProcess modelingAttribute grammarData managementData storage deviceFunctional (mathematics)Markup languageFormal languageOntologyTransformation (genetics)Motion captureInferenceMixed realityReduction of orderObject (grammar)Field (computer science)Category of beingNamespaceMappingWeb Ontology LanguageComputer fileShape (magazine)Source codeGeometryRepository (publishing)Open setUser interfaceSource codeComputer animation
Parameter (computer programming)NamespaceFile viewerObject (grammar)Network topologyKnowledge representation and reasoningQuery languageUser interfaceGraph (mathematics)Set (mathematics)Graph (mathematics)Different (Kate Ryan album)Category of beingFunctional (mathematics)Theory of relativityMenu (computing)GeometryData storage deviceUser interfaceComputer animation
Finitary relationCognitionParsingNatural languageCodeSubgraphType theoryPoint (geometry)Pattern languageGeometrySubject indexingInfinityInformationData integrityKnowledge representation and reasoningMatching (graph theory)AdditionData structureMetadataGraph (mathematics)Category of beingOpen setLevel (video gaming)TrailWorld Wide Web ConsortiumResultantLogicCognitionOperator (mathematics)OntologyInferenceQuery languageTheory of relativityComputer animation
Transcript: English(auto-generated)
Hello, I'm Dahlia Veronka. The prototype geospatial knowledge graph, called MapKB, aims to capture semantic specifications of feature instances, classes, attributes, and properties. The approaches all employ free and open-source software components and free data.
The key objective of the concept stresses user access to logically interrelated data and semantic information from multiple sources using a core ontology. Key components include a resource description framework triple store and sparkle endpoint,
linked open data from outside sources, and a map-based interface based on data semantics. Questions about the storage and retrieval of large databases are outside the scope of this study. MapKB is publicly available as a set of Docker containers of executable software on the U.S.
Geological Survey GitHub repository. We implemented a bottom-up approach for the capture of schema and feature instance data from existing sources. The workflow to populate the triple store begins with Esri shapefiles converted to geography
markup language and then to web ontology language using Karma to design the RDB to RDF mapping language schemas. Python functions were created for namespace management and to convert GML to well-known
text. The ontology resulting from the transformation required manual editing. Some redundant attributes were removed and replaced with object properties to reduce data storage requirements and to reuse set members for inference.
Our properties were edited to improve logical reasoning. The transformation ontology was then aligned with the open geospatial consortium geosparkle ontology. The virtuoso triple store and sparkle endpoint were then ready to be queried using an interface.
The MapKB user interface is based on Leaflet for retrieving, visualizing, and evaluating integrated data. Initial feature representations can be selected using a menu of namespaces. More than one graph can be viewed simultaneously.
By clicking on geospatial features, their associated properties, objects, and literals are available for browsable graph searching. A sparkle viewer and custom query builder allow users to generate faceted search based on different parameters for the graphs currently in the triple store.
Though lacking full geospatial support, pre-built queries support certain topological spatial relation functions that allow user access across different datasets without the
need for background knowledge. Ontology design was validated through inference and competency questions. An ontology pattern infers interchangeable feature type references consisting of index codes, natural language annotation, and GIS geometries.
Data integration was tested through sparkle and geosparkle queries such as nearby points and entities within. We tested geospatial properties to join subgraphs from the national map with linked open data from DBpedia and geonames.
Our results indicate that technical aspects of MapKB were successful, but effective knowledge based design requires further consideration. To effectively represent landscapes, geosparkle operations required an infusion of cognitive spatial thinking beyond logic.
Relating data from the national map with linked open data drew on W3C and OGC properties intended to support graph reuse, but collectively were difficult to apply. A better metadata annotation approach for structural alignment in addition to syntax
matching may be the solution. For more information please see the academic track paper published for this conference. Thank you for your attention.