VIVO ETL using open source tools

Technische Informationsbibliothek (TIB)

Conlon, Mike

Formale Metadaten

Titel

Serientitel

12th International VIVO Conference, June 23 - 25, 2021

Anzahl der Teile

Autor

Conlon, Mike

Mitwirkende

Technische Informationsbibliothek (TIB)

Lizenz

CC-Namensnennung 4.0 International:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.

Identifikatoren

10.5446/51872 (DOI)

Herausgeber

Technische Informationsbibliothek (TIB)

Erscheinungsjahr

2021

Sprache

Englisch

Inhaltliche Metadaten

Fachgebiet

Information und Dokumentation

Genre

Konferenz/Talk

Abstract

Loading data to VIVO requires the creation of triples using the VIVO ontologies. Data may come from a variety of sources and in a variety of formats. vivo-etl (https://github.com/mconlon17/vivo-etl) is a simple open source command-line pipeline using available open source tools for extracting data from a source, transforming it to VIVO triples, and loading the triples to a VIVO TDB data store. The method extracts data from an API using wget, transforms CSV or JSON data to "raw" RDF and then transforms the "raw" RDF to VIVO RDF using a SPARQL CONSTRUCT query executed from the command line using robot, an open source tool (http://robot.obolibrary.org/). VIVO triples can then be loaded using tdbloader. The method can be used to transform data from any source (CERFIF, PubMed, Dimensions, local repositories) to the current VIVO ontologies, or to ontologies under development by the VIVO Ontology Interest Group. A demonstration gathering data from ROR (Research Organization Registry) and providing the data as VIVO triples is included in the presentation.