We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

VIVO ETL using open source tools

Formal Metadata

Title
VIVO ETL using open source tools
Title of Series
Number of Parts
18
Author
Contributors
License
CC Attribution 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Loading data to VIVO requires the creation of triples using the VIVO ontologies. Data may come from a variety of sources and in a variety of formats. vivo-etl (https://github.com/mconlon17/vivo-etl) is a simple open source command-line pipeline using available open source tools for extracting data from a source, transforming it to VIVO triples, and loading the triples to a VIVO TDB data store. The method extracts data from an API using wget, transforms CSV or JSON data to "raw" RDF and then transforms the "raw" RDF to VIVO RDF using a SPARQL CONSTRUCT query executed from the command line using robot, an open source tool (http://robot.obolibrary.org/). VIVO triples can then be loaded using tdbloader. The method can be used to transform data from any source (CERFIF, PubMed, Dimensions, local repositories) to the current VIVO ontologies, or to ontologies under development by the VIVO Ontology Interest Group. A demonstration gathering data from ROR (Research Organization Registry) and providing the data as VIVO triples is included in the presentation.