We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

RDF export of TIB AV-Portal metadata

The German National Library of Science and Technology (TIB) aims to promote the use and distribution of its collections. In this context, TIB publishes the authoritative and time-based, automatically generated metadata of videos of the TIB AV-Portal as Linked Open Data. Only metadata and thumbnails of videos which allow usage of their respective metadata and thumbnails under the Creative Commons License CC0 1.0 Universal are made available. Please note that the data was partially generated by an automatic process and may therefore contain errors or might be incomplete.

In addition, TIB also offers the metadata of the TIB AV portal via an OAI interface - in the formats OAI Dublin Core, MARC XML or RDF XML.

Datasets

Total

Filename Format Size Date created: Version:
tib-av-portal-export-2024-11-28.ttl (zipped) text/turtle ~1.6GiB (unzipped ~17.4GiB) 29.11.2024 2024-11-28

Dumps of publisher IWF Wissen und Medien gGmbH i.L.

These dumps are a subset of the total stock. They only contain the videos of the publisher IWF Wissen und Medien gGmbH i.L..

Filename Format Size Date created: Version:
tib-av-portal-export-iwf-2024-11-28.ttl (zipped) text/turtle ~27.6MiB (unzipped ~244.9MiB) 29.11.2024 2024-11-28

Additional Data and Mappings

Mapping of TIB AV-Portal Subjects to DBpedia and GND

Filename Format Size Date created: Version:
tib-av-portal-subjects-1.0.0.ttl application/turtle 11kB 18.03.2016 1.0.0

Mapping of TIB AV-Portal VCD Classes to DBpedia, Wikidata, and GND

Filename Format Size Date created: Version:
tib-av-portal-classes_vcd-1.0.1.ttl application/turtle 11kB 26.06.2018 1.0.1
tib-av-portal-classes_vcd-1.0.1.n3 application/turtle 48kB 26.06.2018 1.0.1

License

For the use of the metadata and provided thumbnails, the conditions of the Creative Commons License CC0 1.0 Universal (CC0 1.0) Public Domain Dedication shall apply.
(Click here to view summary and legally binding version of the license.)

Acknowledgement

When using the data of TIB, please link to the page https://av.tib.eu/opendata in order to promote the use and distribution of this data.

Documentation of the Data Dumps

This documentation will give a brief overview on the structure of the dump data and shows how it can be imported in a RDF store and queried with SPARQL.

Structure of the data

This section will introduce the structure of the TIB AV-Portal RDF data.

The following table shows the RDF prefixes used in the dumps.

Prefix Namespace Vocabulary
bibframe http​://bibframe.org/vocab/ Bibframe Vocabulary
dbp http​://dbpedia.org/resource/ DBpedia Resources
dcterms http​://purl.org/dc/terms/ DCMI Metadata Terms
dctypes http​://purl.org/dc/dcmitype/ DCMI Type Vocabulary
foaf http​://xmlns.com/foaf/0.1/ Friend of a Friend Vocabulary
gnd http​://d-nb.info/gnd/ Integrated Authority File (GND)
schema http​://schema.org/ Schema.org Vocabulary
tib http​://av.tib.eu/resource/ TIB AV-Portal Resources
cnt http​://www​.w3.org/2011/content# Representing Content in RDF
itsrdf http​://www​.w3.org/2005/11/its/rdf# Internationalization Tag Set (ITS)
nif http​://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core# NLP Interchange Format
oa http​://www​.w3.org/ns/oa# Open Annotation Data Model
rdf http​://www​.w3.org/1999/02/22-rdf-syntax-ns# Resource Description Framework

Note: Don't forget, in Turtle syntax slashes are not allowed in the local part of a prefixed name and have to be escaped with '\'.

Example 1: Video Standard Metadata (datatype properties / literals):
tib:video\/16453 schema:name           "Wall-crossing and geometry at infinity of Betti moduli spaces"@en ;
schema:description    "Linear algebraic differential equation (in one variable) depending on a small ..."@en ;
schema:keywords       "Betti moduli"@en ,  "chaos theory"@en,  "singularity"@en ;
schema:date Created   "1973-01-01T00:00:00+01:00"^^<http://www.w3.org/2001/XMLSchema#gYear> .
schema:duration       1:16:48 .
Example 2: Video Standard Metadata (object properties)
tib:video\/16453 rdf:type              schema:Movie ;
schema:url            <https://av.tib.eu/media/16453> ;
schema:producer       gnd:4028361-6 ;
schema:publisher      tib:Institut_des_Hautes__tudes_Scientifiques_%28IH_S%29 ;
schema:license        <http://creativecommons.org/licenses/by/3.0/deed.en> ;
schema:availability   schema:OnlineOnly ;
bibframe:doi          <http://dx.doi.org/10.5446/16453> ;
schema:thumbnailUrl   <https://av.tib.eu/images/avpimg1fdaede78b338bba137140fd805cd382> .

tib:Institut_des_Hautes__tudes_Scientifiques_%28IH_S%29  foaf:name  “Institut des Hautes Études Scientifiques (IHÉS)” .

Note: As best as possible, we tried to map publishers, producers, creators, etc. to existing knowledge bases or authority files (e.g. GND). In some cases, a mapping could not be made by now or is simply impossible. In that cases the resource is represented through an IRI with ‘tib:’ prefix and its corresponding information, e.g. foaf:name. In further versions of the dumps, these IRIs are subject to be replaced by its correct common knowledge base or authority resources, if possible.

Example 3: OCR result

Image: Example 3

tib:video\/16453?t=smpte-25:0:28:17:11&xywh=368,316,292,15 dcterms:isPartOf tib:video\/16453 .

tib:ocr\/16453_42436_42436_x368y316h15w292   oa:hasTarget    tib:video\/16453?t=smpte-25:0:28:17:11&xywh=368,316,292,15 ;
oa:hasBody      tib:ocr\/16453_42436_42436_x368y316h15w292?char=0,7 ;
oa:annotatedBy  tib:annotator\/OCR-1.0.0 ;
rdf:type        oa:Annotation .

tib:ocr\/16453_42436_42436_x368y316h15w292?char=0,7 rdf:type nif:Context ;
rdf:type nif:RFC5147String ;
nif:isString “optimal” .
Example 4: VCD result

Image: Example 4

tib:video\/16453?t=smpte-25:0:01:02:07 dcterms:isPartOf tib:video\/16453 .

tib:vcd\/16453_1347007_1557  oa:hasTarget   tib:video\/16453?t=smpte-25:0:01:02:07 ;
oa:hasBody     tib:visualconcepts/Lecture ;
oa:annotatedBy tib:annotator\/VCD-1.0.0 ;
oa:motivatedBy oa:tagging ;
rdf:type       oa:Annotation .

tib:visualconcepts\/Lecture  rdf:type oa:SemanticTag .
Example 5: Named Entity Linking of OCR/ASR

Image: Example 5

tib:video\/16453?t=smpte-25:0:05:00:22,0:05:03:00 dcterms:isPartOf tib:video\/16453 .

tib:asr\/16453_13753838_7522 oa:hasTarget   tib:video\/16453?t=smpte-25:0:05:00:22,0:05:03:00 ;
oa:annotatedBy tib:annotator\/ASR-1.0.0 ;
rdf:type       oa:Annotation ;
oa:hasBody     tib:asr\/16453_13753838_7522?char=0,5617 .

tib:asr\/16453_13753838_7522?char=0,5617 rdf:type nif:Context ;
rdf:type nif:RFC5147String .

tib:asr\/16453_13753838_7522?char=4743,4747 nif:referenceContext tib:asr\/16453_13753838_7522?char=0,5617 ;
itsrdf:taIdentRef gnd:4038613-2 ;
itsrdf:taAnnotatorsRef tib:annotator\/NEL-1.0.0 ;
rdf:type nif:Phrase ;
rdf:type nif:String ;
nif:beginIndex "4743" ;
nif:beginIndex "4747" ;
nif:anchorOf "sets" .

How to import the dumps to a triple store

The following table shows some popular RDF stores, which instantly can be used to import and work with the provided RDF dumps.

Virtuoso Opensource https://vos.openlinksw.com/owiki/wiki/VOS/
Sesame http://rdf4j.org/
Apache Jena TBD https://jena.apache.org/documentation/tdb/
Blazegraph https://www.blazegraph.com/

For a quick start, you can use Blazegraph like follows:

Download the blazegraph jar and follow the instructions to start Blazegraph from: https://github.com/blazegraph/database/wiki/Main_Page

Once you have started Blazegraph, you should be able to access it with your own web-browser at:
http://localhost:9999/blazegraph/

Now, download and unzip the TIB AV-Portal dump file from the tables above.

To import the TIB AV-Portal dump file into Blazegraph (cf. the screenshot below):

  • switch to the “UPDATE” tab in Blazegraph
  • enter the complete absolute URL of the locally downloaded and extracted dump file into the text input field.
  • select Type: “File Path or URL” from the dropdown menu
  • press the “Update” button below

The update should start now, indicated by “Running updates ...”. It will likely take some time (about 10 to 30 minutes, depending on your computer) to finish, indicated by a message such as “Modified: 10099269 Milliseconds: 1441798”.

Blazegraph Screenshot

How to query the data with SPARQL

In Blazegraph, switch to the “QUERY” tab and enter the following example queries.

Use the following prefixes with every query:

PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX gnd: <http://d-nb.info/gnd/>
PREFIX schema: <http://schema.org/>
PREFIX tib: <http://av.tib.eu/resource/>
PREFIX itsrdf: <http://www.w3.org/2005/11/its/rdf#>
PREFIX nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#>
PREFIX oa: <http://www.w3.org/ns/oa#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
Example 1: Show video with ID 16453 and all of its triples
SELECT *
WHERE {
  tib:video\/15293 ?p ?o .
}
Example 2: Show all videos of publisher 'IWF (Göttingen)'
SELECT DISTINCT ?movie
WHERE {
  ?movie rdf:type schema:Movie .
  ?movie schema:publisher <http://av.tib.eu/resource/IWF_%28G%C3%B6ttingen%29> .
}
Example 3: Show all videos having the term ‘big data’ in their title
SELECT DISTINCT ?movie ?name
WHERE {
  ?movie rdf:type schema:Movie .
  ?movie schema:name ?name .
  FILTER REGEX(STR(?name), 'big data', 'i') .
}
Example 4: How many videos are annotated with a visual concept?
SELECT (COUNT(DISTINCT ?video) AS ?count)
WHERE {
  ?annotation oa:annotatedBy tib:annotator\/VCD-1.0.0 .
  ?annotation oa:hasTarget ?videoFragment .
  ?annotation oa:hasBody ?concept .
  ?videoFragment dcterms:isPartOf ?video .
}
Example 5: Show videos which have GND entity ‘http​://d-nb.info/gnd/4298379-4’ annotated
SELECT ?video
WHERE {
  ?phrase itsrdf:taIdentRef gnd:4298379-4 .
  ?phrase nif:referenceContext ?context .
  ?annotation oa:hasBody ?context .
  ?annotation oa:hasTarget ?videofragment .
  ?videofragment dcterms:isPartOf ?video .
}
Example 6: How many videos have ocr analysis results
SELECT (COUNT(DISTINCT ?video) AS ?count)
WHERE {
  ?annotation oa:annotatedBy tib:annotator\/OCR-1.0.0 .
  ?annotation oa:hasTarget ?videofragment .
  ?videofragment dcterms:isPartOf ?video .
}