Enhancing authority files through SPARQL federated queries

ZBW - Leibniz-Informationszentrum Wirtschaft

Hochschulbibliothekszentrum des Landes Nordrhein-Westfalen (hbz)

Kerboul, Thomas

Formal Metadata

Title

Enhancing authority files through SPARQL federated queries

Title of Series

SWIB25 - Semantic Web in Libraries

Number of Parts

Author

Kerboul, Thomas

Contributors

Suominen, Osma (Moderation)

License

CC Attribution - ShareAlike 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this

Identifiers

10.5446/72403 (DOI)

Publisher

ZBW - Leibniz-Informationszentrum Wirtschaft

Hochschulbibliothekszentrum des Landes Nordrhein-Westfalen (hbz)

Release Date

2025

Language

English

Content Metadata

Subject Area

Computer Science Information Science

Genre

Conference/Talk

Abstract

Federated queries enable merging data across databases, allowing for the identification of errors and gaps when content overlaps. The Bibliothèque de Genève utilizes SPARQL federated queries between Wikidata and IdRef, a French authority file used for bibliographic cataloging, to enhance records about individuals related to Geneva. Using the IdRef identifier as a common link, several modularized queries were designed, facilitating the discovery of potential improvements. The process of correcting mismatches, however, was predominantly manual, which was crucial, especially in cases of homonymy. IdRef identifiers, often added to Wikidata through VIAF clusters, might be incorrectly associated with the wrong individuals. Manual curation ensured that errors did not propagate further, particularly across members of a given VIAF cluster, thereby maintaining data integrity. Additionally, the comparison revealed that Wikidata tended to be more accurate and up-to-date than IdRef, showcasing the potential of community-curated databases. This presentation aims to demonstrate the reliability of community-curated databases and the power of federated queries, particularly through the use of SPARQL, to enhance data accuracy and integration across multiple sources. By sharing these insights, we hope to encourage other institutions to adopt similar methodologies to improve their data management practices.