DISCOVERY - BIOfid: Accessing legacy literature the semantic (search) way

Cite

Related Material

ZBW - Leibniz-Informationszentrum Wirtschaft

Hochschulbibliothekszentrum des Landes Nordrhein-Westfalen (hbz)

Pachzelt, Adrian

Formal Metadata

Title

DISCOVERY - BIOfid: Accessing legacy literature the semantic (search) way

Title of Series

SWIB21 - Semantic Web in Libraries

Number of Parts

Author

Pachzelt, Adrian

Contributors

Neubert, Joachim (Moderation)

License

CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/60267 (DOI)

Publisher

ZBW - Leibniz-Informationszentrum Wirtschaft

Hochschulbibliothekszentrum des Landes Nordrhein-Westfalen (hbz)

Release Date

2021

Language

English

Content Metadata

Subject Area

Other

Genre

Conference/Talk

Abstract

In BIOfid, we make full texts of legacy biodiversity literature available through a semantic search. The semantic search is capable of processing simple single queries for a species (e.g. "beeches"), but also handles restriction for traits like "Plants with red flowers" by applying Natural Language Processing combined with a rule-based generation of database queries. For this purpose, the semantic search is aware of biological systematics (e.g. beeches are plants). Subsequently, the semantic search returns all documents that contain these species. Future expansions will include the geolocated search for a species in a specific area (e.g. "Beeches in the alps"). To enable these search capabilities, the semantic search draws from a pool of both ontologies and semantically annotated full texts. Within BIOfid, these two kinds of data are intertwined to support the machine understanding of the full texts. In this talk, I will give an introduction into how literature and data are automatically harvested, processed, and prepared for being queried and presented in the BIOfid portal. Furthermore, I will give insights on how user query analysis is done in the portal, discuss its pros, cons, and alternative approaches (e.g. machine learning). Finally, I will give a insight on the current work in BIOfid that involves the extraction of facts from the full texts (Information Retrieval and Extraction).