We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Improving Named Entity Recognition in the Biodiversity Heritage Library with Machine Learning

Formale Metadaten

Titel
Improving Named Entity Recognition in the Biodiversity Heritage Library with Machine Learning
Serientitel
Anzahl der Teile
15
Autor
Lizenz
CC-Namensnennung - Weitergabe unter gleichen Bedingungen 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
Scientific names are important access points to biodiversity literature and significant indicators of content coverage. The Biodiversity Heritage Library (BHL) mines its content using the open source Global Names Recognition and Discovery (GNRD) tool from the Global Names Architecture (GNA) suite of machine learning and named entity recognition algorithms, to extract scientific names to index and attach to page records. The 2017 BHL National Digital Stewardship Residents (NDSR) are working collaboratively on a group of projects designed to deliver a set of best practices recommendations for the next version of the BHL digital library portal. NDSR Residents Katie Mika and Alicia Esquivel will discuss (i.) BHL and the significance of taxon names, (ii.) the current workflow, proposed improvements, and example workflows for linking content across scientific names including semantic linking to biodiversity aggregators such as Encyclopedia of Life and the Global Biodiversity Information Facility, (iii.) how to use scientific names for content analysis, and (iv.) optimizing manuscript transcription of archival content, which introduces problems like outdated and common names, misspellings, and antiquated taxonomies to GNA tools. Authors invite questions, comments, and discussion from audience members as the Residents prepare to submit their final recommendations at the end of the year.