A Description of the Work in Progress of the Transformation of the Handbuch der Keilschriftenliteratur

Boulanger, Christian

Wagner, Andreas

Max-Planck-Institut für Rechtsgeschichte und Rechtstheorie

Riedl, Johannes

Formal Metadata

Title

A Description of the Work in Progress of the Transformation of the Handbuch der Keilschriftenliteratur

Title of Series

New Approaches For Extracting Heterogenous Reference Data

Number of Parts

Author

Riedl, Johannes

Contributors

Riedl, Johannes

License

CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/68325 (DOI)

Publisher

Boulanger, Christian

Wagner, Andreas

Max-Planck-Institut für Rechtsgeschichte und Rechtstheorie

Release Date

2023

Language

English

Producer

Reinold, Fabian

Production Year

2023

Production Place

Frankfurt am Main

Content Metadata

Subject Area

Information Science

Genre

Conference/Talk

Abstract

The University Library of Tübingen has been faced with an increasing number of requests from different faculties to transform relevant scientific achievements from the past - like type- and sometimes even partially handwritten bibliographies, registers and dictionaries - to a form that lends itself to the import into a database based retrieval system. In a considerable number of cases such works comprise the lifework of an individual person and partly have been an indispensable tool for generations of scientists, but despite their relevancy still only exist in a limited quality printed form, since smaller subjects lack the resources to provide a structured digital edition. Since these works are often both highly idiosyncratic in their structure (i.e. they follow individual rules for the formation of the information given), contain a plethora of abbreviations, have a high entropy (i.e. nearly every token is relevant) but have in many cases never been written with the intent of providing the unambiguous syntactical structure needed for computer processing, they pose a considerable challenge for the transformation process. Moreover, the text scope is normally too large to allow economic manual transformation, but nonetheless small in terms of the currently needed amount of data necessary to train an individual model. This, accompanied by the fact that the institution only has restricted resources for these kind of projects, leads to the question, whether and how generally accessible methods can be used to achieve feasible results.