UNER: Universal Named-Entity Recognition Framework

CC Attribution - NonCommercial - NoDerivatives 3.0 Germany:
You are free to use, copy, distribute and transmit the work or content in unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/50454 (DOI)

Publisher

Leibniz Universität Hannover (LUH)

Release Date

Language

Production Year

2020

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

We introduce the Universal Named-Entity Recognition (UNER) framework, a 4-level classification hierarchy, and the methodology that is being adopted to create the first multilingual UNER corpus: the SETimes parallel corpus annotated for named-entities. First, the English SETimes corpus will be annotated using existing tools and knowledge bases. After evaluating the resulting annotations through crowdsourcing campaigns,they will be propagated automatically to other languages within the SE-Times corpora. Finally, as an extrinsic evaluation, the UNER multilingual dataset will be used to train and test available NER tools. As part of future research directions, we aim to increase the number of languages in the UNER corpus and to investigate possible ways of integrating UNER with available knowledge graphs to improve named-entity recognition.

Keywords

named entity recognition

universal ner