Active Learning for Sense Annotation

Hector Martinez Alonso, Barbara Plank, Anders Trærup Johannsen, Anders Søgaard

Abstract

This article describes a real (nonsynthetic) active-learning experiment to obtain supersense annotations for Danish. We compare two instance selection strategies, namely lowest-prediction confidence (MAX), and sampling from the confidence distribution (SAMPLE). We evaluate their performance during the annotation process, across domains for the final resulting system, as well as against in-domain adjudicated data. The SAMPLE strategy yields competitive models that are more robust than the overly length-biased selection criterion of MAX.

Original languageEnglish
Title of host publicationProceedings of the 20th Nordic Conference of Computational Linguistics : NODALIDA 2015
Number of pages5
Place of PublicationLinköping
PublisherLinköping University Electronic Press
Publication date2015
Pages245-250
ISBN (Electronic)978-91-7519-098-3
Publication statusPublished - 2015
SeriesNEALT Proceedings Series
Volume23
ISSN1736-6305

Fingerprint

Dive into the research topics of 'Active Learning for Sense Annotation'. Together they form a unique fingerprint.

Cite this