TY - GEN
T1 - Active Learning for Sense Annotation
AU - Martinez Alonso, Hector
AU - Plank, Barbara
AU - Johannsen, Anders Trærup
AU - Søgaard, Anders
PY - 2015
Y1 - 2015
N2 - This article describes a real (nonsynthetic) active-learning experiment to obtain supersense annotations for Danish. We compare two instance selection strategies, namely lowest-prediction confidence (MAX), and sampling from the confidence distribution (SAMPLE). We evaluate their performance during the annotation process, across domains for the final resulting system, as well as against in-domain adjudicated data. The SAMPLE strategy yields competitive models that are more robust than the overly length-biased selection criterion of MAX.
AB - This article describes a real (nonsynthetic) active-learning experiment to obtain supersense annotations for Danish. We compare two instance selection strategies, namely lowest-prediction confidence (MAX), and sampling from the confidence distribution (SAMPLE). We evaluate their performance during the annotation process, across domains for the final resulting system, as well as against in-domain adjudicated data. The SAMPLE strategy yields competitive models that are more robust than the overly length-biased selection criterion of MAX.
M3 - Article in proceedings
T3 - NEALT Proceedings Series
SP - 245
EP - 250
BT - Proceedings of the 20th Nordic Conference of Computational Linguistics
PB - Linköping University Electronic Press
CY - Linköping
ER -