Importance weighting and unsupervised domain adaptation of POS taggers: a negative result

Barbara Plank, Anders Trærup Johannsen, Anders Søgaard

5 Citationer (Scopus)

Abstract

Importance weighting is a generalization of various statistical bias correction techniques. While our labeled data in NLP is heavily biased, importance weighting has seen only few applications in NLP, most of them relying on a small amount of labeled target data. The publication bias toward reporting positive results makes it hard to say whether researchers have tried. This paper presents a negative result on unsupervised domain adaptation for POS tagging. In this setup, we only have unlabeled data and thus only indirect access to the bias in emission and transition probabilities. Moreover, most errors in POS tagging are due to unseen words, and there, importance weighting cannot help. We present experiments with a wide variety of weight functions, quantilizations, as well as with randomly generated weights, to support these claims.

OriginalsprogEngelsk
TitelThe 2014 Conference on Empirical Methods In Natural Language Processing : EMNLP 2014
ForlagAssociation for Computational Linguistics
Publikationsdato2014
Sider968-973
StatusUdgivet - 2014

Fingeraftryk

Dyk ned i forskningsemnerne om 'Importance weighting and unsupervised domain adaptation of POS taggers: a negative result'. Sammen danner de et unikt fingeraftryk.

Citationsformater