Robust semi-supervised and ensemble-based methods in word sense disambiguation

Anders Østerskov Søgaard, Anders Trærup Johannsen

1 Citationer (Scopus)

Abstract

Mihalcea [1] discusses self-training and co-training in the context of word sense disambiguation and shows that parameter optimization on individual words was important to obtain good results. Using smoothed co-training of a naive Bayes classifier she obtains a 9.8% error reduction on Senseval-2 data with a fixed parameter setting. In this paper we test a semi-supervised learning algorithm with no parameters, namely tri-training [2]. We also test the random subspace method [3] for building committees out of stable learners. Both techniques lead to significant error reductions with different learning algorithms, but improvements do not accumulate. Our best error reduction is 7.4%, and our best absolute average over Senseval-2 data, though not directly comparable, is 12% higher than the results reported in Mihalcea [1].

OriginalsprogEngelsk
TitelProceedings of the 7th International Conference on Advances in Natural Language Processing
ForlagSpringer
Publikationsdato2010
ISBN (Trykt)3-642-14769-0 978-3-642-14769-2
StatusUdgivet - 2010

Fingeraftryk

Dyk ned i forskningsemnerne om 'Robust semi-supervised and ensemble-based methods in word sense disambiguation'. Sammen danner de et unikt fingeraftryk.

Citationsformater