Abstract
Most attempts to train part-of-speech taggers on a mixture of labeled and unlabeled data have failed. In this work stacked learning is used to reduce tagging to a classification task. This simplifies semisupervised training considerably. Our prefered semi-supervised method combines tri-training (Li and Zhou, 2005) and disagreement-based co-training. On the Wall Street Journal, we obtain an error reduction of 4.2% with SVMTool (Gimenez and Marquez, 2004).
Originalsprog | Engelsk |
---|---|
Titel | Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics |
Forlag | Association for Computational Linguistics |
Publikationsdato | 2010 |
ISBN (Elektronisk) | 978-1-932432-67-1 |
Status | Udgivet - 2010 |