Abstract
Most attempts to train part-of-speech taggers on a mixture of labeled and unlabeled data have failed. In this work stacked learning is used to reduce tagging to a classification task. This simplifies semisupervised training considerably. Our prefered semi-supervised method combines tri-training (Li and Zhou, 2005) and disagreement-based co-training. On the Wall Street Journal, we obtain an error reduction of 4.2% with SVMTool (Gimenez and Marquez, 2004).
Original language | English |
---|---|
Title of host publication | Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics |
Publisher | Association for Computational Linguistics |
Publication date | 2010 |
ISBN (Electronic) | 978-1-932432-67-1 |
Publication status | Published - 2010 |