Abstract
Most previous work on annotation projection has been limited to a subset of Indo-European languages, using only a single source language, and projecting annotation for one task at a time. In contrast, we present an Integer Linear Programming (ILP) algorithm that simultaneously projects annotation for multiple tasks from multiple source languages, relying on parallel corpora available for hundreds of languages. When training POS taggers and dependency parsers on jointly projected POS tags and syntactic dependencies using our algorithm, we obtain better performance than a standard approach on 20/23 languages using one parallel corpus; and 18/27 languages using another.
Originalsprog | Engelsk |
---|---|
Titel | Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics |
Vol/bind | 2 (Short papers) |
Forlag | Association for Computational Linguistics |
Publikationsdato | 2016 |
ISBN (Trykt) | 978-1-945626-01-2 |
Status | Udgivet - 2016 |
Begivenhed | 54th Annual Meeting of the Association for Computational Linguistics - Berlin, Tyskland Varighed: 7 aug. 2016 → 12 aug. 2016 Konferencens nummer: 54 |
Konference
Konference | 54th Annual Meeting of the Association for Computational Linguistics |
---|---|
Nummer | 54 |
Land/Område | Tyskland |
By | Berlin |
Periode | 07/08/2016 → 12/08/2016 |