Joint part-of-speech and dependency projection from multiple sources

Anders Trærup Johannsen, Zeljko Agic, Anders Søgaard

3 Citationer (Scopus)

Abstract

Most previous work on annotation projection has been limited to a subset of Indo-European languages, using only a single source language, and projecting annotation for one task at a time. In contrast, we present an Integer Linear Programming (ILP) algorithm that simultaneously projects annotation for multiple tasks from multiple source languages, relying on parallel corpora available for hundreds of languages. When training POS taggers and dependency parsers on jointly projected POS tags and syntactic dependencies using our algorithm, we obtain better performance than a standard approach on 20/23 languages using one parallel corpus; and 18/27 languages using another.

OriginalsprogEngelsk
TitelProceedings of the 54th Annual Meeting of the Association for Computational Linguistics
Vol/bind2 (Short papers)
ForlagAssociation for Computational Linguistics
Publikationsdato2016
ISBN (Trykt) 978-1-945626-01-2
StatusUdgivet - 2016
Begivenhed54th Annual Meeting of the Association for Computational Linguistics - Berlin, Tyskland
Varighed: 7 aug. 201612 aug. 2016
Konferencens nummer: 54

Konference

Konference54th Annual Meeting of the Association for Computational Linguistics
Nummer54
Land/OmrådeTyskland
ByBerlin
Periode07/08/201612/08/2016

Fingeraftryk

Dyk ned i forskningsemnerne om 'Joint part-of-speech and dependency projection from multiple sources'. Sammen danner de et unikt fingeraftryk.

Citationsformater