Abstract
We establish quantitative methods for comparing and estimating the quality of dependency annotations or conversion schemes. We use generalized tree-edit distance to measure divergence between annotations and propose theoretical learnability, derivational perplexity and downstream performance for evaluation. We present systematic experiments with tree-to-dependency conversions of the Penn-III treebank, as well as observations from experiments using treebanks from multiple languages. Our most important observations are: (a) parser bias makes most parsers insensitive to non-local differences between annotations, but (b) choice of annotation nevertheless has significant impact on most downstream applications, and (c) while learnability does not correlate with downstream performance, learnable annotations will lead to more robust performance across domains.
Original language | English |
---|---|
Title of host publication | DepLing 2013 : Proceedings of the Second International Conference on Dependency Linguistics 2013 |
Number of pages | 9 |
Place of Publication | Prag |
Publisher | Association for Computational Linguistics |
Publication date | 2013 |
Pages | 298-306 |
ISBN (Electronic) | 978-80-7378-240-5 |
Publication status | Published - 2013 |
Event | International Conference on Dependency Linguistics: DepLing - Prag, Czech Republic Duration: 27 Aug 2013 → 30 Aug 2013 Conference number: 2 |
Conference
Conference | International Conference on Dependency Linguistics |
---|---|
Number | 2 |
Country/Territory | Czech Republic |
City | Prag |
Period | 27/08/2013 → 30/08/2013 |