An empirical study of differences between conversion schemes and annotation guidelines

Abstract

We establish quantitative methods for comparing and estimating the quality of dependency annotations or conversion schemes. We use generalized tree-edit distance to measure divergence between annotations and propose theoretical learnability, derivational perplexity and downstream performance for evaluation. We present systematic experiments with tree-to-dependency conversions of the Penn-III treebank, as well as observations from experiments using treebanks from multiple languages. Our most important observations are: (a) parser bias makes most parsers insensitive to non-local differences between annotations, but (b) choice of annotation nevertheless has significant impact on most downstream applications, and (c) while learnability does not correlate with downstream performance, learnable annotations will lead to more robust performance across domains.

Original languageEnglish
Title of host publicationDepLing 2013 : Proceedings of the Second International Conference on Dependency Linguistics 2013
Number of pages9
Place of PublicationPrag
PublisherAssociation for Computational Linguistics
Publication date2013
Pages298-306
ISBN (Electronic)978-80-7378-240-5
Publication statusPublished - 2013
EventInternational Conference on Dependency Linguistics: DepLing - Prag, Czech Republic
Duration: 27 Aug 201330 Aug 2013
Conference number: 2

Conference

ConferenceInternational Conference on Dependency Linguistics
Number2
Country/TerritoryCzech Republic
CityPrag
Period27/08/201330/08/2013

Fingerprint

Dive into the research topics of 'An empirical study of differences between conversion schemes and annotation guidelines'. Together they form a unique fingerprint.

Cite this