From ranked words to dependency trees: two-stage unsupervised non-projective dependency parsing

Anders Søgaard

From ranked words to dependency trees: two-stage unsupervised non-projective dependency parsing

LUKKET: Center for Sprogteknologi

4 Citationer (Scopus)

Abstract

Usually unsupervised dependency parsing tries to optimize the probability of a corpus by modifying the dependency model that was presumably used to generate the corpus. In this article we explore a different view in which a dependency structure is among other things a partial order on the nodes in terms of centrality or saliency. Under this assumption we model the partial order directly and derive dependency trees from this order. The result is an approach to unsupervised dependency parsing that is very different from standard ones in that it requires no training data. Each sentence induces a model from which the parse is read off. Our approach is evaluated on data from12 different languages. Two scenarios are considered: a scenario in which information about part-of-speech is available, and a scenario in which parsing relies only on word forms and distributional clusters. Our approach is competitive to state-of-the-art in both scenarios.

Originalsprog	Engelsk
Titel	TextGraphs-6: Graph-based Methods for Natural Language Processing, the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT)
Forlag	Association for Computational Linguistics
Publikationsdato	2011
Status	Udgivet - 2011

Citationsformater

From ranked words to dependency trees: two-stage unsupervised non-projective dependency parsing. / Søgaard, Anders.

TextGraphs-6: Graph-based Methods for Natural Language Processing, the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT). Association for Computational Linguistics, 2011.

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › peer review

@inproceedings{4e61cf510c924c998e1ffceb9c78f6e3,

title = "From ranked words to dependency trees: two-stage unsupervised non-projective dependency parsing",

abstract = "Usually unsupervised dependency parsing tries to optimize the probability of a corpus by modifying the dependency model that was presumably used to generate the corpus. In this article we explore a different view in which a dependency structure is among other things a partial order on the nodes in terms of centrality or saliency. Under this assumption we model the partial order directly and derive dependency trees from this order. The result is an approach to unsupervised dependency parsing that is very different from standard ones in that it requires no training data. Each sentence induces a model from which the parse is read off. Our approach is evaluated on data from12 different languages. Two scenarios are considered: a scenario in which information about part-of-speech is available, and a scenario in which parsing relies only on word forms and distributional clusters. Our approach is competitive to state-of-the-art in both scenarios.",

author = "Anders S{\o}gaard",

year = "2011",

language = "English",

booktitle = "TextGraphs-6: Graph-based Methods for Natural Language Processing, the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT)",

publisher = "Association for Computational Linguistics",

}

TY - GEN

T1 - From ranked words to dependency trees: two-stage unsupervised non-projective dependency parsing

AU - Søgaard, Anders

PY - 2011

Y1 - 2011

N2 - Usually unsupervised dependency parsing tries to optimize the probability of a corpus by modifying the dependency model that was presumably used to generate the corpus. In this article we explore a different view in which a dependency structure is among other things a partial order on the nodes in terms of centrality or saliency. Under this assumption we model the partial order directly and derive dependency trees from this order. The result is an approach to unsupervised dependency parsing that is very different from standard ones in that it requires no training data. Each sentence induces a model from which the parse is read off. Our approach is evaluated on data from12 different languages. Two scenarios are considered: a scenario in which information about part-of-speech is available, and a scenario in which parsing relies only on word forms and distributional clusters. Our approach is competitive to state-of-the-art in both scenarios.

AB - Usually unsupervised dependency parsing tries to optimize the probability of a corpus by modifying the dependency model that was presumably used to generate the corpus. In this article we explore a different view in which a dependency structure is among other things a partial order on the nodes in terms of centrality or saliency. Under this assumption we model the partial order directly and derive dependency trees from this order. The result is an approach to unsupervised dependency parsing that is very different from standard ones in that it requires no training data. Each sentence induces a model from which the parse is read off. Our approach is evaluated on data from12 different languages. Two scenarios are considered: a scenario in which information about part-of-speech is available, and a scenario in which parsing relies only on word forms and distributional clusters. Our approach is competitive to state-of-the-art in both scenarios.

M3 - Article in proceedings

BT - TextGraphs-6: Graph-based Methods for Natural Language Processing, the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT)

PB - Association for Computational Linguistics

ER -