Supersense tagging for Danish

Hector Martinez Alonso, Anders Trærup Johannsen, Sussi Olsen, Sanni Nimb, Nicolai Sørensen, Anna Braasch, Anders Søgaard, Bolette Sandford Pedersen

Abstract

We describe the creation of a new Danish resource for automated coarse-grained word sense disambiguation of running text (supersense tagging, SST). Based on corpus evidence we expand the sense inventory to incorporate new lexical classes. We add tags for verbal satellites like collocates, particles and reflexive pronouns, to give account for the satellite-framing properties of Danish. Finally, we evaluate the quality of our expanded sense inventory in terms of variation in F1 on a stateof- the- art SST system. The SST systems uses type constraints and achieves performance just under the upper bound of interannotator agreement. The initial release is a 1,500-sentence corpus covering six genres, made available under an open-source license.

OriginalsprogEngelsk
TitelProceedings of the 20th Nordic Conference of Computational Linguistics NODALIDA 2015
Antal sider8
Vol/bind109
ForlagLinköping University Electronic Press
Publikationsdato2015
ISBN (Trykt)978-91-7519-098-3
StatusUdgivet - 2015
BegivenhedNODALIDA 2015: Nordic Conference on Computational Linguistics - Vilnius, Litauen
Varighed: 11 maj 201513 maj 2015
Konferencens nummer: 20

Konference

KonferenceNODALIDA 2015
Nummer20
Land/OmrådeLitauen
ByVilnius
Periode11/05/201513/05/2015
NavnNEALT (Northern European Association of Language Technology) Proceedings Series
Vol/bind23
ISSN1736-6305

Fingeraftryk

Dyk ned i forskningsemnerne om 'Supersense tagging for Danish'. Sammen danner de et unikt fingeraftryk.

Citationsformater