Supersense tagging for Danish

Hector Martinez Alonso, Anders Trærup Johannsen, Sussi Olsen, Sanni Nimb, Nicolai Sørensen, Anna Braasch, Anders Søgaard, Bolette Sandford Pedersen

Abstract

We describe the creation of a new Danish resource for automated coarse-grained word sense disambiguation of running text (supersense tagging, SST). Based on corpus evidence we expand the sense inventory to incorporate new lexical classes. We add tags for verbal satellites like collocates, particles and reflexive pronouns, to give account for the satellite-framing properties of Danish. Finally, we evaluate the quality of our expanded sense inventory in terms of variation in F1 on a stateof- the- art SST system. The SST systems uses type constraints and achieves performance just under the upper bound of interannotator agreement. The initial release is a 1,500-sentence corpus covering six genres, made available under an open-source license.

Original languageEnglish
Title of host publicationProceedings of the 20th Nordic Conference of Computational Linguistics NODALIDA 2015
Number of pages8
Volume109
PublisherLinköping University Electronic Press
Publication date2015
ISBN (Print)978-91-7519-098-3
Publication statusPublished - 2015
EventNODALIDA 2015: Nordic Conference on Computational Linguistics - Vilnius, Lithuania
Duration: 11 May 201513 May 2015
Conference number: 20

Conference

ConferenceNODALIDA 2015
Number20
Country/TerritoryLithuania
CityVilnius
Period11/05/201513/05/2015
SeriesNEALT (Northern European Association of Language Technology) Proceedings Series
Volume23
ISSN1736-6305

Fingerprint

Dive into the research topics of 'Supersense tagging for Danish'. Together they form a unique fingerprint.

Cite this