A test suite for evaluating POS taggers across varieties of English

Anna Jørgensen, Anders Søgaard

Abstract

We present a suite of 12 datasets for evaluating POS taggers across varieties of English to enable researchers to evaluate the robustness of their models. The suite includes three new datasets, sampled from lyrics from black American hip-hop artists, southeastern American Twitter, and the subtitles from the TV series The Wire. We present an example eval- uation of an off-the-shelf POS tagger across these datasets.

OriginalsprogEngelsk
TitelProceedings of the 25th International Conference Companion on World Wide Web
Antal sider4
ForlagInternational World Wide Web Conferences Steering Committee
Publikationsdato11 apr. 2016
Sider615-618
ISBN (Trykt)978-1-4503-4144-8
DOI
StatusUdgivet - 11 apr. 2016
Begivenhed25th International World Wide Web Conference - Montreal, Canada
Varighed: 11 apr. 201615 apr. 2016
Konferencens nummer: 25

Konference

Konference25th International World Wide Web Conference
Nummer25
Land/OmrådeCanada
ByMontreal
Periode11/04/201615/04/2016

Fingeraftryk

Dyk ned i forskningsemnerne om 'A test suite for evaluating POS taggers across varieties of English'. Sammen danner de et unikt fingeraftryk.

Citationsformater