Abstract
We present a suite of 12 datasets for evaluating POS taggers across varieties of English to enable researchers to evaluate the robustness of their models. The suite includes three new datasets, sampled from lyrics from black American hip-hop artists, southeastern American Twitter, and the subtitles from the TV series The Wire. We present an example eval- uation of an off-the-shelf POS tagger across these datasets.
Original language | English |
---|---|
Title of host publication | Proceedings of the 25th International Conference Companion on World Wide Web |
Number of pages | 4 |
Publisher | International World Wide Web Conferences Steering Committee |
Publication date | 11 Apr 2016 |
Pages | 615-618 |
ISBN (Print) | 978-1-4503-4144-8 |
DOIs | |
Publication status | Published - 11 Apr 2016 |
Event | 25th International World Wide Web Conference - Montreal, Canada Duration: 11 Apr 2016 → 15 Apr 2016 Conference number: 25 |
Conference
Conference | 25th International World Wide Web Conference |
---|---|
Number | 25 |
Country/Territory | Canada |
City | Montreal |
Period | 11/04/2016 → 15/04/2016 |