Using crowdsourcing to get representations based on regular expressions

Anders Søgaard, Hector Martinez Alonso, Jakob Elming, Anders Trærup Johannsen

2 Citationer (Scopus)

Abstract

Often the bottleneck in document classification is finding good representations that zoom in on the most important aspects of the documents. Most research uses n-gram representations, but relevant features often occur discontinuously, e.g., not⋯ good in sentiment analysis. In this paper we present experiments getting experts to provide regular expressions, as well as crowdsourced annotation tasks from which regular expressions can be derived. Somewhat surprisingly, it turns out that these crowdsourced feature combinations outperform automatic feature combination methods, as well as expert features, by a very large margin and reduce error by 24-41% over n-gram representations.

OriginalsprogEngelsk
TitelEMNLP 2013
ForlagAssociation for Computational Linguistics
Publikationsdato2013
Sider1476-1480
ISBN (Elektronisk)978-1-937284-97-8
StatusUdgivet - 2013

Fingeraftryk

Dyk ned i forskningsemnerne om 'Using crowdsourcing to get representations based on regular expressions'. Sammen danner de et unikt fingeraftryk.

Citationsformater