Putting sarcasm detection into context: the effects of class imbalance and manual labelling on supervised machine classification of Twitter conversations.

Gavin Abercrombie; Dirk Hovy

Putting sarcasm detection into context: the effects of class imbalance and manual labelling on supervised machine classification of Twitter conversations.

Gavin Abercrombie, Dirk Hovy

Department of Nordic Studies and Linguistics

15 Citations (Scopus)

Abstract

Sarcasm can radically alter or invert a phrase's meaning. Sarcasm detection can therefore help improve natural language processing (NLP) tasks. The majority of prior research has modeled sarcasm detection as classification, with two important limitations: 1. Balanced datasets, when sarcasm is actually rather rare. 2. Using Twitter users' self-declarations in the form of hashtags to label data, when sarcasm can take many forms. To address these issues, we create an unbalanced corpus of manually annotated Twitter conversations. We compare human and machine ability to recognize sarcasm on this data under varying amounts of context. Our results indicate that both class imbalance and labelling method affect performance, and should both be considered when designing automatic sarcasm detection systems. We conclude that for progress to be made in real-world sarcasm detection, we will require a new class labelling scheme that is able to access the 'common ground' held between conversational parties.

Original language	English
Title of host publication	Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics – Student Research Workshop
Number of pages	7
Place of Publication	Stroudsburg, PA
Publisher	Association for Computational Linguistics
Publication date	2016
Pages	107-113
ISBN (Print)	978-1-945626-02-9
Publication status	Published - 2016
Event	54th Annual Meeting of the Association for Computational Linguistics - Berlin, Germany Duration: 7 Aug 2016 → 12 Aug 2016 Conference number: 54

Conference

Conference	54th Annual Meeting of the Association for Computational Linguistics
Number	54
Country/Territory	Germany
City	Berlin
Period	07/08/2016 → 12/08/2016

Access to Document

https://www.aclweb.org/anthology/P/P16/P16-3016.pdf

Cite this

Abercrombie, G., & Hovy, D. (2016). Putting sarcasm detection into context: the effects of class imbalance and manual labelling on supervised machine classification of Twitter conversations. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics – Student Research Workshop (pp. 107-113). Association for Computational Linguistics. https://www.aclweb.org/anthology/P/P16/P16-3016.pdf

Putting sarcasm detection into context : the effects of class imbalance and manual labelling on supervised machine classification of Twitter conversations. / Abercrombie, Gavin; Hovy, Dirk.

Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics – Student Research Workshop. Stroudsburg, PA : Association for Computational Linguistics, 2016. p. 107-113.

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Abercrombie, G & Hovy, D 2016, Putting sarcasm detection into context: the effects of class imbalance and manual labelling on supervised machine classification of Twitter conversations. in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics – Student Research Workshop. Association for Computational Linguistics, Stroudsburg, PA, pp. 107-113, 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, 07/08/2016. <https://www.aclweb.org/anthology/P/P16/P16-3016.pdf>

Abercrombie G, Hovy D. Putting sarcasm detection into context: the effects of class imbalance and manual labelling on supervised machine classification of Twitter conversations. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics – Student Research Workshop. Stroudsburg, PA: Association for Computational Linguistics. 2016. p. 107-113

Abercrombie, Gavin ; Hovy, Dirk. / Putting sarcasm detection into context : the effects of class imbalance and manual labelling on supervised machine classification of Twitter conversations. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics – Student Research Workshop. Stroudsburg, PA : Association for Computational Linguistics, 2016. pp. 107-113

@inproceedings{a54358c52eb545529117c1f5bfca189d,

title = "Putting sarcasm detection into context: the effects of class imbalance and manual labelling on supervised machine classification of Twitter conversations.",

abstract = "Sarcasm can radically alter or invert a phrase's meaning. Sarcasm detection can therefore help improve natural language processing (NLP) tasks. The majority of prior research has modeled sarcasm detection as classification, with two important limitations: 1. Balanced datasets, when sarcasm is actually rather rare. 2. Using Twitter users' self-declarations in the form of hashtags to label data, when sarcasm can take many forms. To address these issues, we create an unbalanced corpus of manually annotated Twitter conversations. We compare human and machine ability to recognize sarcasm on this data under varying amounts of context. Our results indicate that both class imbalance and labelling method affect performance, and should both be considered when designing automatic sarcasm detection systems. We conclude that for progress to be made in real-world sarcasm detection, we will require a new class labelling scheme that is able to access the 'common ground' held between conversational parties.",

author = "Gavin Abercrombie and Dirk Hovy",

year = "2016",

language = "English",

isbn = "978-1-945626-02-9",

pages = "107--113",

booktitle = "Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics – Student Research Workshop",

publisher = "Association for Computational Linguistics",

note = "54th Annual Meeting of the Association for Computational Linguistics ; Conference date: 07-08-2016 Through 12-08-2016",

}

TY - GEN

T1 - Putting sarcasm detection into context

T2 - 54th Annual Meeting of the Association for Computational Linguistics

AU - Abercrombie, Gavin

AU - Hovy, Dirk

N1 - Conference code: 54

PY - 2016

Y1 - 2016

N2 - Sarcasm can radically alter or invert a phrase's meaning. Sarcasm detection can therefore help improve natural language processing (NLP) tasks. The majority of prior research has modeled sarcasm detection as classification, with two important limitations: 1. Balanced datasets, when sarcasm is actually rather rare. 2. Using Twitter users' self-declarations in the form of hashtags to label data, when sarcasm can take many forms. To address these issues, we create an unbalanced corpus of manually annotated Twitter conversations. We compare human and machine ability to recognize sarcasm on this data under varying amounts of context. Our results indicate that both class imbalance and labelling method affect performance, and should both be considered when designing automatic sarcasm detection systems. We conclude that for progress to be made in real-world sarcasm detection, we will require a new class labelling scheme that is able to access the 'common ground' held between conversational parties.

AB - Sarcasm can radically alter or invert a phrase's meaning. Sarcasm detection can therefore help improve natural language processing (NLP) tasks. The majority of prior research has modeled sarcasm detection as classification, with two important limitations: 1. Balanced datasets, when sarcasm is actually rather rare. 2. Using Twitter users' self-declarations in the form of hashtags to label data, when sarcasm can take many forms. To address these issues, we create an unbalanced corpus of manually annotated Twitter conversations. We compare human and machine ability to recognize sarcasm on this data under varying amounts of context. Our results indicate that both class imbalance and labelling method affect performance, and should both be considered when designing automatic sarcasm detection systems. We conclude that for progress to be made in real-world sarcasm detection, we will require a new class labelling scheme that is able to access the 'common ground' held between conversational parties.

M3 - Article in proceedings

SN - 978-1-945626-02-9

SP - 107

EP - 113

BT - Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics – Student Research Workshop

PB - Association for Computational Linguistics

CY - Stroudsburg, PA

Y2 - 7 August 2016 through 12 August 2016

ER -

Putting sarcasm detection into context: the effects of class imbalance and manual labelling on supervised machine classification of Twitter conversations.

Abstract

Conference

Access to Document

Fingerprint

Cite this