Investigating Redundancy in Emoji Use: Study on a Twitter Based Corpus

Giulia Donato; Patrizia Paggio

Investigating Redundancy in Emoji Use: Study on a Twitter Based Corpus

Institut for Nordiske Studier og Sprogvidenskab

Abstract

In this paper we present an annotated corpus created with the aim of analyzing the informative behaviour of emoji - an issue of importance for sentiment analysis and natural language processing. The corpus consists of 2475 tweets all containing at least one emoji, which has been annotated using one of the three possible classes: Redundant, Non Redundant, and Non Redundant + POS. We explain how the corpus was collected, describe the annotation procedure and the interface developed for the task. We provide an analysis of the corpus, considering also possible predictive features, discuss the problematic aspects of the annotation, and suggest future improvements.

Originalsprog	Engelsk
Titel	Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis WASSA 2017
Antal sider	9
Udgivelsessted	Stroudsburg, PA
Forlag	Association for Computational Linguistics
Publikationsdato	2017
Sider	118-126
ISBN (Trykt)	978-1-945626-95-1
Status	Udgivet - 2017
Begivenhed	8th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis - Copenhagen, Danmark Varighed: 8 sep. 2017 → 8 sep. 2017 http://WASSA 2017

Konference

Konference	8th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis
Land/Område	Danmark
By	Copenhagen
Periode	08/09/2017 → 08/09/2017
Internetadresse	http://WASSA 2017

Andre filer og links

http://www.aclweb.org/anthology/W17-5200

Citationsformater

Investigating Redundancy in Emoji Use: Study on a Twitter Based Corpus. / Donato, Giulia; Paggio, Patrizia.

Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis WASSA 2017 . Stroudsburg, PA : Association for Computational Linguistics, 2017. s. 118-126.

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › peer review

Donato, G & Paggio, P 2017, Investigating Redundancy in Emoji Use: Study on a Twitter Based Corpus. i Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis WASSA 2017 . Association for Computational Linguistics, Stroudsburg, PA, s. 118-126, 8th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis, Copenhagen, Danmark, 08/09/2017.

@inproceedings{9e6560cb7d504c0f817990f1af273447,

title = "Investigating Redundancy in Emoji Use: Study on a Twitter Based Corpus",

abstract = "In this paper we present an annotated corpus created with the aim of analyzing the informative behaviour of emoji - an issue of importance for sentiment analysis and natural language processing. The corpus consists of 2475 tweets all containing at least one emoji, which has been annotated using one of the three possible classes: Redundant, Non Redundant, and Non Redundant + POS. We explain how the corpus was collected, describe the annotation procedure and the interface developed for the task. We provide an analysis of the corpus, considering also possible predictive features, discuss the problematic aspects of the annotation, and suggest future improvements.",

author = "Giulia Donato and Patrizia Paggio",

year = "2017",

language = "English",

isbn = "978-1-945626-95-1",

pages = "118--126",

booktitle = "Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis WASSA 2017",

publisher = "Association for Computational Linguistics",

note = "8th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis, WASSA 2017 ; Conference date: 08-09-2017 Through 08-09-2017",

url = "http://WASSA 2017",

}

TY - GEN

T1 - Investigating Redundancy in Emoji Use: Study on a Twitter Based Corpus

AU - Donato, Giulia

AU - Paggio, Patrizia

PY - 2017

Y1 - 2017

N2 - In this paper we present an annotated corpus created with the aim of analyzing the informative behaviour of emoji - an issue of importance for sentiment analysis and natural language processing. The corpus consists of 2475 tweets all containing at least one emoji, which has been annotated using one of the three possible classes: Redundant, Non Redundant, and Non Redundant + POS. We explain how the corpus was collected, describe the annotation procedure and the interface developed for the task. We provide an analysis of the corpus, considering also possible predictive features, discuss the problematic aspects of the annotation, and suggest future improvements.

AB - In this paper we present an annotated corpus created with the aim of analyzing the informative behaviour of emoji - an issue of importance for sentiment analysis and natural language processing. The corpus consists of 2475 tweets all containing at least one emoji, which has been annotated using one of the three possible classes: Redundant, Non Redundant, and Non Redundant + POS. We explain how the corpus was collected, describe the annotation procedure and the interface developed for the task. We provide an analysis of the corpus, considering also possible predictive features, discuss the problematic aspects of the annotation, and suggest future improvements.

UR - http://www.aclweb.org/anthology/W17-5200

M3 - Article in proceedings

SN - 978-1-945626-95-1

SP - 118

EP - 126

BT - Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis WASSA 2017

PB - Association for Computational Linguistics

CY - Stroudsburg, PA

T2 - 8th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis

Y2 - 8 September 2017 through 8 September 2017

ER -

Investigating Redundancy in Emoji Use: Study on a Twitter Based Corpus

Abstract

Konference

Andre filer og links

Fingeraftryk

Citationsformater