The Copenhagen Team Participation in the Check-Worthiness Task of the Competition of Automatic Identification and Verification of Claims in Political Debates of the CLEF-2018 CheckThat! Lab

Casper Hansen; Christian Hansen; Jakob Grue Simonsen; Christina Lioma

The Copenhagen Team Participation in the Check-Worthiness Task of the Competition of Automatic Identification and Verification of Claims in Political Debates of the CLEF-2018 CheckThat! Lab

Casper Hansen, Christian Hansen, Jakob Grue Simonsen, Christina Lioma

Department of Computer Science

4 Citations (Scopus)

79 Downloads (Pure)

Abstract

We predict which claim in a political debate should be prioritized
for fact-checking. A particular challenge is, given a debate, how to
produce a ranked list of its sentences based on their worthiness for fact
checking. We develop a Recurrent Neural Network (RNN) model that
learns a sentence embedding, which is then used to predict the checkworthiness
of a sentence. Our sentence embedding encodes both semantic
and syntactic dependencies using pretrained word2vec word embeddings
as well as part-of-speech tagging and syntactic dependency parsing. This
results in a multi-representation of each word, which we use as input to a
RNN with GRU memory units; the output from each word is aggregated
using attention, followed by a fully connected layer, from which the output
is predicted using a sigmoid function. The overall performance of our
techniques is successful, achieving the overall second best performing run
(MAP: 0.1152) in the competition, as well as the highest overall performance
(MAP: 0.1810) for our contrastive run with a 32% improvement
over the second highest MAP score in the English language category. In
our primary run we combined our sentence embedding with state of the
art check-worthy features, whereas in the contrastive run we considered
our sentence embedding alone

Original language	English
Title of host publication	CLEF 2018 Working Notes
Editors	Linda Cappellato , Nicola Ferro , Jian-Yun Nie, Laure Soulier
Number of pages	8
Publisher	CEUR-WS.org
Publication date	2018
Edition	10
Article number	81
Publication status	Published - 2018
Event	19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018 - Avignon, France Duration: 10 Sept 2018 → 14 Sept 2018

Conference

Conference	19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018
Country/Territory	France
City	Avignon
Period	10/09/2018 → 14/09/2018

Series	CEUR Workshop Proceedings
Volume	2125
ISSN	1613-0073

Keywords

CNN
Fact checking
Political debates
RNN

Access to Document

paper_81Final published version, 404 KB

Cite this

Hansen, C., Hansen, C., Simonsen, J. G., & Lioma, C. (2018). The Copenhagen Team Participation in the Check-Worthiness Task of the Competition of Automatic Identification and Verification of Claims in Political Debates of the CLEF-2018 CheckThat! Lab. In L. Cappellato , N. Ferro , J.-Y. Nie, & L. Soulier (Eds.), CLEF 2018 Working Notes (10 ed.). Article 81 CEUR-WS.org.

The Copenhagen Team Participation in the Check-Worthiness Task of the Competition of Automatic Identification and Verification of Claims in Political Debates of the CLEF-2018 CheckThat! Lab. / Hansen, Casper ; Hansen, Christian ; Simonsen, Jakob Grue et al.
CLEF 2018 Working Notes. ed. / Linda Cappellato ; Nicola Ferro ; Jian-Yun Nie; Laure Soulier. 10. ed. CEUR-WS.org, 2018. 81 (CEUR Workshop Proceedings, Vol. 2125).

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Hansen, C , Hansen, C , Simonsen, JG & Lioma, C 2018, The Copenhagen Team Participation in the Check-Worthiness Task of the Competition of Automatic Identification and Verification of Claims in Political Debates of the CLEF-2018 CheckThat! Lab. in L Cappellato , N Ferro , J-Y Nie & L Soulier (eds), CLEF 2018 Working Notes. 10 edn, 81, CEUR-WS.org, CEUR Workshop Proceedings, vol. 2125, 19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018, Avignon, France, 10/09/2018.

Hansen C , Hansen C , Simonsen JG , Lioma C. The Copenhagen Team Participation in the Check-Worthiness Task of the Competition of Automatic Identification and Verification of Claims in Political Debates of the CLEF-2018 CheckThat! Lab. In Cappellato L, Ferro N, Nie JY, Soulier L, editors, CLEF 2018 Working Notes. 10 ed. CEUR-WS.org. 2018. 81. (CEUR Workshop Proceedings, Vol. 2125).

Hansen, Casper ; Hansen, Christian ; Simonsen, Jakob Grue et al. / The Copenhagen Team Participation in the Check-Worthiness Task of the Competition of Automatic Identification and Verification of Claims in Political Debates of the CLEF-2018 CheckThat! Lab. CLEF 2018 Working Notes. editor / Linda Cappellato ; Nicola Ferro ; Jian-Yun Nie ; Laure Soulier. 10. ed. CEUR-WS.org, 2018. (CEUR Workshop Proceedings, Vol. 2125).

@inproceedings{069c4426de2a431dba28d3f3eaab606c,

title = "The Copenhagen Team Participation in the Check-Worthiness Task of the Competition of Automatic Identification and Verification of Claims in Political Debates of the CLEF-2018 CheckThat! Lab",

abstract = "We predict which claim in a political debate should be prioritizedfor fact-checking. A particular challenge is, given a debate, how toproduce a ranked list of its sentences based on their worthiness for factchecking. We develop a Recurrent Neural Network (RNN) model thatlearns a sentence embedding, which is then used to predict the checkworthinessof a sentence. Our sentence embedding encodes both semanticand syntactic dependencies using pretrained word2vec word embeddingsas well as part-of-speech tagging and syntactic dependency parsing. Thisresults in a multi-representation of each word, which we use as input to aRNN with GRU memory units; the output from each word is aggregatedusing attention, followed by a fully connected layer, from which the outputis predicted using a sigmoid function. The overall performance of ourtechniques is successful, achieving the overall second best performing run(MAP: 0.1152) in the competition, as well as the highest overall performance(MAP: 0.1810) for our contrastive run with a 32% improvementover the second highest MAP score in the English language category. Inour primary run we combined our sentence embedding with state of theart check-worthy features, whereas in the contrastive run we consideredour sentence embedding alone",

keywords = "CNN, Fact checking, Political debates, RNN",

author = "Casper Hansen and Christian Hansen and Simonsen, {Jakob Grue} and Christina Lioma",

year = "2018",

language = "English",

series = "CEUR Workshop Proceedings",

publisher = "CEUR-WS.org",

editor = "{Cappellato }, Linda and {Ferro }, Nicola and Nie, {Jian-Yun } and Laure Soulier",

booktitle = "CLEF 2018 Working Notes",

edition = "10",

note = "19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018 ; Conference date: 10-09-2018 Through 14-09-2018",

}

TY - GEN

T1 - The Copenhagen Team Participation in the Check-Worthiness Task of the Competition of Automatic Identification and Verification of Claims in Political Debates of the CLEF-2018 CheckThat! Lab

AU - Hansen, Casper

AU - Hansen, Christian

AU - Simonsen, Jakob Grue

AU - Lioma, Christina

PY - 2018

Y1 - 2018

N2 - We predict which claim in a political debate should be prioritizedfor fact-checking. A particular challenge is, given a debate, how toproduce a ranked list of its sentences based on their worthiness for factchecking. We develop a Recurrent Neural Network (RNN) model thatlearns a sentence embedding, which is then used to predict the checkworthinessof a sentence. Our sentence embedding encodes both semanticand syntactic dependencies using pretrained word2vec word embeddingsas well as part-of-speech tagging and syntactic dependency parsing. Thisresults in a multi-representation of each word, which we use as input to aRNN with GRU memory units; the output from each word is aggregatedusing attention, followed by a fully connected layer, from which the outputis predicted using a sigmoid function. The overall performance of ourtechniques is successful, achieving the overall second best performing run(MAP: 0.1152) in the competition, as well as the highest overall performance(MAP: 0.1810) for our contrastive run with a 32% improvementover the second highest MAP score in the English language category. Inour primary run we combined our sentence embedding with state of theart check-worthy features, whereas in the contrastive run we consideredour sentence embedding alone

AB - We predict which claim in a political debate should be prioritizedfor fact-checking. A particular challenge is, given a debate, how toproduce a ranked list of its sentences based on their worthiness for factchecking. We develop a Recurrent Neural Network (RNN) model thatlearns a sentence embedding, which is then used to predict the checkworthinessof a sentence. Our sentence embedding encodes both semanticand syntactic dependencies using pretrained word2vec word embeddingsas well as part-of-speech tagging and syntactic dependency parsing. Thisresults in a multi-representation of each word, which we use as input to aRNN with GRU memory units; the output from each word is aggregatedusing attention, followed by a fully connected layer, from which the outputis predicted using a sigmoid function. The overall performance of ourtechniques is successful, achieving the overall second best performing run(MAP: 0.1152) in the competition, as well as the highest overall performance(MAP: 0.1810) for our contrastive run with a 32% improvementover the second highest MAP score in the English language category. Inour primary run we combined our sentence embedding with state of theart check-worthy features, whereas in the contrastive run we consideredour sentence embedding alone

KW - CNN

KW - Fact checking

KW - Political debates

KW - RNN

M3 - Article in proceedings

T3 - CEUR Workshop Proceedings

BT - CLEF 2018 Working Notes

A2 - Cappellato , Linda

A2 - Ferro , Nicola

A2 - Nie, Jian-Yun

A2 - Soulier, Laure

PB - CEUR-WS.org

T2 - 19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018

Y2 - 10 September 2018 through 14 September 2018

ER -

The Copenhagen Team Participation in the Check-Worthiness Task of the Competition of Automatic Identification and Verification of Claims in Political Debates of the CLEF-2018 CheckThat! Lab

Abstract

Conference

Keywords

Access to Document

Fingerprint

Cite this