The Copenhagen Team Participation in the Check-Worthiness Task of the Competition of Automatic Identification and Verification of Claims in Political Debates of the CLEF-2018 CheckThat! Lab

Casper Hansen; Christian Hansen; Jakob Grue Simonsen; Christina Lioma

The Copenhagen Team Participation in the Check-Worthiness Task of the Competition of Automatic Identification and Verification of Claims in Political Debates of the CLEF-2018 CheckThat! Lab

Casper Hansen, Christian Hansen, Jakob Grue Simonsen, Christina Lioma

Datalogisk Institut

4 Citationer (Scopus)

79 Downloads (Pure)

Abstract

We predict which claim in a political debate should be prioritized
for fact-checking. A particular challenge is, given a debate, how to
produce a ranked list of its sentences based on their worthiness for fact
checking. We develop a Recurrent Neural Network (RNN) model that
learns a sentence embedding, which is then used to predict the checkworthiness
of a sentence. Our sentence embedding encodes both semantic
and syntactic dependencies using pretrained word2vec word embeddings
as well as part-of-speech tagging and syntactic dependency parsing. This
results in a multi-representation of each word, which we use as input to a
RNN with GRU memory units; the output from each word is aggregated
using attention, followed by a fully connected layer, from which the output
is predicted using a sigmoid function. The overall performance of our
techniques is successful, achieving the overall second best performing run
(MAP: 0.1152) in the competition, as well as the highest overall performance
(MAP: 0.1810) for our contrastive run with a 32% improvement
over the second highest MAP score in the English language category. In
our primary run we combined our sentence embedding with state of the
art check-worthy features, whereas in the contrastive run we considered
our sentence embedding alone

Originalsprog	Engelsk
Titel	CLEF 2018 Working Notes
Redaktører	Linda Cappellato , Nicola Ferro , Jian-Yun Nie, Laure Soulier
Antal sider	8
Forlag	CEUR-WS.org
Publikationsdato	2018
Udgave	10
Artikelnummer	81
Status	Udgivet - 2018
Begivenhed	19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018 - Avignon, Frankrig Varighed: 10 sep. 2018 → 14 sep. 2018

Konference

Konference	19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018
Land/Område	Frankrig
By	Avignon
Periode	10/09/2018 → 14/09/2018

Navn	CEUR Workshop Proceedings
Vol/bind	2125
ISSN	1613-0073

Adgang til dokumentet

paper_81Forlagets udgivne version, 404 KB

Citationsformater

Hansen, C., Hansen, C., Simonsen, J. G., & Lioma, C. (2018). The Copenhagen Team Participation in the Check-Worthiness Task of the Competition of Automatic Identification and Verification of Claims in Political Debates of the CLEF-2018 CheckThat! Lab. I L. Cappellato , N. Ferro , J.-Y. Nie, & L. Soulier (red.), CLEF 2018 Working Notes (10 udg.). Artikel 81 CEUR-WS.org.

The Copenhagen Team Participation in the Check-Worthiness Task of the Competition of Automatic Identification and Verification of Claims in Political Debates of the CLEF-2018 CheckThat! Lab. / Hansen, Casper ; Hansen, Christian ; Simonsen, Jakob Grue et al.
CLEF 2018 Working Notes. red. / Linda Cappellato ; Nicola Ferro ; Jian-Yun Nie; Laure Soulier. 10. udg. CEUR-WS.org, 2018. 81 (CEUR Workshop Proceedings, Bind 2125).

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › peer review

Hansen, C , Hansen, C , Simonsen, JG & Lioma, C 2018, The Copenhagen Team Participation in the Check-Worthiness Task of the Competition of Automatic Identification and Verification of Claims in Political Debates of the CLEF-2018 CheckThat! Lab. i L Cappellato , N Ferro , J-Y Nie & L Soulier (red), CLEF 2018 Working Notes. 10 udg, 81, CEUR-WS.org, CEUR Workshop Proceedings, bind 2125, 19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018, Avignon, Frankrig, 10/09/2018.

Hansen C , Hansen C , Simonsen JG , Lioma C. The Copenhagen Team Participation in the Check-Worthiness Task of the Competition of Automatic Identification and Verification of Claims in Political Debates of the CLEF-2018 CheckThat! Lab. I Cappellato L, Ferro N, Nie JY, Soulier L, red., CLEF 2018 Working Notes. 10 udg. CEUR-WS.org. 2018. 81. (CEUR Workshop Proceedings, Bind 2125).

Hansen, Casper ; Hansen, Christian ; Simonsen, Jakob Grue et al. / The Copenhagen Team Participation in the Check-Worthiness Task of the Competition of Automatic Identification and Verification of Claims in Political Debates of the CLEF-2018 CheckThat! Lab. CLEF 2018 Working Notes. red. / Linda Cappellato ; Nicola Ferro ; Jian-Yun Nie ; Laure Soulier. 10. udg. CEUR-WS.org, 2018. (CEUR Workshop Proceedings, Bind 2125).

@inproceedings{069c4426de2a431dba28d3f3eaab606c,

title = "The Copenhagen Team Participation in the Check-Worthiness Task of the Competition of Automatic Identification and Verification of Claims in Political Debates of the CLEF-2018 CheckThat! Lab",

abstract = "We predict which claim in a political debate should be prioritizedfor fact-checking. A particular challenge is, given a debate, how toproduce a ranked list of its sentences based on their worthiness for factchecking. We develop a Recurrent Neural Network (RNN) model thatlearns a sentence embedding, which is then used to predict the checkworthinessof a sentence. Our sentence embedding encodes both semanticand syntactic dependencies using pretrained word2vec word embeddingsas well as part-of-speech tagging and syntactic dependency parsing. Thisresults in a multi-representation of each word, which we use as input to aRNN with GRU memory units; the output from each word is aggregatedusing attention, followed by a fully connected layer, from which the outputis predicted using a sigmoid function. The overall performance of ourtechniques is successful, achieving the overall second best performing run(MAP: 0.1152) in the competition, as well as the highest overall performance(MAP: 0.1810) for our contrastive run with a 32% improvementover the second highest MAP score in the English language category. Inour primary run we combined our sentence embedding with state of theart check-worthy features, whereas in the contrastive run we consideredour sentence embedding alone",

keywords = "CNN, Fact checking, Political debates, RNN",

author = "Casper Hansen and Christian Hansen and Simonsen, {Jakob Grue} and Christina Lioma",

year = "2018",

language = "English",

series = "CEUR Workshop Proceedings",

publisher = "CEUR-WS.org",

editor = "{Cappellato }, Linda and {Ferro }, Nicola and Nie, {Jian-Yun } and Laure Soulier",

booktitle = "CLEF 2018 Working Notes",

edition = "10",

note = "19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018 ; Conference date: 10-09-2018 Through 14-09-2018",

}

TY - GEN

T1 - The Copenhagen Team Participation in the Check-Worthiness Task of the Competition of Automatic Identification and Verification of Claims in Political Debates of the CLEF-2018 CheckThat! Lab

AU - Hansen, Casper

AU - Hansen, Christian

AU - Simonsen, Jakob Grue

AU - Lioma, Christina

PY - 2018

Y1 - 2018

N2 - We predict which claim in a political debate should be prioritizedfor fact-checking. A particular challenge is, given a debate, how toproduce a ranked list of its sentences based on their worthiness for factchecking. We develop a Recurrent Neural Network (RNN) model thatlearns a sentence embedding, which is then used to predict the checkworthinessof a sentence. Our sentence embedding encodes both semanticand syntactic dependencies using pretrained word2vec word embeddingsas well as part-of-speech tagging and syntactic dependency parsing. Thisresults in a multi-representation of each word, which we use as input to aRNN with GRU memory units; the output from each word is aggregatedusing attention, followed by a fully connected layer, from which the outputis predicted using a sigmoid function. The overall performance of ourtechniques is successful, achieving the overall second best performing run(MAP: 0.1152) in the competition, as well as the highest overall performance(MAP: 0.1810) for our contrastive run with a 32% improvementover the second highest MAP score in the English language category. Inour primary run we combined our sentence embedding with state of theart check-worthy features, whereas in the contrastive run we consideredour sentence embedding alone

AB - We predict which claim in a political debate should be prioritizedfor fact-checking. A particular challenge is, given a debate, how toproduce a ranked list of its sentences based on their worthiness for factchecking. We develop a Recurrent Neural Network (RNN) model thatlearns a sentence embedding, which is then used to predict the checkworthinessof a sentence. Our sentence embedding encodes both semanticand syntactic dependencies using pretrained word2vec word embeddingsas well as part-of-speech tagging and syntactic dependency parsing. Thisresults in a multi-representation of each word, which we use as input to aRNN with GRU memory units; the output from each word is aggregatedusing attention, followed by a fully connected layer, from which the outputis predicted using a sigmoid function. The overall performance of ourtechniques is successful, achieving the overall second best performing run(MAP: 0.1152) in the competition, as well as the highest overall performance(MAP: 0.1810) for our contrastive run with a 32% improvementover the second highest MAP score in the English language category. Inour primary run we combined our sentence embedding with state of theart check-worthy features, whereas in the contrastive run we consideredour sentence embedding alone

KW - CNN

KW - Fact checking

KW - Political debates

KW - RNN

M3 - Article in proceedings

T3 - CEUR Workshop Proceedings

BT - CLEF 2018 Working Notes

A2 - Cappellato , Linda

A2 - Ferro , Nicola

A2 - Nie, Jian-Yun

A2 - Soulier, Laure

PB - CEUR-WS.org

T2 - 19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018

Y2 - 10 September 2018 through 14 September 2018

ER -

The Copenhagen Team Participation in the Check-Worthiness Task of the Competition of Automatic Identification and Verification of Claims in Political Debates of the CLEF-2018 CheckThat! Lab

Abstract

Konference

Adgang til dokumentet

Fingeraftryk

Citationsformater