The Copenhagen team participation in the factuality task of the competition of automatic identification and verification of claims in political debates of the CLEF-2018 Fact Checking Lab

Dongsheng Wang; Jakob Grue Simonsen; Birger Larsen; Christina Lioma

The Copenhagen team participation in the factuality task of the competition of automatic identification and verification of claims in political debates of the CLEF-2018 Fact Checking Lab

Dongsheng Wang, Jakob Grue Simonsen, Birger Larsen, Christina Lioma

Datalogisk Institut

2 Citationer (Scopus)

65 Downloads (Pure)

Abstract

Given a set of political debate claims that have been already identified as worth checking, we consider the task of automatically checking the factuality of these claims. In particular, given a sentence that is worth checking, the goal is for the system to determine whether the claim is likely to be true, false, half-true or that it is unsure of its factuality. We implement a variety of models, including Bayes, SVM, RNN, to either step-wise assist our model or work as potential baselines. Then, we develop additional multi-scale Convolutional Neural Networks (CNNs) with different kernel sizes that learn from external sources whether a claim is true, false, half-true or unsure as follows: we treat claims as search engine queries and step-wise retrieve the top-N documents from Google with as much original claim as possible. We strategically select most relevant but sufficient documents with respect to the claims, and extract features, such as title, total number of results returned, and snippet to train the prediction model. We submitted results of SVM and CNNs, and the overall performance of our techniques is successful, achieving the overall best performing run (with lowest error rate 0.7050 from our SVM and highest accuracy 46.76% from our CNNs) in the competition.

Originalsprog	Engelsk
Titel	CLEF 2018 Working Notes
Redaktører	Linda Cappellato , Nicola Ferro , Jian-Yun Nie, Laure Soulier
Antal sider	10
Forlag	CEUR-WS.org
Publikationsdato	2018
Udgave	10
Artikelnummer	98
Status	Udgivet - 2018
Begivenhed	19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018 - Avignon, Frankrig Varighed: 10 sep. 2018 → 14 sep. 2018

Konference

Konference	19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018
Land/Område	Frankrig
By	Avignon
Periode	10/09/2018 → 14/09/2018

Navn	CEUR Workshop Proceedings
Vol/bind	2125
ISSN	1613-0073

Adgang til dokumentet

paper_98Forlagets udgivne version, 556 KB

Andre filer og links

Link to publication in Scopus

Citationsformater

The Copenhagen team participation in the factuality task of the competition of automatic identification and verification of claims in political debates of the CLEF-2018 Fact Checking Lab. / Wang, Dongsheng; Simonsen, Jakob Grue; Larsen, Birger et al.
CLEF 2018 Working Notes. red. / Linda Cappellato ; Nicola Ferro ; Jian-Yun Nie; Laure Soulier. 10. udg. CEUR-WS.org, 2018. 98 (CEUR Workshop Proceedings, Bind 2125).

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › peer review

Wang, D, Simonsen, JG, Larsen, B & Lioma, C 2018, The Copenhagen team participation in the factuality task of the competition of automatic identification and verification of claims in political debates of the CLEF-2018 Fact Checking Lab. i L Cappellato , N Ferro , J-Y Nie & L Soulier (red), CLEF 2018 Working Notes. 10 udg, 98, CEUR-WS.org, CEUR Workshop Proceedings, bind 2125, 19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018, Avignon, Frankrig, 10/09/2018.

Wang D, Simonsen JG, Larsen B, Lioma C. The Copenhagen team participation in the factuality task of the competition of automatic identification and verification of claims in political debates of the CLEF-2018 Fact Checking Lab. I Cappellato L, Ferro N, Nie JY, Soulier L, red., CLEF 2018 Working Notes. 10 udg. CEUR-WS.org. 2018. 98. (CEUR Workshop Proceedings, Bind 2125).

Wang, Dongsheng ; Simonsen, Jakob Grue ; Larsen, Birger et al. / The Copenhagen team participation in the factuality task of the competition of automatic identification and verification of claims in political debates of the CLEF-2018 Fact Checking Lab. CLEF 2018 Working Notes. red. / Linda Cappellato ; Nicola Ferro ; Jian-Yun Nie ; Laure Soulier. 10. udg. CEUR-WS.org, 2018. (CEUR Workshop Proceedings, Bind 2125).

@inproceedings{bed59021cac0422aac254904dc8331f6,

title = "The Copenhagen team participation in the factuality task of the competition of automatic identification and verification of claims in political debates of the CLEF-2018 Fact Checking Lab",

abstract = "Given a set of political debate claims that have been already identified as worth checking, we consider the task of automatically checking the factuality of these claims. In particular, given a sentence that is worth checking, the goal is for the system to determine whether the claim is likely to be true, false, half-true or that it is unsure of its factuality. We implement a variety of models, including Bayes, SVM, RNN, to either step-wise assist our model or work as potential baselines. Then, we develop additional multi-scale Convolutional Neural Networks (CNNs) with different kernel sizes that learn from external sources whether a claim is true, false, half-true or unsure as follows: we treat claims as search engine queries and step-wise retrieve the top-N documents from Google with as much original claim as possible. We strategically select most relevant but sufficient documents with respect to the claims, and extract features, such as title, total number of results returned, and snippet to train the prediction model. We submitted results of SVM and CNNs, and the overall performance of our techniques is successful, achieving the overall best performing run (with lowest error rate 0.7050 from our SVM and highest accuracy 46.76% from our CNNs) in the competition.",

keywords = "CNN, Fact checking, Political debates, RNN",

author = "Dongsheng Wang and Simonsen, {Jakob Grue} and Birger Larsen and Christina Lioma",

year = "2018",

language = "English",

series = "CEUR Workshop Proceedings",

publisher = "CEUR-WS.org",

editor = "{Cappellato }, Linda and {Ferro }, Nicola and Nie, {Jian-Yun } and Laure Soulier",

booktitle = "CLEF 2018 Working Notes",

edition = "10",

note = "19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018 ; Conference date: 10-09-2018 Through 14-09-2018",

}

TY - GEN

T1 - The Copenhagen team participation in the factuality task of the competition of automatic identification and verification of claims in political debates of the CLEF-2018 Fact Checking Lab

AU - Wang, Dongsheng

AU - Simonsen, Jakob Grue

AU - Larsen, Birger

AU - Lioma, Christina

PY - 2018

Y1 - 2018

N2 - Given a set of political debate claims that have been already identified as worth checking, we consider the task of automatically checking the factuality of these claims. In particular, given a sentence that is worth checking, the goal is for the system to determine whether the claim is likely to be true, false, half-true or that it is unsure of its factuality. We implement a variety of models, including Bayes, SVM, RNN, to either step-wise assist our model or work as potential baselines. Then, we develop additional multi-scale Convolutional Neural Networks (CNNs) with different kernel sizes that learn from external sources whether a claim is true, false, half-true or unsure as follows: we treat claims as search engine queries and step-wise retrieve the top-N documents from Google with as much original claim as possible. We strategically select most relevant but sufficient documents with respect to the claims, and extract features, such as title, total number of results returned, and snippet to train the prediction model. We submitted results of SVM and CNNs, and the overall performance of our techniques is successful, achieving the overall best performing run (with lowest error rate 0.7050 from our SVM and highest accuracy 46.76% from our CNNs) in the competition.

AB - Given a set of political debate claims that have been already identified as worth checking, we consider the task of automatically checking the factuality of these claims. In particular, given a sentence that is worth checking, the goal is for the system to determine whether the claim is likely to be true, false, half-true or that it is unsure of its factuality. We implement a variety of models, including Bayes, SVM, RNN, to either step-wise assist our model or work as potential baselines. Then, we develop additional multi-scale Convolutional Neural Networks (CNNs) with different kernel sizes that learn from external sources whether a claim is true, false, half-true or unsure as follows: we treat claims as search engine queries and step-wise retrieve the top-N documents from Google with as much original claim as possible. We strategically select most relevant but sufficient documents with respect to the claims, and extract features, such as title, total number of results returned, and snippet to train the prediction model. We submitted results of SVM and CNNs, and the overall performance of our techniques is successful, achieving the overall best performing run (with lowest error rate 0.7050 from our SVM and highest accuracy 46.76% from our CNNs) in the competition.

KW - CNN

KW - Fact checking

KW - Political debates

KW - RNN

UR - http://www.scopus.com/inward/record.url?scp=85051059348&partnerID=8YFLogxK

M3 - Article in proceedings

AN - SCOPUS:85051059348

T3 - CEUR Workshop Proceedings

BT - CLEF 2018 Working Notes

A2 - Cappellato , Linda

A2 - Ferro , Nicola

A2 - Nie, Jian-Yun

A2 - Soulier, Laure

PB - CEUR-WS.org

T2 - 19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018

Y2 - 10 September 2018 through 14 September 2018

ER -

The Copenhagen team participation in the factuality task of the competition of automatic identification and verification of claims in political debates of the CLEF-2018 Fact Checking Lab

Abstract

Konference

Adgang til dokumentet

Andre filer og links

Fingeraftryk

Citationsformater