TY - GEN
T1 - The Copenhagen team participation in the factuality task of the competition of automatic identification and verification of claims in political debates of the CLEF-2018 Fact Checking Lab
AU - Wang, Dongsheng
AU - Simonsen, Jakob Grue
AU - Larsen, Birger
AU - Lioma, Christina
PY - 2018
Y1 - 2018
N2 - Given a set of political debate claims that have been already identified as worth checking, we consider the task of automatically checking the factuality of these claims. In particular, given a sentence that is worth checking, the goal is for the system to determine whether the claim is likely to be true, false, half-true or that it is unsure of its factuality. We implement a variety of models, including Bayes, SVM, RNN, to either step-wise assist our model or work as potential baselines. Then, we develop additional multi-scale Convolutional Neural Networks (CNNs) with different kernel sizes that learn from external sources whether a claim is true, false, half-true or unsure as follows: we treat claims as search engine queries and step-wise retrieve the top-N documents from Google with as much original claim as possible. We strategically select most relevant but sufficient documents with respect to the claims, and extract features, such as title, total number of results returned, and snippet to train the prediction model. We submitted results of SVM and CNNs, and the overall performance of our techniques is successful, achieving the overall best performing run (with lowest error rate 0.7050 from our SVM and highest accuracy 46.76% from our CNNs) in the competition.
AB - Given a set of political debate claims that have been already identified as worth checking, we consider the task of automatically checking the factuality of these claims. In particular, given a sentence that is worth checking, the goal is for the system to determine whether the claim is likely to be true, false, half-true or that it is unsure of its factuality. We implement a variety of models, including Bayes, SVM, RNN, to either step-wise assist our model or work as potential baselines. Then, we develop additional multi-scale Convolutional Neural Networks (CNNs) with different kernel sizes that learn from external sources whether a claim is true, false, half-true or unsure as follows: we treat claims as search engine queries and step-wise retrieve the top-N documents from Google with as much original claim as possible. We strategically select most relevant but sufficient documents with respect to the claims, and extract features, such as title, total number of results returned, and snippet to train the prediction model. We submitted results of SVM and CNNs, and the overall performance of our techniques is successful, achieving the overall best performing run (with lowest error rate 0.7050 from our SVM and highest accuracy 46.76% from our CNNs) in the competition.
KW - CNN
KW - Fact checking
KW - Political debates
KW - RNN
UR - http://www.scopus.com/inward/record.url?scp=85051059348&partnerID=8YFLogxK
M3 - Article in proceedings
AN - SCOPUS:85051059348
T3 - CEUR Workshop Proceedings
BT - CLEF 2018 Working Notes
A2 - Cappellato , Linda
A2 - Ferro , Nicola
A2 - Nie, Jian-Yun
A2 - Soulier, Laure
PB - CEUR-WS.org
T2 - 19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018
Y2 - 10 September 2018 through 14 September 2018
ER -