Neural weakly supervised fact check-worthiness detection with contrastive sampling-based ranking loss

Casper Hansen; Christian Hansen; Jakob Grue Simonsen; Christina Lioma

Neural weakly supervised fact check-worthiness detection with contrastive sampling-based ranking loss

Casper Hansen, Christian Hansen, Jakob Grue Simonsen, Christina Lioma

2 Citations (Scopus)

Abstract

This paper describes the winning approach used by the Copenhagen team in the CLEF-2019 CheckThat! lab. Given a political debate or speech, the aim is to predict which sentences should be prioritized for fact-checking by creating a ranked list of sentences. While many approaches for check-worthiness exist, we are the first to directly optimize the sentence ranking as all previous work has solely used standard classification based loss functions. We present a recurrent neural network model that learns a sentence encoding, from which a check-worthiness score is predicted. The model is trained by jointly optimizing a binary cross entropy loss, as well as a ranking based pairwise hinge loss. We obtain sentence pairs for training through contrastive sampling, where for each sentence we find the k most semantically similar sentences with opposite label. To increase the generalizability of the model, we utilize weak supervision by using an existing check-worthiness approach to weakly label a large unlabeled dataset. We experimentally show that both weak supervision and the ranking component improve the results individually (MAP increases of 25% and 9% respectively), while when used together improve the results even more (39% increase). Through a comparison to existing state-of-the-art check-worthiness methods, we find that our approach improves the MAP score by 11%.

Original language	English
Journal	CEUR Workshop Proceedings
Volume	2380
Number of pages	8
ISSN	1613-0073
Publication status	Published - 2019
Event	20th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2019 - Lugano, Switzerland Duration: 9 Sept 2019 → 12 Sept 2019

Conference

Conference	20th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2019
Country/Territory	Switzerland
City	Lugano
Period	09/09/2019 → 12/09/2019

Keywords

Contrastive ranking
Fact check-worthiness
Neural networks

Access to Document

http://ceur-ws.org/Vol-2380/paper_56.pdfLicence: CC BY

Cite this

Hansen, C., Hansen, C., Simonsen, J. G., & Lioma, C. (2019). Neural weakly supervised fact check-worthiness detection with contrastive sampling-based ranking loss. CEUR Workshop Proceedings, 2380. http://ceur-ws.org/Vol-2380/paper_56.pdf

@inproceedings{63ee4db9bd934ad185fd980aeda1a883,

title = "Neural weakly supervised fact check-worthiness detection with contrastive sampling-based ranking loss",

abstract = "This paper describes the winning approach used by the Copenhagen team in the CLEF-2019 CheckThat! lab. Given a political debate or speech, the aim is to predict which sentences should be prioritized for fact-checking by creating a ranked list of sentences. While many approaches for check-worthiness exist, we are the first to directly optimize the sentence ranking as all previous work has solely used standard classification based loss functions. We present a recurrent neural network model that learns a sentence encoding, from which a check-worthiness score is predicted. The model is trained by jointly optimizing a binary cross entropy loss, as well as a ranking based pairwise hinge loss. We obtain sentence pairs for training through contrastive sampling, where for each sentence we find the k most semantically similar sentences with opposite label. To increase the generalizability of the model, we utilize weak supervision by using an existing check-worthiness approach to weakly label a large unlabeled dataset. We experimentally show that both weak supervision and the ranking component improve the results individually (MAP increases of 25% and 9% respectively), while when used together improve the results even more (39% increase). Through a comparison to existing state-of-the-art check-worthiness methods, we find that our approach improves the MAP score by 11%.",

keywords = "Contrastive ranking, Fact check-worthiness, Neural networks",

author = "Casper Hansen and Christian Hansen and Simonsen, {Jakob Grue} and Christina Lioma",

year = "2019",

language = "English",

volume = "2380",

journal = "CEUR Workshop Proceedings",

issn = "1613-0073",

publisher = "ceur workshop proceedings",

note = "20th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2019 ; Conference date: 09-09-2019 Through 12-09-2019",

}

TY - GEN

T1 - Neural weakly supervised fact check-worthiness detection with contrastive sampling-based ranking loss

AU - Hansen, Casper

AU - Hansen, Christian

AU - Simonsen, Jakob Grue

AU - Lioma, Christina

PY - 2019

Y1 - 2019

N2 - This paper describes the winning approach used by the Copenhagen team in the CLEF-2019 CheckThat! lab. Given a political debate or speech, the aim is to predict which sentences should be prioritized for fact-checking by creating a ranked list of sentences. While many approaches for check-worthiness exist, we are the first to directly optimize the sentence ranking as all previous work has solely used standard classification based loss functions. We present a recurrent neural network model that learns a sentence encoding, from which a check-worthiness score is predicted. The model is trained by jointly optimizing a binary cross entropy loss, as well as a ranking based pairwise hinge loss. We obtain sentence pairs for training through contrastive sampling, where for each sentence we find the k most semantically similar sentences with opposite label. To increase the generalizability of the model, we utilize weak supervision by using an existing check-worthiness approach to weakly label a large unlabeled dataset. We experimentally show that both weak supervision and the ranking component improve the results individually (MAP increases of 25% and 9% respectively), while when used together improve the results even more (39% increase). Through a comparison to existing state-of-the-art check-worthiness methods, we find that our approach improves the MAP score by 11%.

AB - This paper describes the winning approach used by the Copenhagen team in the CLEF-2019 CheckThat! lab. Given a political debate or speech, the aim is to predict which sentences should be prioritized for fact-checking by creating a ranked list of sentences. While many approaches for check-worthiness exist, we are the first to directly optimize the sentence ranking as all previous work has solely used standard classification based loss functions. We present a recurrent neural network model that learns a sentence encoding, from which a check-worthiness score is predicted. The model is trained by jointly optimizing a binary cross entropy loss, as well as a ranking based pairwise hinge loss. We obtain sentence pairs for training through contrastive sampling, where for each sentence we find the k most semantically similar sentences with opposite label. To increase the generalizability of the model, we utilize weak supervision by using an existing check-worthiness approach to weakly label a large unlabeled dataset. We experimentally show that both weak supervision and the ranking component improve the results individually (MAP increases of 25% and 9% respectively), while when used together improve the results even more (39% increase). Through a comparison to existing state-of-the-art check-worthiness methods, we find that our approach improves the MAP score by 11%.

KW - Contrastive ranking

KW - Fact check-worthiness

KW - Neural networks

UR - http://www.scopus.com/inward/record.url?scp=85070534030&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85070534030

SN - 1613-0073

VL - 2380

JO - CEUR Workshop Proceedings

JF - CEUR Workshop Proceedings

T2 - 20th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2019

Y2 - 9 September 2019 through 12 September 2019

ER -

Neural weakly supervised fact check-worthiness detection with contrastive sampling-based ranking loss

Abstract

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this