Predicting distresses using deep learning of text segments in annual reports

Rastin Matin; Casper Hansen; Christian Hansen; Pia Mølgaard

doi:10.1016/j.eswa.2019.04.071

Predicting distresses using deep learning of text segments in annual reports

Rastin Matin^*, Casper Hansen, Christian Hansen, Pia Mølgaard

^*Corresponding author for this work

6 Citations (Scopus)

Abstract

Corporate distress models are central to regulators and financial institutions that need to evaluate the default risk of corporate firms. They are traditionally only based on the numerical financial variables in the firms’ annual reports. In this paper we develop a model that employs the unstructured textual data in the reports as well, namely the auditors’ reports and managements’ statements. Our model consists of a convolutional recurrent neural network which, when concatenated with the numerical financial variables, learns a descriptive representation of the text that is suited for corporate distress prediction. We find that the unstructured data provides a statistically significant enhancement of the distress prediction performance, in particular for large firms where accurate predictions are of the utmost importance. Furthermore, we find that auditors’ reports are more informative than managements’ statements and that a joint model including both managements’ statements and auditors’ reports displays no enhancement relative to a model including only auditors’ reports. Our model demonstrates a direct improvement over existing state-of-the-art models in the field of distress modelling.

Original language	English
Journal	Expert Systems with Applications
Volume	132
Pages (from-to)	199-208
Number of pages	10
ISSN	0957-4174
DOIs	https://doi.org/10.1016/j.eswa.2019.04.071
Publication status	Published - 2019

Keywords

Convolutional neural networks
Corporate default prediction
Natural language processing
Recurrent neural networks

Access to Document

10.1016/j.eswa.2019.04.071

Cite this

@article{82583c29edf8423dbc36be8dc57eaaa9,

title = "Predicting distresses using deep learning of text segments in annual reports",

abstract = "Corporate distress models are central to regulators and financial institutions that need to evaluate the default risk of corporate firms. They are traditionally only based on the numerical financial variables in the firms{\textquoteright} annual reports. In this paper we develop a model that employs the unstructured textual data in the reports as well, namely the auditors{\textquoteright} reports and managements{\textquoteright} statements. Our model consists of a convolutional recurrent neural network which, when concatenated with the numerical financial variables, learns a descriptive representation of the text that is suited for corporate distress prediction. We find that the unstructured data provides a statistically significant enhancement of the distress prediction performance, in particular for large firms where accurate predictions are of the utmost importance. Furthermore, we find that auditors{\textquoteright} reports are more informative than managements{\textquoteright} statements and that a joint model including both managements{\textquoteright} statements and auditors{\textquoteright} reports displays no enhancement relative to a model including only auditors{\textquoteright} reports. Our model demonstrates a direct improvement over existing state-of-the-art models in the field of distress modelling.",

keywords = "Convolutional neural networks, Corporate default prediction, Natural language processing, Recurrent neural networks",

author = "Rastin Matin and Casper Hansen and Christian Hansen and Pia M{\o}lgaard",

year = "2019",

doi = "10.1016/j.eswa.2019.04.071",

language = "English",

volume = "132",

pages = "199--208",

journal = "Expert Systems with Applications",

issn = "0957-4174",

publisher = "Pergamon Press",

}

TY - JOUR

T1 - Predicting distresses using deep learning of text segments in annual reports

AU - Matin, Rastin

AU - Hansen, Casper

AU - Hansen, Christian

AU - Mølgaard, Pia

PY - 2019

Y1 - 2019

N2 - Corporate distress models are central to regulators and financial institutions that need to evaluate the default risk of corporate firms. They are traditionally only based on the numerical financial variables in the firms’ annual reports. In this paper we develop a model that employs the unstructured textual data in the reports as well, namely the auditors’ reports and managements’ statements. Our model consists of a convolutional recurrent neural network which, when concatenated with the numerical financial variables, learns a descriptive representation of the text that is suited for corporate distress prediction. We find that the unstructured data provides a statistically significant enhancement of the distress prediction performance, in particular for large firms where accurate predictions are of the utmost importance. Furthermore, we find that auditors’ reports are more informative than managements’ statements and that a joint model including both managements’ statements and auditors’ reports displays no enhancement relative to a model including only auditors’ reports. Our model demonstrates a direct improvement over existing state-of-the-art models in the field of distress modelling.

AB - Corporate distress models are central to regulators and financial institutions that need to evaluate the default risk of corporate firms. They are traditionally only based on the numerical financial variables in the firms’ annual reports. In this paper we develop a model that employs the unstructured textual data in the reports as well, namely the auditors’ reports and managements’ statements. Our model consists of a convolutional recurrent neural network which, when concatenated with the numerical financial variables, learns a descriptive representation of the text that is suited for corporate distress prediction. We find that the unstructured data provides a statistically significant enhancement of the distress prediction performance, in particular for large firms where accurate predictions are of the utmost importance. Furthermore, we find that auditors’ reports are more informative than managements’ statements and that a joint model including both managements’ statements and auditors’ reports displays no enhancement relative to a model including only auditors’ reports. Our model demonstrates a direct improvement over existing state-of-the-art models in the field of distress modelling.

KW - Convolutional neural networks

KW - Corporate default prediction

KW - Natural language processing

KW - Recurrent neural networks

UR - http://www.scopus.com/inward/record.url?scp=85065489352&partnerID=8YFLogxK

U2 - 10.1016/j.eswa.2019.04.071

DO - 10.1016/j.eswa.2019.04.071

M3 - Journal article

AN - SCOPUS:85065489352

SN - 0957-4174

VL - 132

SP - 199

EP - 208

JO - Expert Systems with Applications

JF - Expert Systems with Applications

ER -

Predicting distresses using deep learning of text segments in annual reports

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this