Standardized evaluation of algorithms for computer-aided diagnosis of dementia based on structural MRI: The CADDementia challenge

Esther E. Bron; Marion Smits; Wiesje M. van der Flier; Hugo Vrenken; Frederik Barkhof; Philip Scheltens; Janne M. Papma; Rebecca M.E. Steketee; Carolina Méndez Orellana; Rozanna Meijboom; Madalena Pinto; Joana R. Meireles; Carolina Garrett; António J. Bastos-Leite; Ahmed Abdulkadir; Olaf Ronneberger; Nicola Amoroso; Roberto Bellotti; David Cárdenas-Peña; Andrés M. Álvarez-Meza; Chester V. Dolph; Khan M. Iftekharuddin; Simon Fristed Eskildsen; Pierrick Coupé; Vladimir S. Fonov; Katja Franke; Christian Gaser; Christian Ledig; Ricardo Guerrero; Tong Tong; Katherine R. Gray; Elaheh Moradi; Jussi Tohka; Alexandre Routier; Stanley Durrleman; Alessia Sarica; Giuseppe Di Fatta; Francesco Sensi; Andrea Chincarini; Garry M. Smith; Zhivko V. Stoyanov; Lauge Emil Borch Laurs Sørensen; Mads Nielsen; Sabina Tangaro; Paolo Inglese; Christian Wachinger; Martin Reuter; John C. van Swieten; Wiro J. Niessen; Stefan Klein

doi:10.1016/j.neuroimage.2015.01.048

Standardized evaluation of algorithms for computer-aided diagnosis of dementia based on structural MRI: The CADDementia challenge

Esther E. Bron, Marion Smits, Wiesje M. van der Flier, Hugo Vrenken, Frederik Barkhof, Philip Scheltens, Janne M. Papma, Rebecca M.E. Steketee, Carolina Méndez Orellana, Rozanna Meijboom, Madalena Pinto, Joana R. Meireles, Carolina Garrett, António J. Bastos-Leite, Ahmed Abdulkadir, Olaf Ronneberger, Nicola Amoroso, Roberto Bellotti, David Cárdenas-Peña, Andrés M. Álvarez-MezaChester V. Dolph, Khan M. Iftekharuddin, Simon Fristed Eskildsen, Pierrick Coupé, Vladimir S. Fonov, Katja Franke, Christian Gaser, Christian Ledig, Ricardo Guerrero, Tong Tong, Katherine R. Gray, Elaheh Moradi, Jussi Tohka, Alexandre Routier, Stanley Durrleman, Alessia Sarica, Giuseppe Di Fatta, Francesco Sensi, Andrea Chincarini, Garry M. Smith, Zhivko V. Stoyanov, Lauge Emil Borch Laurs Sørensen, Mads Nielsen, Sabina Tangaro, Paolo Inglese, Christian Wachinger, Martin Reuter, John C. van Swieten, Wiro J. Niessen, Stefan Klein

Department of Computer Science

163 Citations (Scopus)

Abstract

Algorithms for computer-aided diagnosis of dementia based on structural MRI have demonstrated high performance in the literature, but are difficult to compare as different data sets and methodology were used for evaluation. In addition, it is unclear how the algorithms would perform on previously unseen data, and thus, how they would perform in clinical practice when there is no real opportunity to adapt the algorithm to the data at hand. To address these comparability, generalizability and clinical applicability issues, we organized a grand challenge that aimed to objectively compare algorithms based on a clinically representative multi-center data set. Using clinical practice as the starting point, the goal was to reproduce the clinical diagnosis. Therefore, we evaluated algorithms for multi-class classification of three diagnostic groups: patients with probable Alzheimer's disease, patients with mild cognitive impairment and healthy controls. The diagnosis based on clinical criteria was used as reference standard, as it was the best available reference despite its known limitations. For evaluation, a previously unseen test set was used consisting of 354 T1-weighted MRI scans with the diagnoses blinded. Fifteen research teams participated with a total of 29 algorithms. The algorithms were trained on a small training set (n. =. 30) and optionally on data from other sources (e.g., the Alzheimer's Disease Neuroimaging Initiative, the Australian Imaging Biomarkers and Lifestyle flagship study of aging). The best performing algorithm yielded an accuracy of 63.0% and an area under the receiver-operating-characteristic curve (AUC) of 78.8%. In general, the best performances were achieved using feature extraction based on voxel-based morphometry or a combination of features that included volume, cortical thickness, shape and intensity. The challenge is open for new submissions via the web-based framework: http://caddementia.grand-challenge.org.

Original language	English
Journal	NeuroImage
Volume	111
Pages (from-to)	562-579
Number of pages	18
ISSN	1053-8119
DOIs	https://doi.org/10.1016/j.neuroimage.2015.01.048
Publication status	Published - 1 May 2015

Access to Document

10.1016/j.neuroimage.2015.01.048

Cite this

Bron, E. E., Smits, M., van der Flier, W. M., Vrenken, H., Barkhof, F., Scheltens, P., Papma, J. M., Steketee, R. M. E., Méndez Orellana, C., Meijboom, R., Pinto, M., Meireles, J. R., Garrett, C., Bastos-Leite, A. J., Abdulkadir, A., Ronneberger, O., Amoroso, N., Bellotti, R., Cárdenas-Peña, D., ... Klein, S. (2015). Standardized evaluation of algorithms for computer-aided diagnosis of dementia based on structural MRI: The CADDementia challenge. NeuroImage, 111, 562-579. https://doi.org/10.1016/j.neuroimage.2015.01.048

Bron, EE, Smits, M, van der Flier, WM, Vrenken, H, Barkhof, F, Scheltens, P, Papma, JM, Steketee, RME, Méndez Orellana, C, Meijboom, R, Pinto, M, Meireles, JR, Garrett, C, Bastos-Leite, AJ, Abdulkadir, A, Ronneberger, O, Amoroso, N, Bellotti, R, Cárdenas-Peña, D, Álvarez-Meza, AM, Dolph, CV, Iftekharuddin, KM, Eskildsen, SF, Coupé, P, Fonov, VS, Franke, K, Gaser, C, Ledig, C, Guerrero, R, Tong, T, Gray, KR, Moradi, E, Tohka, J, Routier, A, Durrleman, S, Sarica, A, Di Fatta, G, Sensi, F, Chincarini, A, Smith, GM, Stoyanov, ZV, Sørensen, LEBL , Nielsen, M, Tangaro, S, Inglese, P, Wachinger, C, Reuter, M, van Swieten, JC, Niessen, WJ & Klein, S 2015, 'Standardized evaluation of algorithms for computer-aided diagnosis of dementia based on structural MRI: The CADDementia challenge', NeuroImage, vol. 111, pp. 562-579. https://doi.org/10.1016/j.neuroimage.2015.01.048

@article{81eec22821374c1e8cc5d43e61a023cd,

title = "Standardized evaluation of algorithms for computer-aided diagnosis of dementia based on structural MRI: The CADDementia challenge",

abstract = "Algorithms for computer-aided diagnosis of dementia based on structural MRI have demonstrated high performance in the literature, but are difficult to compare as different data sets and methodology were used for evaluation. In addition, it is unclear how the algorithms would perform on previously unseen data, and thus, how they would perform in clinical practice when there is no real opportunity to adapt the algorithm to the data at hand. To address these comparability, generalizability and clinical applicability issues, we organized a grand challenge that aimed to objectively compare algorithms based on a clinically representative multi-center data set. Using clinical practice as the starting point, the goal was to reproduce the clinical diagnosis. Therefore, we evaluated algorithms for multi-class classification of three diagnostic groups: patients with probable Alzheimer's disease, patients with mild cognitive impairment and healthy controls. The diagnosis based on clinical criteria was used as reference standard, as it was the best available reference despite its known limitations. For evaluation, a previously unseen test set was used consisting of 354 T1-weighted MRI scans with the diagnoses blinded. Fifteen research teams participated with a total of 29 algorithms. The algorithms were trained on a small training set (n. =. 30) and optionally on data from other sources (e.g., the Alzheimer's Disease Neuroimaging Initiative, the Australian Imaging Biomarkers and Lifestyle flagship study of aging). The best performing algorithm yielded an accuracy of 63.0% and an area under the receiver-operating-characteristic curve (AUC) of 78.8%. In general, the best performances were achieved using feature extraction based on voxel-based morphometry or a combination of features that included volume, cortical thickness, shape and intensity. The challenge is open for new submissions via the web-based framework: http://caddementia.grand-challenge.org.",

keywords = "Alzheimer's disease, Challenge, Classification, Computer-aided diagnosis, Mild cognitive impairment, Structural MRI",

author = "Bron, {Esther E.} and Marion Smits and {van der Flier}, {Wiesje M.} and Hugo Vrenken and Frederik Barkhof and Philip Scheltens and Papma, {Janne M.} and Steketee, {Rebecca M.E.} and {M{\'e}ndez Orellana}, Carolina and Rozanna Meijboom and Madalena Pinto and Meireles, {Joana R.} and Carolina Garrett and Bastos-Leite, {Ant{\'o}nio J.} and Ahmed Abdulkadir and Olaf Ronneberger and Nicola Amoroso and Roberto Bellotti and David C{\'a}rdenas-Pe{\~n}a and {\'A}lvarez-Meza, {Andr{\'e}s M.} and Dolph, {Chester V.} and Iftekharuddin, {Khan M.} and Eskildsen, {Simon Fristed} and Pierrick Coup{\'e} and Fonov, {Vladimir S.} and Katja Franke and Christian Gaser and Christian Ledig and Ricardo Guerrero and Tong Tong and Gray, {Katherine R.} and Elaheh Moradi and Jussi Tohka and Alexandre Routier and Stanley Durrleman and Alessia Sarica and {Di Fatta}, Giuseppe and Francesco Sensi and Andrea Chincarini and Smith, {Garry M.} and Stoyanov, {Zhivko V.} and S{\o}rensen, {Lauge Emil Borch Laurs} and Mads Nielsen and Sabina Tangaro and Paolo Inglese and Christian Wachinger and Martin Reuter and {van Swieten}, {John C.} and Niessen, {Wiro J.} and Stefan Klein",

year = "2015",

month = may,

day = "1",

doi = "10.1016/j.neuroimage.2015.01.048",

language = "English",

volume = "111",

pages = "562--579",

journal = "NeuroImage",

issn = "1053-8119",

publisher = "Elsevier",

}

TY - JOUR

T1 - Standardized evaluation of algorithms for computer-aided diagnosis of dementia based on structural MRI

T2 - The CADDementia challenge

AU - Bron, Esther E.

AU - Smits, Marion

AU - van der Flier, Wiesje M.

AU - Vrenken, Hugo

AU - Barkhof, Frederik

AU - Scheltens, Philip

AU - Papma, Janne M.

AU - Steketee, Rebecca M.E.

AU - Méndez Orellana, Carolina

AU - Meijboom, Rozanna

AU - Pinto, Madalena

AU - Meireles, Joana R.

AU - Garrett, Carolina

AU - Bastos-Leite, António J.

AU - Abdulkadir, Ahmed

AU - Ronneberger, Olaf

AU - Amoroso, Nicola

AU - Bellotti, Roberto

AU - Cárdenas-Peña, David

AU - Álvarez-Meza, Andrés M.

AU - Dolph, Chester V.

AU - Iftekharuddin, Khan M.

AU - Eskildsen, Simon Fristed

AU - Coupé, Pierrick

AU - Fonov, Vladimir S.

AU - Franke, Katja

AU - Gaser, Christian

AU - Ledig, Christian

AU - Guerrero, Ricardo

AU - Tong, Tong

AU - Gray, Katherine R.

AU - Moradi, Elaheh

AU - Tohka, Jussi

AU - Routier, Alexandre

AU - Durrleman, Stanley

AU - Sarica, Alessia

AU - Di Fatta, Giuseppe

AU - Sensi, Francesco

AU - Chincarini, Andrea

AU - Smith, Garry M.

AU - Stoyanov, Zhivko V.

AU - Sørensen, Lauge Emil Borch Laurs

AU - Nielsen, Mads

AU - Tangaro, Sabina

AU - Inglese, Paolo

AU - Wachinger, Christian

AU - Reuter, Martin

AU - van Swieten, John C.

AU - Niessen, Wiro J.

AU - Klein, Stefan

PY - 2015/5/1

Y1 - 2015/5/1

N2 - Algorithms for computer-aided diagnosis of dementia based on structural MRI have demonstrated high performance in the literature, but are difficult to compare as different data sets and methodology were used for evaluation. In addition, it is unclear how the algorithms would perform on previously unseen data, and thus, how they would perform in clinical practice when there is no real opportunity to adapt the algorithm to the data at hand. To address these comparability, generalizability and clinical applicability issues, we organized a grand challenge that aimed to objectively compare algorithms based on a clinically representative multi-center data set. Using clinical practice as the starting point, the goal was to reproduce the clinical diagnosis. Therefore, we evaluated algorithms for multi-class classification of three diagnostic groups: patients with probable Alzheimer's disease, patients with mild cognitive impairment and healthy controls. The diagnosis based on clinical criteria was used as reference standard, as it was the best available reference despite its known limitations. For evaluation, a previously unseen test set was used consisting of 354 T1-weighted MRI scans with the diagnoses blinded. Fifteen research teams participated with a total of 29 algorithms. The algorithms were trained on a small training set (n. =. 30) and optionally on data from other sources (e.g., the Alzheimer's Disease Neuroimaging Initiative, the Australian Imaging Biomarkers and Lifestyle flagship study of aging). The best performing algorithm yielded an accuracy of 63.0% and an area under the receiver-operating-characteristic curve (AUC) of 78.8%. In general, the best performances were achieved using feature extraction based on voxel-based morphometry or a combination of features that included volume, cortical thickness, shape and intensity. The challenge is open for new submissions via the web-based framework: http://caddementia.grand-challenge.org.

AB - Algorithms for computer-aided diagnosis of dementia based on structural MRI have demonstrated high performance in the literature, but are difficult to compare as different data sets and methodology were used for evaluation. In addition, it is unclear how the algorithms would perform on previously unseen data, and thus, how they would perform in clinical practice when there is no real opportunity to adapt the algorithm to the data at hand. To address these comparability, generalizability and clinical applicability issues, we organized a grand challenge that aimed to objectively compare algorithms based on a clinically representative multi-center data set. Using clinical practice as the starting point, the goal was to reproduce the clinical diagnosis. Therefore, we evaluated algorithms for multi-class classification of three diagnostic groups: patients with probable Alzheimer's disease, patients with mild cognitive impairment and healthy controls. The diagnosis based on clinical criteria was used as reference standard, as it was the best available reference despite its known limitations. For evaluation, a previously unseen test set was used consisting of 354 T1-weighted MRI scans with the diagnoses blinded. Fifteen research teams participated with a total of 29 algorithms. The algorithms were trained on a small training set (n. =. 30) and optionally on data from other sources (e.g., the Alzheimer's Disease Neuroimaging Initiative, the Australian Imaging Biomarkers and Lifestyle flagship study of aging). The best performing algorithm yielded an accuracy of 63.0% and an area under the receiver-operating-characteristic curve (AUC) of 78.8%. In general, the best performances were achieved using feature extraction based on voxel-based morphometry or a combination of features that included volume, cortical thickness, shape and intensity. The challenge is open for new submissions via the web-based framework: http://caddementia.grand-challenge.org.

KW - Alzheimer's disease

KW - Challenge

KW - Classification

KW - Computer-aided diagnosis

KW - Mild cognitive impairment

KW - Structural MRI

U2 - 10.1016/j.neuroimage.2015.01.048

DO - 10.1016/j.neuroimage.2015.01.048

M3 - Journal article

C2 - 25652394

SN - 1053-8119

VL - 111

SP - 562

EP - 579

JO - NeuroImage

JF - NeuroImage

ER -

Standardized evaluation of algorithms for computer-aided diagnosis of dementia based on structural MRI: The CADDementia challenge

Abstract

Access to Document

Fingerprint

Cite this