A random forest approach for competing risks based on pseudo-values

UB Mogensen; TA Gerds

A random forest approach for competing risks based on pseudo-values

UB Mogensen, TA Gerds

Section of Biostatistics

10 Citations (Scopus)

Abstract

Random forest is a supervised learning method that combines many classification or regression trees for prediction. Here we describe an extension of the random forest method for building event risk prediction models in survival analysis with competing risks. In case of right-censored data, the event status at the prediction horizon is unknown for some subjects. We propose to replace the censored event status by a jackknife pseudo-value, and then to apply an implementation of random forests for uncensored data. Because the pseudo-responses take on values on a continuous scale, the node variance is chosen as split criterion for growing regression trees. In a simulation study, the pseudo split criterion is compared with the Gini split criterion when the latter is applied to the uncensored event status. To investigate the resulting pseudo random forest method for building risk prediction models, we analyze it in a simulation study of predictive performance where we compare it to Cox regression and random survival forest. The method is further illustrated in two real data sets.

Original language	Undefined/Unknown
Journal	Statistics in Medicine
Volume	32
Issue number	18
Pages (from-to)	3102-3114
Number of pages	13
ISSN	0277-6715
Publication status	Published - 15 Aug 2013

Cite this

@article{7f0f39d33aba4a0da3a4a8804040b0b7,

title = "A random forest approach for competing risks based on pseudo-values",

abstract = "Random forest is a supervised learning method that combines many classification or regression trees for prediction. Here we describe an extension of the random forest method for building event risk prediction models in survival analysis with competing risks. In case of right-censored data, the event status at the prediction horizon is unknown for some subjects. We propose to replace the censored event status by a jackknife pseudo-value, and then to apply an implementation of random forests for uncensored data. Because the pseudo-responses take on values on a continuous scale, the node variance is chosen as split criterion for growing regression trees. In a simulation study, the pseudo split criterion is compared with the Gini split criterion when the latter is applied to the uncensored event status. To investigate the resulting pseudo random forest method for building risk prediction models, we analyze it in a simulation study of predictive performance where we compare it to Cox regression and random survival forest. The method is further illustrated in two real data sets.",

author = "UB Mogensen and TA Gerds",

year = "2013",

month = aug,

day = "15",

language = "Udefineret/Ukendt",

volume = "32",

pages = "3102--3114",

journal = "Statistics in Medicine",

issn = "0277-6715",

publisher = "JohnWiley & Sons Ltd",

number = "18",

}

TY - JOUR

T1 - A random forest approach for competing risks based on pseudo-values

AU - Mogensen, UB

AU - Gerds, TA

PY - 2013/8/15

Y1 - 2013/8/15

N2 - Random forest is a supervised learning method that combines many classification or regression trees for prediction. Here we describe an extension of the random forest method for building event risk prediction models in survival analysis with competing risks. In case of right-censored data, the event status at the prediction horizon is unknown for some subjects. We propose to replace the censored event status by a jackknife pseudo-value, and then to apply an implementation of random forests for uncensored data. Because the pseudo-responses take on values on a continuous scale, the node variance is chosen as split criterion for growing regression trees. In a simulation study, the pseudo split criterion is compared with the Gini split criterion when the latter is applied to the uncensored event status. To investigate the resulting pseudo random forest method for building risk prediction models, we analyze it in a simulation study of predictive performance where we compare it to Cox regression and random survival forest. The method is further illustrated in two real data sets.

AB - Random forest is a supervised learning method that combines many classification or regression trees for prediction. Here we describe an extension of the random forest method for building event risk prediction models in survival analysis with competing risks. In case of right-censored data, the event status at the prediction horizon is unknown for some subjects. We propose to replace the censored event status by a jackknife pseudo-value, and then to apply an implementation of random forests for uncensored data. Because the pseudo-responses take on values on a continuous scale, the node variance is chosen as split criterion for growing regression trees. In a simulation study, the pseudo split criterion is compared with the Gini split criterion when the latter is applied to the uncensored event status. To investigate the resulting pseudo random forest method for building risk prediction models, we analyze it in a simulation study of predictive performance where we compare it to Cox regression and random survival forest. The method is further illustrated in two real data sets.

M3 - Tidsskriftartikel

SN - 0277-6715

VL - 32

SP - 3102

EP - 3114

JO - Statistics in Medicine

JF - Statistics in Medicine

IS - 18

ER -