Abstract
Evidence-based information as provided by statistical models can guide decision making in clinical and preventive medicine. For example, high blood pressure and current smoking are risk factors for cardiovascular diseases. To develop personalized treatment or to convince patients to participate in appropriate prevention programs it is important to assess the individual risk with high accuracy. Generally, genetic information plays an important role for many diseases and will help to improve the accuracy of existing risk prediction models. However, conventional regression models have several limitations when the information is high-dimensional e.g. when there are many thousands of genes or markers. In these situations machine learning methods such as the random forest can still be applied and provide reasonable prediction accuracy.
The main focus in this talk is the performance of random forest in particular when the response is three-dimensional. In a diagnostic study of inflammatory bowel disease three classes of patients have to be diagnosed based on microarray gene-expression data. The performance of random forest is compared on a probability scale and on a classification scale to elastic net. In survival analysis with competing risks I present an extension of random forest using time-dependent pseudo-values to build event risk prediction models. This approach is evaluated with data from Copenhagen stroke study. Further, I will explain how to use the R-package "pec" to evaluate random forests using prediction error curves in survival analysis
The main focus in this talk is the performance of random forest in particular when the response is three-dimensional. In a diagnostic study of inflammatory bowel disease three classes of patients have to be diagnosed based on microarray gene-expression data. The performance of random forest is compared on a probability scale and on a classification scale to elastic net. In survival analysis with competing risks I present an extension of random forest using time-dependent pseudo-values to build event risk prediction models. This approach is evaluated with data from Copenhagen stroke study. Further, I will explain how to use the R-package "pec" to evaluate random forests using prediction error curves in survival analysis
Original language | English |
---|
Place of Publication | København |
---|---|
Publisher | Københavns Universitet |
Volume | 1 |
Edition | 1 |
Publication status | Published - 2011 |