Covariate selection for the semiparametric additive risk model

Torben Martinussen; Thomas Scheike

Covariate selection for the semiparametric additive risk model

29 Citations (Scopus)

Abstract

his paper considers covariate selection for the additive hazards
model. This model is particularly simple to study theoretically and its
practical implementation has several major advantages to the similar
methodology for the proportional hazards model. One complication
compared with the proportional model is, however, that there is no
simple likelihood to work with. We here study a least squares criterion
with desirable properties and show how this criterion can be
interpreted as a prediction error. Given this criterion, we de. ne
ridge and Lasso estimators as well as an adaptive Lasso and study their
large sample properties for the situation where the number of
covariates p is smaller than the number of observations. We also show
that the adaptive Lasso has the oracle property. In many practical
situations, it is more relevant to tackle the situation with large p
compared with the number of observations. We do this by studying the
properties of the so-called Dantzig selector in the setting of the
additive risk model. Specifically, we establish a bound on how close
the solution is to a true sparse signal in the case where the number of
covariates is large. In a simulation study, we also compare the Dantzig
and adaptive Lasso for a moderate to small number of covariates. The
methods are applied to a breast cancer data set with gene expression
recordings and to the primary biliary cirrhosis clinical data.

Original language	English
Journal	Scandinavian Journal of Statistics
Volume	36
Issue number	4
Pages (from-to)	602
Number of pages	619
ISSN	0303-6898
Publication status	Published - 2009
Externally published	Yes

Cite this

@article{c08503d2edd64f6bb7e61249c4181e28,

title = "Covariate selection for the semiparametric additive risk model",

abstract = "his paper considers covariate selection for the additive hazards model. This model is particularly simple to study theoretically and its practical implementation has several major advantages to the similar methodology for the proportional hazards model. One complication compared with the proportional model is, however, that there is no simple likelihood to work with. We here study a least squares criterion with desirable properties and show how this criterion can be interpreted as a prediction error. Given this criterion, we de. ne ridge and Lasso estimators as well as an adaptive Lasso and study their large sample properties for the situation where the number of covariates p is smaller than the number of observations. We also show that the adaptive Lasso has the oracle property. In many practical situations, it is more relevant to tackle the situation with large p compared with the number of observations. We do this by studying the properties of the so-called Dantzig selector in the setting of the additive risk model. Specifically, we establish a bound on how close the solution is to a true sparse signal in the case where the number of covariates is large. In a simulation study, we also compare the Dantzig and adaptive Lasso for a moderate to small number of covariates. The methods are applied to a breast cancer data set with gene expression recordings and to the primary biliary cirrhosis clinical data.",

author = "Torben Martinussen and Thomas Scheike",

year = "2009",

language = "English",

volume = "36",

pages = "602",

journal = "Scandinavian Journal of Statistics",

issn = "0303-6898",

publisher = "Wiley-Blackwell",

number = "4",

}

TY - JOUR

T1 - Covariate selection for the semiparametric additive risk model

AU - Martinussen, Torben

AU - Scheike, Thomas

PY - 2009

Y1 - 2009

N2 - his paper considers covariate selection for the additive hazards model. This model is particularly simple to study theoretically and its practical implementation has several major advantages to the similar methodology for the proportional hazards model. One complication compared with the proportional model is, however, that there is no simple likelihood to work with. We here study a least squares criterion with desirable properties and show how this criterion can be interpreted as a prediction error. Given this criterion, we de. ne ridge and Lasso estimators as well as an adaptive Lasso and study their large sample properties for the situation where the number of covariates p is smaller than the number of observations. We also show that the adaptive Lasso has the oracle property. In many practical situations, it is more relevant to tackle the situation with large p compared with the number of observations. We do this by studying the properties of the so-called Dantzig selector in the setting of the additive risk model. Specifically, we establish a bound on how close the solution is to a true sparse signal in the case where the number of covariates is large. In a simulation study, we also compare the Dantzig and adaptive Lasso for a moderate to small number of covariates. The methods are applied to a breast cancer data set with gene expression recordings and to the primary biliary cirrhosis clinical data.

AB - his paper considers covariate selection for the additive hazards model. This model is particularly simple to study theoretically and its practical implementation has several major advantages to the similar methodology for the proportional hazards model. One complication compared with the proportional model is, however, that there is no simple likelihood to work with. We here study a least squares criterion with desirable properties and show how this criterion can be interpreted as a prediction error. Given this criterion, we de. ne ridge and Lasso estimators as well as an adaptive Lasso and study their large sample properties for the situation where the number of covariates p is smaller than the number of observations. We also show that the adaptive Lasso has the oracle property. In many practical situations, it is more relevant to tackle the situation with large p compared with the number of observations. We do this by studying the properties of the so-called Dantzig selector in the setting of the additive risk model. Specifically, we establish a bound on how close the solution is to a true sparse signal in the case where the number of covariates is large. In a simulation study, we also compare the Dantzig and adaptive Lasso for a moderate to small number of covariates. The methods are applied to a breast cancer data set with gene expression recordings and to the primary biliary cirrhosis clinical data.

M3 - Journal article

SN - 0303-6898

VL - 36

SP - 602

JO - Scandinavian Journal of Statistics

JF - Scandinavian Journal of Statistics

IS - 4

ER -