Model Selection and Risk Estimation with Applications to Nonlinear Ordinary Differential Equation Systems

Frederik Vissing Mikkelsen

Abstract

Broadly speaking, this thesis is devoted to model selection applied to ordinary dierential
equations and risk estimation under model selection. A model selection framework was developed
for modelling time course data by ordinary dierential equations. The framework
is accompanied by the R software package, episode. This package incorporates a collection
of sparsity inducing penalties into two types of loss functions: a squared loss function relying
on numerically solving the equations and an approximate loss function based on inverse
collocation methods. The goal of this framework is to provide eective computational tools
for estimating unknown structures in dynamical systems, such as gene regulatory networks,
which may be used to predict downstream eects of interventions in the system. A recommended
algorithm based on the computational tools is presented and thoroughly tested in
various simulation studies and applications.
The second part of the thesis also concerns model selection, but focuses on risk estimation,
i.e., estimating the error of mean estimators involving model selection. An extension of Stein's
unbiased risk estimate (SURE), which applies to a class of estimators with model selection,
is developed. The extension relies on studying the degrees of freedom of the estimator, which
for a broad class of estimators decomposes into two terms: one ignoring the selection step
and one correcting for it. The classic SURE assumes that the estimator in question is almost
dierentiable and it therefore only accounts for the rst term of the decomposition. In order to
account for the second term the continuum of models arising when the selection procedure has
a tuning parameter is studied. By exploiting the duality between varying the tuning parameter
for xed observations and perturbing the observations for xed tuning parameter, an identity
is derived for a class of estimators which support the extension of SURE. The resulting
corrected version of SURE is generally fast to compute and for the lasso-OLS estimator it
shows promising results when compared to risk estimation via cross validation.
Original languageEnglish
PublisherDepartment of Mathematical Sciences, Faculty of Science, University of Copenhagen
Number of pages139
Publication statusPublished - 2017

Cite this