Abstract
Poisson regression is an important tool in register-based epidemiology where it is used to study the association between exposure variables and event rates. In this paper, we will discuss the situation with 'large n and small p', where n is the sample size and p is the number of available covariates. Specifically, we are concerned with modeling options when there are time-varying covariates that can have time-varying effects. One problem is that tests of the proportional hazards assumption, of no interactions between exposure and other observed variables, or of other modeling assumptions have large power due to the large sample size and will often indicate statistical significance even for numerically small deviations that are unimportant for the subject matter. Another problem is that information on important confounders may be unavailable. In practice, this situation may lead to simple working models that are then likely misspecified. To support and improve conclusions drawn from such models, we discuss methods for sensitivity analysis, for estimation of average exposure effects using aggregated data, and a semi-parametric bootstrap method to obtain robust standard errors. The methods are illustrated using data from the Danish national registries investigating the diabetes incidence for individuals treated with antipsychotics compared with the general unexposed population.
Original language | English |
---|---|
Journal | Statistics in Medicine |
Volume | 35 |
Issue number | 7 |
Pages (from-to) | 1117-1129 |
Number of pages | 13 |
ISSN | 0277-6715 |
DOIs | |
Publication status | Published - 30 Mar 2016 |