Abstract
Principal component analysis (PCA) has been used extensively in the field of nutritional epidemiology to derive patterns that summarize food and nutrient intake, but interpreting it can be difficult. The authors propose the use of
a new statistical technique, the treelet transform (TT), as an alternative to PCA. TT combines the quantitative pattern extraction capabilities of PCA with the interpretational advantages of cluster analysis and produces patterns
involving only naturally grouped subsets of the original variables. The authors compared patterns derived using TT with those derived using PCA in a study of dietary patterns and risk of myocardial infarction among 26,155
male participants in a prospective Danish cohort. Over a median of 11.9 years of follow-up, 1,523 incident cases of myocardial infarction were ascertained. The 7 patterns derived with TT described almost as much variation as the
first 7 patterns derived with PCA, for which interpretation was less clear. When the authors used multivariate Cox regression models to estimate relative risk of myocardial infarction, the significant risk factors were comparable whether the model was based on PCA or TT factors. The present study shows that TT may be a useful alternative to PCA in epidemiologic studies, leading to patterns that possess comparable explanatory power and are simple to interpret.
a new statistical technique, the treelet transform (TT), as an alternative to PCA. TT combines the quantitative pattern extraction capabilities of PCA with the interpretational advantages of cluster analysis and produces patterns
involving only naturally grouped subsets of the original variables. The authors compared patterns derived using TT with those derived using PCA in a study of dietary patterns and risk of myocardial infarction among 26,155
male participants in a prospective Danish cohort. Over a median of 11.9 years of follow-up, 1,523 incident cases of myocardial infarction were ascertained. The 7 patterns derived with TT described almost as much variation as the
first 7 patterns derived with PCA, for which interpretation was less clear. When the authors used multivariate Cox regression models to estimate relative risk of myocardial infarction, the significant risk factors were comparable whether the model was based on PCA or TT factors. The present study shows that TT may be a useful alternative to PCA in epidemiologic studies, leading to patterns that possess comparable explanatory power and are simple to interpret.
Original language | English |
---|---|
Journal | American Journal of Epidemiology |
Volume | 173 |
Issue number | 10 |
Pages (from-to) | 1097-1104 |
Number of pages | 8 |
ISSN | 0002-9262 |
DOIs | |
Publication status | Published - 15 May 2011 |