TY - JOUR
T1 - Bootstrap based confidence limits in principal component analysis
T2 - a case study
AU - Babamoradi, Hamid
AU - van der Berg, Franciscus Winfried J
AU - Rinnan, Åsmund
PY - 2013/1/15
Y1 - 2013/1/15
N2 - Principal component analysis (PCA) is widely used as a tool in (exploratory) data investigations for many different research areas such as analytical chemistry, food- and pharmaceutical-research, and multivariate statistical process control. Despite its popularity, not many results have been reported thus far on how to calculate reliable confidence interval limits in PCA estimates. And, like all other data analysis tasks, results of PCA are not complete without reasonable expectations for the parameter uncertainties, especially in the case of predictive model objectives. In this paper we will present a case study on how to calculate confidence limits based on bootstrap re-sampling. Two NIR datasets are used to build bootstrap confidence limits. The first dataset shows the effect of outliers on bootstrap confidence limits, while the second shows the bootstrap confidence limits when the data has a bimodal distribution. The different steps and choices which have to be made for the algorithm to perform correctly will be presented. The bootstrap based confidence limits will be compared with the corresponding asymptotic confidence limits. We will thereby conclude that the confidence limits based on the bootstrap method give more meaningful answers and are to be preferred over its asymptotic counterparts.
AB - Principal component analysis (PCA) is widely used as a tool in (exploratory) data investigations for many different research areas such as analytical chemistry, food- and pharmaceutical-research, and multivariate statistical process control. Despite its popularity, not many results have been reported thus far on how to calculate reliable confidence interval limits in PCA estimates. And, like all other data analysis tasks, results of PCA are not complete without reasonable expectations for the parameter uncertainties, especially in the case of predictive model objectives. In this paper we will present a case study on how to calculate confidence limits based on bootstrap re-sampling. Two NIR datasets are used to build bootstrap confidence limits. The first dataset shows the effect of outliers on bootstrap confidence limits, while the second shows the bootstrap confidence limits when the data has a bimodal distribution. The different steps and choices which have to be made for the algorithm to perform correctly will be presented. The bootstrap based confidence limits will be compared with the corresponding asymptotic confidence limits. We will thereby conclude that the confidence limits based on the bootstrap method give more meaningful answers and are to be preferred over its asymptotic counterparts.
U2 - 10.1016/j.chemolab.2012.10.007
DO - 10.1016/j.chemolab.2012.10.007
M3 - Journal article
SN - 0169-7439
VL - 120
SP - 97
EP - 105
JO - Chemometrics and Intelligent Laboratory Systems
JF - Chemometrics and Intelligent Laboratory Systems
ER -