Bootstrap based confidence limits in principal component analysis: a case study

Hamid Babamoradi; Franciscus Winfried J van der Berg; Åsmund Rinnan

doi:10.1016/j.chemolab.2012.10.007

Bootstrap based confidence limits in principal component analysis: a case study

Hamid Babamoradi, Franciscus Winfried J van der Berg, Åsmund Rinnan

Food Analytics and Biotechnology

45 Citationer (Scopus)

Abstract

Principal component analysis (PCA) is widely used as a tool in (exploratory) data investigations for many different research areas such as analytical chemistry, food- and pharmaceutical-research, and multivariate statistical process control. Despite its popularity, not many results have been reported thus far on how to calculate reliable confidence interval limits in PCA estimates. And, like all other data analysis tasks, results of PCA are not complete without reasonable expectations for the parameter uncertainties, especially in the case of predictive model objectives. In this paper we will present a case study on how to calculate confidence limits based on bootstrap re-sampling. Two NIR datasets are used to build bootstrap confidence limits. The first dataset shows the effect of outliers on bootstrap confidence limits, while the second shows the bootstrap confidence limits when the data has a bimodal distribution. The different steps and choices which have to be made for the algorithm to perform correctly will be presented. The bootstrap based confidence limits will be compared with the corresponding asymptotic confidence limits. We will thereby conclude that the confidence limits based on the bootstrap method give more meaningful answers and are to be preferred over its asymptotic counterparts.

Originalsprog	Engelsk
Tidsskrift	Chemometrics and Intelligent Laboratory Systems
Vol/bind	120
Sider (fra-til)	97-105
Antal sider	9
ISSN	0169-7439
DOI	https://doi.org/10.1016/j.chemolab.2012.10.007
Status	Udgivet - 15 jan. 2013

Adgang til dokumentet

10.1016/j.chemolab.2012.10.007

Bootstrap based confidence limits in principal component analysis — A case studyForlagets udgivne version, 932 KB

Citationsformater

@article{bad0545a29c94d48bfe2a7904b27ae36,

title = "Bootstrap based confidence limits in principal component analysis: a case study",

abstract = "Principal component analysis (PCA) is widely used as a tool in (exploratory) data investigations for many different research areas such as analytical chemistry, food- and pharmaceutical-research, and multivariate statistical process control. Despite its popularity, not many results have been reported thus far on how to calculate reliable confidence interval limits in PCA estimates. And, like all other data analysis tasks, results of PCA are not complete without reasonable expectations for the parameter uncertainties, especially in the case of predictive model objectives. In this paper we will present a case study on how to calculate confidence limits based on bootstrap re-sampling. Two NIR datasets are used to build bootstrap confidence limits. The first dataset shows the effect of outliers on bootstrap confidence limits, while the second shows the bootstrap confidence limits when the data has a bimodal distribution. The different steps and choices which have to be made for the algorithm to perform correctly will be presented. The bootstrap based confidence limits will be compared with the corresponding asymptotic confidence limits. We will thereby conclude that the confidence limits based on the bootstrap method give more meaningful answers and are to be preferred over its asymptotic counterparts.",

author = "Hamid Babamoradi and {van der Berg}, {Franciscus Winfried J} and {\AA}smund Rinnan",

year = "2013",

month = jan,

day = "15",

doi = "10.1016/j.chemolab.2012.10.007",

language = "English",

volume = "120",

pages = "97--105",

journal = "Chemometrics and Intelligent Laboratory Systems",

issn = "0169-7439",

publisher = "Elsevier",

}

TY - JOUR

T1 - Bootstrap based confidence limits in principal component analysis

T2 - a case study

AU - Babamoradi, Hamid

AU - van der Berg, Franciscus Winfried J

AU - Rinnan, Åsmund

PY - 2013/1/15

Y1 - 2013/1/15

N2 - Principal component analysis (PCA) is widely used as a tool in (exploratory) data investigations for many different research areas such as analytical chemistry, food- and pharmaceutical-research, and multivariate statistical process control. Despite its popularity, not many results have been reported thus far on how to calculate reliable confidence interval limits in PCA estimates. And, like all other data analysis tasks, results of PCA are not complete without reasonable expectations for the parameter uncertainties, especially in the case of predictive model objectives. In this paper we will present a case study on how to calculate confidence limits based on bootstrap re-sampling. Two NIR datasets are used to build bootstrap confidence limits. The first dataset shows the effect of outliers on bootstrap confidence limits, while the second shows the bootstrap confidence limits when the data has a bimodal distribution. The different steps and choices which have to be made for the algorithm to perform correctly will be presented. The bootstrap based confidence limits will be compared with the corresponding asymptotic confidence limits. We will thereby conclude that the confidence limits based on the bootstrap method give more meaningful answers and are to be preferred over its asymptotic counterparts.

AB - Principal component analysis (PCA) is widely used as a tool in (exploratory) data investigations for many different research areas such as analytical chemistry, food- and pharmaceutical-research, and multivariate statistical process control. Despite its popularity, not many results have been reported thus far on how to calculate reliable confidence interval limits in PCA estimates. And, like all other data analysis tasks, results of PCA are not complete without reasonable expectations for the parameter uncertainties, especially in the case of predictive model objectives. In this paper we will present a case study on how to calculate confidence limits based on bootstrap re-sampling. Two NIR datasets are used to build bootstrap confidence limits. The first dataset shows the effect of outliers on bootstrap confidence limits, while the second shows the bootstrap confidence limits when the data has a bimodal distribution. The different steps and choices which have to be made for the algorithm to perform correctly will be presented. The bootstrap based confidence limits will be compared with the corresponding asymptotic confidence limits. We will thereby conclude that the confidence limits based on the bootstrap method give more meaningful answers and are to be preferred over its asymptotic counterparts.

U2 - 10.1016/j.chemolab.2012.10.007

DO - 10.1016/j.chemolab.2012.10.007

M3 - Journal article

SN - 0169-7439

VL - 120

SP - 97

EP - 105

JO - Chemometrics and Intelligent Laboratory Systems

JF - Chemometrics and Intelligent Laboratory Systems

ER -

Bootstrap based confidence limits in principal component analysis: a case study

Abstract

Adgang til dokumentet

Fingeraftryk

Citationsformater