To test or not to test: Preliminary assessment of normality when comparing two independent samples

Justine Rochon; Matthias Gondan; Meinhard Kieser

doi:10.1186/1471-2288-12-81

To test or not to test: Preliminary assessment of normality when comparing two independent samples

Justine Rochon, Matthias Gondan, Meinhard Kieser

84 Citations (Scopus)

892 Downloads (Pure)

Abstract

Background: Student's two-sample t test is generally used for comparing the means of two independent samples, for example, two treatment arms. Under the null hypothesis, the t test assumes that the two samples arise from the same normally distributed population with unknown variance. Adequate control of the Type I error requires that the normality assumption holds, which is often examined by means of a preliminary Shapiro-Wilk test. The following two-stage procedure is widely accepted: If the preliminary test for normality is not significant, the t test is used; if the preliminary test rejects the null hypothesis of normality, a nonparametric test is applied in the main analysis.
Methods: Equally sized samples were drawn from exponential, uniform, and normal distributions. The two-sample t test was conducted if either both samples (Strategy I) or the collapsed set of residuals from both samples (Strategy II) had passed the preliminary Shapiro-Wilk test for normality; otherwise, Mann-Whitney’s U test was conducted. By simulation, we separately estimated the conditional Type I error probabilities for the parametric and nonparametric part of the two-stage procedure. Finally, we assessed the overall Type I error rate and the power of the two-stage procedure as a whole.
Results: Preliminary testing for normality seriously altered the conditional Type I error rates of the subsequent main analysis for both parametric and nonparametric tests. We discuss possible explanations for the observed results, the most important one being the selection mechanism due to the preliminary test. Interestingly, the overall Type I error rate and power of the entire two-stage procedure remained within acceptable limits.
Conclusion: The two-stage procedure might be considered incorrect from a formal perspective; nevertheless, in the investigated examples, this procedure seemed to satisfactorily maintain the nominal significance level and had acceptable power properties.

Original language	English
Article number	81
Journal	BMC Medical Research Methodology
Volume	12
Number of pages	11
ISSN	1471-2288
DOIs	https://doi.org/10.1186/1471-2288-12-81
Publication status	Published - 2012

Keywords

Computer Simulation
Data Interpretation, Statistical
Humans
Models, Statistical
Normal Distribution
Selection Bias
Statistics, Nonparametric

Access to Document

10.1186/1471-2288-12-81

Published articleFinal published version, 570 KB
Supplemental materialFinal published version, 71.6 KB

Cite this

@article{bf937655c1e04f70b9d58d957355daf6,

title = "To test or not to test: Preliminary assessment of normality when comparing two independent samples",

abstract = "Background: Student's two-sample t test is generally used for comparing the means of two independent samples, for example, two treatment arms. Under the null hypothesis, the t test assumes that the two samples arise from the same normally distributed population with unknown variance. Adequate control of the Type I error requires that the normality assumption holds, which is often examined by means of a preliminary Shapiro-Wilk test. The following two-stage procedure is widely accepted: If the preliminary test for normality is not significant, the t test is used; if the preliminary test rejects the null hypothesis of normality, a nonparametric test is applied in the main analysis.Methods: Equally sized samples were drawn from exponential, uniform, and normal distributions. The two-sample t test was conducted if either both samples (Strategy I) or the collapsed set of residuals from both samples (Strategy II) had passed the preliminary Shapiro-Wilk test for normality; otherwise, Mann-Whitney{\textquoteright}s U test was conducted. By simulation, we separately estimated the conditional Type I error probabilities for the parametric and nonparametric part of the two-stage procedure. Finally, we assessed the overall Type I error rate and the power of the two-stage procedure as a whole.Results: Preliminary testing for normality seriously altered the conditional Type I error rates of the subsequent main analysis for both parametric and nonparametric tests. We discuss possible explanations for the observed results, the most important one being the selection mechanism due to the preliminary test. Interestingly, the overall Type I error rate and power of the entire two-stage procedure remained within acceptable limits.Conclusion: The two-stage procedure might be considered incorrect from a formal perspective; nevertheless, in the investigated examples, this procedure seemed to satisfactorily maintain the nominal significance level and had acceptable power properties.",

keywords = "Computer Simulation, Data Interpretation, Statistical, Humans, Models, Statistical, Normal Distribution, Selection Bias, Statistics, Nonparametric",

author = "Justine Rochon and Matthias Gondan and Meinhard Kieser",

year = "2012",

doi = "10.1186/1471-2288-12-81",

language = "English",

volume = "12",

journal = "B M C Medical Research Methodology",

issn = "1471-2288",

publisher = "BioMed Central Ltd.",

}

TY - JOUR

T1 - To test or not to test

T2 - Preliminary assessment of normality when comparing two independent samples

AU - Rochon, Justine

AU - Gondan, Matthias

AU - Kieser, Meinhard

PY - 2012

Y1 - 2012

N2 - Background: Student's two-sample t test is generally used for comparing the means of two independent samples, for example, two treatment arms. Under the null hypothesis, the t test assumes that the two samples arise from the same normally distributed population with unknown variance. Adequate control of the Type I error requires that the normality assumption holds, which is often examined by means of a preliminary Shapiro-Wilk test. The following two-stage procedure is widely accepted: If the preliminary test for normality is not significant, the t test is used; if the preliminary test rejects the null hypothesis of normality, a nonparametric test is applied in the main analysis.Methods: Equally sized samples were drawn from exponential, uniform, and normal distributions. The two-sample t test was conducted if either both samples (Strategy I) or the collapsed set of residuals from both samples (Strategy II) had passed the preliminary Shapiro-Wilk test for normality; otherwise, Mann-Whitney’s U test was conducted. By simulation, we separately estimated the conditional Type I error probabilities for the parametric and nonparametric part of the two-stage procedure. Finally, we assessed the overall Type I error rate and the power of the two-stage procedure as a whole.Results: Preliminary testing for normality seriously altered the conditional Type I error rates of the subsequent main analysis for both parametric and nonparametric tests. We discuss possible explanations for the observed results, the most important one being the selection mechanism due to the preliminary test. Interestingly, the overall Type I error rate and power of the entire two-stage procedure remained within acceptable limits.Conclusion: The two-stage procedure might be considered incorrect from a formal perspective; nevertheless, in the investigated examples, this procedure seemed to satisfactorily maintain the nominal significance level and had acceptable power properties.

AB - Background: Student's two-sample t test is generally used for comparing the means of two independent samples, for example, two treatment arms. Under the null hypothesis, the t test assumes that the two samples arise from the same normally distributed population with unknown variance. Adequate control of the Type I error requires that the normality assumption holds, which is often examined by means of a preliminary Shapiro-Wilk test. The following two-stage procedure is widely accepted: If the preliminary test for normality is not significant, the t test is used; if the preliminary test rejects the null hypothesis of normality, a nonparametric test is applied in the main analysis.Methods: Equally sized samples were drawn from exponential, uniform, and normal distributions. The two-sample t test was conducted if either both samples (Strategy I) or the collapsed set of residuals from both samples (Strategy II) had passed the preliminary Shapiro-Wilk test for normality; otherwise, Mann-Whitney’s U test was conducted. By simulation, we separately estimated the conditional Type I error probabilities for the parametric and nonparametric part of the two-stage procedure. Finally, we assessed the overall Type I error rate and the power of the two-stage procedure as a whole.Results: Preliminary testing for normality seriously altered the conditional Type I error rates of the subsequent main analysis for both parametric and nonparametric tests. We discuss possible explanations for the observed results, the most important one being the selection mechanism due to the preliminary test. Interestingly, the overall Type I error rate and power of the entire two-stage procedure remained within acceptable limits.Conclusion: The two-stage procedure might be considered incorrect from a formal perspective; nevertheless, in the investigated examples, this procedure seemed to satisfactorily maintain the nominal significance level and had acceptable power properties.

KW - Computer Simulation

KW - Data Interpretation, Statistical

KW - Humans

KW - Models, Statistical

KW - Normal Distribution

KW - Selection Bias

KW - Statistics, Nonparametric

UR - http://www.scopus.com/inward/record.url?scp=84862276509&partnerID=8YFLogxK

U2 - 10.1186/1471-2288-12-81

DO - 10.1186/1471-2288-12-81

M3 - Journal article

C2 - 22712852

AN - SCOPUS:84862276509

SN - 1471-2288

VL - 12

JO - B M C Medical Research Methodology

JF - B M C Medical Research Methodology

M1 - 81

ER -

To test or not to test: Preliminary assessment of normality when comparing two independent samples

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this