TY - JOUR
T1 - Testing for Hardy-Weinberg equilibrium in structured populations using genotype or low-depth next generation sequencing data
AU - Meisner, Jonas
AU - Albrechtsen, Anders
N1 - This article is protected by copyright. All rights reserved.
PY - 2019/9/1
Y1 - 2019/9/1
N2 - Testing for deviations from Hardy–Weinberg equilibrium (HWE) is a common practice for quality control in genetic studies. Variable sites violating HWE may be identified as technical errors in the sequencing or genotyping process, or they may be of particular evolutionary interest. Large-scale genetic studies based on next-generation sequencing (NGS) methods have become more prevalent as cost is decreasing but these methods are still associated with statistical uncertainty. The large-scale studies usually consist of samples from diverse ancestries that make the existence of some degree of population structure almost inevitable. Precautions are therefore needed when analysing these data set, as population structure causes deviations from HWE. Here we propose a method that takes population structure into account in the testing for HWE, such that other factors causing deviations from HWE can be detected. We show the effectiveness of PCAngsd in low-depth NGS data, as well as in genotype data, for both simulated and real data set, where the use of genotype likelihoods enables us to model the uncertainty.
AB - Testing for deviations from Hardy–Weinberg equilibrium (HWE) is a common practice for quality control in genetic studies. Variable sites violating HWE may be identified as technical errors in the sequencing or genotyping process, or they may be of particular evolutionary interest. Large-scale genetic studies based on next-generation sequencing (NGS) methods have become more prevalent as cost is decreasing but these methods are still associated with statistical uncertainty. The large-scale studies usually consist of samples from diverse ancestries that make the existence of some degree of population structure almost inevitable. Precautions are therefore needed when analysing these data set, as population structure causes deviations from HWE. Here we propose a method that takes population structure into account in the testing for HWE, such that other factors causing deviations from HWE can be detected. We show the effectiveness of PCAngsd in low-depth NGS data, as well as in genotype data, for both simulated and real data set, where the use of genotype likelihoods enables us to model the uncertainty.
U2 - 10.1111/1755-0998.13019
DO - 10.1111/1755-0998.13019
M3 - Journal article
C2 - 30977299
SN - 1755-098X
VL - 19
SP - 1144
EP - 1152
JO - Molecular Ecology Resources
JF - Molecular Ecology Resources
IS - 5
ER -