TY - JOUR
T1 - Modelling allelic drop-outs in STR sequencing data generated by MPS
AU - Vilsen, Søren B.
AU - Tvedebrink, Torben
AU - Eriksen, Poul S.
AU - Hussing, Christian
AU - Børsting, Claus
AU - Morling, Niels
PY - 2018/11/1
Y1 - 2018/11/1
N2 - We used a Poisson-gamma model to analyse the allele coverage of autosomal short tandem repeat (STR) systems obtained by massively parallel sequencing (MPS). The Poisson-gamma coverage model was created using the peak height models from capillary electrophoresis (CE) based detection of PCR products as a starting point. The CE models were modified to account for the differences between CE and MPS signals by accounting for the large marker imbalances seen for MPS data and by using the Poisson-gamma distribution instead of the normal, log-normal, or gamma distributions that were applied for CE data. We took two approaches to estimate the marker imbalance parameters by (1) using a work-flow data base, and (2) using the results of replicate investigations of the samples. The Poisson-gamma model was used to estimate the rate of drop-outs of (1) single contributor dilution series experiments and (2) the minor contributor in two-person mixture samples. We examined the predictive capabilities of the model by comparing the observed and expected Brier scores of each sample. We derived the expected Brier scores and their variances to create asymptotic confidence intervals of the Brier scores. We found that the Poisson-gamma model performed well when using the work-flow data base, but that the replicate approach is not necessarily a viable option.
AB - We used a Poisson-gamma model to analyse the allele coverage of autosomal short tandem repeat (STR) systems obtained by massively parallel sequencing (MPS). The Poisson-gamma coverage model was created using the peak height models from capillary electrophoresis (CE) based detection of PCR products as a starting point. The CE models were modified to account for the differences between CE and MPS signals by accounting for the large marker imbalances seen for MPS data and by using the Poisson-gamma distribution instead of the normal, log-normal, or gamma distributions that were applied for CE data. We took two approaches to estimate the marker imbalance parameters by (1) using a work-flow data base, and (2) using the results of replicate investigations of the samples. The Poisson-gamma model was used to estimate the rate of drop-outs of (1) single contributor dilution series experiments and (2) the minor contributor in two-person mixture samples. We examined the predictive capabilities of the model by comparing the observed and expected Brier scores of each sample. We derived the expected Brier scores and their variances to create asymptotic confidence intervals of the Brier scores. We found that the Poisson-gamma model performed well when using the work-flow data base, but that the replicate approach is not necessarily a viable option.
KW - Forensic genetics
KW - Massively parallel sequencing
KW - Modelling allele coverage
KW - Poisson-gamma distribution
KW - Probability of drop-out
KW - Short tandem repeat
UR - http://www.scopus.com/inward/record.url?scp=85050620147&partnerID=8YFLogxK
U2 - 10.1016/j.fsigen.2018.07.017
DO - 10.1016/j.fsigen.2018.07.017
M3 - Journal article
C2 - 30071494
AN - SCOPUS:85050620147
SN - 1872-4973
VL - 37
SP - 6
EP - 12
JO - Forensic Science International: Genetics
JF - Forensic Science International: Genetics
ER -