TY - JOUR
T1 - Statistical modelling of Ion PGM HID STR 10-plex MPS Data
AU - Vilsen, Søren B
AU - Tvedebrink, Torben
AU - Mogensen, Helle Smidt
AU - Morling, Niels
N1 - Copyright © 2017 Elsevier B.V. All rights reserved.
PY - 2017/5/1
Y1 - 2017/5/1
N2 - We investigated the results of short tandem repeat (STR) markers of dilution series experiments and reference profiles generated using the Ion PGM massively parallel sequencing platform utilising the HID STR 10-plex panel. The STR markers were identified by the marker specific flanking regions of the STR region. We investigated the following: (1) the usage of quality measures for identifying substitution errors, (2) the heterozygote balance and compared it to that of capillary electrophoresis (CE), (3) the stability of the coverage and the consequence of IonExpress Barcode adapter (IBA) sampling with decreasing amounts of template DNA, (4) the hypothesis that the parental longest uninterrupted stretch (LUS) is a better linear predictor of stutter ratio than the parent allele length, (5) the use of parental allele length as a predictor of shoulder ratio, and (6) the removal of non-systematic erroneous sequences using dynamic thresholds created by fitting the distribution of the non-systematic erroneous sequences. We found that, due to MID sampling, the average coverage on a marker could not be used as an apt predictor of the amount of template DNA. The parental LUS was shown to be better predictor of stutter ratio than the parental allele repeat length, when markers with compound and complex repeat patterns or markers which contained micro-variants were considered, such as marker TH01 showed R(2) of 0.02 and 0.78 for parent allele repeat length and LUS, respectively. The one-inflated negative binomial method (OINB) and geometric model that can be used to remove non-systematic noise left on average 1.8 and 1.2 systematic errors per STR system, respectively.
AB - We investigated the results of short tandem repeat (STR) markers of dilution series experiments and reference profiles generated using the Ion PGM massively parallel sequencing platform utilising the HID STR 10-plex panel. The STR markers were identified by the marker specific flanking regions of the STR region. We investigated the following: (1) the usage of quality measures for identifying substitution errors, (2) the heterozygote balance and compared it to that of capillary electrophoresis (CE), (3) the stability of the coverage and the consequence of IonExpress Barcode adapter (IBA) sampling with decreasing amounts of template DNA, (4) the hypothesis that the parental longest uninterrupted stretch (LUS) is a better linear predictor of stutter ratio than the parent allele length, (5) the use of parental allele length as a predictor of shoulder ratio, and (6) the removal of non-systematic erroneous sequences using dynamic thresholds created by fitting the distribution of the non-systematic erroneous sequences. We found that, due to MID sampling, the average coverage on a marker could not be used as an apt predictor of the amount of template DNA. The parental LUS was shown to be better predictor of stutter ratio than the parental allele repeat length, when markers with compound and complex repeat patterns or markers which contained micro-variants were considered, such as marker TH01 showed R(2) of 0.02 and 0.78 for parent allele repeat length and LUS, respectively. The one-inflated negative binomial method (OINB) and geometric model that can be used to remove non-systematic noise left on average 1.8 and 1.2 systematic errors per STR system, respectively.
U2 - 10.1016/j.fsigen.2017.01.017
DO - 10.1016/j.fsigen.2017.01.017
M3 - Journal article
C2 - 28193505
SN - 1872-4973
VL - 28
SP - 82
EP - 89
JO - Forensic Science International: Genetics
JF - Forensic Science International: Genetics
ER -