TY - JOUR

T1 - Modelling allelic drop-outs in STR sequencing data generated by MPS

AU - Vilsen, Søren B.

AU - Tvedebrink, Torben

AU - Eriksen, Poul S.

AU - Hussing, Christian

AU - Børsting, Claus

AU - Morling, Niels

PY - 2018/11/1

Y1 - 2018/11/1

N2 - We used a Poisson-gamma model to analyse the allele coverage of autosomal short tandem repeat (STR) systems obtained by massively parallel sequencing (MPS). The Poisson-gamma coverage model was created using the peak height models from capillary electrophoresis (CE) based detection of PCR products as a starting point. The CE models were modified to account for the differences between CE and MPS signals by accounting for the large marker imbalances seen for MPS data and by using the Poisson-gamma distribution instead of the normal, log-normal, or gamma distributions that were applied for CE data. We took two approaches to estimate the marker imbalance parameters by (1) using a work-flow data base, and (2) using the results of replicate investigations of the samples. The Poisson-gamma model was used to estimate the rate of drop-outs of (1) single contributor dilution series experiments and (2) the minor contributor in two-person mixture samples. We examined the predictive capabilities of the model by comparing the observed and expected Brier scores of each sample. We derived the expected Brier scores and their variances to create asymptotic confidence intervals of the Brier scores. We found that the Poisson-gamma model performed well when using the work-flow data base, but that the replicate approach is not necessarily a viable option.

AB - We used a Poisson-gamma model to analyse the allele coverage of autosomal short tandem repeat (STR) systems obtained by massively parallel sequencing (MPS). The Poisson-gamma coverage model was created using the peak height models from capillary electrophoresis (CE) based detection of PCR products as a starting point. The CE models were modified to account for the differences between CE and MPS signals by accounting for the large marker imbalances seen for MPS data and by using the Poisson-gamma distribution instead of the normal, log-normal, or gamma distributions that were applied for CE data. We took two approaches to estimate the marker imbalance parameters by (1) using a work-flow data base, and (2) using the results of replicate investigations of the samples. The Poisson-gamma model was used to estimate the rate of drop-outs of (1) single contributor dilution series experiments and (2) the minor contributor in two-person mixture samples. We examined the predictive capabilities of the model by comparing the observed and expected Brier scores of each sample. We derived the expected Brier scores and their variances to create asymptotic confidence intervals of the Brier scores. We found that the Poisson-gamma model performed well when using the work-flow data base, but that the replicate approach is not necessarily a viable option.

KW - Forensic genetics

KW - Massively parallel sequencing

KW - Modelling allele coverage

KW - Poisson-gamma distribution

KW - Probability of drop-out

KW - Short tandem repeat

UR - http://www.scopus.com/inward/record.url?scp=85050620147&partnerID=8YFLogxK

U2 - 10.1016/j.fsigen.2018.07.017

DO - 10.1016/j.fsigen.2018.07.017

M3 - Journal article

AN - SCOPUS:85050620147

VL - 37

SP - 6

EP - 12

JO - Forensic Science International: Genetics

JF - Forensic Science International: Genetics

SN - 1872-4973

ER -