Modelling allelic drop-outs in STR sequencing data generated by MPS

Søren B. Vilsen*, Torben Tvedebrink, Poul S. Eriksen, Christian Hussing, Claus Børsting, Niels Morling

*Kontaktforfatter

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

2 Citationer (Scopus)

Resumé

We used a Poisson-gamma model to analyse the allele coverage of autosomal short tandem repeat (STR) systems obtained by massively parallel sequencing (MPS). The Poisson-gamma coverage model was created using the peak height models from capillary electrophoresis (CE) based detection of PCR products as a starting point. The CE models were modified to account for the differences between CE and MPS signals by accounting for the large marker imbalances seen for MPS data and by using the Poisson-gamma distribution instead of the normal, log-normal, or gamma distributions that were applied for CE data. We took two approaches to estimate the marker imbalance parameters by (1) using a work-flow data base, and (2) using the results of replicate investigations of the samples. The Poisson-gamma model was used to estimate the rate of drop-outs of (1) single contributor dilution series experiments and (2) the minor contributor in two-person mixture samples. We examined the predictive capabilities of the model by comparing the observed and expected Brier scores of each sample. We derived the expected Brier scores and their variances to create asymptotic confidence intervals of the Brier scores. We found that the Poisson-gamma model performed well when using the work-flow data base, but that the replicate approach is not necessarily a viable option.
OriginalsprogEngelsk
TidsskriftForensic Science International: Genetics
Vol/bind37
Sider (fra-til)6-12
Antal sider7
ISSN1872-4973
DOI
StatusUdgivet - 1 nov. 2018

Fingerprint

High-Throughput Nucleotide Sequencing
Capillary Electrophoresis
Microsatellite Repeats
Workflow
Databases
Poisson Distribution
Alleles
Confidence Intervals
Polymerase Chain Reaction

Citer dette

@article{6b441455bd034d42a95297ca8276e5d9,
title = "Modelling allelic drop-outs in STR sequencing data generated by MPS",
abstract = "We used a Poisson-gamma model to analyse the allele coverage of autosomal short tandem repeat (STR) systems obtained by massively parallel sequencing (MPS). The Poisson-gamma coverage model was created using the peak height models from capillary electrophoresis (CE) based detection of PCR products as a starting point. The CE models were modified to account for the differences between CE and MPS signals by accounting for the large marker imbalances seen for MPS data and by using the Poisson-gamma distribution instead of the normal, log-normal, or gamma distributions that were applied for CE data. We took two approaches to estimate the marker imbalance parameters by (1) using a work-flow data base, and (2) using the results of replicate investigations of the samples. The Poisson-gamma model was used to estimate the rate of drop-outs of (1) single contributor dilution series experiments and (2) the minor contributor in two-person mixture samples. We examined the predictive capabilities of the model by comparing the observed and expected Brier scores of each sample. We derived the expected Brier scores and their variances to create asymptotic confidence intervals of the Brier scores. We found that the Poisson-gamma model performed well when using the work-flow data base, but that the replicate approach is not necessarily a viable option.",
keywords = "Forensic genetics, Massively parallel sequencing, Modelling allele coverage, Poisson-gamma distribution, Probability of drop-out, Short tandem repeat",
author = "Vilsen, {S{\o}ren B.} and Torben Tvedebrink and Eriksen, {Poul S.} and Christian Hussing and Claus B{\o}rsting and Niels Morling",
year = "2018",
month = "11",
day = "1",
doi = "10.1016/j.fsigen.2018.07.017",
language = "English",
volume = "37",
pages = "6--12",
journal = "Forensic Science International: Genetics",
issn = "1872-4973",
publisher = "Elsevier",

}

Modelling allelic drop-outs in STR sequencing data generated by MPS. / Vilsen, Søren B.; Tvedebrink, Torben; Eriksen, Poul S.; Hussing, Christian; Børsting, Claus; Morling, Niels.

I: Forensic Science International: Genetics, Bind 37, 01.11.2018, s. 6-12.

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

TY - JOUR

T1 - Modelling allelic drop-outs in STR sequencing data generated by MPS

AU - Vilsen, Søren B.

AU - Tvedebrink, Torben

AU - Eriksen, Poul S.

AU - Hussing, Christian

AU - Børsting, Claus

AU - Morling, Niels

PY - 2018/11/1

Y1 - 2018/11/1

N2 - We used a Poisson-gamma model to analyse the allele coverage of autosomal short tandem repeat (STR) systems obtained by massively parallel sequencing (MPS). The Poisson-gamma coverage model was created using the peak height models from capillary electrophoresis (CE) based detection of PCR products as a starting point. The CE models were modified to account for the differences between CE and MPS signals by accounting for the large marker imbalances seen for MPS data and by using the Poisson-gamma distribution instead of the normal, log-normal, or gamma distributions that were applied for CE data. We took two approaches to estimate the marker imbalance parameters by (1) using a work-flow data base, and (2) using the results of replicate investigations of the samples. The Poisson-gamma model was used to estimate the rate of drop-outs of (1) single contributor dilution series experiments and (2) the minor contributor in two-person mixture samples. We examined the predictive capabilities of the model by comparing the observed and expected Brier scores of each sample. We derived the expected Brier scores and their variances to create asymptotic confidence intervals of the Brier scores. We found that the Poisson-gamma model performed well when using the work-flow data base, but that the replicate approach is not necessarily a viable option.

AB - We used a Poisson-gamma model to analyse the allele coverage of autosomal short tandem repeat (STR) systems obtained by massively parallel sequencing (MPS). The Poisson-gamma coverage model was created using the peak height models from capillary electrophoresis (CE) based detection of PCR products as a starting point. The CE models were modified to account for the differences between CE and MPS signals by accounting for the large marker imbalances seen for MPS data and by using the Poisson-gamma distribution instead of the normal, log-normal, or gamma distributions that were applied for CE data. We took two approaches to estimate the marker imbalance parameters by (1) using a work-flow data base, and (2) using the results of replicate investigations of the samples. The Poisson-gamma model was used to estimate the rate of drop-outs of (1) single contributor dilution series experiments and (2) the minor contributor in two-person mixture samples. We examined the predictive capabilities of the model by comparing the observed and expected Brier scores of each sample. We derived the expected Brier scores and their variances to create asymptotic confidence intervals of the Brier scores. We found that the Poisson-gamma model performed well when using the work-flow data base, but that the replicate approach is not necessarily a viable option.

KW - Forensic genetics

KW - Massively parallel sequencing

KW - Modelling allele coverage

KW - Poisson-gamma distribution

KW - Probability of drop-out

KW - Short tandem repeat

UR - http://www.scopus.com/inward/record.url?scp=85050620147&partnerID=8YFLogxK

U2 - 10.1016/j.fsigen.2018.07.017

DO - 10.1016/j.fsigen.2018.07.017

M3 - Journal article

AN - SCOPUS:85050620147

VL - 37

SP - 6

EP - 12

JO - Forensic Science International: Genetics

JF - Forensic Science International: Genetics

SN - 1872-4973

ER -