Statistical aspects of forensic genetics: Models for qualitative and quantitative STR data

Publikation: ForskningPh.d.-afhandling

Abstrakt

This PhD thesis deals with statistical models intended for forensic genetics, which is the part of forensic medicine concerned with analysis of DNA evidence from criminal cases together with calculation of alleged paternity and affinity in family reunification cases. The main focus of the thesis is on crime cases as these differ from the other types of cases since the biological material often is used for person identification contrary to affinity. Common to all cases, however, is that the DNA is used as evidence in order to assess the probability of observing the biological material given different hypotheses. Most countries use commercially
manufactured DNA kits for typing a person’s DNA profile. Using these kits the DNA profile is constituted by the state of 10-15 DNA loci which has a large variation from person to person in the population. Thus, only a small fraction of the genome is typed, but due to the large variability, it is possible to identify individuals with very high probability. These probabilities are used when calculating the weight of evidence, which in some cases corresponds to the
likelihood of observing a given suspect’s DNA profile in the population. By assessing the probability of the DNA evidence under competing hypotheses the biological evidence may be used in the court’s deliberation and trial on equal footing with other evidence and expert statements. These probabilities are based on population genetic models whose assumptions must be validated.

The thesis’s first two articles describe the ”θ-correction” which compensate for possible population structures and remote coancestry that could affect the models’ accuracy. The Danish reference database with nearly 52,000 DNA profiles, is analysed and the number of near-matches is compared to the expected numbers under the model. A frequent event in connection with crime cases is the detection of more than one person’s DNA in a sample from the crime scene. In such cases, the DNA profile is called a DNA mixture as it is not possible mechanically or chemically to separate the biological traces into its contributing parts. To ascribe an evidentiary weight to a DNA mixture, the quantitative part (comprised as signal intensities in a so-called electropherogram - EPG) of the result from biotechnological analysis is used. Two models for handling DNA mixtures are presented together with an efficient
algorithm to separate the DNA mixture in the most probable contributing profiles. Furthermore, it is discussed how the quantitative part of the evidence is included in calculating the evidential weight.

In criminal cases, the biological traces are often found at crime scenes in conditions which can degrade and contaminate the DNA strand, which complicates the subsequent biochemical analysis. Furthermore, the amount of DNA may be limited which may challenge the sensitivity of the biotechnology applied in the analysis. Models to evaluate the degree of degradation and
estimate the probability of an allelic drop-out are discussed in the thesis. Furthermore, it is exemplified how to incorporate the probability of degradation and drop-out when calculating the weight of evidence.

Finally, the thesis contains an article which deals with post-processing of the data after the signal is processed by PCR thermo cycler and detected by electrophoresis apparatus. Central is the detection of a signal-to-noise limit which currently is a fixed limit recommended by the manufacturer of the typing kit. This article discusses how this threshold can be determined from the noise such that it may be specific to each case and locus. Additionally two filters are presented that handle specific types of artifacts in the data generation process which are manifested as increased signals in the EPG.
Luk

Detaljer

This PhD thesis deals with statistical models intended for forensic genetics, which is the part of forensic medicine concerned with analysis of DNA evidence from criminal cases together with calculation of alleged paternity and affinity in family reunification cases. The main focus of the thesis is on crime cases as these differ from the other types of cases since the biological material often is used for person identification contrary to affinity. Common to all cases, however, is that the DNA is used as evidence in order to assess the probability of observing the biological material given different hypotheses. Most countries use commercially
manufactured DNA kits for typing a person’s DNA profile. Using these kits the DNA profile is constituted by the state of 10-15 DNA loci which has a large variation from person to person in the population. Thus, only a small fraction of the genome is typed, but due to the large variability, it is possible to identify individuals with very high probability. These probabilities are used when calculating the weight of evidence, which in some cases corresponds to the
likelihood of observing a given suspect’s DNA profile in the population. By assessing the probability of the DNA evidence under competing hypotheses the biological evidence may be used in the court’s deliberation and trial on equal footing with other evidence and expert statements. These probabilities are based on population genetic models whose assumptions must be validated.

The thesis’s first two articles describe the ”θ-correction” which compensate for possible population structures and remote coancestry that could affect the models’ accuracy. The Danish reference database with nearly 52,000 DNA profiles, is analysed and the number of near-matches is compared to the expected numbers under the model. A frequent event in connection with crime cases is the detection of more than one person’s DNA in a sample from the crime scene. In such cases, the DNA profile is called a DNA mixture as it is not possible mechanically or chemically to separate the biological traces into its contributing parts. To ascribe an evidentiary weight to a DNA mixture, the quantitative part (comprised as signal intensities in a so-called electropherogram - EPG) of the result from biotechnological analysis is used. Two models for handling DNA mixtures are presented together with an efficient
algorithm to separate the DNA mixture in the most probable contributing profiles. Furthermore, it is discussed how the quantitative part of the evidence is included in calculating the evidential weight.

In criminal cases, the biological traces are often found at crime scenes in conditions which can degrade and contaminate the DNA strand, which complicates the subsequent biochemical analysis. Furthermore, the amount of DNA may be limited which may challenge the sensitivity of the biotechnology applied in the analysis. Models to evaluate the degree of degradation and
estimate the probability of an allelic drop-out are discussed in the thesis. Furthermore, it is exemplified how to incorporate the probability of degradation and drop-out when calculating the weight of evidence.

Finally, the thesis contains an article which deals with post-processing of the data after the signal is processed by PCR thermo cycler and detected by electrophoresis apparatus. Central is the detection of a signal-to-noise limit which currently is a fixed limit recommended by the manufacturer of the typing kit. This article discusses how this threshold can be determined from the noise such that it may be specific to each case and locus. Additionally two filters are presented that handle specific types of artifacts in the data generation process which are manifested as increased signals in the EPG.
OriginalsprogEngelsk
ForlagDepartment of Mathematical Sciences, Aalborg University
Antal sider183
StatusUdgivet - 2010
PublikationsartForskning
SeriePh.D. Report Series
Nummer19
ISSN1601-8346

Download-statistik

Ingen data tilgængelig
ID: 48415083