Literature DB >> 19029128

Impact of allelic dropout on evidential value of forensic DNA profiles using RMNE.

F Van Nieuwerburgh¹, E Goetghebeur, M Vandewoestyne, D Deforce.

Abstract

MOTIVATION: Two methods are commonly used to report on evidence carried by forensic DNA profiles: the 'Random Man Not Excluded' (RMNE) approach and the likelihood ratio (LR) approach. It is often claimed a major advantage of the LR method that dropout can be assessed probabilistically.
RESULTS: In this article, a new RMNE measure is proposed that like-wise accounts for allelic dropout in an observed forensic DNA profile. We discuss the necessary calculations, underline their simplicity and provide a tool for performing the calculations.

Entities: Chemical Gene Species

Mesh：

Substances：
DNA

Year: 2008 PMID： 19029128 PMCID： PMC2639002 DOI： 10.1093/bioinformatics/btn608

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

1 INTRODUCTION

There are two common methods to report on the probability of occurrence of forensic DNA profiles: the ‘Random Man Not Ex-cluded’ (RMNE) approach and the likelihood ratio (LR) approach. Each approach has its advantages and disadvantages. It is often claimed that the major advantage of the LR method rests in the ability to probabilistically assess the impact of dropout (Gill et al., 2006). For the RMNE, we develop a method that is simple and allows for allelic dropout while avoiding detailed probabilistic assumptions on where these dropouts might have occurred. We discuss the necessary calculations and use them in some examples.

1.1 DNA profile and allelic dropout

A DNA profile is a list of observed alleles from each of the analyzed loci. For each locus all possible alleles and their frequency of occurrence in the population are known. The alleles can be observed as an analog signal after PCR amplification of samples containing DNA from one or more individuals. It is possible that not all the alleles of the contributing individuals are observed. This allelic dropout can result from various reasons: DNA degradation in the sample can lead to allelic dropout, typically of the alleles with a longer product size. When very small amounts of DNA are present in the sample, stochastic effects can be the cause that the template DNA of some of the alleles is not sufficiently present in the PCR reaction. Due to technical imperfections of the used analysis methods, the signal of an amplified allele could fall below the signal-to-noise ratio threshold.

1.2 RMNE approach

In the forensic context, the profile probability represents the probability Pr(E|H0) of the evidentiary DNA profile (E) under the hypothesis (H0) that the DNA profile is from a person unrelated to the suspect. For a single-contributor DNA profile, under the approximation that profiles from unrelated people are independent, this probability is the frequency of occurrence of the profile in the population (Gill et al., 2006). This is the probability that a random person has the same DNA profile as the evidence profile; in other words that a random person is not excluded by the evidence. The RMNE method does not make use of the quantitative data (peak height/area) in the DNA profile. Alleles are considered present only when observed and absent otherwise, ignoring the fact that there might be allelic dropout. When there are loci that require dropped out alleles to allow for a match with the suspect sample, one practice is to omit the inconvenient locus from the RMNE calculation. Such a calculation is suspect-centric and prejudicial against the suspect as it implies that in the population considered by the calculation, only the same loci would be used for inculpation/exculpation as those being considered for the present suspect (Gill et al., 2006). In this article, however, we present a method using nonsuspect-driven RMNE calculations taking allelic dropout into consideration.

1.3 The LR approach

The LR approach considers the probability of the evidence under two or more alternative hypotheses about the source(s) of the profile. A typical analysis of a crime sample has the prosecution hypothesis (H) and the defense hypothesis (H). When the LR <1, the evidence favors Hd. When the LR >1, the evidence shows more support for Hp. In a single-contributor case, the LR coincides with the RMNE probability: the probability of the evidence profile under Hp (the suspect is the contributor) is 1, assuming that no errors are made in the chain of evidence. The probability of the evidence profile under Hd (the contributor is an unrelated person) is the population frequency of the profile as would have been given by the profile probability approach (Evett et al., 1991; Gill et al., 2006; Weir et al., 1997). The LR method also agrees on a binary view of alleles (see RMNE method). Published calculations (Gill et al., 2000) and computer programs (Curran et al., 2005) can be used to deal with allelic dropout in a probabilistic way. The complexity of the LR calculation when mixed profiles and dropout have to be considered is the reason why they are generally not used (Gill et al., 2006). The biased practice of omitting the inconvenient locus from the LR calculation (as with the RMNE calculation, see above) is more commonly used.

1.4 Advantages and disadvantages of the RMNE and LR approach

The RMNE and LR approach are based on different statistical models which are elaborately discussed by Buckleton (2005a). The two methods are bound to give different answers in one and the same case (Gill et al., 2006). In contrast with the LR approach, the RMNE approach does not make full use of the evidence. It does not make use of the information on the DNA profile of the suspect and the number of possible contributors (Buckleton, 2005b), which may lead to an underestimation of the strength of the evidence. The RMNE approach is much more straightforward to implement, requires less interpretation (which is subject to nonobjectivity) and returns a value which is valid for the mixed profile, independent of the knowledge of possible contributors. It can be reported before possible contributors have been identified and can be used as a quality ‘value’ of the profile (Buckleton, 2005b): when the RMNE frequency of an evidence profile is too high; this profile will show too many false positive matches with profiles of innocent suspects. The RMNE calculation is particularly useful in complex mixtures, because it involves no assumptions about the identity or number of contributors to a mixture (Buckleton, 2005b). In the LR approach, the number of possible contributors has to be estimated (Gill et al., 2006), which is subject to error. RMNE results are easier to explain in court: when a suspect matches the mixture, the main question is: ‘What is the probability that someone else in the population would also match the mixture?’ (Buckleton, 2005b). The RMNE calculation brings the exact answer to this question. LR results are more difficult to explain in court: usually the defense hypothesis is not known, resulting in many LR options that might have to be discussed (Buckleton, 2005b). Studies have demonstrated that there are serious problems with understanding evidence presented as a LR and that it takes more skill to correctly interpret the same evidence presented as a LR compared to presentation as a population frequency (Taroni and Aitken, 1998). Using the LR approach, the impact of dropout can be assessed probabilistically (Gill et al., 2006), but requires detailed assumptions on the dropout mechanism and complex calculations. In this study, however, we present an RMNE calculation which allows for allelic dropout without being suspect-driven. The LR calculations for mixed contributor profiles assume a certain probability of the dropout pattern P(D) (Gill et al., 2006). This P(D) probability has to be estimated and is prone to error. One practice is to calculate LR for several P(D) (Gill et al., 2006). To use the presented RMNE calculations, no assumptions have to be made about the probability of dropout. The calculations simply allow for one, two or more dropouts when comparing profiles of suspects and ‘random men’ with the evidence profile.

2 METHODS

2.1 Single contributor, unambiguous profile

P(EL) = P(A)2 in case of homozygosity at locus L. P(EL) = 2P(A)P(A) in case of heterozygosity at locus L. With: P(EL) is the profile probability at locus L. P(A), P(A, P(A are the probabilities of the observed alleles A, A and A at locus L. With: P(E) is the profile probability (or RMNE probability) considering all analyzed loci. P(ELk) is the profile probability at locus L. λ is the number of analyzed loci.

2.2 Mixed contributor, unambiguous profile

With: With: P(EL) is the profile probability at locus L. P(A is the probability of the observed allele A at locus L. n is the number of observed alleles at locus L. P(E) is the RMNE probability considering all analyzed loci. P(ELk) is the RMNE probability at locus L. λ is the number of analyzed loci.

2.3 Mixed contributor, considering the possibility of allelic dropout

The calculation of the RMNE probability considering the possibility of allelic dropout is based on the above described ‘mixed contributor’ RMNE probability. We define the RMNE probability allowing for x dropouts, as the probability that a random man would not be excluded as a donor to the mixture, where we will exclude someone if and only if more than x alleles in his DNA profile are not observed in the mixture. Specifically, we calculate the RMNE probability conditional on the fact that 0, 1 or 2 allelic dropouts may have occurred on any of the loci. To this end, we start by calculating the probability of a match at a given locus, assuming 0, 1 or 2 allelic dropouts, respectively, have occurred at that locus.

2.3.1 Probabilities at locus L

With: P(EL0) is the RMNE probability at locus L, allowing no allelic dropouts at locus L. This is the same formula used for unambiguous mixed contributor calculation. P(EL1) is the RMNE probability at locus L, assuming that exactly one allelic dropout has occurred at locus L. Only ‘random men’ with a combination of an observed and a nonobserved allele are not excluded. P(EL2) is the RMNE probability at locus L, assuming that exactly two allelic dropouts have occurred at locus L. Only ‘random men’ with a combination of nonobserved alleles are not excluded. P(A is the probability of the observed allele A at locus L. n is the number of observed alleles at locus L. P(A) is the probability of the nonobserved allele A at locus L. m is the number of nonobserved alleles at locus L.

2.3.2 Probabilities combining the results from all loci

The RMNE probability of a DNA profile allowing for no allelic dropout P(E0), is the product of the P(EL0) from all individual loci (Buckleton, 2005b). To calculate the RMNE probability of a DNA profile assuming exactly one allelic dropout P(E1), one has to consider that this dropout can occur at each analyzed locus. When this dropout occurs at a certain locus (locus i), no dropouts can occur at the other loci. The RMNE probability of a DNA profile assuming exactly one dropout at locus i, is the product of the P(EL1i) and the P(EL0) of the other loci. P(E1) is the sum of all possible P(EL) products with one dropout at one of the loci. Reasoning by means of analogies, for calculation of the RMNE probability of a DNA profile assuming exactly two allelic dropouts P(E2), one has to consider the possibility of two dropouts at one locus or one dropout at two loci. When two dropouts occur at a certain locus (locus i), no dropout can occur at the other analyzed loci. The RMNE probability of a DNA profile assuming exactly two dropouts at locus i, is the product of the P(EL2i) and the P(EL0) of the other loci. The RMNE probability of a DNA profile assuming one dropout at locus i and one dropout at locus j, is the product of the P(EL1i), P(EL1j) and the P(EL0) of the other loci. Accepting a random man as not being excluded when he matches any of these dropout patterns, P(E2) is the sum of all possible P(EL) products with two dropouts at one locus or one dropout at two loci. With: P(E0) is the RMNE probability considering all analyzed loci, allowing no allelic dropouts. P(E1) is the RMNE probability considering all analyzed loci, assuming exactly one allelic dropout at one of the analyzed loci. P(E2) is the RMNE probability considering all analyzed loci, assuming exactly two allelic dropouts at one of the analyzed loci or two times one allelic dropout at two of the analyzed loci. P(E0 to 1) is the RMNE probability considering all analyzed loci, allowing up to one allelic dropout at one of the analyzed loci. P(E0 to 2) is the RMNE probability considering all analyzed loci, allowing up to two allelic dropouts at one or two of the analyzed loci. P(EL0k) is the profile probability at locus L, allowing no allelic dropouts at that locus. P(EL1i), P(EL1j) is the profile probability at locus L and L, assuming exactly one allelic dropout at locus L, and one at locus L P(EL2i) is the profile probability at locus L, assuming exactly two allelic dropout at locus L. λ is the number of analyzed loci.

3 IMPLEMENTATION

3.1 RMNE calculation per locus based on the data in Table 1

Example of a hypothetical DNA profile with three loci

3.2 RMNE calculation combining the results from all loci based on the data in Table 1

3.3 Validation of the formulae on simulated populations

Three populations were simulated using two arbitrary chosen allele frequency tables. One population contains the given frequencies of all 256 allele combinations using two loci with four alleles each. A second population contains the given frequencies of all 6400 possible allele combinations using one locus with five alleles and two loci with four alleles. A third population contains the given frequencies of all 6561 allele combinations using four loci with three alleles each. These populations represent the ‘random men’ who are included or excluded by a given arbitrary chosen evidence profile. For each individual is checked whether the individual would be included/excluded by the evidence, assuming 0, 1 or 2 dropouts or allowing up to 1 or 2 dropouts. The frequency of included individuals (RMNE) in the simulated population was compared to the calculated P(E0), P(E1), P(E2), P(E0 to 1) and P(E0 to 2), respectively, and found to be equal. An Excel file with these validation tests on two simulated populations is available at http://www.labfbt.UGent.be/RMNE.php.

4 DISCUSSION

In 2006, the DNA commission of the International Society of Forensic Genetics made recommendations on the interpretation of mixed DNA profiles (Gill et al., 2006). These recommendations are generally endorsed by many forensic genetic laboratories. One important recommendation which is still debated regards the use of the LR method rather than the RMNE method (Morling et al., 2007). To our knowledge, this is the first report on forensic RMNE calculations allowing for allelic dropout. The reported calculations present an alternative to the poor practice of omitting an inconvenient locus from the standard RMNE calculation when the DNA profile of a suspect does not completely fit into the DNA profile of the evidence. This insight could tip the pro/con balance in favor of the RMNE method: the biggest advantage of the LR method is the fact that more data is used for the calculation, sometimes resulting in a less conservative result. The LR approach is a general and coherent framework for interpreting evidence, which allows combination with other evidence. In return, more assumptions have to be made (number and origin of possible contributors), several alternative hypotheses must be formulated (as the prosecution and the defense both seek to maximize their respective probabilities) and more complex calculations have to be performed.

4.1 RMNE allowing for dropout

One of the main strengths of the RMNE approach is the possibility to report the RMNE probability before possible contributors have been identified. This probability is a measure for the usefulness of the profile for comparison with a suspect's profile or comparison with profiles in a database. The P(E0 to 1) and P(E0 to 2) calculations can be used to calculate the RMNE probability allowing for one or two dropouts, without using the profile of a suspect. In contrast with the LR approach, the presented RMNE approach does not need assumptions on the probability P(D) that an allelic dropout has occurred in the evidence profile. Because it is impossible to estimate P(D) perfectly, one practice is to calculate the LR for a range of different P(D). This makes explanation in court of the LR even more confusing: Gill et al. (2006) show that using the same mixed contributor profile and the same suspect, the LR result can shift from ‘evidence in favor of the prosecutor hypothesis’ to ‘evidence in favor of the defense hypothesis’ when using a different P(D). Using only one P(D) in court gives the false impression that the expert can actually make an estimation of the P(D). The presented RMNE calculation is a simpler measure with only one simple answer to the question: ‘How many random men would match the evidence when we allow for up to x number of allelic dropouts.’ Logically, the probability that a random person is not excluded by the evidence increases with the number of allowed allelic dropouts (Table 2). Allowing for one dropout using P(E0 to 1) is in this example (Tables 1 and 2) more conservative than omitting one of either loci from the standard RMNE calculation, but this can not be generalized. Allowing for up to two allelic dropouts using the P(E0 to 2) formula is always more conservative than omitting one of either loci from the standard RMNE calculation (Table 2).

Table 2.

RMNE probabilities for the hypothetical profile (see Table 1)

	Probability
P(E₀) no dropout	0.0168
P(E₁) assuming exactly 1 dropout	0.1007
P(E₂) assuming exactly 2 dropouts	0.2474
P(E_{0 to 1}) allowing 1 dropout	0.1175
PP(E_{0 to 2}) allowing 2 dropouts	0.3649
P(E_{0, omitting locus 1})	0.0459
P(E_{0, omitting locus 2})	0.0635
P(E_{0, omitting locus 3})	0.0936

Table 1.

Example of a hypothetical DNA profile with three loci

Three loci with all known alleles	Population frequency of the alleles P(A)	Is the allele observed in the evidence profile?
Locus 1
Allele 1	0.18
Allele 2	0.19	Yes
Allele 3	0.20	Yes
Allele 4	0.21	Yes
Allele 5	0.22
Locus 2
Allele 1	0.15
Allele 2	0.15
Allele 3	0.16	Yes
Allele 4	0.17	Yes
Allele 5	0.18	Yes
Allele 6	0.19
Locus 3
Allele 1	0.12
Allele 2	0.12
Allele 3	0.13	Yes
Allele 4	0.14	Yes
Allele 5	0.15	Yes
Allele 6	0.16
Allele 7	0.18

RMNE probabilities for the hypothetical profile (see Table 1) When considering allelic dropout, the presented RMNE calculation is conceptually simpler than the LR calculation: instead of estimating the probability of dropout, the RMNE calculation makes the simpler assumption that a given number of alleles may have dropped out and then accepts any profile in the population that matches that constraint as ‘not excluded’. Therefore, no further assumptions need to be made about the independence or (relative) probability of dropout at several loci. The main advantage is that in court, the RMNE measure gives only one simple answer to the question: ‘How many random men would match the evidence when we allow x number of allelic drop-outs.’ No alternative hypotheses and alternative probabilities of dropout P(D) have to be discussed. On the other hand this simplification could be seen as a disadvantage: the probability that a dropout has occurred in the evidence profile is of crucial importance for the evidential value against the suspect. If it is for example zero, the suspect should be excluded as the donor. Because it is however impossible to estimate the P(D) accurately, this leads to many alternative hypotheses, possibly confusing a jury of laymen (see previous paragraph).

4.2 RMNE, number of dropouts and evidential value

The presented calculations can be adapted to find the RMNE probability, allowing for more than two dropouts. The formula P(E3) (see below) calculates the RMNE probability assuming exactly three allelic dropouts at two or three of the analyzed loci. Analogous formulae can calculate the RMNE probability allowing for an unlimited number of dropouts. For each given mixed DNA profile, the P(E0 to ) can be calculated. The question arises as to which x to use in the P(E0 to ) calculation and which P(E0 to ) to report in court. Several ways of working can be suggested. An expert could set a maximum number of allowed dropouts based on his expertise with the used analysis method and based on the quality (e.g. signal intensity) of the evidence profile. This assessment of the number of allowed dropouts is less prone to error compared to the assessment of the probability that dropout has occurred (needed for an LR calculation): while assessing if it is sufficiently probable that dropout has occurred and how many dropouts can be allowed, the expert indirectly makes an assessment of the P(D), in spite of the fact that a correct P(D) cannot be calculated. However, the expert does not need to put down a distinct figure of the P(D) because the P(D) itself is not used in the RMNE calculation. With 16 analyzed loci, the Powerplex16 (Promega, Madison, USA) and the AmpFlSTR® Identifiler (Applied Biosystems, Foster City, USA) are currently the commercially available forensic DNA profiling kits with the highest number of analyzed loci. Using this kind of kits, an expert would probably allow only up to two, maybe three dropouts. When more loci are analyzed, the chance that the analysis fails at one or more loci increases. In this case, the limit in number of allowed dropouts could be adjusted dependent on the number of analyzed loci. Another scenario, where the presented RMNE formulas could be used is in the case that the same evidence sample is analyzed more than once. When a low-quality profile is obtained from an evidence sample and when there is still enough DNA of the sample available, the expert could decide to perform a second PCR and subsequent analysis on the same sample. When mutually comparing the two obtained DNA profiles of the same sample, e.g. two dropouts could be detected at locus A in profile 1 and one dropout could be detected at locus B in profile 2. Based on this information, the expert can conclude with high certainty (not absolute certainty, because unlikely artifacts like dropin could have occurred in this example) that dropout has occurred in both these analyses. When dropout is detected at two loci, there is a great chance that dropout also has occurred at another locus. Using the presented P(E0 to 1) and P(E0 to 2) formulas, the expert could allow up to two dropouts. Note that in this scenario, the expert does not have to estimate the P(D), as he can allow dropout based on the fact that dropout has occurred at two loci in the performed analyses. The presented method should only be used if an expert decides that dropout has to be taken into account. This decision should be based on the quality of the evidence profile and on an assessment if it is sufficiently probable that dropout has occurred. Allowing for dropout where P(D)≈ 0, could lead to the false inculpation of a suspect. The number of allowed dropouts (x) in the RMNE calculation should not be determined by the number of lacking alleles in the evidence profile compared with the suspect's profile. This could be considered a suspect-driven way of working as the number of allowed dropouts is decided upon based on the profile of the suspect. Conflict of Interest: none declared.

7 in total

1. An investigation of the rigor of interpretation rules for STRs derived from less than 100 pg of DNA.

Authors: P Gill; J Whitaker; C Flaxman; N Brown; J Buckleton
Journal: Forensic Sci Int Date: 2000-07-24 Impact factor: 2.395

2. Interpretation of repeat measurement DNA evidence allowing for multiple contributors and population substructure.

Authors: J M Curran; P Gill; M R Bill
Journal: Forensic Sci Int Date: 2005-02-10 Impact factor: 2.395

3. DNA commission of the International Society of Forensic Genetics: Recommendations on the interpretation of mixtures.

Authors: P Gill; C H Brenner; J S Buckleton; A Carracedo; M Krawczak; W R Mayr; N Morling; M Prinz; P M Schneider; B S Weir
Journal: Forensic Sci Int Date: 2006-06-05 Impact factor: 2.395

4. Interpretation of DNA mixtures--European consensus on principles.

Authors: Niels Morling; Ingo Bastisch; Peter Gill; Peter M Schneider
Journal: Forensic Sci Int Genet Date: 2007-09-06 Impact factor: 4.882

5. A guide to interpreting single locus profiles of DNA mixtures in forensic cases.

Authors: I W Evett; C Buffery; G Willott; D Stoney
Journal: J Forensic Sci Soc Date: 1991 Jan-Mar

6. Probabilistic reasoning in the law. Part 1: Assessment of probabilities and explanation of the value of DNA evidence.

Authors: F Taroni; C G Aitken
Journal: Sci Justice Date: 1998 Jul-Sep Impact factor: 2.124

Review 7. Interpreting DNA mixtures.

Authors: B S Weir; C M Triggs; L Starling; L I Stowell; K A Walsh; J Buckleton
Journal: J Forensic Sci Date: 1997-03 Impact factor: 1.832

7 in total

2 in total

Review 1. Separation/extraction, detection, and interpretation of DNA mixtures in forensic science (review).

Authors: Ruiyang Tao; Shouyu Wang; Jiashuo Zhang; Jingyi Zhang; Zihao Yang; Xiang Sheng; Yiping Hou; Suhua Zhang; Chengtao Li
Journal: Int J Legal Med Date: 2018-05-25 Impact factor: 2.686

2. Lab Retriever: a software tool for calculating likelihood ratios incorporating a probability of drop-out for forensic DNA profiles.

Authors: Keith Inman; Norah Rudin; Ken Cheng; Chris Robinson; Adam Kirschner; Luke Inman-Semerau; Kirk E Lohmueller
Journal: BMC Bioinformatics Date: 2015-09-18 Impact factor: 3.169

2 in total