Literature DB >> 32411996

Must the random man be unrelated? A lingering misconception in forensic genetics.

Emmanuel Milot^1,2, Simon Baechler^1,3,4, Frank Crispino^1,2.

Abstract

A nearly universal practice among forensic DNA scientists includes mentioning an unrelated person as the possible alternative source of a DNA stain, when one in fact refers to an unknown person. Hence, experts typically express their conclusions with statements like: "The probability of the DNA evidence is X times higher if the suspect is the source of the trace than if another person unrelated to the suspect is the source of the trace." Published forensic guidelines encourage such allusions to the unrelated person. However, as the authors show here, rational reasoning and population genetic principles do not require the conditioning of the evidential value on the unrelatedness between the unknown individual and the person of interest (e.g., a suspect). Surprisingly, this important semantic issue has been overlooked for decades, despite its potential to mislead the interpretation of DNA evidence by criminal justice system stakeholders.

Entities: Chemical Disease Gene Species

Keywords: DNA evidence; Fact-finder; Match probability; Relatedness; Semantics

Year: 2019 PMID： 32411996 PMCID： PMC7219187 DOI： 10.1016/j.fsisyn.2019.11.003

Source DB: PubMed Journal: Forensic Sci Int ISSN： 2589-871X Impact factor: 2.395

Introduction

Forensic science has been the target of severe critiques, in particular through the reports of the National Research Council in 2009 [1] and the President’s Council of Advisors on Science and Technology in 2016 in the USA [2]. DNA typing was relatively spared by that storm, largely due to its strong grounding in probabilistic models to assess the weight of evidence. Nevertheless, the rendering of the weight of DNA evidence may mask fundamental interpretation issues for fact-finders, where semantics and communication are of prime importance. As highlighted by a growing body of research [[3], [4], [5], [6], [7], [8], [9], [10]], communication between scientists and non-scientists is far from straightforward and may cause unconscious misunderstandings. Each word is important and the burden is on forensic scientists to convey their message in an accurate, transparent, readable and efficient way. Many debates between law and forensic science experts1 underline the semantic issues and call to set up solutions for a clear communication that removes any ambiguity, a sort of common language between science and justice. One semantic issue that has lingered ever since the introduction of trace DNA analyses in criminal investigations pertains to a very widespread practice: the concept of the ‘unrelated person’. Experts typically express their conclusions about the weight of DNA evidence with statements like: “The probability of the evidence is X times higher under the hypothesis that the suspect is the source of the trace than under the hypothesis that another person unrelated to the suspect is the source of the trace.” The word ‘unrelated’ has spreaded across the forensic literature since its tentative appearance in Jeffreys et al.‘s initial paper on DNA fingerprints [11]. Nowadays, the word is almost always present in expert reports, scientific papers, textbooks and, importantly, forensic guidelines and recommendations. For instance, in the ENFSI Guideline for Evaluative Reporting in Forensic Science, DNA case examples mention alternative propositions considering “an (unknown) unrelated person” [ [12], pp. 34, 40]. Likewise, in its latest recommendations the DNA commission of the International Society of Forensic Genetics mentions “it is standard to apply the ‘unrelated’ caveat” (see footnote 6 in Ref. [13]). While there is an abundant literature about the problem of how to deal with relatives in forensic genetics, curiously we found no published reference that fundamentally addresses the interpretation of the concept of ‘unrelatedness’. This issue is semantic in nature and does not challenge the validity of the mathematical models that are applied to assign the probability of DNA evidence in everyday casework. However, we are concerned about the confusion that the routine and default usage of the word ‘unrelated’ can cause among an audience of investigators, lawyers, prosecutors or fact finders over the correct meaning of calculations pertaining to DNA evidence.

Confusion over the ‘unrelated’

All individuals have relatives. This is a consequence of the finite size (N < ∞) of populations. Thus, suspects have relatives too. The more genes they share with them, the more challenging it may be to make conclusive inferences about the source of DNA traces. This explains why forensic experts tend to specify that the reported weight of evidence holds only if the source of the trace is unrelated to the suspect or, equivalently, that the suspect’s relatives are excluded from the pool of individuals that may be randomly drawn from the population of interest. However, since an individual is always related to any other member of the population – whether their most recent common ancestor lived one generation or thousands of years ago – conditioning on unrelatedness implies that the weight of evidence strictly applies to a non-existent fraction of the population. No doubt that forensic scientists have a more practical definition in mind when they use the word ‘unrelated’, such as “not closely related to the suspect” or “not related to a degree close enough to bias substantially the calculation of the weight of evidence”. Yet, such fuzzy definitions can be misleading. First, referring to a person unrelated to the suspect may be perceived as if the population of interest excluded (close) relatives, in a sense a form of covert exoneration.2 This is because, in such a case, the set of pan class="Species">people encompan>ssed by the prosecution and the defence hypotheses excludes relatives, which may give the impression that both sides do not consider them as relevant. Second, one may think that relatives compromise the value of evidence. For instance, as suggested by a reporting scientist with whom we discussed the issue, one may wonder if the use of the word ‘unrelated’ in the alternative proposition means that if the suspect has a brother, the weight of evidence is meaningless and the DNA evidence useless. Third, non-geneticists may think that two pan class="Species">persons that do not fall under a usual “close relationship” category are necessarily more genetically distant than close relatives. Take the example of first cousins. Their kinship coefficient3 (φ) is 0.0625. However, there are a plethora of pedigree relationships that can lead to the exact same kinship level when two persons share several but more remote ancestors, especially in endogamous populations. Moreover, forensic biologists themselves do not seem to agree on the correct interpretation of ‘unrelated’. The issue arouse independently to authors of this paper in different contexts in Europe and North America, demonstrating similar concerns about the word ‘unrelated’ shared by practitioners and researchers in various countries. For example, in a 2012 international workshop on forensic DNA, one of us suggested that the word ‘unrelated’ should not be used anymore in expert reports. The discussion that followed among reporting scientists showed that they diverge over the interpretation and implications of this term. The issue was also brought forward in 2017 within a Swiss working group dedicated to interpreting forensic evidence and expressing conclusions. Despite admitting discomfort when asked to justify the default use of the word ‘unrelated’, the members decided to keep using it until the scientific literature addresses the question because, if questioned, they must refer to “the scientific state of knowledge". Furthermore, as applied in forensic science the concept of unrelatedness appears to be an incorrect interpretation of population genetic principles. Essentially, the problem arises when the absence of knowledge about the relatives of a person of interest leads the scientist to transform the ‘unknown person’ (the classical ‘random man’) into an ‘unrelated person’ upon reporting a random match probability, a likelihood ratio, or any other quantitative assessment of the DNA evidence. However, the key point for the correct interpretation of the weight of DNA evidence is not the existence of relatives per se but rather the information that one has or not about them and about their potential involvement in the case at hand. As we show in the next section, when no information about relatives is available, one should not interpret Hardy-Weinberg (HW) equations, or their derivations (e.g., those incorporating some form of coancestry), as conditioning the weight of evidence on the unrelatedness between the person of interest (e.g., suspect) and the unknown source of the trace.

All is relative

Consider two competing hypotheses about the source of a trace, H and H, respectively proposed by the prosecution and the defence [14]. In a Bayesian framework, the strength of our belief in favour of one hypothesis over the other before observing the DNA evidence (i.e. the ratio of their prior odds), is given by Pr(H|I)/Pr(H|I), where I is any other relevant (e.g., circumstantial) information available about/for the casework. After observing the DNA evidence (E), the posterior odds becomewhere is the likelihood ratio (LR). In equation (1), case information available about relatives is a component of I and we will designate it by I. A classical example is when the suspect has a brother who is assumed to belong to the population of interest. In such a case, H usually remains unchanged (e.g. “the suspect is the source of the trace”) while H could be that “his brother is the source of the trace”, or that “another person than the suspect, not excluding his brother, is the source”. In either case the calculation of the LR denominator must be adjusted appropriately [15]. Therefore, changing I can modify or refine both the set of hypotheses to be evaluated and the calculation of the LR, in agreement with these hypotheses [16].4 Now, since this is true for any defence hypothesis admitting any specified relatives as the potential donor of the DNA stain [15,18], we will not limit our consideration to the sole brother case and refer more generally to the kinship coefficient φ, which has a value for every degree of genetic relationship (see footnote 3). When the reporting scientist has no knowledge about the existence of relatives, then I = ∅ (empty set). In this case, it is generally assumed that the calculation of in equation (1), which is based on Hardy-Weinberg law in the simplest model, holds only when the donor is unrelated to the suspect, that is φ = 0. However, this has no resonance for stakeholders of the justice system that have to deal with the real world, where crimes occur in populations composed of many kinds of relatives. Actually, the only thing that the denominator should entail is that the reporting scientist incorporates no relevant information about the kinship of the suspect to other pan class="Species">persons in the population. That is, I = ∅ does not imply that potential donors are totally unrelated to the suspect (i.e. that Pr(φ > 0) = 0). Strictly speaking, an absence of kinship between individuals is expected only in infinite size populations since Pr(φ > 0) → 0 when N → ∞ under random mating [19] (see also Appendix A). Consequently, the absence of information about relatives should not be equated to an absence of kinship. The use of the word ‘unrelated’ is even more problematic under the Balding-Nichols (BN) model [20], which is routinely applied by forensic labs in place of the HW model. This model postulates that relatedness does exist between the suspect and the source of the trace due to population subdivision, such that individuals from the same subpopulation share a common ancestry (and assuming that the suspect and the donor belong to the same subpopulation). The theta (θ) parameter of this model corrects for the non-independence of their genetic profiles by incorporating information from studies on the genetic structuring of pan class="Species">human populations. Obviously, this means that the kinship between the suspect and other pan class="Species">persons from the same subpopulation is greater than zero. Consequently, it is incoherent to use the word ‘unrelated’ in the formulation of the weight of evidence based on this model. Moreover, contrary to a widespread idea, HW or BN equations do provide correct values for the probability of a genetic profile when one admits the inclusion of the suspect’s relatives in the population of interest, as long as no information about these relatives is available, as demonstrated for the HW case in Appendix A. Hence, the standard random match probability must be understood as the match probability in the absence of knowledge about relatives, rather than in its common acceptance as the “match probability when relatives are excluded from the population of interest”, which is equivalent to make the unreasonable assumption that the suspect has no relatives! A similar reasoning applies to other weight-of-evidence metrics that tend to refer to unrelated pan class="Species">persons in their verbal formulations, including likelihood ratios of various degrees of sophistication.

The unrelated man: an unnecessary burden

To circumvent the lack of realism conveyed by references to unrelated individuals, some authors proposed to change the calculation and presentation of the weight of evidence. In their “call for a re-examination of reporting practice”, Buckleton and Triggs [16] concluded that “it is time that the match probabilities for a sibling are reported in all casework involving many loci where the suspect has a non-excluded sibling” – a call that however appears to have had little effect on current common DNA reporting practice (see Ref. [21] for a similar argument). Likewise, Taylor et al. [22] proposed a unified LR that accounts for potential relatives and “removes the need to stipulate that the alternative donor is unrelated when forming the propositions” [22, p. 57]. Basically, LRs considering different types of relatives are calculated, weighted by the postulated frequency of each type of relative, and then summed up [22]. The STRmix™ software implements a different approach by letting the user specify the average number of pan class="Species">children per family, to better reflect the composition of the population of interest (see http://strmix.esr.cri.nz/#home for a list of publications relative to the methods implemented in STRmix™). As far as the assumptions about the relatedness structure are made explicit, above approaches have the advantage of considering populations that are more realistic of pan class="Species">human mating systems than the classic ‘random mating’ scheme. However, while they address the problem of how to best quantify the weight of DNA evidence, they do not fully address the semantic issue of its verbal formulation, because an ‘unrelated’ category may still remain among the several types of relatives considered. What should reporting scientists do then? We suggest referring simply to an ‘unknown person’ or to the ‘random individual’ is sufficient because one should not, and does not need to, discard the possibility that the source is related to the suspect to an unknown degree. Alternatively, a more explicit wording would be ‘an unknown person, without regard to his relatedness to the suspect’. Again, the important point here is not unrelatedness but the absence of relevant knowledge about relatives (I = ∅), which prevails in most real life casework. From this perspective, one can simply consider that if the unknown individual who left a DNA trace happened to be the brother or the cousin of the suspect, this would be a sort of ancillary consequence, a way by which we categorize and name one among many possible genetic outcomes of a random draw in a finite population. This way of expressing the relatedness avoids the pitfalls associated with the choice of an arbitrary definition of ‘unrelated’ within the forensic context. pan class="Disease">Critically, in assessing the weight of the DNA evidence with standard metrics, one must nevertheless bear in mind the assumption that the suspect has no more or less chance to have relatives of a given degree than the average person in the population of interest. Therefore, it is still important to specify that potential relatives are included in the list of possible donors, especially when the set of possible suspects is small.5

Prospective

The arguments presented in this paper call for a change in reporting practices to prevent semantic confusion and potential misinterpretation of DNA evidence by fact-finders and other criminal justice system pan class="Species">participants. We suggest avoiding the routine and default use of the word ‘unrelated’, not only in oral communications and expert reports, but also in the forensic literature in general, including guidelines and recommendations. Some might believe that this issue is unlikely to have a big influence on the interpretation of forensic DNA expertises, but the confusion that exists even among reporting scientists (see section 2) casts doubt on such an assumption. For clarity and robust reporting practices in forensic genetics, there is a vacuum in the literature about this question that needs to be filled. Thus, we hope this paper will spark discussion, and will be glad to hear what other pan class="Species">people think, including scientists, investigators, prosecutors, lawyers and fact-finders supporting or mitigating our concern (a web page [www.uqtr.ca/lrc/unrelated] has been opened to gather comments from the readers). In all cases, future studies in criminology, psychology and law will be essential to better document the variation in the perception, both by scientists and non-scientists, of the unrelatedness concept, and the impact of this variation on the justice system. The perception of alternative formulations should be compared, such as the one proposed here (‘an unknown person, without regard to his relatedness to the suspect’). This calls for an active collaboration between scientists and stakeholders of the criminal justice system to reduce the gap “that exists between questions lawyers are actually interested in, and the answers that scientists deliver to Courts” [23]. Finally, while this paper focuses on evaluative reporting, it will also be important to assess if and how various interpretations of the unrelatedness concept could impact decisions and action in the course of criminal investigations.

Conflict of Interest

The authors declare no conflict of interest.

Declaration of competing interest

We declare that this research involves no competing interests.

Table A.1

Values obtained for the standard random match probability (RMPstd) and the random match probability accounting for the possibility that suspect’s siblings may exist in the population (RMPsib), for a heterozygote a/b and various settings of N, p and p (assuming no coancestry due to population subdivision, i.e. θ = 0). RMPstd for model 2 integrates the expected difference in the genotype frequencies in a finite population (2pp – pp /N) that is a random draw from an infinite population (2pp) [25].

p_a	p_b	N	Model 1: fixed allele frequencies		Model 2: random allele frequencies
p_a	p_b	N	RMP_std	RMP_sib	RMP_std	RMP_sib
0.1	0.1	∞	0.02000000	0.02000000	0.02000000	0.02000000
		1,000,000	0.02000001	0.02000001	0.01999999	0.01999999
		10,000	0.02000100	0.02000100	0.01999900	0.01999900
		1000	0.02001001	0.02001001	0.01999000	0.01999020
		100	0.02010050	0.02010071	0.01990000	0.01992015
0.5	0.1	∞	0.10000000	0.10000000	0.10000000	0.10000000
		1,000,000	0.10000010	0.10000010	0.09999995	0.09999995
		10,000	0.10000500	0.10000500	0.09999500	0.09999501
		1000	0.10005000	0.10005000	0.09995000	0.09995006
		100	0.10050250	0.10050260	0.09950000	0.09956099

16 in total

1. An extended likelihood ratio framework for interpreting evidence.

Authors: J S Buckleton; C M Triggs; C Champod
Journal: Sci Justice Date: 2006 Apr-Jun Impact factor: 2.124

2. Forensic scientists' conclusions: how readable are they for non-scientist report-users?

Authors: Loene M Howes; K Paul Kirkbride; Sally F Kelty; Roberta Julian; Nenagh Kemp
Journal: Forensic Sci Int Date: 2013-05-16 Impact factor: 2.395

3. Analysis of matches and partial-matches in a Danish STR data set.

Authors: Torben Tvedebrink; Poul Svante Eriksen; James Michael Curran; Helle Smidt Mogensen; Niels Morling
Journal: Forensic Sci Int Genet Date: 2011-09-06 Impact factor: 4.882

4. Individual-specific 'fingerprints' of human DNA.

Authors: A J Jeffreys; V Wilson; S L Thein
Journal: Nature Date: 1985 Jul 4-10 Impact factor: 49.962

5. Relatedness calculations for linked loci incorporating subpopulation effects.

Authors: Jo-Anne Bright; James M Curran; John S Buckleton
Journal: Forensic Sci Int Genet Date: 2013-03-26 Impact factor: 4.882

6. Perception problems of the verbal scale.

Authors: Carrie Mullen; Danielle Spence; Linda Moxey; Allan Jamieson
Journal: Sci Justice Date: 2013-11-20 Impact factor: 2.124

7. An illustration of the effect of various sources of uncertainty on DNA likelihood ratio calculations.

Authors: D Taylor; J-A Bright; J Buckleton; J Curran
Journal: Forensic Sci Int Genet Date: 2014-02-08 Impact factor: 4.882

8. On the interpretation of likelihood ratios in forensic science evidence: Presentation formats and the weak evidence effect.

Authors: K A Martire; R I Kemp; M Sayle; B R Newell
Journal: Forensic Sci Int Date: 2014-04-14 Impact factor: 2.395

9. DNA profile match probability calculation: how to allow for population stratification, relatedness, database selection and single bands.

Authors: D J Balding; R A Nichols
Journal: Forensic Sci Int Date: 1994-02 Impact factor: 2.395

10. Understanding forensic expert evaluative evidence: A study of the perception of verbal expressions of the strength of evidence.

Authors: Eleanor Arscott; Ruth Morgan; Georgina Meakin; James French
Journal: Sci Justice Date: 2017-02-09 Impact factor: 2.124