| Literature DB >> 35846115 |
Stephen D Turner1, V P Nagraj1, Matthew Scholz1, Shakeel Jessa1, Carlos Acevedo1, Jianye Ge2,3, August E Woerner2,3, Bruce Budowle2,3.
Abstract
Technological advances in sequencing and single nucleotide polymorphism (SNP) genotyping microarray technology have facilitated advances in forensic analysis beyond short tandem repeat (STR) profiling, enabling the identification of unknown DNA samples and distant relationships. Forensic genetic genealogy (FGG) has facilitated the identification of distant relatives of both unidentified remains and unknown donors of crime scene DNA, invigorating the use of biological samples to resolve open cases. Forensic samples are often degraded or contain only trace amounts of DNA. In this study, the accuracy of genome-wide relatedness methods and identity by descent (IBD) segment approaches was evaluated in the presence of challenges commonly encountered with forensic data: missing data and genotyping error. Pedigree whole-genome simulations were used to estimate the genotypes of thousands of individuals with known relationships using multiple populations with different biogeographic ancestral origins. Simulations were also performed with varying error rates and types. Using these data, the performance of different methods for quantifying relatedness was benchmarked across these scenarios. When the genotyping error was low (<1%), IBD segment methods outperformed genome-wide relatedness methods for close relationships and are more accurate at distant relationship inference. However, with an increasing genotyping error (1-5%), methods that do not rely on IBD segment detection are more robust and outperform IBD segment methods. The reduced call rate had little impact on either class of methods. These results have implications for the use of dense SNP data in forensic genomics for distant kinship analysis and FGG, especially when the sample quality is low.Entities:
Keywords: SNP; forensic genetic genealogy; forensics; genealogy; kinship; relatedness
Year: 2022 PMID: 35846115 PMCID: PMC9282869 DOI: 10.3389/fgene.2022.882268
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.772
FIGURE 1Overall classification accuracy using default parameters. Panels show the genotyping error increasing in panels going left-to-right and missing data rates increasing panels going top-to-bottom. Individual bars within each panel show the classification accuracy within each simulated population. This graphic shows roughly equivalent accuracy with zero error but decreased accuracy for both IBD segment methods in comparison to KING with a higher genotyping error.
FIGURE 2RMSE comparing the inferred versus simulated kinship. Panels show the genotyping error increasing in panels going left-to-right and missing data rates increasing in panels going top-to-bottom. Individual bars within each panel show the classification accuracy within each simulated population (ASW, GBR, and MXL).
FIGURE 3Difference between the inferred kinship coefficient versus the true simulated kinship coefficient for three different methods using default parameters at different error and missingness levels for simulated relationships from GBR founders. Error increases in panels going left-to-right. Missing data increase in panels going top-to-bottom. Each point represents a pair of simulated individuals. Red = hap-IBD; green = KING; blue = IBIS.