| Literature DB >> 33193704 |
María de la Puente1,2, Jorge Ruiz-Ramírez1, Adrián Ambroa-Conde1, Catarina Xavier2, Jorge Amigo3, María Ángeles Casares de Cal4, Antonio Gómez-Tato4, Ángel Carracedo1,3, Walther Parson2,5, Christopher Phillips1, María Victoria Lareu1.
Abstract
The development of microhaplotype (MH) panels for massively parallel sequencing (MPS) platforms is gaining increasing relevance for forensic analysis. Here, we expand the applicability of a 102 autosomal and 11 X-chromosome panel of MHs, previously validated with both MiSeq and Ion S5 MPS platforms and designed for identification purposes. We have broadened reference population data for identification purposes, including data from 240 HGDP-CEPH individuals of native populations from North Africa, the Middle East, Oceania and America. Using the enhanced population data, the panel was evaluated as a marker set for bio-geographical ancestry (BGA) inference, providing a clear differentiation of the five main continental groups of Africa, Europe, East Asia, Native America, and Oceania. An informative degree of differentiation was also achieved for the population variation encompassing North Africa, Middle East, Europe, South Asia, and East Asia. In addition, we explored the potential for individual BGA inference from simple mixed DNA, by simulation of mixed profiles followed by deconvolution of mixture components.Entities:
Keywords: bio-geographical ancestry; human identification; massively parallel sequencing; microhaplotypes; mixed DNA
Year: 2020 PMID: 33193704 PMCID: PMC7606911 DOI: 10.3389/fgene.2020.581041
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
FIGURE 1Pairwise FST (blue) and number of pairwise genotype differences between (green) and within (orange) populations for the autosomal MHs. Populations are named and grouped into eight major populations according to Supplementary Table S5.
FIGURE 2Bar chart represents log10 cumulative random match probability values (i.e., the probability that two individuals share the same profile) for the 30 populations considered, based on the autosomal MH data only. Populations are named and grouped into eight major populations according to Supplementary Table S5. Dashed lines represent, from bottom to top, the theoretical values for a panel composed of 102 perfectly balanced bi, tri and tetra-allelic SNPs for comparison: 3.56E-44, 1.98E-75, and 9.32E-99, respectively.
FIGURE 3Bio-geographical ancestry analysis of the five continental reference populations. (A) STRUCTURE results of ancestry proportions at K = 5. Each bar represents an individual and is colored in segments whose lengths correspond to their genetic cluster membership coefficients in up to five inferred population groups. (B) Three dimensional MDS analysis showing coordinates 1 and 2 (left) and 2 and 3 (right). (C) Neighbor Joining (NJ) tree analysis. For the MDS and NJ-tree plots, populations are colored according to the five different clusters which correspond to the five major populations identified in the STRUCTURE plot.
FIGURE 4Bio-geographical ancestry analysis of the five NAF-Eurasia reference population sets. (A) STRUCTURE results of ancestry proportions at K = 5. Each bar represents an individual and is colored in segments whose lengths correspond to their genetic cluster membership coefficients in up to five inferred population groups. (B) Three dimensional MDS analysis showing coordinates 1 and 2 (left) and 2 and 3 (right). (C) Neighbor Joining (NJ) tree analysis. For the MDS and NJ-tree, populations are colored according to the five different clusters which correspond to the five major populations identified in the STRUCTURE plot.
FIGURE 5Bio-geographical ancestry inference for the major and minor mixture components in mixtures 1, 2, and 3; classified using the continental reference set presented on Figure 3. (A) STRUCTURE results of ancestry proportions at K = 5. Each bar represents an individual and is colored in segments whose lengths correspond to their genetic cluster membership coefficients in up to five inferred population groups. (B) Three dimensional MDS analysis showing coordinates 1 and 2 (for mixture 1) or 1 and 3 (for mixtures 2 and 3). Populations and major and minor components are colored according to the legend. (C) Table showing, for each mixture ratio, the expected ancestry of the known components and % of completeness (compl.) of the minor and major deconvoluted MH profiles. Details of the simulated profiles and deconvolution results can be found in Supplementary File S2 and Supplementary Table S8.