| Literature DB >> 26227424 |
Nicolas Duforet-Frebourg1,2,3, Lucie M Gattepaille4, Michael G B Blum5,6, Mattias Jakobsson7,8.
Abstract
BACKGROUND: In ecology and forensics, some population assignment techniques use molecular markers to assign individuals to known groups. However, assigning individuals to known populations can be difficult if the level of genetic differentiation among populations is small. Most assignment studies handle independent markers, often by pruning markers in Linkage Disequilibrium (LD), ignoring the information contained in the correlation among markers due to LD.Entities:
Mesh:
Year: 2015 PMID: 26227424 PMCID: PMC4521458 DOI: 10.1186/s12859-015-0661-6
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Mean percentage of Incorrect Assignment (MIA) for simulated data from a divergence model with 3 populations (see main text for details on simulations). Panel a: x axis represents the window sizes. Note that window size = 1 corresponds to using SNP genotype data to assign individuals to populations. Panel b: x axis represents the proportion of individuals in the samples that are used in the training set. The mean incorrect assignment of individuals is evaluated with individuals from the validation set that were not used to construct the haplotypes
Fig. 2Principal Component Analysis on 447,245 SNPs for Spanish and Portuguese samples from POPRES
Fig. 3Principal Component Analysis of the Spanish and Portuguese samples from POPRES using the haplotypes found with HaploPOP. The haplotypes were built from 447,245 SNPs using a window size of 150 kb. For constructing haplotypes, the training sets consist of the Portuguese and Spanish individuals (Panel a) or a mix of Portuguese and Spanish individuals in both sets ‘A’ and ‘B’ (Panel b)
Fig. 4Mean percentage of Incorrect Assignment (MIA) when distinguishing the Spanish and Portuguese samples from POPRES. The error is evaluated with a split-validation approach