| Literature DB >> 25678934 |
Abstract
MOTIVATION: Correctly modeling population structure is important for understanding recent evolution and for association studies in humans. While pre-existing knowledge of population history can be used to specify expected levels of subdivision, objective metrics to detect population structure are important and may even be preferable for identifying groups in some situations. One such metric for genomic scale data is implemented in the cross-validation procedure of the program ADMIXTURE, but it has not been evaluated on recently diverged and potentially cryptic levels of population structure. Here, I develop a new method, AdmixKJump, and test both metrics under this scenario.Entities:
Keywords: 1000 Genomes project; Admixture; Fine scale population structure; Population genetics
Year: 2015 PMID: 25678934 PMCID: PMC4325960 DOI: 10.1186/s13029-014-0031-1
Source DB: PubMed Journal: Source Code Biol Med ISSN: 1751-0473
Figure 1Split time vs metric accuracy. The x-axis is a split time parameter added to the Human demographic model indicating the point when two populations start diverging. The y-axis has two labels, the first, Ancestry Accuracy, indicates how accurate the model parameters correctly cluster the two populations, where 50% accuracy is a random assignment. The second y-axis label indicates the % accuracy of AdmixKJump or cross-validation to correctly identify K ∗=2 or two clusters. I am reporting population sample sizes of 10 (blue), 30 (red), and 50 (purple).
European 1000 genomes project pairwise comparison for F and
|
|
|
|
|---|---|---|
| CEU-FIN | 0.006 | 1 |
| CEU-GBR | 0.002 | 1 |
| CEU-TSI | 0.003 | 1 |
| FIN-GBR | 0.005 | 1 |
| FIN-TSI | 0.009 | 2 |
| GBR-TSI | 0.004 | 1 |
AdmixKJump shows two clusters for one of the pairwise comparisons (FIN-TSI) whereas cross-validation does not. This is consistent with the increased divergence of this pair compared to the others, which here is measured by F.