| Literature DB >> 25592880 |
Vikas Bansal1,2, Ondrej Libiger3,4.
Abstract
BACKGROUND: Estimation of individual ancestry from genetic data is useful for the analysis of disease association studies, understanding human population history and interpreting personal genomic variation. New, computationally efficient methods are needed for ancestry inference that can effectively utilize existing information about allele frequencies associated with different human populations and can work directly with DNA sequence reads.Entities:
Mesh:
Year: 2015 PMID: 25592880 PMCID: PMC4301802 DOI: 10.1186/s12859-014-0418-7
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Admixture proportions for 25 Mozabite individuals. The coefficients were estimated using allele frequencies from the HapMap reference populations and using two methods: iAdmix (a) and ADMIXTURE (b). The population labels are as follows: TSI (blue), CEU (light blue), MKK (red), YRI (green) and LWK (yellow).
Figure 2Distribution of the number of reads covering the 249,075 polymorphic sites in the HapMap3 allele frequency panel using low-coverage whole-genome and exome sequence data from one individual (NA19704) sequenced in the 1000 Genomes Project.
Comparison of admixture estimates for individuals from ASW population
|
|
|
|
|
|
|---|---|---|---|---|
| NA19625 | lowcov | 0.6657 | 0.1718 | 0.1626 |
| exome | 0.6672 | 0.1674 | 0.1654 | |
| NA19700 | lowcov | 0.8308 | 0.1692 | 0 |
| exome | 0.8341 | 0.1656 | 0 | |
| NA19703 | lowcov | 0.8554 | 0.1445 | 0 |
| exome | 0.8564 | 0.1437 | 0 | |
| NA19704 | lowcov | 0.8622 | 0.138 | 0 |
| exome | 0.8577 | 0.1423 | 0 | |
| NA19707 | lowcov | 0.7397 | 0.243 | 0.0173 |
| exome | 0.7354 | 0.2456 | 0.0189 | |
| NA19701 | lowcov | 0.8447 | 0.1313 | 0.024 |
| exome | 0.8446 | 0.1286 | 0.0268 |
Admixture estimates were calculated using low-coverage whole-genome sequence data (lowcov) and exome sequence data for 6 individuals from the ASW (African-American) population in the 1000 Genomes project.
Admixture coefficients for simulated pools
|
|
|
|
|
|---|---|---|---|
| 20 GBR | 1.0 | 0 | 0 |
| 19 GBR, 1 CHS | 0.9465 | 0.0535 | 0 |
| 19 GBR, 1 LWK | 0.9653 | 0 | 0.0347 |
| 18 GBR, 1 LWK, 1 CHS | 0.9116 | 0.0562 | 0.0323 |
| 39 GBR, 1 CHS | 0.9705 | 0.0295 | 0 |
| 59 GBR, 1 CHS | 0.9793 | 0.0207 | 0 |
Pools were constructed using exome sequence data from the 1000 Genomes data and the admixture coefficients estimated using allele frequencies from 8 HapMap reference populations.