| Literature DB >> 19087322 |
Jukka Corander1, Pekka Marttinen, Jukka Sirén, Jing Tang.
Abstract
BACKGROUND: During the most recent decade many Bayesian statistical models and software for answering questions related to the genetic structure underlying population samples have appeared in the scientific literature. Most of these methods utilize molecular markers for the inferences, while some are also capable of handling DNA sequence data. In a number of earlier works, we have introduced an array of statistical methods for population genetic inference that are implemented in the software BAPS. However, the complexity of biological problems related to genetic structure analysis keeps increasing such that in many cases the current methods may provide either inappropriate or insufficient solutions.Entities:
Mesh:
Year: 2008 PMID: 19087322 PMCID: PMC2629778 DOI: 10.1186/1471-2105-9-539
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Posterior probabilities of the origins of alleles for an admixed individual from the population labelled C/D, who was assigned into the cluster with green label in the genetic mixture analysis. The posterior probabilities are only shown for the alleles where the loge Bayes factor for an ancestry deviating from the origin labelled green exceeds the default threshold (2.30). For simplicity of the visualization, the genotype data are assumed ordered, such that the lower and upper panels correspond to chromosome 1 and 2, respectively.
Figure 2Posterior probabilities of the origins of alleles for an individual with pure ancestry in the population labelled E, who was assigned into the cluster with magenta label in the genetic mixture analysis. The posterior probabilities are only shown for the alleles where the loge Bayes factor for an ancestry deviating from the origin labelled green exceeds the default threshold (2.30). For simplicity of the visualization, the genotype data are assumed ordered, such that the lower and upper panels correspond to chromosome 1 and 2, respectively.
Figure 3Posterior estimates of the admixture coefficients for 700 individuals with 377 microsatellite loci simulated using five underlying populations indicated by the black vertical lines (A = Eurasia, B = Africa, C = Oceania, D = East Asia, E = America). The populations with two labels indicate that the individuals are admixed between the two origins (one parent from each population). The populations with four labels indicate that the individuals have ancestry in the corresponding populations (admixed parents). The allele frequencies used in the simulation are the posterior mode estimates under a Dirichlet prior from the human data reported in [32] using the same clusters as in [4].
Posterior mode clustering of the human data from [34] using the genetic mixture analysis at the sample population level in BAPS.
| Cluster: | Included sample populations: |
| Cluster 1 | Han, Han-NChina, Dai, Daur, Hezhen, Lahu, Miao, Oroqen, She, Tujia, Tu, Xibo, Yi, Mongola, Naxi, Cambodian, Japanese, TundraNentsi, Yakut |
| Cluster 2 | Melanesian, Papuan |
| Cluster 3 | Orcadian, Adygei, Russian, Basque, French, Italian, Sardinian, Tuscan, Mozabite, Bedouin, Druze, Palestinian, Balochi, Brahui, Burusho, Hazara, Kalash, Makrani, Pathan, Sindhi, Uygur |
| Cluster 4 | Kogi, Arhuaco |
| Cluster 5 | TicunaArara, TicunaTarapaca |
| Cluster 6 | BantuSouthAfrica, BantuKenya, Mandenka, Yoruba, BiakaPygmy, MbutiPygmy, San |
| Cluster 7 | Karitiana |
| Cluster 8 | Piapoco, Maya, Chipewyan, Cree, Ojibwa, Kaqchikel, Mixtec, Mixe, Zapotec, Guaymi, Cabecar, Aymara, Huilliche, Guarani, Kaingang, Quechua, Zenu, Inga, Wayuu, Embera, Waunana |
| Cluster 9 | Pima |
| Cluster 10 | Surui |
| Cluster 11 | Ache |
Figure 4Posterior admixture estimates for the human data reported in [34] based on the optimal genetic mixture estimate with 11 clusters under the BAPS uniform prior clustering model for individuals.