| Literature DB >> 23823723 |
Javier Prado-Martinez1, Peter H Sudmant, Jeffrey M Kidd, Heng Li, Joanna L Kelley, Belen Lorente-Galdos, Krishna R Veeramah, August E Woerner, Timothy D O'Connor, Gabriel Santpere, Alexander Cagan, Christoph Theunert, Ferran Casals, Hafid Laayouni, Kasper Munch, Asger Hobolth, Anders E Halager, Maika Malig, Jessica Hernandez-Rodriguez, Irene Hernando-Herraez, Kay Prüfer, Marc Pybus, Laurel Johnstone, Michael Lachmann, Can Alkan, Dorina Twigg, Natalia Petit, Carl Baker, Fereydoun Hormozdiari, Marcos Fernandez-Callejo, Marc Dabad, Michael L Wilson, Laurie Stevison, Cristina Camprubí, Tiago Carvalho, Aurora Ruiz-Herrera, Laura Vives, Marta Mele, Teresa Abello, Ivanela Kondova, Ronald E Bontrop, Anne Pusey, Felix Lankester, John A Kiyang, Richard A Bergl, Elizabeth Lonsdorf, Simon Myers, Mario Ventura, Pascal Gagneux, David Comas, Hans Siegismund, Julie Blanc, Lidia Agueda-Calpena, Marta Gut, Lucinda Fulton, Sarah A Tishkoff, James C Mullikin, Richard K Wilson, Ivo G Gut, Mary Katherine Gonder, Oliver A Ryder, Beatrice H Hahn, Arcadi Navarro, Joshua M Akey, Jaume Bertranpetit, David Reich, Thomas Mailund, Mikkel H Schierup, Christina Hvilsom, Aida M Andrés, Jeffrey D Wall, Carlos D Bustamante, Michael F Hammer, Evan E Eichler, Tomas Marques-Bonet.
Abstract
Most great ape genetic variation remains uncharacterized; however, its study is critical for understanding population history, recombination, selection and susceptibility to disease. Here we sequence to high coverage a total of 79 wild- and captive-born individuals representing all six great ape species and seven subspecies and report 88.8 million single nucleotide polymorphisms. Our analysis provides support for genetically distinct populations within each species, signals of gene flow, and the split of common chimpanzees into two distinct groups: Nigeria-Cameroon/western and central/eastern populations. We find extensive inbreeding in almost all wild populations, with eastern gorillas being the most extreme. Inferred effective population sizes have varied radically over time in different lineages and this appears to have a profound effect on the genetic diversity at, or close to, genes in almost all species. We discover and assign 1,982 loss-of-function variants throughout the human and great ape lineages, determining that the rate of gene loss has not been different in the human branch compared to other internal branches in the great ape phylogeny. This comprehensive catalogue of great ape genome diversity provides a framework for understanding evolution and a resource for more effective management of wild and captive great ape populations.Entities:
Mesh:
Year: 2013 PMID: 23823723 PMCID: PMC3822165 DOI: 10.1038/nature12228
Source DB: PubMed Journal: Nature ISSN: 0028-0836 Impact factor: 49.962
Genetic variation summary by species and subspecies
Summary statistics for each species and subspecies.
| Genus | Scientific name Species/subspecies | Common name | N | Mean Coverage | Fixed Sites to Human reference | No. of SNVs | Mean SNVs per Individual | No. of Singletons | Ancestry Informative Markers (AIMs) | Ne (10-3) |
|---|---|---|---|---|---|---|---|---|---|---|
| Non-African | 6 | 18.3 | 386,974 | 5,887,443 | 2,639,546 | 1,379,448 | 12,316 | 9.7 – 19.5 | ||
| African | 3 | 20.9 | 632,253 | 6,309,453 | 3,203,178 | 2,448,454 | 12,316 | 13.9 – 27.9 | ||
| Humans | 9 | 19.2 | 224,660 | 9,172,573 | 3,061,604 | 3,827,902 | - | 13.1 – 16.2 | ||
|
| ||||||||||
| Nigerian-Cameroon | 10 | 16.7 | 25,017,403 | 12,605,585 | 4,816,435 | 2,695,109 | 2,213 | 18.5 – 37.0 | ||
| Eastern | 6 | 28.7 | 25,126,506 | 11,264,879 | 4,843,530 | 2,228,396 | 1,265 | 19.7 – 39.5 | ||
| Central | 4 | 23.8 | 25,080,750 | 11,820,858 | 4,983,933 | 3,948,347 | 619 | 24.4 – 48.7 | ||
| Western | 4 | 27.3 | 26,832,247 | 4,729,933 | 2,411,501 | 1,481,079 | 145,548 | 9.8 – 19.5 | ||
| Common Chimpanzees | 24 | 22.5 | 24,087,088 | 27,153,659 | 5,693,903 | 10,352,931 | 149,645 | 30.9 – 61.8 | ||
|
| ||||||||||
| Bonobos | 13 | 27.5 | 27,068,299 | 8,950,002 | 2,738,755 | 3,159,889 | - | 11.9 – 23.8 | ||
|
| ||||||||||
| Eastern lowland | 3 | 22.8 | 34,537,496 | 3,866,117 | 2,578,328 | 484,482 | 317,028 | 12.2 – 24.3 | ||
| Cross river | 1 | 17.6 | 35,553,861 | 2,585,360 | 2,585,360 | 165,482 | 35,693 | 14.9 – 29.8 | ||
| Western lowland | 23 | 17.8 | 31,602,620 | 17,314,403 | 6,410,662 | 2,797,388 | 19,902 | 26.8 – 53.5 | ||
| Gorillas | 27 | 18.3 | 31,376,203 | 19,177,989 | 6,492,831 | 3,447,352 | 372,623 | 28.4 – 56.9 | ||
|
| ||||||||||
| Sumatran | 5 | 28.7 | 62,880,923 | 14,543,573 | 7,263,256 | 5,681,303 | 1,132,808 | 27.5 – 55.0 | ||
| Bornean | 5 | 25.8 | 64,249,235 | 10,321,213 | 5,763,354 | 3,555,596 | 1,132,808 | 19.5 – 39.0 | ||
| Orangutans | 10 | 27.3 | 60,661,869 | 24,309,920 | 9,338,148 | 6,409,648 | 42.3 – 84.6 | |||
|
| ||||||||||
| All | 83 | 23.0 | 83,954,672 | 83,580,213 | - | - | - | |||
Polymorphic variants found in each species/subspecies after substracting fixed sites.
Singletons and doubletons calculated combining all the samples within the species.
Variants only found in a single group within each species.
Calculated from Θw. μ = 1e-9 - 0.5e-9 mut·bp-1·yr-1 and g = 25 for Homo and Pan, 19 for Gorilla and 26 for Pongo.
Hybrid sample Donald and 4 related gorillas were excluded.
Figure 1Samples, heterozygosity and genetic diversity
a. Geographical distribution of great ape populations across Indonesia and Africa sequenced in this study. The formation of the islands of Borneo and Sumatra resulted in the speciation of the two corresponding orangutan populations. The Sanaga River forms a natural boundary between Nigeria-Cameroon and Central chimpanzee populations while the Congo River separates the bonobo population from the Central and Eastern chimpanzees. Eastern lowland and Western lowland gorillas are both separated by a large geographical distance. b. Heterozygosity estimates of each of the individual species and subspecies are superimposed onto a neighbor-joining tree from genome-wide genetic distance estimates. Arrows indicate heterozygosities previously reported[30] for Western and Central chimpanzee populations c. Runs of homozygosity among great apes. The relationship between the coefficient of inbreeding (FROH) and the number of autozygous >1 Mbp segments is shown. Bonobos and Eastern lowland gorillas show an excess of inbreeding compared to the other great apes, suggesting small population sizes or a fragmented population. d. Genetic structure based on clustering of great apes. All individuals (columns) are grouped into different clusters (K=2 to K=6, rows) colored by species and according to their common genetic structure. Most captive individuals, labeled on top, show a complex admixture from different wild populations. A signature of admixture, for example, is clearly observed in the known hybrid Donald, a second-generation captive where we predict 15% admixture of Central chimpanzee on a Western background consistent with its pedigree. A gray line at the bottom denotes new groups at K=6 in agreement with the location of origin or ancestral admixture.
Figure 2Inferred population history
Population splits and effective population sizes (N) during great ape evolution. Split times (dark brown) and divergence times (light brown) are plotted as a function of divergence (d) on the bottom and time on top. Time is estimated using a single mutation rate (μ) of 1·10−9 mut/(bp·year). The ancestral and current effective population sizes are also estimated using this mutation rate. The results from several methods used to estimate N, (COALHMM, ILS COALHMM, PSMC and ABC are colored in orange, purple, blue and green respectively). The chimpanzee split times are estimated using the ABC method. The x-axis is rescaled for divergences larger than 2·10−3 to provide more resolution in recent splits. All the values used in this figure can be found in Table S5.
Figure 3PSMC analysis
Inferred historical population sizes by PSMC. The lower x-axis gives time measured by pairwise sequence divergence and the y-axis gives the effective population size measured by the scaled mutation rate. The upper x-axis indicates scaling in years, assuming a mutation rate ranging from 10−9 to 5·10−10 per site per year. The top left panel shows the inference for modern human populations. In the rest of the three panels, thin light lines of the same color correspond to PSMC inferences on 100 rounds of bootstrapped sequences.