| Literature DB >> 22975719 |
Yuichiro Hara1, Tadashi Imanishi, Yoko Satta.
Abstract
The demographic history of human would provide helpful information for identifying the evolutionary events that shaped the humanity but remains controversial even in the genomic era. To settle the controversies, we inferred the speciation times (T) and ancestral population sizes (N) in the lineage leading to human and great apes based on whole-genome alignment. A coalescence simulation determined the sizes of alignment blocks and intervals between them required to obtain recombination-free blocks with a high frequency. This simulation revealed that the size of the block strongly affects the parameter inference, indicating that recombination is an important factor for achieving optimum parameter inference. From the whole genome alignments (1.9 giga-bases) of human (H), chimpanzee (C), gorilla (G), and orangutan, 100-bp alignment blocks separated by ≥5-kb intervals were sampled and subjected to estimate τ = μT and θ = 4μgN using the Markov chain Monte Carlo method, where μ is the mutation rate and g is the generation time. Although the estimated τ(HC) differed across chromosomes, τ(HC) and τ(HCG) were strongly correlated across chromosomes, indicating that variation in τ is subject to variation in μ, rather than T, and thus, all chromosomes share a single speciation time. Subsequently, we estimated Ts of the human lineage from chimpanzee, gorilla, and orangutan to be 6.0-7.6, 7.6-9.7, and 15-19 Ma, respectively, assuming variable μ across lineages and chromosomes. These speciation times were consistent with the fossil records. We conclude that the speciation times in our recombination-free analysis would be conclusive and the speciation between human and chimpanzee was a single event.Entities:
Mesh:
Year: 2012 PMID: 22975719 PMCID: PMC3752010 DOI: 10.1093/gbe/evs075
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Estimated Parameters for Each Chromosomal Alignment Set
| Regions | Alignment Length (Mb) | Branch Length | −ln | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Whole genomed | 40.8 | 0.00330 | 0.00423 | 0.00819 | 0.00264 | 0.00229 | 0.00709 | 0.0352 | −66,035,045 |
| Autosomesd | 38.7 | 0.00326 | 0.00423 | 0.00835 | 0.00286 | 0.00223 | 0.00659 | 0.0355 | −62,585,914 |
| Chr. 1 | 3.28 | 0.00313 | 0.00408 | 0.00778 | 0.00270 | 0.00216 | 0.00691 | 0.0335 | −5,289,418 |
| Chr. 2 | 3.49 | 0.00327 | 0.00426 | 0.00827 | 0.00284 | 0.00223 | 0.00671 | 0.0350 | −5,650,076 |
| Chr. 3 | 2.92 | 0.00324 | 0.00429 | 0.00837 | 0.00303 | 0.00225 | 0.00679 | 0.0352 | −4,736,827 |
| Chr. 4 | 2.79 | 0.00345 | 0.00434 | 0.00886 | 0.00288 | 0.00254 | 0.00676 | 0.0368 | −4,545,603 |
| Chr. 5 | 2.61 | 0.00327 | 0.00436 | 0.00855 | 0.00316 | 0.00221 | 0.00644 | 0.0354 | −4,243,851 |
| Chr. 6 | 2.43 | 0.00307 | 0.00423 | 0.00837 | 0.0036 | 0.00225 | 0.00629 | 0.0346 | −3,934,109 |
| Chr. 7 | 2.13 | 0.00345 | 0.00426 | 0.00826 | 0.00221 | 0.00227 | 0.00705 | 0.0352 | −3,458,675 |
| Chr. 8 | 2.09 | 0.00352 | 0.00453 | 0.00899 | 0.00311 | 0.00242 | 0.00650 | 0.0372 | −3,419,007 |
| Chr. 9 | 1.62 | 0.00332 | 0.00431 | 0.00786 | 0.00282 | 0.00215 | 0.00678 | 0.0340 | −2,620,109 |
| Chr. 10 | 1.86 | 0.00331 | 0.00422 | 0.00827 | 0.00263 | 0.00228 | 0.00723 | 0.0353 | −3,027,929 |
| Chr. 11 | 1.86 | 0.00324 | 0.00415 | 0.00827 | 0.00271 | 0.00233 | 0.00692 | 0.0348 | −3,022,784 |
| Chr. 12 | 1.93 | 0.00317 | 0.00408 | 0.00818 | 0.00264 | 0.00234 | 0.00671 | 0.0342 | −3,127,323 |
| Chr. 13 | 1.43 | 0.00321 | 0.00434 | 0.00878 | 0.00364 | 0.00239 | 0.00660 | 0.0361 | −2,336,632 |
| Chr. 14 | 1.29 | 0.00310 | 0.00413 | 0.00792 | 0.00296 | 0.00220 | 0.00721 | 0.0344 | −2,093,555 |
| Chr. 15 | 1.14 | 0.00317 | 0.00421 | 0.00786 | 0.00324 | 0.00216 | 0.00731 | 0.0343 | −1,850,007 |
| Chr. 16 | 1.07 | 0.00360 | 0.00459 | 0.00870 | 0.00277 | 0.00238 | 0.00730 | 0.0374 | −1,755,416 |
| Chr. 17 | 1.09 | 0.00294 | 0.00385 | 0.00726 | 0.00264 | 0.00195 | 0.00827 | 0.0330 | −1,752,968 |
| Chr. 18 | 1.11 | 0.00330 | 0.00433 | 0.00880 | 0.00342 | 0.00243 | 0.00616 | 0.0358 | −1,801,757 |
| Chr. 19 | 0.725 | 0.00315 | 0.00401 | 0.00751 | 0.00285 | 0.00237 | 0.00895 | 0.0350 | −1,173,725 |
| Chr. 20 | 0.878 | 0.00310 | 0.00415 | 0.00798 | 0.00325 | 0.00225 | 0.00706 | 0.0344 | −1,418,404 |
| Chr. 21 | 0.466 | 0.00337 | 0.00444 | 0.00910 | 0.00370 | 0.00278 | 0.00665 | 0.0376 | −761,073 |
| Chr. 22 | 0.454 | 0.00299 | 0.00401 | 0.00796 | 0.00328 | 0.00241 | 0.00744 | 0.0348 | −733,646 |
| Chr. X | 2.07 | 0.00277 | 0.00371 | 0.00637 | 0.00153 | 0.00171 | 0.00627 | 0.0295 | −3,286,104 |
| Coding regionsd | 2.37 | 0.00156 | 0.00249 | 0.00418 | 0.00367 | 0.00137 | 0.00552 | 0.0213 | −3,632,995 |
| FFD 3rd positions | 0.351 | 0.00437 | 0.00548 | 0.0135 | 0.00401 | 0.00480 | 0.00708 | 0.05359 | −598,524 |
a95% CI of each estimated parameter and the estimates based on the sample 2 were shown in supplementary table S2, Supplementary Material online.
bAnalyzed based on the method (2), assuming heterogeneity of mutation rates across the lineages (see Materials and Methods), except the regions with footnote d.
cAverage of sum of the branch lengths in each locus.
dAnalyzed based on the method (3), assuming heterogeneity of mutation rates across lineages and chromosomes (see Materials and Methods).
eθ = 3μgN based on X chromosome.
fFour-fold degenerate sites at third codon positions.
Estimated Speciation Times and Ancestral Population Sizes
| 0.436 × 10−9 | 7.57 | 9.70 | 18.8 | 75,600 | 65,500 | 203,000 | 43,800 | 49,200 | 180,000 |
| 0.556 × 10−9 | 5.94 | 7.61 | 14.7 | 59,300 | 51,400 | 159,000 | 34,300 | 38,500 | 141,000 |
| 1.00 × 10−9 | 3.30 | 4.23 | 8.19 | 33,000 | 28,600 | 88,600 | 19,100 | 21,400 | 78,400 |
aThe value traditionally used. This value was not used for the conclusive estimation.
F(A) Number of genealogies in a block under each of the block size conditions, setting the interval between the blocks at 5 kb. The frequency of hot spots was considered to cover 10% of the genomes (see text). The results in different proportion of two recombination rates were shown in supplementary figure S1, Supplementary Material online. (B) Number of blocks sharing a genealogy with an adjacent block under each of the interval length conditions, setting the block size at 100 bp. These values are the average of the 1,000 replications of the coalescence simulation. (C–H) The estimated θs and τs from simulated sequences. Each boxplot consists of the averages of the 2.5th percentile, lower quartile, median, upper quartile, and 97.5th percentile from 20 replications, from bottom to top. A mark of X represents the median of each of the 20 replications. Dotted lines represent the true values. Under each condition, asterisks indicate that the true value is outside of the 95th percentile, and daggers indicate that the true value is smaller than or larger than all of the medians in the 20 replications.
FPlots and a regression line between τHC and τHCG for each chromosome: τHC and τHCG from the original sample (A), τHC from the original sample and τHCG from the sample 2 (B), and τHC from the sample 2 and τHCG from the original sample (C). Diamonds represent autosomes, and an a cross, a triangle, or a square represents an X chromosome, coding region, or 4-fold degenerate sites at the third codon positions, respectively. The regression line was calculated for autosomes and X chromosome and shown with its formula and the square of its correlation coefficient.
Estimated Relative Ratios of the Mutation Rates to μH
| Relative Ratio to | |||||||
|---|---|---|---|---|---|---|---|
| Whole genome | 1 | 1.004 | 1.034 | 1.091 | 1.005 | 1.025 | 1.091 |
| X chromosome | 1 | 0.9965 | 1.073 | 1.159 | 1.001 | 1.070 | 1.159 |
FRelationship between the estimated speciation times and the fossil records of ancestral great apes (Sawada et al. 1998; Ishida et al. 1999; Gabunia et al. 2001; Haile-Selassie 2001; Brunet et al. 2005; Kunimatsu et al. 2007; Suwa et al. 2007; Wood 2010), for details see Discussion. Dotted lines represent the upper and lower bounds of the 95th percentiles of estimated speciation times (orange for THC and purple for THCG) (table 3) .