| Literature DB >> 26085518 |
Feng-Hua Lv1, Wei-Feng Peng2, Ji Yang1, Yong-Xin Zhao2, Wen-Rong Li3, Ming-Jun Liu3, Yue-Hui Ma4, Qian-Jun Zhao4, Guang-Li Yang5, Feng Wang6, Jin-Quan Li7, Yong-Gang Liu8, Zhi-Qiang Shen9, Sheng-Guo Zhao10, Eer Hehua11, Neena A Gorkhali12, S M Farhad Vahidi13, Muhammad Muladno14, Arifa N Naqvi15, Jonna Tabell16, Terhi Iso-Touru16, Michael W Bruford17, Juha Kantanen18, Jian-Lin Han19, Meng-Hua Li20.
Abstract
Despite much attention, history of sheep (Ovis aries) evolution, including its dating, demographic trajectory and geographic spread, remains controversial. To address these questions, we generated 45 complete and 875 partial mitogenomic sequences, and performed a meta-analysis of these and published ovine mitochondrial DNA sequences (n = 3,229) across Eurasia. We inferred that O. orientalis and O. musimon share the most recent female ancestor with O. aries at approximately 0.790 Ma (95% CI: 0.637-0.934 Ma) during the Middle Pleistocene, substantially predating the domestication event (∼8-11 ka). By reconstructing historical variations in effective population size, we found evidence of a rapid population increase approximately 20-60 ka, immediately before the Last Glacial Maximum. Analyses of lineage expansions showed two sheep migratory waves at approximately 4.5-6.8 ka (lineages A and B: ∼6.4-6.8 ka; C: ∼4.5 ka) across eastern Eurasia, which could have been influenced by prehistoric West-East commercial trade and deliberate mating of domestic and wild sheep, respectively. A continent-scale examination of lineage diversity and approximate Bayesian computation analyses indicated that the Mongolian Plateau region was a secondary center of dispersal, acting as a "transportation hub" in eastern Eurasia: Sheep from the Middle Eastern domestication center were inferred to have migrated through the Caucasus and Central Asia, and arrived in North and Southwest China (lineages A, B, and C) and the Indian subcontinent (lineages B and C) through this region. Our results provide new insights into sheep domestication, particularly with respect to origins and migrations to and from eastern Eurasia.Entities:
Keywords: Ovis aries; colonization simulation; domestication; gene flow; meta-analysis; mitogenome; wild ancestor
Mesh:
Substances:
Year: 2015 PMID: 26085518 PMCID: PMC4576706 DOI: 10.1093/molbev/msv139
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
FGeographic distribution of the samples in this and early ovine mtDNA studies.
FGeographic distribution of the five major maternal lineages across Eurasia based on sequences obtained in this study and retrieved from GenBank. (A) Phylogenetic tree inferred from partial control region sequences (left) and lineage composition of sheep in different geographic regions at different time points (right) based on ancient specimens (Cai et al. 2007, 2011; Demirci et al. 2013; Niemi et al. 2013); (B) lineage frequency distribution of partial control region sequences; previously reported lineage frequencies in 12 regions (I–XII) are detailed in supplementary table S17, Supplementary Material online; (C) lineage frequency distribution of partial Cyt-b sequences; (D) geographic distribution of fat-tailed native sheep breeds (regions with black lines) and lineage C (region colored in purple). Pie plots show the proportions of the five distinct lineages (A–E) of domestic sheep in the different geographic regions (for the details of the geographic regions, see supplementary tables S5 and S6, Supplementary Material online). In the phylogenetic tree, diagnostic mutations are showed on the branches and are named according to their nucleotide positions relative to the reference sequence AF010406; amino acid replacements are underlined and synonymous replacements are marked in black. Control region mutations (15,437–16,616 bp) are shown in blue. Insertions are indicated by a “+” after the position number and followed by the type of inserted nucleotide(s). Mutations with prefix “β” indicate identical variable sites found in Meadows et al. (2007), which are used to define the five major lineages.
FSynthetic maps illustrating geographic variation of nucleotide variability for the total lineages and lineages A, B, and C. (A) The total lineages, (B) lineage A, (C) lineage B, and (D) lineage C.
FPhylogeny of domestic and wild sheep inferred from a total of 95 complete mitogenomes (supplementary table S1, Supplementary Material online) using BI and ML methods with posterior probability (the first value) and bootstrap values (the second value) on the nodes, respectively. Divergence times for the lineages (Ma) were estimated only based on the 61 complete mitogenomes of native domestic sheep breeds and wild sheep species (see supplementary table S1, Supplementary Material online).
Number of Parameters Fitted, dN/dS Ratios, Log-Likelihood Scores, and Their Differences under Different Models.
| Model | ln | Models Compared | 2Δln | |||
|---|---|---|---|---|---|---|
| A: One | 102 | −18,462.81 | ||||
| B: Two | 103 | −18,454.79 | ||||
| A versus B | 16.04 | |||||
| C: Three | 104 | −18,419.46 | A versus C | 86.70 | ||
| A versus D | 99.08 | |||||
| B versus C | 70.66 | |||||
| D: Four | 105 | −18,413.27 | B versus D | 83.04 | ||
| C versus D | 12.38 | |||||
Note.—p, number of parameters in the model; ln L, log-likelihood score; ω, the dN/dS ratio for the branches; ωA ωB and ωD are the dN/dS ratios for branches lineages A, B, and D, respectively (see supplementary fig. S14, Supplementary Material online); ω is the background dN/dS ratio for the rest branch(es); 2Δln L, twice the log-likelihood difference of the models compared.
**Very significant (P < 0.01).
Divergence Time Estimated by the Sequences of Complete Mitogenomes and the Protein-Coding Genes (synonymous mutation and the third-codon position) Using ML and BI Methods.
| Method | Data Set | Model | Node | Node 1 (TC/E)Ma | Node 2 (TA/B)Ma | Node 3 (TAB/D)Ma | Node 4 (TABD/CE)Ma | Node 5 (T | T | T |
|---|---|---|---|---|---|---|---|---|---|---|
| ML | Mitogenome | Global | Time | 0.36 | 0.51 | 0.74 | 0.88 | 2.60 | 3.00 | 7.72 |
| 95%(CI) | (0.278–0.439) | (0.402–0.616) | (0.613–0.867) | (0.743–1.013) | — | (2.673–3.323) | (6.567–8.883) | |||
| Local | Time | 0.34 | 0.53 | 0.78 | 0.93 | 2.60 | 3.06 | 8.15 | ||
| 95%(CI) | (0.276–0.472) | (0.397–0.668) | (0.600–0.956) | (0.721–1.131) | — | (2.697–3.419) | (6.635–9.664) | |||
| Synonymous | Global | Time | 0.31 | 0.52 | 0.68 | 0.79 | 2.60 | 2.92 | 8.36 | |
| 95%(CI) | (0.217–0.405) | (0.373–0.661) | (0.536–0.829) | (0.637–0.934) | — | (2.535–3.312) | (6.441–10.286) | |||
| Local | Time | 0.31 | 0.52 | 0.69 | 0.80 | 2.60 | 2.93 | 8.31 | ||
| 95%(CI) | (0.200–0.418) | (0.346–0.694) | (0.494–0.887) | (0.583–1.018) | — | (2.453–3.413) | (6.182–10.436) | |||
| Third codon | Global | Time | 0.29 | 0.50 | 0.64 | 0.73 | 2.60 | 2.81 | 7.47 | |
| 95%(CI) | (0.190–0.390) | (0.361–0.639) | (0.497–0.783) | (0.579–0.881) | — | (2.418–3.202) | (6.157–8.783) | |||
| Local | Time | 0.29 | 0.50 | 0.64 | 0.73 | 2.60 | 2.81 | 7.47 | ||
| 95%(CI) | (0.190–0.390) | (0.359–0.636) | (0.498–0.783) | (0.581–0.882) | — | (2.414–3.200) | (6.158–8.786) | |||
| BI | Mitogenome | Relaxed-molecular clock | Median | 0.35 | 0.55 | 0.85 | 0.92 | 2.60 | 2.68 | 6.13 |
| 95%HPD | (0.130–0.641) | (0.266–0.913) | (0.390–1.413) | (0.464–1.498) | — | (2.462–3.031) | (2.464–11.618) | |||
| Synonymous | Median | 0.41 | 0.61 | 0.96 | 1.06 | 2.60 | 2.62 | 5.89 | ||
| 95%HPD | (0.147–0.772) | (0.291–1.013) | (0.478–1.604) | (0.541–1.716) | — | (2.458–3.338) | (5.456–11.598) | |||
| Third codon | Median | 0.36 | 0.57 | 0.87 | 0.94 | 2.60 | 2.62 | 6.49 | ||
| 95%HPD | (0.142–0.656) | (0.27–0.912) | (0.421–1.428) | (0.472–1.496) | — | (2.461–3.083) | (2.478–12.647) |
Note.—“—,” not available.
FSynthetic maps illustrating geographic variation of eigenvalues (λ) for the first two MDS dimensions (λ1 and λ2) and regression of λ versus geographic distance from the putative original site of colonization process. (A) synthetic map for λ1, (B) synthetic map for λ2, (C) regression of λ1 versus geographic distances from the domestication center of sheep (represented by the geographic distance from the Kilis province of Turkey, where ancient domestic sheep are located; Demirci et al. 2013) for Asian populations (r = 0.201; P < 0.05); (D) regression of λ1 (based on lineage A only) versus geographic distances from the domestication center of sheep (represented by the geographic distance from the Kilis province of Turkey, where ancient domestic sheep are located; Demirci et al. 2013) for sheep populations from the Indian subcontinent (r = 0.547; P < 0.01); and (E) regression of λ2 versus geographic distances from a putative “transportation hub” of the Mongolian Plateau region (represented by the geographic distance from the northernmost population [Transbaikal Finewool] sampled) for eastern Eurasian (including China, Mongolia, and India) populations (r = 0.372; P < 0.01).
Sudden Expansion Model Parameters Estimated from the Distribution of Pairwise Differences between Sequences within mtDNA Lineages and Neutrality Tests for Different Lineages.
| Lineages | Areas | τ (90% CI) | SSD | ka (90% CI) | Tajima’s | Fu’s | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| A | All | 2.62 (1.740–3.078) | 0.00 | 77.81 | 0.005 | 0.044 | 0.030 | 0.548 | 6.443 (4.279–7.569) | -2.06 | 0.000 | -25.09 | 0.000 |
| India | 1.64 (0.617–9.394) | 4.95 | 99,999.00 | 0.008 | 0.498 | 0.011 | 0.471 | 4.033(1.517–23.100) | -1.08 | 0.103 | -24.68 | 0.000 | |
| East Asiaa | 2.39 (2.121–2.717) | 0.01 | 99,999.00 | 0.000 | 0.437 | 0.039 | 0.246 | 5.877 (5.216–6.681) | -2.33 | 0.000 | -26.50 | 0.000 | |
| Middle East | 2.50 (1.969–3.004) | 0.00 | 99,999.00 | 0.006 | 0.072 | 0.053 | 0.103 | 6.148 (4.842–7.387) | -2.16 | 0.001 | -26.97 | 0.000 | |
| Europe | 3.66 (2.061–5.852) | 0.53 | 41.25 | 0.005 | 0.541 | 0.028 | 0.558 | 9.000 (5.068–14.390) | -1.04 | 0.129 | -4.89 | 0.011 | |
| B | All | 2.77 (1.424–3.947) | 0.04 | 16.63 | 0.001 | 0.783 | 0.021 | 0.759 | 6.811 (3.502–9.706) | -2.24 | 0.000 | -25.48 | 0.000 |
| India | 1.38 (0.391–6.633) | 2.54 | 99,999.00 | 0.006 | 0.474 | 0.020 | 0.594 | 3.393 (0.961–16.311) | -0.83 | 0.190 | -12.88 | 0.001 | |
| East Asiaa | 2.85 (1.232–6.648) | 1.03 | 10.45 | 0.002 | 0.743 | 0.012 | 0.893 | 7.008 (3.030–16.348) | -1.93 | 0.004 | -26.14 | 0.000 | |
| Middle East | 3.00 (1.621–3.949) | 0.00 | 32.66 | 0.001 | 0.673 | 0.023 | 0.685 | 7.377 (3.986–9.711) | -2.28 | 0.000 | -26.66 | 0.000 | |
| Europe | 2.85 (1.232–6.648) | 1.03 | 10.45 | 0.002 | 0.743 | 0.012 | 0.893 | 6.000 (5.383–6.711) | -1.93 | 0.004 | -26.14 | 0.000 | |
| C | All | 1.85 (0.977–2.705) | 0.00 | 15.77 | 0.000 | 0.944 | 0.031 | 0.784 | 4.549 (2.402–6.652) | -2.41 | 0.000 | -27.22 | 0.000 |
| East Asiaa | 2.02 (0.770–3.189) | 0.00 | 11.08 | 0.002 | 0.675 | 0.030 | 0.780 | 4.967 (1.893–7.842) | -2.02 | 0.004 | -24.70 | 0.000 | |
| Middle East | 1.59 (1.146–2.082) | 0.00 | 99,999.00 | 0.000 | 0.829 | 0.052 | 0.406 | 3.910 (2.818–5.120) | -2.24 | 0.001 | -27.33 | 0.000 | |
Note.—τ = 2ut, u is the mutation rate for the haplotypes, t is the time in generations; 90% CI, 90% confidence interval (CI) obtained by the parametric bootstrapping with 1,000 resampling; θ = 2uN, N is the initial effective population size; θ = 2uN, N is the final effective population size; SSD is the sum of square deviations (SSD) between the observed and the expected mismatch; PSSD, PR, PD, and PF are the corresponding significance values for SSD, rH, Tajima’s D (Tajima 1989), and Fu’s FS (Fu 1997); rH, Harpending’s Raggedness index (Harpending 1994).
aEast Asia includes the regions of Northern China and the Mongolian Plateau (for details, see supplementary table S5, Supplementary Material online).
F(A) The major migratory events and routes of the major sheep lineages across Eurasia (lineage A, in blue; lineage B, in red; and lineage C, in yellow). Migration routes reported in previous studies have also been included: (1) The Mediterranean routes (Ryder 1984), (2) the Danubian route (Ryder 1984), (3) the northern Europe route (Tapio et al. 2006), (4) the ancient sea trade route to the Indian subcontinent (Singh et al. 2013), and (5) routes of introduction and spread of sheep pastoralism in Africa (Muigai and Hanotte 2013); (B) the optimal population colonization models for A/B/C lineages in DIYABC v.2.0.4 (Cornuet et al. 2014) (supplementary information S5, Supplementary Material online). Colors in the two figures are irrelevant between each other.
Results of Model Choice for the Colonization Scenarios of Sheep Lineages A, B, and C Tested in the ABC Analyses.
| Scenarios | Posterior Probability | 95% Confidence Intervals |
|---|---|---|
| The models of lineage A in the first step | ||
| A1‐1 | 0.120 | 0.015 − 0.224 |
| | ||
| A1‐3 | 0.038 | 0.020 − 0.056 |
| The models of lineage A in the second step | ||
| A2‐1 | 0.215 | 0.168 − 0.263 |
| A2‐2 | 0.060 | 0.016 − 0.105 |
| | ||
| The models of lineage B in the first step | ||
| B1‐1 | 0.253 | 0.245 − 0.261 |
| B1‐2 | 0.258 | 0.245 − 0.272 |
| | ||
| The models of lineage B in the second step | ||
| B2‐1 | 0.043 | 0.038 − 0.048 |
| | ||
| B2‐3 | 0.004 | 0.003 − 0.004 |
| B2‐4 | 0.005 | 0.005 − 0.006 |
| B2‐5 | 0.009 | 0.008 − 0.010 |
| The models of lineage C | ||
| C-1 | 0.158 | 0.149 − 0.167 |
| C-2 | 0.079 | 0.069 − 0.088 |
| C-3 | 0.207 | 0.200 − 0.215 |
| C-4 | 0.138 | 0.128 − 0.147 |
| |
Note.—The best-supported model with the highest posterior probability is indicated in italics.