| Literature DB >> 27146119 |
Francesca Gandini1,2, Alessandro Achilli1,3, Maria Pala2, Martin Bodner4, Stefania Brandini1, Gabriela Huber4, Balazs Egyed5, Luca Ferretti1, Alberto Gómez-Carballa6, Antonio Salas6, Rosaria Scozzari7, Fulvio Cruciani7, Alfredo Coppa8, Walther Parson4,9, Ornella Semino1, Pedro Soares10, Antonio Torroni1, Martin B Richards2, Anna Olivieri1.
Abstract
Rare mitochondrial lineages with relict distributions can sometimes be disproportionately informative about deep events in human prehistory. We have studied one such lineage, haplogroup R0a, which uniquely is most frequent in Arabia and the Horn of Africa, but is distributed much more widely, from Europe to India. We conclude that: (1) the lineage ancestral to R0a is more ancient than previously thought, with a relict distribution across the Mediterranean/Southwest Asia; (2) R0a has a much deeper presence in Arabia than previously thought, highlighting the role of at least one Pleistocene glacial refugium, perhaps on the Red Sea plains; (3) the main episode of dispersal into Eastern Africa, at least concerning maternal lineages, was at the end of the Late Glacial, due to major expansions from one or more refugia in Arabia; (4) there was likely a minor Late Glacial/early postglacial dispersal from Arabia through the Levant and into Europe, possibly alongside other lineages from a Levantine refugium; and (5) the presence of R0a in Southwest Arabia in the Holocene at the nexus of a trading network that developed after ~3 ka between Africa and the Indian Ocean led to some gene flow even further afield, into Iran, Pakistan and India.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27146119 PMCID: PMC4857117 DOI: 10.1038/srep25472
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Maximum-parsimony phylogenetic tree of 202 complete mtDNA sequences belonging to haplogroup R0a.
Three R0b sequences are also included. Each circle represents a mitogenome and numbers are the same as those in Table S1. Mutations are shown on the branches (relative to rCRS); they are transitions unless the base change is explicitly indicated. Suffixes indicate: transversions (to A, G, C, or T), deletions (d), heteroplasmies (R and Y) and reversions (@). Insertions are also suffixed with a dot followed by a number indicating how many bases were inserted and the inserted nucleotide/s (.1C). Recurrent mutations are underlined. The variation at np 16519, in the number of Cs at nps 309 and 315 as well as the AC indels at nps 515–522 were not included in the phylogeny. All the samples are coloured according to their geographic origin as shown in the legend. ML age estimates are reported in ka for nodes encompassing at least three mitogenomes, except for R0a5 (two mitogenomes), which is extremely rare.
Molecular divergence and age estimates (maximum likelihood and ρ) for haplogroup R0a’b and its subclades.
| Haplogroup | N | ML | SE | Age (ka) | 95% CI (ka) | ρ | σ | Age (ka) | 95% CI (ka) |
|---|---|---|---|---|---|---|---|---|---|
| R0a’b | 205 | 13.77 | 1.20 | 39.20 | (32.01;46.57) | 10.85 | 2.02 | 30.30 | (23.97;36.79) |
| >R0a | 202 | 10.49 | 0.79 | 29.22 | (24.64;33.89) | 7.82 | 1.09 | 21.37 | (15.28;27.63) |
| >>R0a1 | 58 | 9.54 | 0.87 | 26.39 | (21.43;31.45) | 8.00 | 1.85 | 21.89 | (11.64;32.64) |
| >>>R0a1a | 52 | 4.81 | 0.36 | 12.85 | (10.92;14.80) | 4.13 | 0.46 | 10.97 | (8.51;13.46) |
| >>>>R0a1a1 | 12 | 3.89 | 0.56 | 10.31 | (7.31;13.36) | 2.67 | 0.93 | 7.00 | (2.19;11.97) |
| >>>>>R0a1a1a | 9 | 1.35 | 0.38 | 3.51 | (1.59;5.46) | 1.22 | 0.62 | 3.16 | (0.01;6.38) |
| >>>>>>R0a1a1a1 | 5 | 0.63 | 0.26 | 1.63 | (0.30;2.98) | 0.40 | 0.28 | 1.03 | (0;2.47) |
| >>>>R0a1a2 | 3 | 1.79 | 0.45 | 4.65 | (2.33;7.01) | 2.33 | 1.00 | 6.09 | (0.95;11.41) |
| >>>>R0a1a3 | 4 | 4.41 | 0.41 | 11.73 | (9.51;13.98) | 6.25 | 1.75 | 16.88 | (7.41;26.81) |
| >>>>>R0a1a3a | 3 | 2.16 | 0.64 | 5.64 | (2.34;9.01) | 2.67 | 1.25 | 7.00 | (0.56;13.70) |
| >>>>R0a1a4 | 3 | 2.85 | 0.64 | 7.49 | (4.16;10.90) | 3.33 | 1.25 | 8.78 | (2.27;15.55) |
| >>>>R0a1a5 | 4 | 3.49 | 0.60 | 9.21 | (6.03;12.45) | 2.75 | 0.83 | 7.22 | (2.91;11.65) |
| >>>R0a1b | 3 | 0.69 | 3.67 | 1.78 | (0;21.55) | 0.67 | 1.54 | 1.73 | (0;9.76) |
| >>R0a2’3 | 123 | 7.66 | 0.83 | 20.92 | (16.29;25.65) | 6.06 | 1.06 | 16.34 | (10.56;22.29) |
| >>>R0a2 | 117 | 6.14 | 0.41 | 16.56 | (14.28;18.87) | 5.10 | 0.49 | 13.65 | (10.99;16.34) |
| >>>>R0a2a | 7 | 3.34 | 0.88 | 8.82 | (4.18;13.58) | 2.29 | 0.70 | 5.99 | (2.37;9.69) |
| >>>>>R0a2a1 | 3 | 0.98 | 0.43 | 2.53 | (0.34;4.76) | 0.67 | 0.47 | 1.73 | (0;4.14) |
| >>>>R0a2b | 24 | 4.75 | 0.64 | 12.69 | (9.25;16.19) | 4.29 | 1.16 | 11.41 | (5.26;17.77) |
| >>>>>R0a2b1 | 14 | 4.01 | 0.66 | 10.62 | (7.12;14.20) | 4.21 | 1.41 | 11.19 | (3.75;18.94) |
| >>>>>>R0a2b1a | 9 | 0.26 | 0.11 | 0.67 | (0.10;1.24) | 0.22 | 0.16 | 0.56 | (0;1.37) |
| >>>>>>R0a2b1b | 5 | 3.05 | 0.66 | 8.04 | (4.59;11.56) | 3.20 | 1.26 | 8.43 | (1.88;15.24) |
| >>>>>>>R0a2b1b1 | 4 | 1.76 | 0.43 | 4.57 | (2.35;6.83) | 1.75 | 0.66 | 4.55 | (1.17;8.01) |
| >>>>>R0a2b2 | 9 | 1.50 | 0.49 | 3.89 | (1.39;6.43) | 1.11 | 0.47 | 2.87 | (0.48;5.30) |
| >>>>R0a2c | 4 | 4.64 | 1.05 | 12.37 | (6.75;18.17) | 3.50 | 1.41 | 9.25 | (1.90;16.92) |
| >>>>>R0a2c1 | 3 | 0.84 | 0.41 | 2.16 | (0.06;4.28) | 0.67 | 0.47 | 1.73 | (0;4.14) |
| >>>>R0a2d | 7 | 4.67 | 0.81 | 12.45 | (8.11;16.89) | 4.43 | 1.17 | 11.79 | (5.58;18.22) |
| >>>>R0a2f | 15 | 5.56 | 0.53 | 14.94 | (12.06;17.85) | 6.27 | 1.58 | 16.93 | (8.36;25.89) |
| >>>>>R0a2f1 | 9 | 2.07 | 0.53 | 5.41 | (2.70;8.18) | 2.22 | 0.93 | 5.80 | (1.02;10.73) |
| >>>>>>R0a2f1a | 6 | 1.07 | 0.45 | 2.76 | (0.47;5.09) | 1.17 | 0.73 | 3.03 | (0;6.82) |
| >>>>>>R0a2f1b | 3 | 1.27 | 0.49 | 3.29 | (0.79;5.82) | 1.33 | 0.82 | 3.45 | (0;7.72) |
| >>>>R0a2g | 7 | 4.15 | 0.94 | 11.02 | (6.03;16.15) | 3.29 | 0.96 | 8.68 | (3.65;13.85) |
| >>>>>R0a2g1 | 4 | 3.14 | 0.71 | 8.27 | (4.53;12.10) | 3.00 | 1.12 | 7.89 | (2.08;13.91) |
| >>>>>>R0a2g1a | 3 | 2.48 | 0.64 | 6.49 | (3.17;9.88) | 2.67 | 1.05 | 7.00 | (1.58;12.62) |
| >>>>R0a2h | 3 | 3.49 | 0.92 | 9.21 | (4.37;14.19) | 2.67 | 1.15 | 7.00 | (1.07;13.16) |
| >>>>R0a2i | 3 | 1.64 | 0.64 | 4.27 | (1.00;7.61) | 1.67 | 0.88 | 4.34 | (0;8.96) |
| >>>>>R0a2j | 3 | 5.16 | 0.68 | 13.81 | (10.15;17.55) | 6.67 | 1.56 | 18.07 | (9.55;26.95) |
| >>>>R0a2k | 3 | 5.50 | 0.53 | 14.78 | (11.91;17.69) | 7.00 | 1.97 | 19.01 | (8.27;30.33) |
| >>>>R0a2m | 3 | 0.55 | 0.36 | 1.41 | (0;3.23) | 0.50 | 0.50 | 1.29 | (0;3.84) |
| >>>>R0a2n | 7 | 4.93 | 0.83 | 13.17 | (8.72;17.73) | 3.86 | 1.19 | 10.23 | (3.97;16.72) |
| >>>>>R0a2n1 | 4 | 2.45 | 1.22 | 6.41 | (0.13;12.94) | 1.75 | 1.09 | 4.55 | (0;10.30) |
| >>>>>R0a2n2 | 3 | 2.45 | 0.70 | 6.41 | (2.81;10.10) | 2.33 | 0.88 | 6.09 | (1.56;10.76) |
| >>>>R0a2o | 5 | 3.83 | 0.90 | 10.15 | (5.38;15.05) | 2.40 | 1.02 | 6.28 | (1.03;11.71) |
| >>>>>R0a2o1 | 4 | 1.67 | 0.62 | 4.35 | (1.17;7.59) | 1.25 | 0.66 | 3.24 | (0;6.67) |
| >>>>R0a2q | 4 | 1.47 | 0.64 | 3.82 | (0.55;7.15) | 1.00 | 0.61 | 2.59 | (0;5.74) |
| >>>>R0a2r | 11 | 4.64 | 0.85 | 12.37 | (7.84;17.02) | 3.82 | 1.20 | 10.12 | (3.81;16.66) |
| >>>R0a3 | 5 | 5.19 | 0.87 | 13.89 | (9.22;18.68) | 3.80 | 1.08 | 10.06 | (4.38;15.94) |
| >>>>R0a3a | 3 | 4.24 | 0.85 | 11.26 | (6.75;15.88) | 3.67 | 1.29 | 9.71 | (2.96;16.73) |
| >>R0a4 | 6 | 0.46 | 0.21 | 1.19 | (0.14;2.24) | 0.33 | 0.24 | 0.85 | (0;2.07) |
| >>R0a5 | 2 | 6.92 | 1.19 | 18.77 | (12.24;25.51) | 5.50 | 1.66 | 14.77 | (5.87;24.09) |
| >>R0a6 | 11 | 1.15 | 0.38 | 2.98 | (1.07;4.92) | 1.00 | 0.35 | 2.59 | (0.81;4.39) |
| >R0b | 3 | 5.71 | 1.00 | 15.34 | (9.93;20.91) | 5.00 | 1.37 | 13.37 | (6.05;20.98) |
aNumber of mitogenomes.
bMaximum likelihood molecular divergence.
cUsing the corrected molecular clock proposed by Soares et al.81. Except for R0a5, we calculated age estimates only for subclades encompassing at least three mitogenomes.
Figure 2Spatial frequency distribution maps of haplogroups R0a, R0a1a, R0a2b1 and R0a2b2.
Dots indicate the geographical locations of the surveyed populations. Population frequencies (%) correspond to those listed in Table S2. The extremely high frequencies of R0a and R0a1a in the Socotra sample (38.5% and 24.6%, respectively) were not included in order to provide a correct representation of the much lower frequencies in the regions surrounding the island. We constructed spatial frequency distribution plots with the program Surfer 9 (Golden Software, http://www.goldensoftware.com/products/surfer).
Figure 3Bayesian skyline plots (BSPs) of haplogroups R0a, R0a1a and R0a2.
The thick solid line is the median estimate and the shading shows the 95% highest posterior density limits. The time axis is limited to 25 ka, beyond which the curves remain flat. Hypothetical effective population sizes through time are based on the mitogenomes listed in Table S1.
Figure 4Founder analysis of R0a.
Probabilistic distribution of founder clusters across migration times, with time scanned at 200 year intervals from 0 to 50 ka, using f1 (blue lines) and f2 (red lines) criteria. (A) from the Fertile Crescent, Caucasus, Iran and the Arabian Peninsula to Eastern Africa; (B) from the Fertile Crescent and Caucasus to Arabian Peninsula and Eastern Africa; (C) from the Fertile Crescent and Caucasus to the Arabian Peninsula; (D) from the Arabian Peninsula to the Fertile Crescent, Iran and Caucasus; (E) from the Arabian Peninsula to the Fertile Crescent and Caucasus; (F) from the Fertile Crescent, Iran, North Africa, the Arabian Peninsula and Caucasus to India and Pakistan and (G) from the Fertile Crescent, Caucasus, Iran, North Africa and the Arabian Peninsula to Europe.