| Literature DB >> 27043341 |
Chiara Barbieri1,2, Alexander Hübner3, Enrico Macholdt3, Shengyu Ni3, Sebastian Lippold3, Roland Schröder3, Sununguko Wata Mpoloka4, Josephine Purps5, Lutz Roewer5, Mark Stoneking3, Brigitte Pakendorf6.
Abstract
The recent availability of large-scale sequence data for the human Y chromosome has revolutionized analyses of and insights gained from this non-recombining, paternally inherited chromosome. However, the studies to date focus on Eurasian variation, and hence the diversity of early-diverging branches found in Africa has not been adequately documented. Here, we analyze over 900 kb of Y chromosome sequence obtained from 547 individuals from southern African Khoisan- and Bantu-speaking populations, identifying 232 new sequences from basal haplogroups A and B. We identify new clades in the phylogeny, an older age for the root, and substantially older ages for some individual haplogroups. Furthermore, while haplogroup B2a is traditionally associated with the spread of Bantu speakers, we find that it probably also existed in Khoisan groups before the arrival of Bantu speakers. Finally, there is pronounced variation in branch length between major haplogroups; in particular, haplogroups associated with Bantu speakers have significantly longer branches. Technical artifacts cannot explain this branch length variation, which instead likely reflects aspects of the demographic history of Bantu speakers, such as recent population expansion and an older average paternal age. The influence of demographic factors on branch length variation has broader implications both for the human Y phylogeny and for similar analyses of other species.Entities:
Mesh:
Year: 2016 PMID: 27043341 PMCID: PMC4835522 DOI: 10.1007/s00439-016-1651-0
Source DB: PubMed Journal: Hum Genet ISSN: 0340-6717 Impact factor: 4.132
Fig. 1Maximum parsimony (MP) tree for the southern African dataset, rooted with A00. The width of the triangles is proportional to the number of individuals included. Previously unreported lineages are highlighted. Branches are numbered to identify them in Table S1 (Online Resource 2), where information on the defining mutations and comparison with other nomenclature systems are reported. Branch number 1 indicates the branch shared by A2 and A3b1, which is not visible as a separate branch in the MP reconstruction
Diversity and other statistics for the major haplogroup branches
| Major haplogroup branch | Sample size | Frequency % | Nucleotide diversity | Variance | No. of haplotypes | Haplotype diversity | SD | Frequency in Khoisan % | Frequency in Bantu % |
|
|---|---|---|---|---|---|---|---|---|---|---|
| A2 | 49 | 9.0 | 0.009 | 0.00002 | 44 | 0.991 | 0.002 | 13.2 | 0.0 | 0.000 |
| A3b1 | 83 | 15.2 | 0.018 | 0.00007 | 72 | 0.992 | 0.001 | 20.2 | 3.6 | 0.000 |
| B2a | 53 | 9.7 | 0.005 | 0.00001 | 38 | 0.913 | 0.015 | 9.2 | 13.6 | 0.195 |
| B2b | 47 | 8.6 | 0.013 | 0.00004 | 40 | 0.992 | 0.001 | 11.6 | 2.1 | 0.002 |
| G, I, O, T, R1 | 20 | 3.7 | 0.042 | 0.00044 | 20 | 1 | 0 | 3.2 | 2.1 | 0.720 |
| E2 | 12 | 2.2 | 0.005 | 0.00001 | 10 | 0.955 | 0.019 | 1.1 | 5.0 | 0.017 |
| E1b1b | 59 | 10.8 | 0.005 | 0.00001 | 47 | 0.978 | 0.003 | 15.1 | 2.1 | 0.000 |
| E1b1a + L485 | 101 | 18.5 | 0.006 | 0.00001 | 91 | 0.996 | 0.0004 | 8.9 | 35.7 | 0.000 |
| E1b1a1 | 11 | 2.0 | 0.006 | 0.00001 | 11 | 1 | 0 | 1.1 | 3.6 | 0.125 |
| E1b1a8a | 112 | 20.5 | 0.004 | 0.000005 | 91 | 0.966 | 0.004 | 16.4 | 32.1 | 0.000 |
Fig. 2Diversity and distribution of haplogroup B2a. a Network of B2a sequences color coded by linguistic affiliation (Khoisan vs. Bantu speaking individuals). The dashed line indicates the position of branch 21 from Fig. 1, which leads to the root of B2a. b Schematic distribution of haplogroup B2a in Africa: the more intense the color, the higher the frequency in the population. Small crosses mark the locations of the 146 African populations included in the analysis (see Supplemental Table S3)
Fig. 3Values of TMRCA for the A2-T node from the present study. The dates are obtained by direct count and by BEAST analysis, for four different mutation rates (indicated with different colors); both median and mean estimates are indicated. The dates are compared with estimates from other studies (indicated by the name of the first author), which variously dated the same A2-T node (not explicitly labeled in the figure) or the A00-T or A0-T nodes (identified above the bars) (color figure online)
Fig. 4Distances to the A2-T node in number of mutations. a Distribution of distances from each tip to the A2-T node. b Density distribution of distances to the A2-T node for each major haplogroup. Haplogroups are color-coded as in Fig. 1 (color figure online)