| Literature DB >> 18818760 |
Laure Ségurel1, Begoña Martínez-Cruz, Lluis Quintana-Murci, Patricia Balaresque, Myriam Georges, Tatiana Hegay, Almaz Aldashev, Firuza Nasyrova, Mark A Jobling, Evelyne Heyer, Renaud Vitalis.
Abstract
In the last two decades, mitochondrial DNA (mtDNA) and the non-recombining portion of the Y chromosome (NRY) have been extensively used in order to measure the maternally and paternally inherited genetic structure of human populations, and to infer sex-specific demography and history. Most studies converge towards the notion that among populations, women are genetically less structured than men. This has been mainly explained by a higher migration rate of women, due to patrilocality, a tendency for men to stay in their birthplace while women move to their husband's house. Yet, since population differentiation depends upon the product of the effective number of individuals within each deme and the migration rate among demes, differences in male and female effective numbers and sex-biased dispersal have confounding effects on the comparison of genetic structure as measured by uniparentally inherited markers. In this study, we develop a new multi-locus approach to analyze jointly autosomal and X-linked markers in order to aid the understanding of sex-specific contributions to population differentiation. We show that in patrilineal herder groups of Central Asia, in contrast to bilineal agriculturalists, the effective number of women is higher than that of men. We interpret this result, which could not be obtained by the analysis of mtDNA and NRY alone, as the consequence of the social organization of patrilineal populations, in which genetically related men (but not women) tend to cluster together. This study suggests that differences in sex-specific migration rates may not be the only cause of contrasting male and female differentiation in humans, and that differences in effective numbers do matter.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18818760 PMCID: PMC2535577 DOI: 10.1371/journal.pgen.1000200
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Human sex-specific demography inferred from genetic data.
| Region | Markers | Method | Social organization | Differences in demographic parameters between males and females | References | ||
| Sex-biased migration | Skewed effective population size | ||||||
| GLOBAL | mtDNA, NRY SNPs | Genetic structure (AMOVA | NA | None |
| ||
| GLOBAL | Autosomal STRs | Genetic structure (AMOVA) | NA | None |
| ||
| GLOBAL | mtDNA, NRY SNPs | Coalescent-based (TMRCA | NA |
| and/or |
|
|
| GLOBAL | mtDNA, NRY STRs+SNPs, Autosomal STRs+SNPs | Genetic structure ( | NA |
| Considered as negligible |
| |
| GLOBAL | NRY SNPs | Coalescent-based (mismatch distributions) | NA | Not considered |
|
| |
| India | mtDNA | Genetic structure ( | Endogamy, patrilocality | None |
| ||
| NRY STRs | Endogamy, matrilocality | None | |||||
| Sinai peninsula | mtDNA, NRY | Genetic diversity | Endogamy and rare patrilocal exogamy, polygyny |
| and/or |
|
|
| West New Guinea | mtDNA, NRY STRs+SNPs | Genetic structure and diversity ( | Exogamy, patrilocality, patrilineality, polygyny |
| and/or |
|
|
| Sub-Saharan Africa | mtDNA, NRY STRs+SNPs | Genetic structure (AMOVA) | FPP |
| and/or |
|
|
| HGP |
| and/or |
| ||||
| Thailand | mtDNA, NRY STRs | Coalescent-based (Approximate Bayesian Computation) | Patrilocality |
| and/or |
|
|
| Matrilocality |
| and/or |
| ||||
| Eastern North America | mtDNA, NRY STRs+SNPs | Genetic structure (AMOVA), coalescent-based (MIGRATE | Patrilocality, patrilineality |
| and/or |
|
|
| Matrilocality, matriliny |
| and/or |
| ||||
| Central Asia (pastoral populations) | mtDNA, NRY STRs | Genetic structure and diversity (AMOVA, | Exogamy, patrilineality |
| and/or |
|
|
| New Britain | mtDNA, NRY SNPs, X-linked loci | Coalescent-based (θ | No strong endogamy, ambilocality, polygyny |
| and |
|
|
| Central Asia | mtDNA, NRY STRs | Genetic structure (AMOVA) | Exogamy, patrilocality, polygyny |
| Considered as negligible |
| |
| Thailand | mtDNA, NRY STRs | Genetic structure and diversity (haplotype diversity, | Patrilocality |
| Considered as negligible |
| |
| Matrilocality |
| Considered as negligible | |||||
| Sub-Saharan Africa | mtDNA, NRY SNPs | Genetic structure and diversity (haplotype diversity, AMOVA) | NA |
| Not considered |
| |
| Continental Asia | mtDNA, NRY SNPs | Genetic structure ( | NA |
| Not considered |
| |
| Russia | mtDNA, NRY SNPs | Genetic structure ( | Patrilocality, patrilineality |
| Not considered |
| |
| Caucasus | mtDNA, NRY SNPs | Genetic structure (AMOVA) | NA |
| Not considered |
| |
| Turkey | mtDNA, NRY STRs+SNPs | Genetic structure (AMOVA) | NA |
| Not considered |
| |
This table summarizes the observed patterns of sex-specific differences in demographic parameters reported in a number of recent studies. The first column lists the location of the sampled populations, or indicates whether the study is conducted at a global scale. The second column gives the markers used, and the third column indicates the statistical methods employed. The fourth column provides indications on social organization, available a priori for the populations under study. In the fifth and sixth columns, the authors' interpretations of sex-specific differences in demographic parameters are given, with respect to skewed gene flow and/or effective numbers.
Indications on social organization, marriage rules, etc., as provided by the authors.
The differences in demographic parameters between males and females, as inferred by the authors, are given in terms of sex-biased gene flow, and skewed effective numbers; the authors' interpretation to the observed pattern is given in parentheses, when available.
Single nucleotide polymorphisms.
Analysis of molecular variance [69].
Not available (no detailed information given by the authors concerning social organization, marriage rules, etc.).
Short tandem repeats.
Time to the most recent common ancestor.
mtDNA and NRY were not sampled in the same individuals or populations.
The authors discussed a possible difference in demographic parameters between males and females, but considered it as negligible.
The authors did not consider this pattern.
Food-producer populations.
Hunter-gatherer populations.
Monte Carlo Markov chain method to estimate population sizes and migration rates [70].
Variance in Reproductive Success.
population-mutation parameter.
Figure 1Geographic map of the sampled area, with the 21 populations studied.
Bilineal agriculturalist populations are in blue (Tajiks); Patrilineal herders with a semi-nomadic lifestyle are in red (Kazaks, Karakalpaks, Kyrgyz and Turkmen).
Sample description.
| Sampled populations (area) | Acronym | Location | Long. | Lat. |
|
|
|
|
|
| ||||||||
| Tajiks (Samarkand) | TJA | Uzbekistan/Tajikistan border |
|
| 26 | 31 | 32 | 32 |
| Tajiks (Samarkand) | TJU | Uzbekistan/Tajikistan border |
|
| 27 | 29 | 29 | 29 |
| Tajiks (Ferghana) | TJR | Tajikistan/Kyrgyzstan border |
|
| 30 | 29 | 29 | 29 |
| Tajiks (Ferghana) | TJK | Tajikistan/Kyrgyzstan border |
|
| 26 | 26 | 35 | 40 |
| Tajiks (Gharm) | TJE | Northern Tajikistan |
|
| 29 | 25 | 27 | 31 |
| Tajiks (Gharm) | TJN | Western Tajikistan |
|
| 33 | 24 | 30 | 35 |
| Tajiks (Gharm) | TJT | Northern Tajikistan |
|
| 31 | 25 | 30 | 32 |
| Tajiks (Penjinkent) | TDS | Uzbekistan/Tajikistan border |
|
| 30 | 25 | 31 | 31 |
| Tajiks (Penjinkent) | TDU | Uzbekistan/Tajikistan border |
|
| 40 | 25 | 31 | 40 |
| Tajiks (Yagnobs from Douchambe) | TJY | Western Tajikistan |
|
| 39 | 25 | 36 | 40 |
|
| ||||||||
| Karakalpaks (Qongrat from Karakalpakia) | KKK | Western Uzbekistan |
|
| 56 | 45 | 54 | 55 |
| Karakalpaks (On Tört Uruw from Karakalpakia) | OTU | Western Uzbekistan |
|
| 49 | 45 | 54 | 53 |
| Kazaks (Karakalpakia) | KAZ | Western Uzbekistan |
|
| 47 | 49 | 50 | 50 |
| Kazaks (Bukara) | LKZ | Southern Uzbekistan |
|
| 20 | 25 | 20 | 31 |
| Kyrgyz (Andijan) | KRA | Tajikistan/Kyrgyzstan border |
|
| 31 | 45 | 46 | 48 |
| Kyrgyz (Narin) | KRG | Middle Kyrgyzstan |
|
| 20 | 18 | 20 | 20 |
| Kyrgyz (Narin) | KRM | Middle Kyrgyzstan |
|
| 21 | 21 | 22 | 26 |
| Kyrgyz (Narin) | KRL | Middle Kyrgyzstan |
|
| 36 | 22 | - | - |
| Kyrgyz (Narin) | KRB | Middle Kyrgyzstan |
|
| 31 | 24 | - | - |
| Kyrgyz (Issyk Kul) | KRT | Eastern Kyrgyzstan |
|
| 33 | 37 | - | - |
| Turkmen (Karakalpakia) | TUR | Western Uzbekistan |
|
| 42 | 47 | 51 | 51 |
Long., longitude; Lat., latitude. n X, n A, n Y and n mt: sample size for X-linked, autosomal, Y-linked and mitochondrial markers, respectively.
Level of diversity and differentiation for NRY markers and mtDNA.
| NRY markers |
| |||
| Locus name | Allelic richness (AR) |
| Herders | Agriculturalists |
| DYS426 | 4 | 0.500 | 0.3326 | 0.0068 |
| DYS393 | 8 | 0.492 | 0.1095 | 0.0517 |
| DYS390 | 8 | 0.739 | 0.1229 | 0.1253 |
| DYS385 a/b | 15 | 0.858 | 0.1414 | 0.0278 |
| DYS388 | 9 | 0.531 | 0.3003 | 0.0736 |
| DYS19 | 7 | 0.743 | 0.1081 | 0.1310 |
| DYS392 | 10 | 0.516 | 0.1345 | 0.0701 |
| DYS391 | 7 | 0.495 | 0.2533 | 0.0686 |
| DYS389I | 6 | 0.541 | 0.1537 | 0.1395 |
| DYS439 | 7 | 0.725 | 0.1638 | 0.0291 |
| DYS389II | 8 | 0.763 | 0.1556 | 0.0395 |
We calculated the total allelic richness (AR) (over all populations) and the expected heterozygosity H e [55] using Arlequin version 3.1 [56]. Genetic differentiation among populations was measured both per locus and overall loci, using Weir and Cockerham's F ST estimator [57], as calculated in Genepop 4.0 [58]. We calculated the total number of polymorphic sites, the unbiased estimate of expected heterozygosity H e [55], and F ST using Arlequin version 3.1 [56].
Figure 2Diagram representing the relative values of expected genetic differentiation for autosomal markers and for X-linked markers .
In the red upper right triangle, the F ST estimates for autosomal markers are higher than for X-linked markers. In this case, N f/N is necessarily larger than 0.5. In the blue region of the figure, the F ST estimates for autosomal markers are lower than for X-linked markers. The white plain line, at which , represents the set of (N f/N, m f/m) values where the autosomal and X-linked F ST estimates are equal. In this case , if N f = N m, then the lower effective size of X-linked markers (which would be three-quarters that of autosomal markers) can only be balanced by a complete female-bias in dispersal (m f/m = 1). Conversely, if m f = m m, the large female fraction of effective numbers compensates exactly the low effective size of X-linked markers only for N f = 7N m. Last, if m f = m m/2, then the autosomal and X-linked F ST estimates can only be equal as the number of males tends towards zero.
Level of diversity and differentiation for X-linked and autosomal markers.
|
| ||||
| Locus name | Allelic richness (AR) |
| Herders | Agriculturalists |
|
| ||||
| CTAT014 | 19 | 0.746 | 0.0018 | 0.0225 |
| GATA124E07 | 15 | 0.847 | 0.0024 | 0.0136 |
| GATA31D10 | 8 | 0.697 | 0.0069 | 0.0007 |
| ATA28C05 | 7 | 0.722 | 0.0086 | 0.0179 |
| AFM150xf10 | 14 | 0.832 | −0.0021 | 0.0152 |
| GATA100G03 | 14 | 0.734 | −0.0019 | 0.0084 |
| AGAT121P | 15 | 0.593 | −0.0016 | 0.0048 |
| ATCT003 | 10 | 0.797 | 0.0095 | 0.0261 |
| GATA31F01 | 11 | 0.804 | 0.0069 | 0.0053 |
|
| ||||
| AFM249XC5 | 19 | 0.848 | 0.0080 | 0.0081 |
| ATA10H11 | 13 | 0.680 | 0.0128 | 0.0193 |
| AFM254VE1 | 14 | 0.837 | 0.0105 | 0.0086 |
| AFMA218YB5 | 14 | 0.852 | 0.0030 | 0.0151 |
| GGAA7G08 | 22 | 0.896 | 0.0096 | 0.0138 |
| GATA11H10 | 16 | 0.776 | 0.0017 | 0.0056 |
| GATA12A07 | 16 | 0.857 | 0.0001 | 0.0163 |
| GATA193A07 | 15 | 0.825 | 0.0064 | 0.0087 |
| AFMB002ZF1 | 11 | 0.820 | 0.0028 | 0.0169 |
| AFMB303ZG9 | 16 | 0.858 | 0.0090 | 0.0148 |
| ATA34G06 | 12 | 0.675 | 0.0088 | 0.0132 |
| GATA72G09 | 18 | 0.884 | −0.0023 | 0.0131 |
| GATA22F11 | 21 | 0.897 | 0.0152 | 0.0144 |
| GGAA6D03 | 13 | 0.831 | 0.0048 | 0.0176 |
| GATA88H02 | 17 | 0.892 | 0.0063 | 0.0056 |
| SE30 | 15 | 0.762 | 0.0084 | 0.0103 |
| GATA43C11 | 16 | 0.870 | 0.0028 | 0.0093 |
| AFM203YG9 | 14 | 0.753 | 0.0105 | 0.0084 |
| AFM157XG3 | 13 | 0.753 | 0.0147 | 0.0196 |
| UT2095 | 16 | 0.738 | 0.0032 | 0.0112 |
| GATA28D01 | 25 | 0.896 | 0.0156 | 0.0139 |
| GGAA4B09 | 19 | 0.707 | 0.0034 | 0.0208 |
| ATA3A07 | 12 | 0.746 | 0.0078 | 0.0070 |
| AFM193XH4 | 11 | 0.716 | 0.0164 | 0.0129 |
| GATA11B12 | 26 | 0.896 | 0.0104 | 0.0265 |
| AFM165XC11 | 13 | 0.785 | 0.0058 | 0.0185 |
| AFM248VC5 | 20 | 0.620 | 0.0246 | 0.0145 |
We calculated the allelic richness (AR) and unbiased estimates of expected heterozygosity H e [55], obtained both by locus and on average with Arlequin version 3.1 [56]. Genetic differentiation among populations was measured both per locus and overall loci, using Weir and Cockerham's F ST estimator [57] as calculated in Genepop 4.0 [58].
Figure 3p-values of Wilcoxon tests plotted in the (N f/N, m f/m) parameter space.
For each set of (N f/N, m f/m) values, we applied the transformation in eq. (4), and tested whether our data on autosomal and X-linked markers were consistent, given the hypothesis defined by the set of (N f/N, m f/m) values. (A) Surface plot of the p-values, as a function of the female fraction of effective number and the female fraction of migration rate, for the herders (11 populations). The arrow indicates the line that separates the region where p≤0.05 from that where p>0.05. Non-significant p-values (p>0.05) correspond to the values of (N f/N, m f/m) that could not be rejected, given our data. (B) Contour plots, for the same data. The dashed line indicates the range of (N f/N, m f/m) values inferred from the ratio of NRY and mtDNA population structure, as obtained from the relationship: . The dotted lines correspond to the cases where N f = N m (vertical line) and m f = m m (horizontal line). (C) and (D) as (A) and (B), respectively, for the agriculturalists (10 populations).
Figure 4Percentage of significant tests in the (N f/N, m f/m) parameter space, for simulated data.
We chose a range of 49 (N f m f/N m m m) ratios, varying from 0.0004 to 2401, and for each of these ratios we chose 29 sets of (N f/N, m f/m) values. By doing this, we obtained 1421 sets of (N f/N, m f/m) values, represented as white dots in the right-hand side panel B, covering the whole parameter space. For each set, we simulated 100 independent datasets using a coalescent-based algorithm, and taking the same number of individuals and the same number of loci for each genetic system as in the observed data. For each dataset, we calculated the p-value for a one-sided Wilcoxon sum rank test , and for each set of (N f/N, m f/m) values we calculated the percentage of significant p-values (at the α = 0.05 level). A. Surface plot of the proportion of significant p-values (at the α = 0.05 level), as a function of the female fraction of effective number and the female fraction of migration rate. B. Contour plot, for the same data. The dotted line, at which , represents the set of (N f/N, m f/m) values where the autosomal and X-linked F ST's are equal. The theory predicts that we should only find in the upper-right triangle defined by the dotted line. Hence, the proportion of significant p-values for any set of (N f/N, m f/m) values in this upper right triangle gives an indication of the power of the method.
Autosomal and X-linked differentiation on jackknifed samples.
| Sample removed |
|
|
|
|
|
| ||||
| KAZ | 0.0084 | 0.0050 | 0.068 | 1.7 |
| KKK | 0.0085 | 0.0050 | 0.078 | 1.7 |
| KRA | 0.0078 | 0.0027 | 0.022 | 2.9 |
| KRB | 0.0080 | 0.0030 | 0.028 | 2.7 |
| KRG | 0.0078 | 0.0035 | 0.037 | 2.2 |
| KRL | 0.0086 | 0.0038 | 0.018 | 2.3 |
| KRM | 0.0069 | 0.0023 | 0.018 | 3.0 |
| KRT | 0.0081 | 0.0044 | 0.047 | 1.8 |
| LKZ | 0.0088 | 0.0025 | 0.002 | 3.5 |
| OTU | 0.0089 | 0.0038 | 0.022 | 2.3 |
| TUR | 0.0054 | 0.0025 | 0.073 | 2.2 |
|
| ||||
| TDS | 0.0125 | 0.0109 | 0.443 | 1.1 |
| TDU | 0.0132 | 0.0153 | 0.705 | 0.9 |
| TJA | 0.0144 | 0.0123 | 0.109 | 1.2 |
| TJE | 0.0140 | 0.0133 | 0.148 | 1.1 |
| TJK | 0.0134 | 0.0131 | 0.457 | 1.0 |
| TJN | 0.0148 | 0.0144 | 0.387 | 1.0 |
| TJR | 0.0140 | 0.0141 | 0.401 | 1.0 |
| TJT | 0.0139 | 0.0121 | 0.225 | 1.1 |
| TJU | 0.0139 | 0.0127 | 0.283 | 1.1 |
| TJY | 0.0139 | 0.0116 | 0.259 | 1.2 |
For each group, we removed one sample in turn and calculated the differentiation on autosomal and X-linked markers. The p-value gives the result of a one-sided Wilcoxon sum rank test , as performed on the full dataset.