| Literature DB >> 22902787 |
Michael C Turchin1, Charleston W K Chiang, Cameron D Palmer, Sriram Sankararaman, David Reich, Joel N Hirschhorn.
Abstract
Strong signatures of positive selection at newly arising genetic variants are well documented in humans(1-8), but this form of selection may not be widespread in recent human evolution(9). Because many human traits are highly polygenic and partly determined by common, ancient genetic variation, an alternative model for rapid genetic adaptation has been proposed: weak selection acting on many pre-existing (standing) genetic variants, or polygenic adaptation(10-12). By studying height, a classic polygenic trait, we demonstrate the first human signature of widespread selection on standing variation. We show that frequencies of alleles associated with increased height, both at known loci and genome wide, are systematically elevated in Northern Europeans compared with Southern Europeans (P < 4.3 × 10(-4)). This pattern mirrors intra-European height differences and is not confounded by ancestry or other ascertainment biases. The systematic frequency differences are consistent with the presence of widespread weak selection (selection coefficients ∼10(-3)-10(-5) per allele) rather than genetic drift alone (P < 10(-15)).Entities:
Mesh:
Year: 2012 PMID: 22902787 PMCID: PMC3480734 DOI: 10.1038/ng.2368
Source DB: PubMed Journal: Nat Genet ISSN: 1061-4036 Impact factor: 38.330
Comparisons of the mean AF difference and the maximum likelihood estimate of s in pairwise combinations of populations across Europe
| Populations | Comparison | Sample size (N) | Mean AF difference | t-test p-value | s (w=sβ) | LRT p-value (w=sβ vs. drift) | s (w=sβ) | LRT p-value (w=sβ vs. drift) |
|---|---|---|---|---|---|---|---|---|
| T = 20 | T = 500 | |||||||
| U.S. vs. Spain (MIGen) | N vs S | 257, 254 | 0.0079 | 9.67E–16 | 0.138 | 9.57E–16 | 0.0055 | 9.65E–16 |
| Sweden vs. Spain (MIGen) | N vs S | 58, 58 | 0.0094 | 1.47E–07 | 0.183 | 5.48E–08 | 0.00728 | 5.44E–08 |
| UK vs. Italy (POPRES) | N vs S | 208, 208 | 0.016 | 1.06E–33 | 0.264 | 2.99E–35 | 0.0105 | 3.24E–35 |
| UK vs. Portugal (POPRES) | N vs S | 125, 125 | 0.012 | 1.72E–20 | 0.207 | 4.91E–18 | 0.00824 | 4.98E–18 |
| UK vs. Switzerland (French) (POPRES) | N vs C | 208, 208 | 0.0044 | 5.18E–07 | 0.0757 | 1.52E–05 | 0.00302 | 1.52E–05 |
| Switzerland (French) vs. Italy (POPRES) | C vs S | 208, 208 | 0.011 | 1.73E–25 | 0.188 | 1.32E–22 | 0.00746 | 1.36E–22 |
| Switzerland (French) vs. Portugal (POPRES) | C vs S | 125, 125 | 0.0081 | 1.86E–12 | 0.139 | 1.07E–09 | 0.00554 | 1.08E–09 |
Each population is categorized as Northern (N), Central (C), or Southern (S) European. Results shown are for the set of ~1,400 independent SNPs (see main text and Supplemental Methods for exact numbers in each comparison), comparing the mean allele frequency difference between the more Northern population and the more Southern population, as well as the maximum likelihood estimate of the selection coefficients under a model in which the coefficients are proportional to the estimated effects on height (w = s*β, where β is the estimated increase in height per allele, in standard deviations). The p-values shown for the mean allele frequency difference are assessed by t-test. The p-values for the estimates of s are assessed by likelihood ratio test (LRT), comparing a model of drift alone vs. a model of drift plus selection. Though too recent to be the realistic time frames for historical divergence (T, in generations) between the Northern- and Southern-European populations, results for T = 20 and 500 were included to account for the likely bi-directional migration between European populations, which would decrease the apparent time of divergence between the two populations. Note that our analysis is actually estimating the product of T and s. Because our estimates of T and s cannot be decoupled, the LRT statistics and p values are nearly identical across ranges of T (see Supplemental Tables 4–10 for more detailed results across a full range of T). Accordingly, we are not estimating T but are instead estimating s under a range of values for T that are likely to span the actual (unknown) value of
Figure 1Mean allele frequency difference of height SNPs, matched SNPs and genome-wide SNPs between Northern- and Southern-European populations
a, Mean frequency difference of the height-increasing alleles from 139 known height SNPs in MIGen (solid red line) are compared against that of 10,000 sets of randomly-drawn SNPs, with each set matched by average Northern- and Southern-European allele frequencies to the known height SNPs on a per-SNP basis. Shown in purple is the mean value across the 10,000 sets of matched SNPs, and in blue is the expected mean difference for the sets of matched SNPs (x=0). b, Mean frequency difference of the height-increasing allele for sets of 500 independent (r2 < 0.1) SNPs across the genome. SNPs were sorted by GIANT height association p-value. Shown in red is the curve of best fit, in purple the genome-wide mean frequency difference, and in blue the expected mean difference (y=0). U.S. individuals of Northern-European ancestry and Spanish individuals from the MIGen dataset were used. NEur, Northern European. SEur, Southern European. AF, allele frequency.
Figure 2Within-family analyses of height and the Northern-predominant alleles across the genome
Ordered by GIANT height association p-values, height was regressed against the number of Northern-predominant alleles for each SNP, using data from a total of 4,819 individuals in 1,761 sibships. Height and allele counts were both normalized within sibships. a, The average regression coefficients in groups of 500 SNPs are plotted on the y-axis. The SNP ranks are plotted on the x-axis. The red line is the curve of best fit; purple dashed line is the directly comparable curve of best fit for the GIANT effect sizes; blue dashed line is y=0. b, The running averages of the regression coefficients were plotted on the y-axis (red and black filled circles). The running averages of regression coefficients from 1,000 analyses where phenotypes were permuted within sibships are also shown (grey open circles). Observed data points are colored black if they are less extreme than 0.01% of the permuted values. The blue dashed line is y=0.