| Literature DB >> 19148272 |
James J Cai1, J Michael Macpherson, Guy Sella, Dmitri A Petrov.
Abstract
Much effort and interest have focused on assessing the importance of natural selection, particularly positive natural selection, in shaping the human genome. Although scans for positive selection have identified candidate loci that may be associated with positive selection in humans, such scans do not indicate whether adaptation is frequent in general in humans. Studies based on the reasoning of the MacDonald-Kreitman test, which, in principle, can be used to evaluate the extent of positive selection, suggested that adaptation is detectable in the human genome but that it is less common than in Drosophila or Escherichia coli. Both positive and purifying natural selection at functional sites should affect levels and patterns of polymorphism at linked nonfunctional sites. Here, we search for these effects by analyzing patterns of neutral polymorphism in humans in relation to the rates of recombination, functional density, and functional divergence with chimpanzees. We find that the levels of neutral polymorphism are lower in the regions of lower recombination and in the regions of higher functional density or divergence. These correlations persist after controlling for the variation in GC content, density of simple repeats, selective constraint, mutation rate, and depth of sequencing coverage. We argue that these results are most plausibly explained by the effects of natural selection at functional sites -- either recurrent selective sweeps or background selection -- on the levels of linked neutral polymorphism. Natural selection at both coding and regulatory sites appears to affect linked neutral polymorphism, reducing neutral polymorphism by 6% genome-wide and by 11% in the gene-rich half of the human genome. These findings suggest that the effects of natural selection at linked sites cannot be ignored in the study of neutral human polymorphism.Entities:
Mesh:
Year: 2009 PMID: 19148272 PMCID: PMC2613029 DOI: 10.1371/journal.pgen.1000336
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Correlation coefficients among the studied variables: the level of neutral polymorphism (θ), the level of normalized neutral polymorphism (P = θ/d), recombination rate (RR), GC content (GC), the density of simple repeats (RD), the divergence at coding sites (D), the divergence at conserved noncoding region (D), the number of codons (FD), the number of conserved noncoding sites (FD), and the level of neutral divergence (d).
|
|
| RR | GC | RD |
|
|
|
|
| |
|
| — | 0.9364 | 0.2187 | −0.2747 | −0.1046 | −0.2939 | −0.1655 | −0.3210 | −0.3094 | 0.2868 |
|
| 0.7880 | — | 0.1309 | −0.2460 | −0.1306 | −0.2467 | −0.1552 | −0.2363 | −0.2161 | −0.0166 |
| (1.27e-2) | ||||||||||
| RR | 0.1486 | 0.0886 | — | 0.3535 | −0.2769 | 0.0480 | −0.0454 | 0.0267 | −0.0243 | 0.2934 |
| (5.36e-13) | (8.58e-12) | (6.04e-5) | (2.57e-4) | |||||||
| GC | −0.1837 | −0.1630 | 0.2421 | — | −0.0617 | 0.5694 | 0.1899 | 0.6100 | 0.5096 | −0.1322 |
| RD | −0.0703 | −0.0878 | −0.1876 | −0.0412 | — | 0.0226 | 0.0617 | −0.0248 | −0.0356 | 0.0539 |
| (1.63e-20) | (6.81e-4) | (1.93e-4) | (8.86e-8) | (5.55e-16) | ||||||
| Dn | −0.2079 | −0.1733 | 0.0337 | 0.4141 | 0.0166 | — | 0.3027 | 0.8941 | 0.6772 | −0.1727 |
| (2.81e-13) | (3.17e-4) | |||||||||
| Dx | −0.1150 | −0.1080 | −0.0313 | 0.1296 | 0.0425 | 0.2204 | — | 0.3008 | 0.4965 | −0.0444 |
| (8.27e-12) | (1.81e-20) | (2.53e-11) | ||||||||
| FDn | −0.2213 | −0.1606 | 0.0188 | 0.4397 | −0.0163 | 0.7379 | 0.2119 | — | 0.8260 | −0.3022 |
| (3.03e-5) | (3.08e-4) | |||||||||
|
| −0.2096 | −0.1446 | −0.0157 | 0.3535 | −0.0238 | 0.5045 | 0.3524 | 0.6493 | — | −0.3242 |
| (4.00e-4) | (7.89e-8) | |||||||||
| dneu | 0.2011 | −0.0109 | 0.2011 | −0.0917 | 0.0365 | −0.1264 | −0.0320 | −0.2115 | −0.2226 | — |
| (1.38e-2) | (2.12e-16) | (2.96e-12) |
**: P<1e-20.
*: 1e-20≤P<1e-3.
P> = 1e-3.
Spearman's ρ and Kendall's τ are given at the upper and lower diagonal parts of the table, respectively. P-values are given in parentheses for marginally significant (1e-20≤P<1e-3) and nonsignificant (NS, P> = 1e-3) values.
Figure 1Correlations between recombination rate and neutral divergence rate and neutral polymorphism.
Scatter plots display values of two variables in orange dots for (A) recombination rate and the level of neutral divergence rate (d), (B) recombination rate and the level of neutral polymorphism (θ), and (C) recombination rate and the level of normalized neutral polymorphism (P = θ/d). Black circles are average values for orange dots pooled in 100 bins each containing 1% of the data points.
Spearman rank correlation and partials correlation coefficients between the number of codons (FD) and the levels of neutral polymorphism (θ) or the normalized neutral polymorphism (P = θ/d), and between the number of conserved noncoding sites (FD) and the levels of neutral polymorphism (θ) or normalized neutral polymorphism (P = θ/d).
|
|
|
|
| RR, GC, RD |
|
|
|
|
|
|
|
| ○ | ○ | ○ | ○ |
|
|
|
|
| • | ○ | ○ | ○ |
|
| — |
| — | • | ○ | ○ | • |
| −0.036 | −0.025 | −0.042 | −0.021 | • | ○ | • | •(○) |
| (6.67e-8) | (1.94e-4) | (2.34e-10) | (1.35e-3) | ||||
| 0.010 | 0.027 | −0.025 | 0.007 | • | • | • | •(○) |
| (1.51e-1) | (4.80e-5) | (1.42e-4) | (3.16e-1) |
§: Correlation coefficients for FD versus θ or P was calculated here controlling for FD and the correlation coefficients for FD versus θ or P was calculated here controlling for FD
†: Correlation coefficients for FD or FD versus P were not calculated or were calculated without controlling for d, as P ( = θ /d) is not independent from d.
**: P<1e-10.
*: 1e-10≤P<1e-3.
P> = 1e-3.
Closed circles (•) indicate the controlled variables. Highly significant values (P<1e-10) are in bold. P-values are given in parentheses for marginally significant (1e-10≤P<1e-3) and nonsignificant (NS, P> = 1e-3) values.
Figure 2Relationships among the levels of functional density and neutral polymorphism.
Scatter plots display values of two variables in orange dots for (A) the number of codons (FD) and the level of neutral polymorphism (θ), (B) the number of conserved noncoding sites (FD) and the level of neutral polymorphism (θ), (C) the number of codons (FD) and the level of normalized neutral polymorphism (P = θ/d), and (D) the number of conserved noncoding sites (FD) and the level of normalized neutral polymorphism (P = θ/d). Black circles are average values for orange dots pooled in 100 bins each containing 1% of the data points.
Spearman rank correlation and partials correlation coefficients between the divergence at coding sites (D) and the levels of neutral polymorphism (θ) or normalized neutral polymorphism (P = θ/d) and between the divergence at conserved noncoding region (D) and the levels of neutral polymorphism (θ) or normalized neutral polymorphism (P = θ/d).
|
|
|
|
| RR, GC, RD |
|
|
|
|
|
|
|
| ○ | ○ | ○ | ○ |
|
|
|
|
| • | ○ | ○ | ○ |
|
| — |
| — | • | ○ | ○ | • |
|
|
|
|
| • | ○ | • | •(○) |
|
|
|
|
| • | • | • | •(○) |
| (2.06e-10) |
§: Correlation coefficients for D versus θ or P were calculated controlling for D and the correlation coefficients for D versus θ or P were calculated controlling for D.
†: Correlation coefficients for D or D versus P were not calculated or were calculated without controlling for d, as P ( = θ /d) is not independent of d.
**: P<1e-10.
*: 1e-10≤P<1e-3.
P> = 1e-3.
Closed circles (•) indicate the controlled variables. Highly significant values (P<1e-10) are in bold. P-values are given in parentheses for marginally significant (1e-10≤P<1e-3) value.
Figure 3Relationships among the levels of functional divergence and neutral polymorphism.
Scatter plots display values of two variables in orange dots for (A) the divergence at coding sites (D) and the level of neutral polymorphism (θ), (B) the divergence at conserved noncoding region (D) and the level of neutral polymorphism (θ), (C) the divergence at coding sites (D) and the level of normalized neutral polymorphism (P = θ/d), and (D) the divergence at conserved noncoding region (D) and the level of normalized neutral polymorphism (P = θ/d). Black circles are average values for orange dots pooled in 100 bins each containing 1% of the data points.