| Literature DB >> 19091723 |
Sridhar Kudaravalli1, Jean-Baptiste Veyrieras, Barbara E Stranger, Emmanouil T Dermitzakis, Jonathan K Pritchard.
Abstract
Changes in gene expression may represent an important mode of human adaptation. However, to date, there are relatively few known examples in which selection has been shown to act directly on levels or patterns of gene expression. In order to test whether single nucleotide polymorphisms (SNPs) that affect gene expression in cis are frequently targets of positive natural selection in humans, we analyzed genome-wide SNP and expression data from cell lines associated with the International HapMap Project. Using a haplotype-based test for selection that was designed to detect incomplete selective sweeps, we found that SNPs showing signals of selection are more likely than random SNPs to be associated with gene expression levels in cis. This signal is significant in the Yoruba (which is the population that shows the strongest signals of selection overall) and shows a trend in the same direction in the other HapMap populations. Our results argue that selection on gene expression levels is an important type of human adaptation. Finally, our work provides an analytical framework for tackling a more general problem that will become increasingly important: namely, testing whether selection signals overlap significantly with SNPs that are associated with phenotypes of interest.Entities:
Mesh:
Year: 2008 PMID: 19091723 PMCID: PMC2767089 DOI: 10.1093/molbev/msn289
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
FThe abundance of eQTL signals in SNPs with and without evidence for selection (YRI data). In each plot, the red data correspond to SNPs with strong evidence for selection (|iHS| > 2 and surrounded by an unusual cluster of other high |iHS| SNPs); blue data are for SNPs with |iHS| > 2; and black data are for all SNPs. (A) Quantile–quantile plots of the distributions of −log10(P values) obtained from testing the expression levels at each gene for association with nearby SNPs. The dashed line indicates the expected distribution of P values if there were no true associations between SNPs and gene expression levels. Notice that SNPs with high |iHS| (red and blue data) show a higher rate of significant P values compared with SNPs without a signal of selection. (B) SNPs with high |iHS| show an enrichment for eQTLs at various distances from the transcription start site. (C) SNPs with high iHS tend to be enriched for eQTLs after controlling for allele frequency. The enrichment may be highest in the frequency ranges where iHS has the greatest power (roughly 50–80%; Voight et al. 2006). (D) SNPs with high iHS show generally higher rates of eQTLs after controlling for LD levels, as measured by the number of SNPs in high LD with the SNP in question (r2 > 0.8). For analogous plots of the other two populations, see the Supplementary Material online.
FTwo examples in which an eQTL is centered on a strong signal of selection. The upper half of each plot (green and red points) shows the strength of association between SNPs and gene expression levels (plotted as −log10(P values) of the indicated gene). The lower half of each plot (blue and red points) indicates −|iHS| scores for the same set of SNPs. Red points indicate SNPs that are both strongly associated with expression (P < 10−4) and have |iHS| > 2. The positions of the genes of interest are indicated by the red bars at the center of each plot. (A) Data from SCL25A16 (YRI). (B) Data from SPATA20 (ASN). According to the sliding-window analysis, the clusters of high |iHS| signals are in the 2.5% and 1% tails of the empirical Yoruba (A) and Asian (B) distributions, respectively. The favored haplotypes are at 60% and 89% frequency, respectively.
Enrichment of eQTLs among SNPs with Signals of Selection, as Estimated by Two Different Methods
| Population | Analysis | Odds Ratio | 95% CI | Number of Genes |
| YRI | LR | 2.41 | 1.23–4.27 | 35 |
| HM | 5.43 | 2.17–13.87 | ||
| CEU | LR | 2.29 | 0.83–4.35 | 16 |
| HM | 2.26 | 0.45–8.03 | ||
| ASN | LR | 1.41 | 0.82–2.38 | 47 |
| HM | 1.52 | 0.58–3.43 |
NOTE.—For each population separately, we used logistic regression (LR) and a hierarchical model (HM) to estimate the odds ratio that an SNP with a selection signal (|iHS| > 2 and a cluster-based signal in the top 5%) is an eQTL, compared with a comparable SNP without a selection signal. The 95% CIs were estimated as described in the Methods. “Number of genes” indicates the number of genes for which at least one SNP is both associated with gene expression and has a selection signal.