| Literature DB >> 22205949 |
Karen Kapur1, Thierry Schüpbach, Ioannis Xenarios, Zoltán Kutalik, Sven Bergmann.
Abstract
Genome-wide association studies have been instrumental in identifying genetic variants associated with complex traits such as human disease or gene expression phenotypes. It has been proposed that extending existing analysis methods by considering interactions between pairs of loci may uncover additional genetic effects. However, the large number of possible two-marker tests presents significant computational and statistical challenges. Although several strategies to detect epistasis effects have been proposed and tested for specific phenotypes, so far there has been no systematic attempt to compare their performance using real data. We made use of thousands of gene expression traits from linkage and eQTL studies, to compare the performance of different strategies. We found that using information from marginal associations between markers and phenotypes to detect epistatic effects yielded a lower false discovery rate (FDR) than a strategy solely using biological annotation in yeast, whereas results from human data were inconclusive. For future studies whose aim is to discover epistatic effects, we recommend incorporating information about marginal associations between SNPs and phenotypes instead of relying solely on biological annotation. Improved methods to discover epistatic effects will result in a more complete understanding of complex genetic effects.Entities:
Mesh:
Year: 2011 PMID: 22205949 PMCID: PMC3242756 DOI: 10.1371/journal.pone.0028415
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Comparison of the FDR (determined at cutoffs corresponding to the 0.1% quantile of permutation p-values) for detecting interactions in yeast gene expression data among the different subset strategies.
(A) The FDR is plotted against the number of SNP pairs for MM, MG and ST in red, green and blue, respectively. (B–D) The FDR is shown for MM, MG and ST strategies compared to 500 MM0, MG0 and ST0 control strategies, respectively. Significance values are computed as the proportion of control strategies with FDR as low or lower. (B) Significance values 0.052, 0.072, 0.088, 0.05, 0.16, 0.54, 1.0, 1.0 and 0.15. (C) Significance values 0.17, 0.17, 0.048, 0.13, 0.32, 1.0, 1.0 and 0.16. (D) Significance values 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0.
Figure 2Illustration of comparisons between subset strategies and control strategies for the MM strategy.
A subset strategy is applied to both gene expression measurements and randomly permuted measurements. The gene expression measurements define the number of hits at a given p-value threshold while the permutations are used to estimate the expected number of false positives, giving rise to an estimate of the FDR.
Figure 3The data supporting the most significant interaction from the MM strategy is shown here.
Capital letter markers refer to RM11; lowercase letter markers refer to BY4716 (S288c). Blue bars mark the model predicted expression levels at each combination of genetic markers, green dots show the observed mean expression levels, and grey bars show the standard deviation.
Figure 4Comparison of the FDR (determined at cutoffs corresponding to the 0.1% quantile of permutation p-values) for detecting interactions in human gene expression data among the different subset strategies.
The FDR is plotted against the number of SNP pairs for MM, MG and ST in red, green and blue, respectively.
Details of the two hits discovered by the MM strategy in a human CEU eQTL dataset.
| Gene | Probe | Snp1 | Snp2 | P-value |
| HLA-DRB1 | ILMN_20550_7330093 | rs3763313 | rs3129883 | 1.79e-14 |
| HLA-DRB5 | ILMN_3178_4390692 | rs984778 | rs206017 | 1.19e-13 |
| IFIT3 | ILMN_1944_2690452 | rs2197025 | rs2031339 | 9.13e-12 |
The FDR is estimated at 49%.