| Literature DB >> 24803592 |
Futao Zhang1, Eric Boerwinkle2, Momiao Xiong2.
Abstract
The critical barrier in interaction analysis for rare variants is that most traditional statistical methods for testing interactions were originally designed for testing the interaction between common variants and are difficult to apply to rare variants because of their prohibitive computational time and poor ability. The great challenges for successful detection of interactions with next-generation sequencing (NGS) data are (1) lack of methods for interaction analysis with rare variants, (2) severe multiple testing, and (3) time-consuming computations. To meet these challenges, we shift the paradigm of interaction analysis between two loci to interaction analysis between two sets of loci or genomic regions and collectively test interactions between all possible pairs of SNPs within two genomic regions. In other words, we take a genome region as a basic unit of interaction analysis and use high-dimensional data reduction and functional data analysis techniques to develop a novel functional regression model to collectively test interactions between all possible pairs of single nucleotide polymorphisms (SNPs) within two genome regions. By intensive simulations, we demonstrate that the functional regression models for interaction analysis of the quantitative trait have the correct type 1 error rates and a much better ability to detect interactions than the current pairwise interaction analysis. The proposed method was applied to exome sequence data from the NHLBI's Exome Sequencing Project (ESP) and CHARGE-S study. We discovered 27 pairs of genes showing significant interactions after applying the Bonferroni correction (P-values < 4.58 × 10(-10)) in the ESP, and 11 were replicated in the CHARGE-S study.Mesh:
Year: 2014 PMID: 24803592 PMCID: PMC4032862 DOI: 10.1101/gr.161760.113
Source DB: PubMed Journal: Genome Res ISSN: 1088-9051 Impact factor: 9.043
Average type 1 error rates of the statistics for testing interaction between two genes with rare variants
Figure 1.(A) Power curves of three statistics: the FRG, the regression on PCA, and pairwise interaction tests. Permutations were used to adjust for multiple testing, that is, for testing the interaction between two genomic regions that consist of rare variants, for a quantitative trait as a function of the relative risk parameter r at the significance level α = 0.05, under the Dominant OR Dominant model, assuming sample sizes of 2000. (B) Power curves of three statistics: the FRG, the regression on PCA, and pairwise interaction tests. Permutations were used to adjust for multiple testing, that is, for testing the interaction between two genomic regions that consist of rare variants, for a quantitative trait as a function of the relative risk parameter r at the significance level α = 0.05, under the Dominant AND Dominant model, assuming sample sizes of 2000. (C) Power curves of three statistics: the FRG, the regression on PCA, and pairwise interaction tests. Permutations were used to adjust for multiple testing, that is, for testing the interaction between two genomic regions that consist of rare variants, for a quantitative trait as a function of the relative risk parameter r at the significance level α = 0.05, under the Recessive OR Recessive model, assuming sample sizes of 2000. (D) Power curves of three statistics: the FRG, the regression on PCA, and pairwise interaction tests. Permutations were used to adjust for multiple testing, that is, for testing the interaction between two genomic regions that consist of rare variants, for a quantitative trait as a function of the relative risk parameter r at the significance level α = 0.05, under the Threshold model, assuming sample sizes of 2000. (E) Power curves of three statistics: the FRG, the regression on PCA, and pairwise interaction tests. Permutations were used to adjust for multiple testing, that is, for testing the interaction between two genomic regions that consist of rare variants, for a quantitative trait as a function of the sample size at the significance level α = 0.05, under the Dominant OR Dominant model, assuming the relative risk parameter r = 0.1. (F) Power curves of three statistics: the FRG, the regression on PCA, and pairwise interaction tests. Permutations were used to adjust for multiple testing, that is, for testing the interaction between two genomic regions with both common and rare variants, where 10% of the common variants and 10% of the rare variants were chosen as causal variants, as a function of the relative risk parameter r at the significance level α = 0.05, under the Dominant OR Dominant model, assuming sample sizes of 2000.
Figure 2.(A) QQ plot for the ESP data set. (B) QQ plot for the CHARGE-S data set.
P-values of 11 pairs of genes that were significantly interacted in the ESP and CHARGE-S studies
P-values of 27 pairs of significantly interacted genes identified by FRG
P-values of 35 pairs of SNPs between genes KCNK5 and PRDM13 for testing interaction
Figure 3.Networks of 27 pairs of genes showing significant evidence of interactions and genes showing mild interactions in Supplemental Table S5.
Figure 4.Nine interactions (pink color) between genes (green color) which form a subnetwork were replicated in the NHLBI’s ESP and CHARGE-S studies.