| Literature DB >> 25982363 |
Molly A Hall1, Shefali S Verma1, John Wallace1, Anastasia Lucas1, Richard L Berg2, John Connolly3, Dana C Crawford4, David R Crosslin5, Mariza de Andrade6, Kimberly F Doheny7, Jonathan L Haines4, John B Harley8, Gail P Jarvik5,9, Terrie Kitchner2, Helena Kuivaniemi10, Eric B Larson11, David S Carrell11, Gerard Tromp10, Tamara R Vrabec10, Sarah A Pendergrass10, Catherine A McCarty12, Marylyn D Ritchie1,10.
Abstract
Bioinformatics approaches to examine gene-gene models provide a means to discover interactions between multiple genes that underlie complex disease. Extensive computational demands and adjusting for multiple testing make uncovering genetic interactions a challenge. Here, we address these issues using our knowledge-driven filtering method, Biofilter, to identify putative single nucleotide polymorphism (SNP) interaction models for cataract susceptibility, thereby reducing the number of models for analysis. Models were evaluated in 3,377 European Americans (1,185 controls, 2,192 cases) from the Marshfield Clinic, a study site of the Electronic Medical Records and Genomics (eMERGE) Network, using logistic regression. All statistically significant models from the Marshfield Clinic were then evaluated in an independent dataset of 4,311 individuals (742 controls, 3,569 cases), using independent samples from additional study sites in the eMERGE Network: Mayo Clinic, Group Health/University of Washington, Vanderbilt University Medical Center, and Geisinger Health System. Eighty-three SNP-SNP models replicated in the independent dataset at likelihood ratio test P < 0.05. Among the most significant replicating models was rs12597188 (intron of CDH1)-rs11564445 (intron of CTNNB1). These genes are known to be involved in processes that include: cell-to-cell adhesion signaling, cell-cell junction organization, and cell-cell communication. Further Biofilter analysis of all replicating models revealed a number of common functions among the genes harboring the 83 replicating SNP-SNP models, which included signal transduction and PI3K-Akt signaling pathway. These findings demonstrate the utility of Biofilter as a biology-driven method, applicable for any genome-wide association study dataset.Entities:
Keywords: association; complex disease; genetic interaction
Mesh:
Year: 2015 PMID: 25982363 PMCID: PMC4550090 DOI: 10.1002/gepi.21902
Source DB: PubMed Journal: Genet Epidemiol ISSN: 0741-0395 Impact factor: 2.135
Study population characteristics
| No. of cases | No. of controls | Total | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| eMERGE study site | Male | Female | All | Male | Female | All | Male | Female | All | |
| Discovery | Marshfield | 934 | 1,258 | 2,192 | 474 | 711 | 1,185 | 1,408 | 1,969 | 3,377 |
| Replication | Mayo, Group Health/University of Washington, Vanderbilt, Geisinger | 1,726 | 1,843 | 3,569 | 400 | 342 | 742 | 2,126 | 2,185 | 4,311 |
| Total | All | 2,600 | 3,101 | 5,761 | 874 | 1,053 | 1,927 | 3,534 | 4,154 | 7,688 |
Sample sizes are given for cataract cases, controls, and total population for the discovery and replication datasets. Sample information for discovery and replication samples after quality control.
Figure 1Steps involved in generating Biofilter SNP-SNP models. (A) Biofilter accessed LOKI-compiled databases with information about connections between genes (for the example shown here: connections within a pathway). (B) Biofilter-generated gene-gene models based on connections between genes that were validated by five or more databases. (C) For each gene-gene model, pairwise SNP-SNP models were created for each unique combination of loci across a gene pair.
Figure 2Flow chart of steps in the discovery and replication analyses.
Figure 3All replicating SNP-SNP models with LRT P < 0.01 in both the replication and discovery datasets. SNP-SNP models are shown above with the –log10 of the P-value in the track directly beneath (discovery values are in blue and replication values are in red). Visualization was performed using Synthesis View software [Pendergrass et al., 2010].
Figure 4Ten most significant replicating SNP-SNP models, ranked by significance level in the discovery (A) and replication (B) samples. For both figures, the SNP-SNP models and their nearest genes are listed to the left. The track to the right of each displays the –log10 of the P-value for the discovery (blue) and replication (red) groups. Figures were made using Synthesis View.
Figure 5Common groups relating to genes in replicating SNP-SNP models. Figures display two of the most common groups (yellow) and the genes that Biofilter annotated with that group (blue): (A) signal transduction and (B) PI3K-Akt signaling pathway. Solid lines indicate group-gene connection, dotted line indicates gene-gene connection from the interaction analysis. Plots were generated using Cytoscape software [Saito et al., 2012].