| Literature DB >> 21915301 |
Nirmala Akula1, Ancha Baranova, Donald Seto, Jeffrey Solka, Michael A Nalls, Andrew Singleton, Luigi Ferrucci, Toshiko Tanaka, Stefania Bandinelli, Yoon Shin Cho, Young Jin Kim, Jong-Young Lee, Bok-Ghee Han, Francis J McMahon.
Abstract
Genome-wide association studies (GWAS) are a valuable approach to understanding the genetic basis of complex traits. One of the challenges of GWAS is the translation of genetic association results into biological hypotheses suitable for further investigation in the laboratory. To address this challenge, we introduce Network Interface Miner for Multigenic Interactions (NIMMI), a network-based method that combines GWAS data with human protein-protein interaction data (PPI). NIMMI builds biological networks weighted by connectivity, which is estimated by use of a modification of the Google PageRank algorithm. These weights are then combined with genetic association p-values derived from GWAS, producing what we call 'trait prioritized sub-networks.' As a proof of principle, NIMMI was tested on three GWAS datasets previously analyzed for height, a classical polygenic trait. Despite differences in sample size and ancestry, NIMMI captured 95% of the known height associated genes within the top 20% of ranked sub-networks, far better than what could be achieved by a single-locus approach. The top 2% of NIMMI height-prioritized sub-networks were significantly enriched for genes involved in transcription, signal transduction, transport, and gene expression, as well as nucleic acid, phosphate, protein, and zinc metabolism. All of these sub-networks were ranked near the top across all three height GWAS datasets we tested. We also tested NIMMI on a categorical phenotype, Crohn's disease. NIMMI prioritized sub-networks involved in B- and T-cell receptor, chemokine, interleukin, and other pathways consistent with the known autoimmune nature of Crohn's disease. NIMMI is a simple, user-friendly, open-source software tool that efficiently combines genetic association data with biological networks, translating GWAS findings into biological hypotheses.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21915301 PMCID: PMC3168369 DOI: 10.1371/journal.pone.0024220
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1NIMMI flowchart.
An overview of the dataflow in NIMMI is shown in Figure 1. The data shown here is drawn from the InCHIANTI height GWAS dataset. Approximately 2.5 million SNPs were analyzed using PLINK setting the parameters as specified under GWAS data module (see Design and Implementation). This resulted in ∼2.4 million SNPs with association p-values, which were then assigned to 17,783 genes. Gene assignment and gene-based p-values were calculated using VEGAS. These gene-based p-values were converted to z-scores and combined with gene weights (calculated by the modified Google PageRank algorithm) in the network using the Liptak-Stouffer method to identify the ‘trait prioritized sub-networks’ that were evaluated in DAVID.
Figure 2Comparison of gene-based percentile ranks with NIMMI’s network percentile ranks.
The x-axis shows the candidate genes for height and the y-axis shows the percentile rank. Blue triangles represent the InCHIANTI GWAS dataset, red squares represent the Korean height GWAS dataset and green circles represent the GAIN Controls height GWAS dataset. Figure 2a shows the single-locus ranking and Figure 2b shows NIMMI network-based ranking for 34 height candidate genes.
Figure 3Network overlap.
Top 2% overlap of NIMMI prioritized networks in InCHIANTI, Korean and GAIN controls datasets shows 38 networks that are common to all three datasets. Five networks are common to InCHIANTI and GAIN controls datasets only. Korean and GAIN controls datasets have seven networks in common and four networks are common between InCHIANTI and Korean datasets. Ten networks are specific to InCHIANTI dataset, whereas Korean and GAIN controls datasets have 8 and 7 networks, respectively.
NIMMI 'height prioritized sub-networks' vs. Cytoscape.
| GO-terms | NIMMI sub-networks | Cytoscape sub-networks |
| E | x | x |
| H | x | |
| M | x | x |
| N | x | x |
| P | x | x |
| R | x | x |
| S | x | x |
| T | x | x |
| Z | x |
E-Gene Expression; H - Steroid Hormone receptor signaling; M-Protein metabolic process/protein modification process; N-Nucleic acid metabolism/Nucliec acid binding/DNA-Replication; P-Phosphate/phosphorus metabolic process; R-RNA processing/RNA binding/RNA metabolic process/RNA splicing/Transcription/Transcription Regulation; S-Signal transduction/Intracellular signaling/Cell communication; T-Transport/localization; Z-metal ion binding/zinc ion binding.
NIMMI prioritized sub-networks for Crohn's Disease.
| Crohn's Disease Bonf. corrPval | David GO Set 1* | Genes in Set1 | Enrichment Pval | David GO Set 2* | Genes in Set2 | Enrichment Pval | David GO Set 3* | Genes in Set3 | Enrichment Pval |
| 9.51E-14 | A | 29/96 | 3.50E-08 | N | 31/96 | 7.30E-05 | M | 26/96 | 1.90E-10 |
| 2.12E-13 | 27/107 | 1.90E-03 | 33/107 | 1.70E-03 | R | 27/107 | 4.40E-11 | ||
| 3.30E-13 | 28/110 | 2.60E-03 | O | 27/110 | 1.00E-06 | S | 38/110 | 1.50E-07 | |
| 3.53E-13 | 30/91 | 7.30E-07 | 23/91 | 4.90E-04 | |||||
| 3.23E-16 | R | 65/115 | 1.90E-20 | O | 28/115 | 4.90E-04 | E | 35/115 | 3.80E-15 |
| 4.10E-13 | 74/152 | 2.50E-18 | N | 41/152 | 9.00E-15 | S | 38/152 | 1.50E-04 | |
| 2.89E-13 | N | 31/89 | 9.40E-05 | ||||||
| 2.08E-14 | 31/123 | 1.80E-02 |
A-Apoptosis; E-Gene Expression; M-Protein metabolic process/protein modification process; N-Nucleic acid metabolism/Nucliec acid binding/DNA-Replication; O-response to organic substance; R-RNA processing/RNA binding/RNA metabolic process/RNA splicing/Transcription/Transcription Regulation; S-Signal transduction/Intracellular signaling/Cell communication.
NIMMI prioritized Crohn's Disease enriched KEGG/BioCarta pathways.
| enriched KEGG/BioCarta pathways | enrichment p-value |
| Adherens junction | 8.70E-07 |
| Apoptosis | 3.50E-05 |
| B-cell receptor signaling | 1.10E-02 |
| Cell cycle | 1.70E-07 |
| Chemokine signaling | 9.90E-03 |
| Control of gene expression by vitamin D receptor | 1.30E-07 |
| EGF signaling | 7.60E-04 |
| ErbB signaling | 8.10E-06 |
| Erk1/Erk2 Mapk signaling | 9.20E-05 |
| Fc gamma R-mediated phagocytosis | 5.80E-14 |
| Focal adhesion | 3.20E-06 |
| IL-2 Receptor Beta Chain in T cell Activation | 1.60E-02 |
| IL6 signaling | 2.80E-03 |
| Insulin signaling | 2.90E-04 |
| Jak-STAT signaling | 2.70E-03 |
| Neurotrophin signaling | 1.20E-02 |
| p53 signaling | 1.20E-03 |
| Pathways in cancer | 6.30E-11 |
| Pelp1 Modulation of Estrogen Receptor Activity | 1.20E-04 |
| Role of PPAR-gamma Coactivators in Obesity and Thermogenesis | 2.40E-04 |
| T Cytotoxic Cell Surface Molecules | 8.00E-03 |
| T-cell receptor signaling | 1.50E-04 |
| TPO signaling | 7.30E-03 |
| Wnt signaling | 1.70E-03 |
Summary of GWAS datasets.
| Height Datasets | n | Observed [O] Imputed [I] | SNPs after QC | Total Genes | |
| InCHIANTI | 975 | I | 2,453,309 | 17,783 | |
| Korean | 8,842 | O | 352,228 | 17,408 | |
| GAIN Controls | 768 | O | 722,742 | 17,720 |
Figure 4Architecture of Network Interface Miner for Multigenic Interactions (NIMMI).
Network Interface Miner for Multigenic Interactions (NIMMI) consists of three levels: SNPs, Genes and Networks, and each level in turn has different modules necessary to prioritize ‘trait prioritized sub-networks’. At the SNPs level (or Level 1), the SNPs are analyzed in the GWAS data module using PLINK. The SNPs are then assigned to genes and a gene-wise p-value is calculated using VEGAS (Level 2). The Database Miner and Network generator module in Networks level (or Level 3) mine the BioGRID database for human PPIs and created two-step networks that are then ranked using the modified Google PageRank algorithm in the Gene/Network ranker and prioritizer module. The association p-value of a gene from Level 2 and gene weight from Level 3 are then combined using the Liptak-Stouffer method. The resulting ‘trait prioritized sub-networks’ are then evaluated in DAVID.