| Literature DB >> 28956817 |
Lu Liu1, Jinmao Wei2, Jianhua Ruan3.
Abstract
Detecting associations between an input gene set and annotated gene sets (e.g., pathways) is an important problem in modern molecular biology. In this paper, we propose two algorithms, termed NetPEA and NetPEA', for conducting network-based pathway enrichment analysis. Our algorithms consider not only shared genes but also gene-gene interactions. Both algorithms utilize a protein-protein interaction network and a random walk with a restart procedure to identify hidden relationships between an input gene set and pathways, but both use different randomization strategies to evaluate statistical significance and as a result emphasize different pathway properties. Compared to an over representation-based method, our algorithms can identify more statistically significant pathways. Compared to an existing network-based algorithm, EnrichNet, our algorithms have a higher sensitivity in revealing the true causal pathways while at the same time achieving a higher specificity. A literature review of selected results indicates that some of the novel pathways reported by our algorithms are biologically relevant and important. While the evaluations are performed only with KEGG pathways, we believe the algorithms can be valuable for general functional discovery from high-throughput experiments.Entities:
Keywords: enrichment analysis; gene sets; pathway; protein–protein interaction network; random walk with restart
Year: 2017 PMID: 28956817 PMCID: PMC5664096 DOI: 10.3390/genes8100246
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1The workflow for calculating similarity scores between the input gene set and pathways. RWR refers to a Random Walk with Restart procedure.
Spearman correlation coefficient between GSEA and four other approaches.
| NetPEA | 0.2373 | |||
| NetPEA’ | 0.3978 | 0.1265 | 0.182 | |
| ORA | 0.3968 | 0.2406 | 0.1602 | 0.264 |
| EnrichNet | 0.2967 | 0.219 | 0.1779 | 0.2726 |
| NetPEA | 0.1195 | |||
| NetPEA’ | 0.2911 | 0.2756 | 0.1419 | |
| ORA | 0.2507 | 0.332 | 0.1421 | 0.0583 |
| EnrichNet | 0.2167 | 0.3067 | 0.0823 | 0.0599 |
Significant pathways: NetPEA vs. ORA .
| Input Gene Set | # Unique Pathways | # Common Pathways | |||
|---|---|---|---|---|---|
| NetPEA | ORA | ||||
| Parkinson | 0 | 19 | 0 | 18 | |
| Lymphoma | 0 | 10 | 0 | 5 | |
| Breast cancer [ | 0 | 6 | 0 | 12 | |
| Breast cancer [ | 4 | 1 | 4 | 0 | 16 |
| Lung cancer [ | 0 | 3 | 0 | 12 | |
| Lung cancer [ | 1 | 6 | 0 | 17 | |
| Diabetes (down) | 0 | 4 | 0 | 2 | |
| Diabetes (up) | 0 | 0 | 0 | 2 | |
| Leukemia (down) | 0 | 1 | 0 | 2 | |
| Leukemia (up) | 0 | 3 | 0 | 7 | |
| Gender (down) | 0 | 0 | 0 | 1 | |
| Gender (up) | 0 | 1 | 0 | 2 | |
| p53 (down) | 0 | 1 | 0 | 5 | |
| p53 (up) | 1 | 5 | 0 | 18 | |
Significant pathways: NetPEA’ vs. ORA.
| Input Gene Set | # Unique Pathways | # Common Pathways | |||
|---|---|---|---|---|---|
| NetPEA’ | ORA | ||||
| Parkinson | 7 | 28 | 3 | 1 | 5 |
| Lymphoma | 16 | 5 | 4 | 0 | 6 |
| Breast cancer [ | 11 | 9 | 0 | 0 | 9 |
| Breast cancer [ | 5 | 9 | 1 | 0 | 11 |
| Lung cancer [ | 28 | 12 | 1 | 0 | 2 |
| Lung cancer [ | 27 | 12 | 1 | 0 | 11 |
| Diabetes (down) | 19 | 4 | 0 | 0 | 2 |
| Diabetes (up) | 13 | 2 | 0 | 0 | 0 |
| Leukemia (down) | 16 | 2 | 0 | 0 | 1 |
| Leukemia (up) | 15 | 6 | 0 | 0 | 4 |
| Gender (down) | 5 | 1 | 0 | 0 | 0 |
| Gender (up) | 20 | 3 | 0 | 0 | 0 |
| p53 (down) | 13 | 5 | 0 | 0 | 1 |
| p53 (up) | 27 | 15 | 0 | 0 | 9 |
Common significant pathways analysis.
| NetPEA | NetPEA’ | ORA | |
|---|---|---|---|
| Common pathways between two breast cancer data sets ([ | glycolysis/gluconeogenesis, homologous recombination, oocyte meiosis, p53 signaling, progesterone-mediated oocyte maturation, base excision repaire, cell cycle | lipoic acid metabolism, progesterone-mediated oocyte maturation, cell cycle, protesome, ubiquitin mediated proteolysis, oocyte meiosis | glycolysis/gluconeogensis, homologous recombination, progesterone-mediated oocyte maturation, cell cycle, oocyte meiosis |
| Common pathways between two lung cancer data sets ([ | DNA replication, ECM -receptor interaction, focal adhesion, mismatch repair, nucleotide excision repair, pancreatic cancer, pathways in cancer, prostate cancer, small cell lung cancer, base excision repair, bladder cancer | antigen processing and presentation, base excision repair, DNA replication, ErBB signaling, FC epsilon RI signaling, FC gamma r-mediated phagocytosis, lysosome, mismatch repair, nucleotide excision repair, prostate cancer, vibrio cholerae infection | focal adhesion, mismatch repair, pathways in cancer, small cell lung cancer |
Figure 2Common pathways from two different datasets of the same disease. (a) Overlap between two breast cancer data sets; (b) Overlap between two lung cancer data sets.
Pathways cross verification analysis for NetPEA.
| Input Gene Set | Positive | Negative | ||||||
|---|---|---|---|---|---|---|---|---|
| NetPEA | ORA | EnrichNet | GSEA | NetPEA | ORA | EnrichNet | GSEA | |
| Lung cancer [ | 16 | 16 | 16 | 7 | 0 | 0 | 0 | 2 |
| Lung cancer [ | 14 | 16 | 16 | 4 | 0 | 8 | ||
| Diabetes (down) | 17 | 19 | 16 | 5 | 0 | 0 | 0 | 3 |
| Diabetes (up) | 15 | 16 | 13 | 5 | 1 | 3 | ||
| Leukemia (down) | 18 | 18 | 15 | 5 | 0 | 0 | 0 | 7 |
| Leukemia (up) | 15 | 19 | 15 | 8 | 0 | 4 | ||
| Gender (down) | 18 | 19 | 15 | 8 | 0 | 2 | ||
| Gender (up) | 17 | 19 | 15 | 7 | 0 | 0 | 0 | 1 |
| p53 (down) | 19 | 19 | 13 | 10 | 0 | 2 | ||
| p53 (up) | 16 | 15 | 16 | 6 | 0 | 3 | ||
Pathways cross verification analysis for NetPEA’.
| Input Gene Set | Positive | Negative | ||||||
|---|---|---|---|---|---|---|---|---|
| NetPEA’ | ORA | EnrichNet | GSEA | NetPEA’ | ORA | EnrichNet | GSEA | |
| Lung cancer [ | 7 | 14 | 16 | 9 | 0 | 0 | 0 | 2 |
| Lung cancer [ | 4 | 15 | 15 | 4 | 4 | 0 | 1 | 7 |
| Diabetes (down) | 6 | 19 | 16 | 6 | 4 | 0 | 0 | 2 |
| Diabetes (up) | 8 | 15 | 13 | 5 | 1 | 1 | 4 | 2 |
| Leukemia (down) | 7 | 15 | 16 | 5 | 1 | 0 | 0 | 6 |
| Leukemia (up) | 7 | 18 | 15 | 9 | 1 | 0 | 0 | 2 |
| Gender (down) | 8 | 17 | 15 | 9 | 2 | 0 | 1 | 2 |
| Gender (up) | 9 | 18 | 15 | 10 | 2 | 0 | 0 | 1 |
| p53 (down) | 6 | 14 | 14 | 11 | 1 | 0 | 1 | 3 |
| p53 (up) | 6 | 13 | 15 | 8 | 0 | 0 | 2 | 2 |