| Literature DB >> 30891064 |
Hung Nguyen1, Sangam Shrestha1, Duc Tran1, Adib Shafi2, Sorin Draghici2,3, Tin Nguyen1.
Abstract
A recent focus of computational biology has been to integrate the complementary information available in molecular profiles as well as in multiple network databases in order to identify connected regions that show significant changes under different conditions. This allows for capturing dynamic and condition-specific mechanisms of the underlying phenomena and disease stages. Here we review 22 such integrative approaches for active module identification published over the last decade. This article only focuses on tools that are currently available for use and are well-maintained. We compare these methods focusing on their primary features, integrative abilities, network structures, mathematical models, and implementations. We also provide real-world scenarios in which these methods have been successfully applied, as well as highlight outstanding challenges in the field that remain to be addressed. The main objective of this review is to help potential users and researchers to choose the best method that is suitable for their data and analysis purpose.Entities:
Keywords: PPI network; active module; active subnetwork; data integration; network analysis; subnetwork identification
Year: 2019 PMID: 30891064 PMCID: PMC6411791 DOI: 10.3389/fgene.2019.00155
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Computational tools for active subnetwork identification.
| DIAMOnD | ✗ | ✓ | Python | ✗ | ✓ | Ghiassian et al., | 74 | 24 | ||
| GXNA | ✗ | ✓ | C++ | ✗ | ✓ | Nacu et al., | 141 | 12 | free | |
| MATISSE | ✗ | ✓ | Java | ✓ | ✗ | Ulitsky and Shamir, | 313 | 28 | free | |
| CEZANNE | ✗ | ✓ | Java | ✓ | ✗ | Ulitsky and Shamir, | 111 | 12 | free | |
| PinnacleZ | ✗ | ✓ | Java | ✓ | ✗ | Chuang et al., | 1,414 | 128 | free | |
| RME Module Detection | ✗ | ✓ | Ruby | ✗ | ✓ | Miller et al., | 82 | 11 | MIT | |
| BMRF-Net | ✗ | ✓ | C++, Java | ✓ | ✗ | Shi et al., | 4 | 1 | free | |
| COSINE | ✗ | ✓ | R | ✗ | ✓ | Ma et al., | 64 | 9 | GPL-3 | |
| GLADIATOR | ✗ | ✓ | Python | ✗ | ✓ | Silberberg et al., | 3 | 3 | free | |
| jActiveModules | ✗ | ✓ | Java | ✓ | ✗ | Ideker et al., | 1,115 | 69 | GPL | |
| MOEA | ✗ | ✓ | MATLAB | ✗ | ✓ | Chen et al., | 1 | 1 | free | |
| BioNet & Hienz | ✗ | ✓ | R, Python | ✗ | ✓ | Dittrich et al., | 175 | 17 | MIT | |
| Beisser et al., | 413 | 51 | ||||||||
| HotNet & HotNet2 | ✗ | ✓ | Python, MATLAB | ✗ | ✓ | Vandin et al., | 237 | 33 | ||
| Leiserson et al., | 313 | 104 | ||||||||
| RegMod | ✗ | ✓ | MATLAB | ✗ | ✓ | Qiu et al., | 73 | 9 | free | |
| ResponseNet | ✓ | ✗ | Python | ✓ | ✗ | Lan et al., | 26 | free | ||
| Basha et al., | 57 | 8 | ||||||||
| 15 | 3 | |||||||||
| TimeXNet | ✓ | ✓ | Java | ✓ | ✓ | Patil and Nakai, | 6 | 1 | ||
| EnrichNet | ✓ | ✗ | R, PHP | ✓ | ✗ | Glaab et al., | 171 | 28 | ||
| Walktrap-GM | ✗ | ✓ | R | ✗ | ✓ | Petrochilos et al., | 20 | 4 | GPL | |
| MEMo | ✗ | ✓ | Java, Python | ✗ | ✓ | Ciriello et al., | 377 | 62 | LGPL | |
| ModuleDiscoverer | ✗ | ✓ | R | ✗ | ✓ | Vlaic et al., | 3 | 3 | GPL | |
| ClustEx | ✗ | ✓ | C++ | ✗ | ✓ | Gu et al., | 54 | 6 | free | |
| SAMBA | ✗ | ✓ | Java | ✓ | ✓ | Tanay et al., | 431 | 30 | free | |
Availability provides links to software. Web and Pkg indicate whether the web interface and the standalone package are available, respectively. Code lists the programming languages that were used to implement the software. GUI and Cml indicate whether the tool has GUI and command line, respectively. Reference provides the corresponding publication. Cit. shows the total number of citations while Cit/year shows the number of citations per year. License provides software license; GPL is an abbreviation for the GNU General Public License; LGPL stands for GNU Lesser General Public License; MIT stands for Massachusetts Institute of Technology;
free is free for academic and non-commercial use. Note that PinnacleZ is not compatible with Cytoscape 3 but it can still be downloaded from web archive .
Active module identification approaches along with their corresponding input, network databases and species.
| DIAMOnD | Gene expression | ✗ | TRANSFAC, IntAct, MINT, BioGRID, HPRD, CORUM, PhosphositePlus, curated network (Vinayagam et al., | |
| GXNA | Gene expression | ✗ | EntrezGene, KEGG | |
| MATISSE & CEZANNE | Gene expression | ✗ | SGD, BioGRID, BIND, HPRD, GO, MIPS, KEGG | |
| PinnacleZ | Gene expression | ✓ | GO, Cell Circuits | |
| RME Module Detection | Somatic mutation (SNP, CNV) | ✗ | – | |
| BMRF-Net | Gene expression | ✗ | HPRD | |
| COSINE | Gene expression | ✓ | HPRD | Homo sapiens |
| GLADIATOR | Multiple lists of proteins where each list is associated with a disease | ✓ | Curated Network (Menche et al., | |
| jActiveModules | Gene expression | ✓ | BIND, TRANSFAC, GAL | |
| MOEA | Gene expression | ✗ | BioGRID, KEGG, GO | |
| BioNet & Hienz | Gene expression& survival information | ✗ | HPRD | |
| HotNet & HotNet2 | Mutation frequency of genes | ✗ | KEGG, HPRD | |
| RegMod | Gene expression | ✗ | HPRD, MSigDB, GO | |
| ResponseNet | Weighted list of DE genes and proteins | ✗ | Curated Network (Yeger-Lotem et al., | |
| TimeXNet | Three list of DE genes and log fold change at initial intermediate and late stages | ✗ | HitPredict, InnateDB, TRANSFAC, KEGG | |
| EnrichNet | A list of genes or proteins | ✗ | STRING, KEGG, BioCarta, WikiPathways, Reactome, InterPro, NCI-PID | |
| Walktrap-GM | Gene expression | ✗ | HPRD, KEGG | |
| MEMo | Gene expression, somatic mutation, copy number variation | ✗ | Reactome, MSKCC Cancer Cell Map, NCI-PID, HPRD, GO, PANTHER, INOH | |
| ModuleDiscoverer | Gene expression | ✗ | STRING | |
| ClustEx | DE genes and fold change | ✗ | HPRD | |
| SAMBA | Gene expression, transcription factor (TF) binding | ✗ | MIPS | |
Experimental Input indicates various input requirements that need to be supplied for the method to function. MC indicates whether the tool can handle multiple conditions (i.e., multiple diseases instead of the typical disease vs. control). Network Database provides network interaction information for various kinds of methods. Species gives information about species whose data could be used by the computational tools.
Figure 1Overall workflow of active subnetwork identification. (A) Schematic representation of computational approaches that integrate molecular profiles with known interactions accumulated in knowledge bases. Most methods start by scoring the nodes and calculating node similarity that reflects the expression change (e.g., between disease and control) and correlation between genes, respectively. Then, they adjust the scores and edge weights by taking into consideration the topological order and interaction between genes and proteins. The next step is to construct the subnetworks using edge weights and node scores. Typically, each method develops a specific subnetwork extension strategy in order to optimize a specific subnetwork scoring function using node scores and edge weights. After the subnetworks are constructed, each method performs a hypothesis testing to assess the statistical significance of identified modules. Some methods also repeatedly reconstruct the subnetwork after statistical tests to find a more optimal solution. (B) An example network and identified active subnetwork. The subnetwork are often a very well-connected component of the global network.
Figure 2Workflows of active module identification approaches. The figure highlights the key characteristics and key differences between each method. From left to right are the techniques applied in each approach: (i) node scoring, (ii) edge scoring, (iii) algorithm used to construct the subnetworks, and (iv) statistical test for assessing the significance of the identified active subnetworks. We note that GLADIATOR does not score nodes nor edges but uses Jaccard Index between input gene sets (of different diseases) as the objective for its simulated annealing algorithm.