| Literature DB >> 28649433 |
Richa Batra1,2,3, Nicolas Alcaraz1,4, Kevin Gitzhofer5, Josch Pauling6, Henrik J Ditzel4,7, Marc Hellmuth5,8, Jan Baumbach1,9, Markus List10.
Abstract
De novo pathway enrichment is a powerful approach to discover previously uncharacterized molecular mechanisms in addition to already known pathways. To achieve this, condition-specific functional modules are extracted from large interaction networks. Here, we give an overview of the state of the art and present the first framework for assessing the performance of existing methods. We identified 19 tools and selected seven representative candidates for a comparative analysis with more than 12,000 runs, spanning different biological networks, molecular profiles, and parameters. Our results show that none of the methods consistently outperforms the others. To mitigate this issue for biomedical researchers, we provide guidelines to choose the appropriate tool for a given dataset. Moreover, our framework is the first attempt for a quantitative evaluation of de novo methods, which will allow the bioinformatics community to objectively compare future tools against the state of the art.Entities:
Year: 2017 PMID: 28649433 PMCID: PMC5445589 DOI: 10.1038/s41540-017-0007-2
Source DB: PubMed Journal: NPJ Syst Biol Appl ISSN: 2056-7189
List of publicly available de novo network enrichment methods in alphabetical order (February 7, 2016)
| Tool | Method | Software | Reference |
|---|---|---|---|
| BioNet* | ASO | app | Ref. |
| ClustEx | Clust. | app | Ref. |
| cMonkey | Clust. | app | Ref. |
| COSINE* | SP | app | Ref. |
| GiGA* | SP | app | Ref. |
| GXNA* | SP | app | Ref. |
| HotNet | SP | app | Ref. |
| jActiveModules | ASO | C-PL | Ref. |
| KeyPathwayMiner* | MC | app, C-PL, WS | Ref. |
| DEGAS* | MC | app | Ref. |
| MEMCover | MC | app | Ref. |
| NetWalker | ASO | app | Ref. |
| NetworkTrail | ASO | WS | Ref. |
| PinnacleZ* | ASO | app, C-PL | Ref. |
| ReactomeViz-MCL | Clust. | C-PL | Ref. |
| RegMOD | SP | app | Ref. |
| ResponseNet | ASO | WS | Ref. |
| SubExtract | ASO | app | Ref. |
| TieDIE | SP | script (Python) | Ref. |
De novo methods capable of processing gene expression data are sub-divided into four categories, i.e. ASO aggregate score optimization, SP score propagation, MC module cover and Clust clustering based; see supplementary material for details. The availability of the tool as a stand-alone application (app), as Cytoscape plugin (C-PL), or as web service (WS) is further indicated.
* Included in our quantitative study (see text for details)
Fig. 1A typical workflow for de novo pathway enrichment. The underlying hypothesis is that phenotype-specific genes (foreground, FG) are differentially expressed in many case samples compared to a control group (1, 2) By using statistical tests, one can determine which genes are affected by the phenotype (3) and overlay this information on an interaction network (4) De novo pathway enrichment tools aim to extract sub-networks enriched with phenotype-specific FG genes (5) Comparing several such methods is an open issue
Fig. 4Illustration of the used models for FG and BG expression distributions generated for cases and control samples: VV in (a) and VM in (b)
Fig. 2Average performance for over 80 FG sets of size 20 nodes were generated, using the AVD algorithm, with varying signal strength (a, b) and varying sparsity (c, d). Expression profiles were simulated with varying mean (VM) (a, c) and varying variation (VV) (b, d). The HPRD network was used as the input network and the performance was assessed using the F-measure. The error-bars (a, b) and box plots (c, d) represent performance over several FG nodes and over a range of internal parameter settings for each tool. The higher the signal strength, the more different are the expression profiles of the FG vs. BG genes, indicating that we can expect pathway enrichment methods to identify them more easily. For details on internal parameters, signal strength and sparsity values, please refer to Supplementary Tables 1, 3, 5 respectively
Fig. 3Average performance for over 80 FG sets of size 20 nodes generated using the AVD algorithm. Expression profiles were simulated with VM. The HPRD network was used as the input network. Performance was assessed with the F-measure for a range of internal parameter settings for each tool, i.e. the expected pathway size (M) for GXNA, the number of allowed exceptions (outliers) in a pathways (L) for KPM, and the pathway false discovery rate (FDR) for BioNet