| Literature DB >> 24523864 |
Yunpeng Liu1, Daniel A Tennant2, Zexuan Zhu3, John K Heath4, Xin Yao1, Shan He5.
Abstract
Disease module is a group of molecular components that interact intensively in the disease specific biological network. Since the connectivity and activity of disease modules may shed light on the molecular mechanisms of pathogenesis and disease progression, their identification becomes one of the most important challenges in network medicine, an emerging paradigm to study complex human disease. This paper proposes a novel algorithm, DiME (Disease Module Extraction), to identify putative disease modules from biological networks. We have developed novel heuristics to optimise Community Extraction, a module criterion originally proposed for social network analysis, to extract topological core modules from biological networks as putative disease modules. In addition, we have incorporated a statistical significance measure, B-score, to evaluate the quality of extracted modules. As an application to complex diseases, we have employed DiME to investigate the molecular mechanisms that underpin the progression of glioma, the most common type of brain tumour. We have built low (grade II)--and high (GBM)--grade glioma co-expression networks from three independent datasets and then applied DiME to extract potential disease modules from both networks for comparison. Examination of the interconnectivity of the identified modules have revealed changes in topology and module activity (expression) between low- and high- grade tumours, which are characteristic of the major shifts in the constitution and physiology of tumour cells during glioma progression. Our results suggest that transcription factors E2F4, AR and ETS1 are potential key regulators in tumour progression. Our DiME compiled software, R/C++ source code, sample data and a tutorial are available at http://www.cs.bham.ac.uk/~szh/DiME.Entities:
Mesh:
Year: 2014 PMID: 24523864 PMCID: PMC3921127 DOI: 10.1371/journal.pone.0086693
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1General work flow for the DiME framework.
Algorithm 1. DiME algorithm.
|
|
|
|
| Create |
|
|
| Create empty real-valued vector |
|
|
| Create |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Return solution with highest |
| Delete current best solution (module) from network and update |
|
|
Algorithm 2. Local moving function.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Algorithm 3. DiME sampling function.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Algorithm 4. DiME seeding function.
|
|
|
|
|
|
|
|
|
|
Characteristics of the benchmark networks.
| Network Name | ||||
| Algorithm | Erdös | PGP | Cond-mat | |
| No. of Nodes |
|
|
|
|
| No. of Edges |
|
|
|
|
scores of the first module of each benchmark network.
| Network Name | ||||
| Algorithm | Erdös | PGP | Cond-mat | |
| DiME |
|
|
|
|
| Original CE |
|
|
| - |
The results in bold font indicate the they are statistically significant (Student's -tests ).
Computation time (second) for extracting the first module in each benchmark network.
| Network Name | ||||
| Algorithm | Erdös | PGP | Cond-mat | |
| DiME |
|
|
|
|
| Original CE |
|
|
| - |
The results in bold font indicate the they are statistically significant (Student's -tests ).
Relative loss of genes under different B-score cutoffs.
| B-score Cutoff | |||
| Algorithm | 0.05 | 0.001 | 1 |
| Rembrandt Data ( |
|
|
|
| TCGA Data ( |
|
|
|
| Rembrandt Data ( |
|
|
|
| GEO Data ( |
|
|
|
Figure 2Correlation of scores with B-scores.
All modules with size larger than 2 and B-score are included. A few modules whose B-score is 0 (indicating scores exceeding the lower limit of detection in the B-score algorithm) were excluded. Fitted lines of versus are shown. The fitted Pearson's correlation values are 0.57 (grade II glioma, left panel) and 0.65 (GBM, right panel) respectively, with both correlation values smaller than 0.0001 in Pearson's correlation tests.
Figure 3DiME is robust to edge noise in co-epxression networks.
Shown in the plots are results for the grade II glioma networks (left panel) and GBM networks (right panel). The horizontal axes display the technique used, and vertical axes show average conservation scores. Only modules with size larger than 5 are taken into consideration. Asterisks denote statistical significance in Student's -tests when comparing means with MCODE modules: “***” - .
Figure 4Visualisation of grade II glioma modules with B-score less than and their inter-module connectivity.
Nodes represent extracted modules, node size represents module size and node color represents (log-transformed) fold-change in average module gene expression level compared with normal patient samples (Red - increase in average expression, green - decrease in average expression, lavender - no change in average expression). Edge widths are proportional to connectivity (i.e., number of co-expression gene pairs) between module pairs.
Figure 5Visualisation of GBM modules with B-score less than and their inter-module connectivity.
Nodes represent extracted modules, node size represents module size and node color represents (log-transformed) fold-change in average module gene expression level compared with normal patient samples (Red - increase in average expression, green - decrease in average expression, lavender - no change in average expression). Edge widths are proportional to connectivity (i.e., number of co-expression gene pairs) between module pairs.
Figure 6Comparison of module reproducibility among different algorithms.
Shown are box plots of average reproducibility (Jaccard index) for each technique used. Asterisks denote statistical significance in Student's -tests when comparing means with MCODE modules: “*” - .
Summary of functional annotation and location information of the conserved common modules.
| Module Number | Top 3 GO BP Terms | Chromosome Locations | Transcription Factors |
| 1 | immune response ( | 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 17, 19, 20, 21, 22, X |
|
| 2 | synaptic transmission ( | 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 19, 20, 22, X |
|
| 3 | nervous system development ( | 1, 2, 3, 4, 6, 7, 8, 9, 11, 12, 15, 16, 17, 19, X |
|
| 4 | ribonucleoside triphosphate catabolic process ( | 3, 6, 7, 8, 12, 14, 17, X |
|
| 5 | antigen processing and presentation of exogenous peptide antigen via MHC class I, TAP-dependent ( | 6 |
|
| 6 | M phase ( | 1, 4, 8, 10, 15, 17, 20 |
|
| 7 | type I interferon-mediated signaling pathway ( | 1, 2, 12, 21 |
|
Figure 7Heat map showing expression landscape of all genes in the 7 conserved common modules across grade II glioma and GBM samples.
Rows correspond to genes grouped by modules and columns correspond to samples grouped by tumour grade.