| Literature DB >> 25560835 |
Hao Chen1, Zhitu Zhu, Yichun Zhu, Jian Wang, Yunqing Mei, Yunfeng Cheng.
Abstract
It is known that a disease is rarely a consequence of an abnormality of a single gene, but reflects the interactions of various processes in a complex network. Annotated molecular networks offer new opportunities to understand diseases within a systems biology framework and provide an excellent substrate for network-based identification of biomarkers. The network biomarkers and dynamic network biomarkers (DNBs) represent new types of biomarkers with protein-protein or gene-gene interactions that can be monitored and evaluated at different stages and time-points during development of disease. Clinical bioinformatics as a new way to combine clinical measurements and signs with human tissue-generated bioinformatics is crucial to translate biomarkers into clinical application, validate the disease specificity, and understand the role of biomarkers in clinical settings. In this article, the recent advances and developments on network biomarkers and DNBs are comprehensively reviewed. How network biomarkers help a better understanding of molecular mechanism of diseases, the advantages and constraints of network biomarkers for clinical application, clinical bioinformatics as a bridge to the development of diseases-specific, stage-specific, severity-specific and therapy predictive biomarkers, and the potentials of network biomarkers are also discussed.Entities:
Keywords: diagnostics; disease; dynamic network biomarkers; network biomarkers; protein-protein interactions
Mesh:
Substances:
Year: 2015 PMID: 25560835 PMCID: PMC4407592 DOI: 10.1111/jcmm.12447
Source DB: PubMed Journal: J Cell Mol Med ISSN: 1582-1838 Impact factor: 5.310
Fig 1Network-based human disease model. In response to the dynamically varying intrinsic (genetic) and extrinsic (environmental) perturbations (A), protein interaction networks (B) that interactively link genome, epigenome, transcriptome, proteome and metabolome are of central importance to modulate cell behaviour. The interplay of these interconnected cellular signalling networks can converge towards disease states and ultimately can initiate and drive complex diseases (C).
Fig 2The evolution of biomarkers concepts. (A) Traditional molecule biomarkers, known as single or a group of several genes and proteins that are static indicators on the disease state; (B) The recent developed network biomarkers, with the integration of knowledge on protein annotations, interactions, and signalling pathways, are static measurements on the disease state; (C) The newly developed dynamic network biomarkers, providing dynamical measurements on the disease state within a systems biology framework.
Fig 3Systematic approach of network biomarker discovery. Chart schematically illustrates the critical stages of network biomarker discovery: (A) The global expression profiles (genomics, proteomics, literatures, etc.) of a disease phenotype are obtained as ‘seeds’ of the disease module; (B) Such seeds is integrated into the constructed protein network (Y2H, AP-MS), literature-curated pathways, or computational predicted networks that contain systematic pathobiological events of phenotypes; (C) Through quantitative systematic approaches, phenotype-associated subnetworks and/or pathways are then scored, ranked and identified; and (D) such disease modules or network biomarkers can distinguish a disease phenotype from a normal phenotype more accurately than traditional molecule biomarkers. Y2H: high-throughput yeast two-hybrid; AP-MS: affinity purification mass spectrometry.
Information of human protein–protein interaction network databases
| Database | Method | Proteins number | Interactions number | Ref. | Website |
|---|---|---|---|---|---|
| CCSB | Y2H | 1549 | 2754 | ||
| MDC | Y2H | 1705 | 3186 | ||
| BIND | Literature | 6089 | 14955 | ||
| DIP | Literature | 3877 | 6103 | ||
| MINT | Literature | 8762 | 26830 | ||
| HomoMINT | Prediction & Literature | 8634 | 323595 | ||
| InteAct | Literature | 60932 | 197974 | ||
| BioGRID | Literature | 18208 | 220390 | ||
| HPRD | Literature | 30047 | 41327 | ||
| Reactome | Literature | 7085 | 6744 | ||
| UniHI | Database integration | 36023 | 374833 | ||
| HAPPI | Database integration | 70829 | 601757 |
The table displays the number of proteins and the number of interactions derived from each database. The column termed Methods shows the general approaches how PPIs were compiled in the different resources. In addition, the literature reference for the resource and the websites of the databases are given.
CCSB: Center for Cancer Systems Biology; MDC: Max Delbrück Center; BIND: Biomolecular Interaction Network Database; DIP: Database of Interacting Proteins; MINT: Molecular INTeraction database; HomoMINT: inferred human network of MINT; InteAct: the protein Interaction database; BioGRID: Biological General Repository for Interaction Datasets; HPRD: Human Protein Reference Database; Reactome: a curated pathway database; UniHI: Unified Human Interactome; HAPPI: Human Annotated and Predicted Protein Interaction.
All species.
Examples of software packages for mapping phenotype-related subnetworks
| Name | Full name | Website | Description | Ref. |
|---|---|---|---|---|
| GRNInfer | Gene Regulatory Network Inference Tool | A gene regulatory network inference tool from multiple microarray data sets | ||
| MDCinfer | Inferring protein–protein interactions based on multi-domain Co-operation | PPI prediction tool based on multiple domain co-operation analysis | ||
| TRNInfer | Inferring transcriptional regulatory networks from high-throughput data | Infer direct relationships between transcription factors and target genes | ||
| Samo | Protein Structure Alignment tool based on Multiple Objective optimization | A protein structure alignment tool based on multiple objective optimization | ||
| MNAligner | Molecular Network Aligner | Alignment of molecular networks by quadratic programming | ||
| PTG | Parsimonious Tree-Grow method for haplotype inference | Parsimonious tree-grow method for haplotype inference | ||
| PRNA | Protein–RNA Binding-Site Prediction | Prediction of protein–RNA binding sites by a random forest method with combined features | ||
| NOA | Network Ontology Analysis | Collection of gene ontology tools aiming to analyse functions of gene network instead of gene list | ||
| DDN | Differential dependency network analysis | Detect statistically significant topological changes in the transcriptional networks between two biological conditions | ||
| WGCNA | Weighted correlation network analysis | A comprehensive collection of R functions for performing various aspects of weighted correlation network analysis patterns among genes across microarray samples | ||
| SurvNet | N/A | A bioinformatics web app for identifying network-based biomarkers that most correlate with patient survival data | ||
| DiME | Disease Module Extraction | A novel algorithm based on the Community Extraction criterion, to extract topological core modules from biological networks as putative disease modules |
The table displays the abbreviated name and full name of the computational programs with respective description and website. In addition, the literature reference for the resource is given.
Examples of network biomarkers studies in humans
| Disease | Seeds | PPI data sources | Ref. | Key findings |
|---|---|---|---|---|
| Breast cancer | Microarray data of two cohorts of breast cancer patients from literature | Database integration of Y2H, prediction and literature curation, includes 11203 proteins and 57235 interactions | The identified subnetwork biomarkers increased the reproducibility and accuracy in differentiating metastatic from non-metastatic breast tumours, compared to traditional molecular biomarkers | |
| Microarray data of two cohorts of breast cancer patients from literature | Database integration of HPRD database and IPA, includes 584 genes and 2280 interactions | The identified network biomarkers are highly enriched in biological pathways associated with cancer progression and the prediction performance is much improved when tested across different data sets | ||
| Colorectal cancer | Two sets of proteomic targets of colorectal cancer obtained from tissue biopsies | HPRD database includes 9299 proteins and 35023 interactions | Integration of complementary data sources can enhance the discovery of candidate subnetworks in cancer that are well-suited for mechanistic validation in disease | |
| Microarray data of two cohorts of colorectal cancer patients from literature | HPRD database includes 9299 proteins and 35023 interactions | The identified subnetwork biomarkers outperformed other biomarkers in predicting metastasis of colorectal cancer and offered insights in the mechanisms of metastasis in cancer | ||
| 67 proteins identified from tumour tissue of a cohort of colorectal cancer patients | MetaCorefrom GeneGo Inc. (version 4.6 build 12332) | The identified protein subnetwork biomarkers can discriminate late stage cancer | ||
| Prostate cancer | A prostate gene data set from literature | Database integration of DIP and HPRD, includes 6509 proteins and 23157 interactions | The proposed approach can discover condition relevant functional modules efficiently | |
| Gastric cancer | 272 differentially expressed genes in the metastatic gastric cancer | UniHI database | The identified subnetwork biomarkers are promising diagnostic markers for liver metastasis of gastric cancer | |
| Gene expression data from GEO database (ID: GSE27342) | HPRD database includes 9465 proteins and 37039 interactions | Identified network biomarkers include 34 genes shown to be directly connected by the gastric cancer-related genes to all phases, and a functional transition from normal phenotypes to cancer phases was demonstrated | ||
| Lung cancer | Microarray data from GEO database (ID: GSE4115) | Database integration of BioGRID and HPRD | Identified 40 proteins related to lung carcinogenesis. The network-based biomarker was effective in diagnosing smokers with signs of lung cancer | |
| Cardiovascular disease | Mass spectrometry data from major adverse cardiac events patients | HPRD includes 18796 proteins and 37056 interactions | The identified network biomarkers can classify the patients with major adverse cardiac events more accurately than traditional molecular biomarkers | |
| 105 heart failure associated proteins from the literature | Database integration of HPRD, BioGRID and MINT | The identified network biomarkers support accurate prediction of heart failure and provide novel clue to the underlying mechanisms | ||
| Known inflammation biomarkers from clinical practice and literature | Database integration of DIP, IntAct and MINT | Identified a panel of gene biomarkers with high discriminatory capability predicts clinical outcome after myocardial infarction | ||
| CHD microarray data from GEO database (ID: GSE26125 & GSE14790) | Database integration of HPRD, BIND, BioGrid, IntAct and MINT, includes 4761 proteins and 18084 interactions | Identified 12 dysfunctional modules from the constructed CHD subnetwork, which provide clue to molecular mechanisms of CHD | ||
| Dilated cardiomyopathy microarray data from GEO database (ID: GSE3586) | HPRD includes 9059 proteins and 34869 interactions | Identified two functional modules from the interaction networks. The dynamics of these modules between normal and disease states suggested a potential molecular model of dilated cardiomyopathy | ||
| Acute myeloid leukaemia | Microarray data from GEO database (ID: GSE425) | Database integration of HPRD and OPHID, includes 9142 proteins and 41456 interactions | Identified AML causing genes most of which were not detectable with gene expression analysis alone because of the minor changes in mRNA level | |
| Asthma | Asthma-associated genes from OMIM database | HPRD | The identified subnetworks were consistent with known asthma pathways. Novel asthma associated genes were also identified | |
| Glioma | Microarray data from GEO database (ID: GDS1815) | I2D database, includes 681404 interaction | Network biomarkers related to glioma prognosis were identified. MYC expression is positively correlated with lifetime extension | |
| Acute aortic dissection | 2737 genes differentially expressed between acute Stanford type A aortic dissection patients and controls | Curated human PPI network, includes 6437 proteins and 258954 interactions | Eight PPI hotspots associated with aortic dissection were identified. In particular, JAK2 may play a key role in the occurrence of acute aortic dissection |
The table displays biomarkers studies in humans with respective network approach description. In addition, the literature reference for the resource is given.
GEO: Gene Expression Omnibus; OMIM: Online Mendelian Inheritance in Man database; IPA: Ingenuity Pathway Analysis; MetaCore: Data-mining and pathway analysis (http://thomsonreuters.com/metacore/); CHD: Congenital heart disease; OPHID: Online Predicted Human Interaction Database; I2D: Interologous Interaction Database; MYC: myelocytomatosis oncogene.
Fig 4A proposed workflow of disease-specific biomarker identification by integration of bioinformatics and clinical informatics. Both the global expression profiles (B; genomics, proteomics, etc.) and the clinical data (D) of a disease phenotype at different stages (A) are obtained. Disease-associated functional networks (C) are measured by bioinformatics, while clinical informatics (E) is generated through a digital evaluation score system (DESS). By integrating bioinformatics and clinical informatics, the molecular-phenotype networks are measured using different methods to score, rank and identify candidate biomarkers (F). The identified disease-specific biomarkers are then validated to differentiate a disease phenotype from a normal phenotype (G) for clinical application to develop predictive, diagnostics and preventive methods for personalized medicine.