| Literature DB >> 28198675 |
Jiajie Peng1, Kun Bai2,3, Xuequn Shang1, Guohua Wang2, Hansheng Xue2, Shuilin Jin4, Liang Cheng5, Yadong Wang6, Jin Chen7,8.
Abstract
BACKGROUND: Identifying the genes associated to human diseases is crucial for disease diagnosis and drug design. Computational approaches, esp. the network-based approaches, have been recently developed to identify disease-related genes effectively from the existing biomedical networks. Meanwhile, the advance in biotechnology enables researchers to produce multi-omics data, enriching our understanding on human diseases, and revealing the complex relationships between genes and diseases. However, none of the existing computational approaches is able to integrate the huge amount of omics data into a weighted integrated network and utilize it to enhance disease related gene discovery.Entities:
Keywords: Disease gene prediction; Integrated network; Laplacian normalization; Supervised random walk
Mesh:
Year: 2017 PMID: 28198675 PMCID: PMC5310285 DOI: 10.1186/s12864-016-3263-4
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1The Framework of SLN−SRW. Framework of SLN-SRW for estimating the edge weight of the integrated network automatically and predicting disease genes based on it. The second step is the essential part of SLN-SRW algorithm
Integrated databases and ontologies. The first column, second column, and third column represent the abbreviation of the data source, simplify the description of the data source and the relationship extracted from the data source respectively. Eleven data sources are used to construct the integrated network. Specific types of nodes and edges are extracted from various data sources and integrated into a network
| Abbreviation | Data sources | Relationship |
|---|---|---|
| STRING | Search Tool for the Retrieval of Interacting Gene/Proteins | gene-gene |
| CTD-DG | The Comparative Toxicogenomics Database - Curated Disease-Gene Interactions | disease-gene |
| OMIM | Online Mendelian Inheritance in Man Disease Subtypes | disease-gene |
| ClinVar | Clinical Variants and phenotypes | Disease/Phenotype-gene |
| HGNC | HUGO gene Nomenclature Committee Database | gene name mapping |
| MeSH | Medical Subject Headings | Unified vocabulary |
| UMLS | Unified Medical Language System | Unified vocabulary |
| SIDD | Semantically Integrated Disease-associated Database | disease name mapping |
| DO | Human Disease Ontology | DO term-gene/ DO term-DO term |
| HPO | Human Phenotype Ontology | HPO term-gene/ HPO term-HPO term |
| GO | Gene Ontology | GO term-gene/GO term-GO term |
Fig. 2The workflow of constructing the integrated network. Work flow of constructing the integrated network based on multiple data sources
Fig. 3The process of training the the parameter w. The steps of training the the parameter w
Fig. 4he AUC score for each given restart probability for three methods. The AUC score for each given restart probability for three methods. The red, blue and yellow lines are represent SLN-SRW, SRW and RWR method respectively
Fig. 5ROC curves for the experimental results on testing set. ROC curves for the experimental results on testing set. ROC curves for the experimental results calculated with SLN-SRW (green), SRW (red) and RWR (blue)
Fig. 6True disease-gene pair rates. True disease-gene pair rates at different top k levels
Fig. 7The boxplot of the error score. The boxplot of the error score for SLN-SRW and SRW