| Literature DB >> 35255810 |
Abstract
BACKGROUND: Measuring similarity between complex diseases has significant implications for revealing the pathogenesis of diseases and development in the domain of biomedicine. It has been consentaneous that functional associations between disease-related genes and semantic associations can be applied to calculate disease similarity. Currently, more and more studies have demonstrated the profound involvement of non-coding RNA in the regulation of genome organization and gene expression. Thus, taking ncRNA into account can be useful in measuring disease similarities. However, existing methods ignore the regulation functions of ncRNA in biological process. In this study, we proposed a novel deep-learning method to deduce disease similarity.Entities:
Keywords: Disease similarity; Gene functional network; Non-coding RNA; Semantic association
Mesh:
Substances:
Year: 2022 PMID: 35255810 PMCID: PMC8902705 DOI: 10.1186/s12859-022-04613-1
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Density curves of three disease similarity methods
Top-5 similar diseases for 5 query disease
| Query | Top-5 associated diseases | Score |
|---|---|---|
| Coronary artery disease | FBCP | 0.7908 |
| SEMD | 0.7886 | |
| RT-syndromea | 0.7879 | |
| PMDS | 0.7796 | |
| MODY | 0.7521 | |
| Genetic obesity | GSD | 0.8131 |
| VSD | 0.7925 | |
| PC1/3 deficiency | 0.7166 | |
| LEP deficiency | 0.6885 | |
| SMS | 0.6517 | |
| Hyperbilirubinemia | SLC anemia | 0.8598 |
| Porphyria | 0.8396 | |
| DJS | 0.8376 | |
| Rotor syndrome | 0.7561 | |
| ILL | 0.7461 | |
| Neuroblastoma | macrocolon | 0.8136 |
| ADHD | 0.8054 | |
| HD | 0.78821 | |
| SCDO | 0.78649 | |
| PNPO deficiency | 0.74891 | |
| Growth hormone deficiency | IGH deficiency | 0.65643 |
| CPHD | 0.65444 | |
| Hypopituitarism | 0.62002 | |
| CRMO | 0.61325 | |
| CAGSSS | 0.57399 |
aRubinstein–Taybi syndrome
Simialrity Score of 3 disease pairs measured by ImpAESim
| Group | Disease-pair | Score |
|---|---|---|
| Target | MCDIa, MCDIIb | 0.67298 |
| MCDI, MCDIIIc | 0.66728 | |
| MCDIII, MCDII | 0.78042 | |
| Contrast1 | MCDI, Precocious puberty | 0.23109 |
| MCDII, Precocious puberty | 0.30141 | |
| MCDIII, Precocious puberty | 0.33356 | |
| Contrast2 | MCDI, Celiac disease | 0.38606 |
| MCDI, Celiac disease | 0.3198 | |
| MCDI, Celiac disease | 0.36077 |
aMitochondrial complex I deficiency
bMitochondrial complex II deficiency
cMitochondrial complex III deficiency
Fig. 2The pipeline of ImpAESim algorithm. This framework mainly contains two parts, multi-network embedding to obtain a compact low-dimensional vector feature representation to describe the topological properties for each disease and disease similarity calculation based on distance measurement. First we integrate three disease-related information sources to construct three input networks (A), then we run RWR to learn global topological properties of the networks. The output of RWR is fed to the classic Auto-Encoder (B) to calculate the constraints and obtain low-dimensional vectors of hidden layer. Then the low-dimensional vectors and constraints are fed to the ImpAE (C) to obtain the low-dimensional representation of disease features after concatenating the hidden vectors. Finally the combined representations of diseases can be utilized to measure disease similarity by calculating a cosine distance (D)
Fig. 3Different disease-related information networks feeding to RWR process