| Literature DB >> 32251458 |
Thomas Gaudelet1, Noël Malod-Dognin2, Jon Sánchez-Valle2, Vera Pancaldi2,3,4, Alfonso Valencia2,5, Nataša Pržulj1,2,5.
Abstract
Diseases involve complex modifications to the cellular machinery. The gene expression profile of the affected cells contains characteristic patterns linked to a disease. Hence, new biological knowledge about a disease can be extracted from these profiles, improving our ability to diagnose and assess disease risks. This knowledge can be used for drug re-purposing, or by physicians to evaluate a patient's condition and co-morbidity risk. Here, we consider differential gene expressions obtained by microarray technology for patients diagnosed with various diseases. Based on these data and cellular multi-scale organization, we aim at uncovering disease-disease, disease-gene and disease-pathway associations. We propose a neural network with structure based on the multi-scale organization of proteins in a cell into biological pathways. We show that this model is able to correctly predict the diagnosis for the majority of patients. Through the analysis of the trained model, we predict disease-disease, disease-pathway, and disease-gene associations and validate the predictions by comparisons to known interactions and literature search, proposing putative explanations for the predictions.Entities:
Mesh:
Year: 2020 PMID: 32251458 PMCID: PMC7135208 DOI: 10.1371/journal.pone.0231059
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Example of neural network architecture.
For the first layer, the connections are defined by biological information, i.e. a unit representing a gene is connected to all the biological pathways that the gene is involved in. We do not add any prior knowledge on the last layer, thus it is fully connected.
Performances of different classifiers in terms of cross-entropy loss (CEL), micro- and macro-average precisions (Pre and Pre, respectively).
Each score is computed across the 10-fold cross-validation and we provide the standard deviation. Bold scores highlight the best scores for each metric.
| Algorithm | CEL | Pre | Pre |
|---|---|---|---|
| GPD | 1.09±0.06 | 0.80±0.01 | 0.71±0.02 |
| MLR | |||
| RF | 1.56±0.24 | 0.80±0.01 | 0.70±0.03 |
| nB | 10.63±0.55 | 0.66±0.01 | 0.60±0.02 |
| SVM | 1.42±0.04 | 0.72±0.02 | 0.59±0.02 |
Performance in terms of area under the ROC (AUROC) and area under the precision–recall (AUPRE) for the prediction of disease–gene associations for each methods.
| AUROC | AUPRE | |
|---|---|---|
| GPD | ||
| MLR | 0.52 | 6.3 |
| Katz | 0.55 | 7.3 |
| FDE | 0.50 | 5.3 |
Top 10 disease–pathway predictions derived from GPD.
| Disease | Pathway R-HSA- | Literature support |
|---|---|---|
| Autistic disorder | 5653890 | |
| Irritable bowel syndrome | 532668 | PMID:20338921 |
| Irritable bowel syndrome | 391906 | PMID:16835707 |
| Type 2 diabetes mellitus | 499943 | doi: |
| Asthma | 391906 | PMID:8603274 |
| Schizophrenia | 71288 | PMID:22465051 |
| Major depressive disorder | 8934903 | PMID:27063986 |
| Type 2 diabetes mellitus | 8939245 | PMID:19667185 |
| Schizophrenia | 5683371 | |
| Sjogren’s syndrome | 389661 |
Top 10 disease–gene predicted by GPD.
| Disease | Gene | Literature support |
|---|---|---|
| Asthma | UBB | |
| Schizophrenia | RHOA | PMID:16402129 |
| Alzheimer’s disease | FGF23 | PMID:26674092 |
| Autistic disorder | FGF20 | PMID:19204725 |
| Prostate cancer | RPS27A | PMID:15647830 |
| Amyotrophic lateral sclerosis | PSMD13 | |
| Amyotrophic lateral sclerosis | CASP3 | PMID:11715057 |
| Chronic obstructive pulmonary disease | SKP1 | PMID:23713962 |
| Autistic disorder | PSMB2 | |
| Irritable bowel syndrome | PSMA1 | PMID:28717845 |
Performance in terms of area under the ROC (AUROC) and area under the precision–recall (AUPRE) for the prediction of disease–pathway associations for each methods.
| AUROC | AUPRE | |
|---|---|---|
| GPD | ||
| AFDE | 0.47 | 6 |
Fig 2Precision–recall (top) and ROC (bottom) curves of the test against the disease co-morbidity network built by Hidalgo et al. [12].
Top 10 disease–disease links predicted using our approach based on the trained GPD.
| Disease 1 | Disease 2 |
|---|---|
| Atrial fibrillation | Vitiligo |
| Atrial fibrillation | Peripheral vascular disease |
| Alcoholic hepatitis | Osteosarcoma |
| Rhabdoid cancer | Medulloblastoma |
| Cornelia de Lange syndrome | Vitiligo |
| Peripheral vascular disease | Vitiligo |
| Atrial fibrillation | Osteosarcoma |
| Leishmaniasis | Alcoholic hepatitis |
| Sotos syndrome | Vitiligo |
| Follicular lymphoma | Osteosarcoma |