| Literature DB >> 32188871 |
Han-Jing Jiang1,2,3, Yu-An Huang4, Zhu-Hong You5,6,7.
Abstract
Drug-disease association is an important piece of information which participates in all stages of drug repositioning. Although the number of drug-disease associations identified by high-throughput technologies is increasing, the experimental methods are time consuming and expensive. As supplement to them, many computational methods have been developed for an accurate in silico prediction for new drug-disease associations. In this work, we present a novel computational model combining sparse auto-encoder and rotation forest (SAEROF) to predict drug-disease association. Gaussian interaction profile kernel similarity, drug structure similarity and disease semantic similarity were extracted for exploring the association among drugs and diseases. On this basis, a rotation forest classifier based on sparse auto-encoder is proposed to predict the association between drugs and diseases. In order to evaluate the performance of the proposed model, we used it to implement 10-fold cross validation on two golden standard datasets, Fdataset and Cdataset. As a result, the proposed model achieved AUCs (Area Under the ROC Curve) of Fdataset and Cdataset are 0.9092 and 0.9323, respectively. For performance evaluation, we compared SAEROF with the state-of-the-art support vector machine (SVM) classifier and some existing computational models. Three human diseases (Obesity, Stomach Neoplasms and Lung Neoplasms) were explored in case studies. As a result, more than half of the top 20 drugs predicted were successfully confirmed by the Comparative Toxicogenomics Database(CTD database). This model is a feasible and effective method to predict drug-disease correlation, and its performance is significantly improved compared with existing methods.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32188871 PMCID: PMC7080766 DOI: 10.1038/s41598-020-61616-9
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Flowchart of SAEROF model.
The data comparison list of the database.
| Datasets | Drugs | Diseases | Associations |
|---|---|---|---|
| Cdataset | 663 | 409 | 2532 |
| Fdataset | 593 | 313 | 1933 |
Figure 2Weighted drug sharing network. The dotted line represents the drug-disease association between, and the shared diseases of drug pairs represent the weight.
Figure 3The structure of an auto-encoder.
10-fold cross-validation results performed by SAEROF on Fdataset.
| Test set | Acc. (%) | Pre. (%) | Recall. (%) | F1-score. (%) |
|---|---|---|---|---|
| 0 | 83.76 | 83.25 | 84.54 | 83.89 |
| 1 | 79.90 | 82.95 | 75.26 | 78.92 |
| 2 | 79.90 | 79.90 | 79.90 | 79.90 |
| 3 | 79.53 | 80.32 | 78.24 | 79.27 |
| 4 | 79.79 | 84.43 | 73.06 | 78.33 |
| 5 | 82.90 | 84.70 | 80.31 | 82.45 |
| 6 | 81.09 | 86.14 | 74.09 | 79.67 |
| 7 | 82.38 | 83.78 | 80.31 | 82.01 |
| 8 | 82.38 | 85.31 | 78.24 | 81.62 |
| 9 | 80.05 | 83.33 | 75.13 | 79.02 |
| Average |
10-fold cross-validation results performed by SAEROF on Cdataset.
| Test set | Acc. (%) | Pre. (%) | Recall. (%) | F1-score. (%) |
|---|---|---|---|---|
| 0 | 86.42 | 87.45 | 85.04 | 86.23 |
| 1 | 84.45 | 84.58 | 84.25 | 84.42 |
| 2 | 82.81 | 85.78 | 78.66 | 82.06 |
| 3 | 84.78 | 84.38 | 85.38 | 84.87 |
| 4 | 82.61 | 86.67 | 77.08 | 81.59 |
| 5 | 83.40 | 84.49 | 81.82 | 83.13 |
| 6 | 81.23 | 85.27 | 75.49 | 80.08 |
| 7 | 82.81 | 86.73 | 77.47 | 81.84 |
| 8 | 81.23 | 85.27 | 75.49 | 80.08 |
| 9 | 84.98 | 87.66 | 81.42 | 84.43 |
| Average |
Figure 4Comparison of ROC curves on Fdataset and Cdataset. (a) Is the ROC curve of 10-fold cross validation on the Fdataset. (b) s is the ROC curve of 10-fold cross validation on the Cdataset.
AUC Results of cross validation experiments.
| Method | Fdataset | Cdatase |
|---|---|---|
| DrugNet | 0.778(0.001) | 0.804(0.001) |
| HGBI | 0.829(0.012) | 0.858(0.014) |
10-fold cross validation used in Fdataset with SVM classifier.
| Test set | Acc. (%) | Pre. (%) | Recall. (%) | F1-score. (%) |
|---|---|---|---|---|
| 0 | 75.52 | 71.06 | 86.08 | 77.86 |
| 1 | 74.23 | 70.61 | 82.99 | 76.30 |
| 2 | 76.55 | 75.12 | 79.38 | 77.19 |
| 3 | 70.73 | 69.42 | 74.09 | 71.68 |
| 4 | 74.35 | 70.26 | 84.46 | 76.71 |
| 5 | 71.50 | 67.97 | 81.35 | 74.06 |
| 6 | 72.28 | 70.48 | 76.68 | 73.45 |
| 7 | 75.39 | 71.49 | 84.46 | 77.43 |
| 8 | 74.35 | 72.38 | 78.76 | 75.43 |
| 9 | 75.65 | 72.40 | 82.90 | 77.29 |
10-fold cross validation used in Cdataset with SVM classifier.
| Test set | Acc. (%) | Pre. (%) | Recall. (%) | F1-score. (%) |
|---|---|---|---|---|
| 0 | 74.61 | 72.08 | 80.31 | 75.98 |
| 1 | 79.13 | 75.00 | 87.40 | 80.73 |
| 2 | 77.87 | 75.09 | 83.40 | 79.03 |
| 3 | 77.47 | 75.09 | 82.21 | 78.49 |
| 4 | 75.69 | 74.07 | 79.05 | 76.48 |
| 5 | 80.63 | 78.60 | 84.19 | 81.30 |
| 6 | 75.69 | 73.21 | 81.03 | 76.92 |
| 7 | 75.10 | 71.67 | 83.00 | 76.92 |
| 8 | 74.51 | 71.83 | 80.63 | 75.98 |
| 9 | 78.46 | 75.90 | 83.40 | 79.47 |
Figure 5Comparison of ROC curves of SVM classifier in Fdataset and Cdataset. (a) Is the ROC curve of 10-fold cross validation on the Fdataset. (b) s is the ROC curve of 10-fold cross validation on the Cdataset.
The top-20 drugs predicted to be associated with Obesity.
| Index | Drug Name | Evidence | Index | Drug Name | Evidence |
|---|---|---|---|---|---|
| 1 | Topiramate | Confirmed | 11 | Benzphetamine | Confirmed |
| 2 | Sibutramine | N.A. | 12 | Methotrexate | Confirmed |
| 3 | Phenylpropanolamine | Confirmed | 13 | Prednisone | Confirmed |
| 4 | Phentermine | Confirmed | 14 | Mitoxantrone | Confirmed |
| 5 | Phendimetrzaine | N.A. | 15 | Scopolamine | Confirmed |
| 6 | Orlistat | Confirmed | 16 | Imipramine | Confirmed |
| 7 | Methamphetamine | Confirmed | 17 | Dexamethasone | Confirmed |
| 8 | Diethylpropion | Confirmed | 18 | Azathioprine | N.A. |
| 9 | Cimetidine | Confirmed | 19 | Diazepam | Confirmed |
| 10 | Bupropion | Confirmed | 20 | Clonazepam | Confirmed |
The top 20 drugs predicted to be associated with Stomach Neoplasms.
| Index | Drug Name | Evidence | Index | Drug Name | Confirmed |
|---|---|---|---|---|---|
| 1 | Terazosin | Confirmed | 11 | Diethylpropion | N.A. |
| 2 | Tacrolimus | Confirmed | 12 | Beclomethasone | Confirmed |
| 3 | Spironolactone | Confirmed | 13 | Baclofen | Confirmed |
| 4 | Meloxicam | Confirmed | 14 | Prazosin | Confirmed |
| 5 | Hyoscyamine | N.A. | 15 | Metoclopramide | N.A. |
| 6 | Glatiramer acetate | N.A. | 16 | Methotrexate | Confirmed |
| 7 | Famotidine | Confirmed | 17 | Memantine | Confirmed |
| 8 | Escitalopram | Confirmed | 18 | Thalidomide | Confirmed |
| 9 | Carbamazepine | Confirmed | 19 | Ibuprofen | Confirmed |
| 10 | Phenobarbital | Confirmed | 20 | Gliclazide | Confirmed |
The top 20 drugs predicted to be associated with obesity Lung Neoplasms.
| Index | Drug Name | Evidence | Index | Drug Name | Confirmed |
|---|---|---|---|---|---|
| 1 | Pyridoxine | Confirmed | 11 | Gemcitabine | Confirmed |
| 2 | Etoposide | Confirmed | 12 | Alprostadil | Confirmed |
| 3 | Felbamate | Confirmed | 13 | Fluocinolone acetonide | Confirmed |
| 4 | Levocabastine | N.A. | 14 | Doxazosin | Confirmed |
| 5 | L-Alanine | Confirmed | 15 | Etidronic acid | Confirmed |
| 6 | Cetirizine | Confirmed | 16 | Medrysone | N.A. |
| 7 | Lamotrigine | Confirmed | 17 | Spermine | Confirmed |
| 8 | Auranofin | Confirmed | 18 | Donepezil | Confirmed |
| 9 | Alimemazine | N.A. | 19 | Decitabine | N.A. |
| 10 | Loratadine | Confirmed | 20 | Piroxicam | Confirmed |