| Literature DB >> 25821813 |
Guohua Huang1, Yin Lu2, Changhong Lu3, Mingyue Zheng2, Yu-Dong Cai4.
Abstract
Discovering potential indications of novel or approved drugs is a key step in drug development. Previous computational approaches could be categorized into disease-centric and drug-centric based on the starting point of the issues or small-scaled application and large-scale application according to the diversity of the datasets. Here, a classifier has been constructed to predict the indications of a drug based on the assumption that interactive/associated drugs or drugs with similar structures are more likely to target the same diseases using a large drug indication dataset. To examine the classifier, it was conducted on a dataset with 1,573 drugs retrieved from Comprehensive Medicinal Chemistry database for five times, evaluated by 5-fold cross-validation, yielding five 1st order prediction accuracies that were all approximately 51.48%. Meanwhile, the model yielded an accuracy rate of 50.00% for the 1st order prediction by independent test on a dataset with 32 other drugs in which drug repositioning has been confirmed. Interestingly, some clinically repurposed drug indications that were not included in the datasets are successfully identified by our method. These results suggest that our method may become a useful tool to associate novel molecules with new indications or alternative indications with existing drugs.Entities:
Mesh:
Year: 2015 PMID: 25821813 PMCID: PMC4363546 DOI: 10.1155/2015/584546
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 1A plot of the number of drugs in DS2 versus the number of indications.
Detailed information of samples in DSte.
| Name | ID | Original indication | Reported indication |
|---|---|---|---|
| Statins | CID000446156 | Myocardial infarction | Prostate cancer, leukemia |
| Metformin | CID000004091 | Diabetes mellitus | Breast cancer, adenocarcinoma, prostate, colorectal cancer |
| Rapamycin | CID005284616 | Immunosuppressant | Colorectal cancer, lymphoma, leukemia |
| Methotrexate | CID000126941 | Acute leukemia | Osteosarcoma, breast cancer, |
| Zoledronic acid | CID000068740 | Antibone resorption | Multiple myeloma, prostate cancer, breast cancer |
| Wortmannin | CID000312145 | Antifungal | Leukemia |
| Thiocolchicoside | CID000072067 | Muscle relaxant | Leukemia, multiple myeloma |
| Noscapine | CID000275196 | Antitussive, antimalarial, analgesic | Multiple cancer types |
| Galantamine | CID000009651 | Polio, paralysis, anaesthesia | Alzheimer's disease |
| Ropinirole | CID000005095 | Hypertension | Parkinson's disease, idiopathic |
| Tofisopam | CID000005502 | Anxiety-related conditions | Irritable bowel syndrome |
| Finasteride | CID000057363 | Benign prostatic hyperplasia | Hair loss |
| Mifepristone | CID000055245 | Pregnancy termination | Psychotic major depression |
| Minoxidil | CID000004201 | Hypertension | Hair loss |
| Paclitaxel | CID000036314 | Cancer | Restenosis |
| Phentolamine | CID000005775 | Hypertension | Impaired night vision |
| Sildenafil | CID000005212 | Angina | Male erectile dysfunction |
| Tadalafil | CID000110635 | Cardiovascular disease, inflammation | Male erectile dysfunction |
| Topiramate | CID005284627 | Epilepsy | Obesity |
| Zidovudine | CID000035370 | Cancer | HIV/AIDS |
| Allopurinol | CID000002094 | Tumor lysis syndrome | Gout |
| Amphotericin | CID005280965 | Fungal infections | Leishmaniasis |
| Colchicine | CID000006167 | Gout | Recurrent pericarditis |
| Retinoic acid | CID000444795 | Acne | Acute prophylaxis |
| Bimatoprost | CID005311027 | Glaucoma | Promoting eyelash growth |
| Ceftriaxone | CID005479530 | Bacterial infections | Amyotrophic lateral sclerosis |
| Colesevelam | CID000160051 | Hyperlipidemia | Type 2 diabetes mellitus |
| Disulfiram | CID000003117 | Alcoholism | Melanoma |
| Naproxen | CID000156391 | Inflammation, pain | Anti-Alzheimer's disease |
| Minocycline | CID054675783 | Acne | Ovarian cancer, glioma |
| Dapoxetine | CID000071353 | Analgesia, depression | Premature ejaculation |
| Bromocriptine | CID000031101 | Parkinson's disease | Diabetes mellitus |
Best performance of the method based on chemical similarities for different types of fingerprint and values of k.
| Type of fingerprint | Highest 1st order prediction accuracy (%) |
|
|---|---|---|
| ECFP_2 | 48.70 | 3 |
| ECFP_4 | 49.39 | 2 |
| ECFP_6 | 49.11 | 5 |
| FCFP_2 | 42.87 | 2,3 |
| FCFP_4 | 48.07 | 3 |
| FCFP_6 | 48.99 | 3 |
| FP2 | 43.91 | 3 |
| MACCS | 43.39 | 2,3 |
The 1st order prediction accuracies with different k obtained by the method based on chemical interactions on DS( evaluated by jackknife test.
| Value of | The 1st order prediction accuracy |
|---|---|
| 1 | 47.77% |
| 2 | 55.92% |
| 3 | 57.59% |
| 4 | 58.26% |
| 5 | 58.48% |
| 6 | 58.37% |
| 7 | 58.15% |
| 8 | 58.04% |
| 9 | 58.04% |
| 10 | 58.04% |
| 11 | 57.81% |
| 12 | 57.81% |
| 13 | 57.70% |
| 14 | 57.70% |
| 15 | 57.70% |
| 895 | 57.59% |
The first 20 prediction accuracies obtained by the method based on chemical interactions on DS( evaluated by 5-fold cross-validation for 5 times.
| Order | First time (%) | Second time (%) | Third time (%) | Fourth time (%) | Fifth time (%) | Mean (%) | Standard deviation (%) |
|---|---|---|---|---|---|---|---|
| 1 | 56.37 | 55.95 | 57.31 | 57.47 | 57.92 | 57.00 | 0.82 |
| 2 | 21.98 | 24.01 | 22.03 | 22.17 | 22.35 | 22.51 | 0.85 |
| 3 | 8.91 | 7.25 | 8.90 | 6.90 | 6.84 | 7.76 | 1.06 |
| 4 | 5.98 | 5.32 | 4.22 | 5.77 | 5.25 | 5.31 | 0.68 |
| 5 | 3.16 | 4.19 | 4.11 | 4.41 | 4.56 | 4.09 | 0.55 |
| 6 | 2.59 | 2.49 | 2.40 | 2.04 | 1.94 | 2.29 | 0.29 |
| 7 | 1.47 | 2.38 | 2.51 | 2.49 | 2.51 | 2.27 | 0.45 |
| 8 | 1.69 | 1.47 | 1.26 | 1.36 | 1.60 | 1.47 | 0.18 |
| 9 | 2.37 | 1.13 | 1.48 | 1.36 | 1.25 | 1.52 | 0.49 |
| 10 | 0.68 | 1.02 | 1.48 | 1.02 | 0.91 | 1.02 | 0.29 |
| 11 | 1.24 | 1.25 | 0.80 | 1.24 | 0.91 | 1.09 | 0.22 |
| 12 | 1.01 | 1.02 | 1.37 | 1.13 | 1.37 | 1.18 | 0.18 |
| 13 | 1.35 | 1.25 | 1.03 | 1.24 | 1.14 | 1.20 | 0.12 |
| 14 | 0.90 | 0.45 | 0.57 | 0.68 | 0.57 | 0.63 | 0.17 |
| 15 | 0.56 | 0.57 | 0.91 | 0.79 | 0.68 | 0.70 | 0.15 |
| 16 | 0.68 | 0.79 | 0.46 | 0.23 | 0.57 | 0.54 | 0.22 |
| 17 | 1.13 | 0.79 | 0.68 | 1.24 | 0.46 | 0.86 | 0.32 |
| 18 | 0.90 | 0.79 | 0.23 | 1.13 | 0.57 | 0.72 | 0.34 |
| 19 | 1.13 | 0.57 | 0.91 | 1.02 | 0.68 | 0.86 | 0.23 |
| 20 | 0.56 | 1.36 | 1.26 | 0.68 | 1.14 | 1.00 | 0.36 |
The Recalls and Precisions of the first two predictions obtained by three methods on DS(, DS(, and DS2, respectively.
| Order of time | DS( | DS( | DS2 | |||
|---|---|---|---|---|---|---|
| Recall | Precision | Recall | Precision | Recall | Precision | |
| 1st | 61.55 | 39.18 | 48.95 | 28.79 | 56.06 | 34.65 |
| 2nd | 62.37 | 39.98 | 47.81 | 28.26 | 55.99 | 34.84 |
| 3rd | 62.42 | 39.67 | 47.24 | 27.76 | 55.69 | 34.39 |
| 4th | 62.45 | 39.82 | 49.68 | 29.32 | 56.86 | 35.22 |
| 5th | 62.68 | 40.14 | 49.39 | 29.09 | 56.80 | 35.25 |
|
| ||||||
| Mean | 62.29 | 39.76 | 48.62 | 28.65 | 56.28 | 34.87 |
The first 20 prediction accuracies obtained by the method based on chemical similarities on DS( evaluated by 5-fold cross-validation for 5 times.
| Order | First time (%) | Second time (%) | Third time (%) | Fourth time (%) | Fifth time (%) | Mean (%) | Standard deviation (%) |
|---|---|---|---|---|---|---|---|
| 1 | 44.17 | 43.62 | 43.90 | 45.86 | 44.68 | 44.45 | 0.88 |
| 2 | 13.41 | 12.90 | 11.62 | 12.77 | 13.51 | 12.84 | 0.75 |
| 3 | 6.85 | 5.94 | 8.18 | 6.39 | 5.89 | 6.65 | 0.94 |
| 4 | 5.54 | 6.67 | 4.73 | 5.22 | 6.90 | 5.81 | 0.93 |
| 5 | 4.52 | 4.06 | 5.45 | 3.92 | 4.45 | 4.48 | 0.60 |
| 6 | 2.19 | 3.33 | 3.87 | 4.64 | 3.02 | 3.41 | 0.92 |
| 7 | 3.64 | 3.33 | 2.30 | 2.90 | 2.73 | 2.98 | 0.53 |
| 8 | 1.90 | 1.74 | 3.16 | 1.31 | 3.16 | 2.25 | 0.86 |
| 9 | 2.77 | 2.75 | 2.30 | 2.32 | 1.58 | 2.34 | 0.48 |
| 10 | 2.48 | 3.48 | 1.43 | 1.74 | 1.44 | 2.11 | 0.87 |
| 11 | 2.04 | 1.45 | 1.72 | 1.16 | 2.44 | 1.76 | 0.50 |
| 12 | 2.19 | 2.61 | 2.01 | 2.76 | 1.44 | 2.20 | 0.52 |
| 13 | 2.33 | 1.74 | 2.58 | 2.03 | 1.29 | 2.00 | 0.50 |
| 14 | 2.19 | 2.17 | 2.73 | 1.31 | 2.30 | 2.14 | 0.52 |
| 15 | 0.44 | 1.88 | 1.15 | 1.60 | 1.58 | 1.33 | 0.56 |
| 16 | 1.02 | 1.16 | 0.72 | 1.16 | 1.15 | 1.04 | 0.19 |
| 17 | 0.87 | 0.87 | 1.29 | 1.02 | 1.15 | 1.04 | 0.18 |
| 18 | 0.87 | 1.16 | 1.29 | 1.02 | 0.72 | 1.01 | 0.23 |
| 19 | 1.60 | 1.01 | 1.87 | 1.31 | 1.01 | 1.36 | 0.37 |
| 20 | 1.60 | 0.72 | 0.72 | 1.02 | 1.29 | 1.07 | 0.38 |
The first 20 prediction accuracies obtained by the integrated method on DS2 evaluated by 5-fold cross-validation for 5 times.
| Order | First time (%) | Second time (%) | Third time (%) | Fourth time (%) | Fifth time (%) | Mean (%) | Standard deviation (%) |
|---|---|---|---|---|---|---|---|
| 1 | 51.05 | 50.54 | 51.37 | 52.38 | 52.07 | 51.48 | 0.75 |
| 2 | 18.25 | 19.14 | 17.42 | 18.05 | 18.44 | 18.26 | 0.62 |
| 3 | 8.01 | 6.68 | 8.58 | 6.68 | 6.42 | 7.27 | 0.96 |
| 4 | 5.79 | 5.91 | 4.45 | 5.53 | 5.98 | 5.53 | 0.63 |
| 5 | 3.75 | 4.13 | 4.70 | 4.20 | 4.51 | 4.26 | 0.37 |
| 6 | 2.42 | 2.86 | 3.05 | 3.18 | 2.42 | 2.78 | 0.36 |
| 7 | 2.42 | 2.80 | 2.42 | 2.67 | 2.61 | 2.58 | 0.17 |
| 8 | 1.78 | 1.59 | 2.10 | 1.34 | 2.29 | 1.82 | 0.38 |
| 9 | 2.54 | 1.84 | 1.84 | 1.78 | 1.40 | 1.88 | 0.41 |
| 10 | 1.46 | 2.10 | 1.46 | 1.34 | 1.14 | 1.50 | 0.36 |
| 11 | 1.59 | 1.34 | 1.21 | 1.21 | 1.59 | 1.39 | 0.19 |
| 12 | 1.53 | 1.72 | 1.65 | 1.84 | 1.40 | 1.63 | 0.17 |
| 13 | 1.78 | 1.46 | 1.72 | 1.59 | 1.21 | 1.55 | 0.23 |
| 14 | 1.46 | 1.21 | 1.53 | 0.95 | 1.34 | 1.30 | 0.23 |
| 15 | 0.51 | 1.14 | 1.02 | 1.14 | 1.08 | 0.98 | 0.27 |
| 16 | 0.83 | 0.95 | 0.57 | 0.64 | 0.83 | 0.76 | 0.16 |
| 17 | 1.02 | 0.83 | 0.95 | 1.14 | 0.76 | 0.94 | 0.15 |
| 18 | 0.89 | 0.95 | 0.70 | 1.08 | 0.64 | 0.85 | 0.18 |
| 19 | 1.34 | 0.76 | 1.34 | 1.14 | 0.83 | 1.08 | 0.27 |
| 20 | 1.02 | 1.08 | 1.02 | 0.83 | 1.21 | 1.03 | 0.14 |
Figure 2Two curves with Recalls as their X-axis and Precisions as their Y-axis. Recalls and precisions were obtained by method based on chemical interactions with k = 5 and method based on chemical similarities with fingerprint ECFP_4 and k = 2.
8 instances to illuminate accurate prediction of new indications in validation test dataset.
| Name | ID | 1st order prediction | 2nd order prediction | Original indication | New indication |
|---|---|---|---|---|---|
| Rapamycin | CID005284616 | Antineoplastica | Antiinflammatoryc | Immunosuppressant (acted as mTOR inhibitor) [ | Colorectal cancer, lymphoma, leukemia [ |
| Zoledronic | CID000068740 | Antineoplastica | Antiinflammatoryc | Antibone resorption (acted as osteoclast inhibitor) [ | Multiple myeloma, Prostate cancer, breast cancer [ |
| Wortmannin | CID000312145 | Antidiabeticb | Antineoplastica | Antifungal [ | Leukemia [ |
| Galantamine | CID000009651 | Anti-Alzheimer's diseasea | Antihypertensivec | Polio (acted as acetylcholinesterase inhibitor) [ | Alzheimer's disease [ |
| Ropinirole | CID000005095 | Antipsychoticc | Antiparkinsoniana | Antihypertension (acted as dopamine-2 agonist) [ | Parkinson's disease [ |
| Zidovudine | CID000035370 | Antiviralb | Antineoplastica | Anticancer [ | Anti-HIV [ |
| Allopurinol | CID000002094 | Uricosurica | Antineoplasticb | Tumor lysis syndrome [ | Gout [ |
| Colesevelam | CID000160051 | Antihypolipidemicb | Antidiabetica | Antihyperlipidemia [ | Type 2 diabetes mellitus [ |
a: correctly predicted in new indications;
b: correctly predicted in original indications;
c: incorrectly predicted in original indications.