| Literature DB >> 31118067 |
Salma Jamal1, Waseem Ali1, Priya Nagpal2, Sonam Grover3, Abhinav Grover4.
Abstract
BACKGROUND: Predicting adverse drug reactions (ADRs) has become very important owing to the huge global health burden and failure of drugs. This indicates a need for prior prediction of probable ADRs in preclinical stages which can improve drug failures and reduce the time and cost of development thus providing efficient and safer therapeutic options for patients. Though several approaches have been put forward for in silico ADR prediction, there is still room for improvement.Entities:
Keywords: Adverse drug reactions; Feature selection; Machine learning; Random forest; Sequential minimization optimization
Mesh:
Substances:
Year: 2019 PMID: 31118067 PMCID: PMC6530172 DOI: 10.1186/s12967-019-1918-z
Source DB: PubMed Journal: J Transl Med ISSN: 1479-5876 Impact factor: 5.531
Provides the types of features used to generate the models and the number of features obtained after RemoveUseless and mRMR selection approaches
| Type of feature | Source | Initial number | RemoveUseless | mRMR | Total final features |
|---|---|---|---|---|---|
| Biological | |||||
| Targets | DrugBank | 1264 | 1207 | 50 | 150 |
| Transporters | DrugBank | 86 | 84 | 50 | |
| Enzymes | DrugBank | 182 | 182 | 50 | |
| Chemical | |||||
| Substructures | PubChem | 881 | 629 | 50 | 50 |
| Phenotypic | |||||
| Other ADRs | SIDER | 5497 | 5292 | 50 | 100 |
| Therapeutic indications | SIDER | 1840 | 1600 | 50 | |
Provides a comparison of the number of drugs and types of features used in the present study and for other ADR prediction studies
| Dataset | Drugs | Side effects | Substructures | Targets | Transporters | Enzymes | Pathways | Indications |
|---|---|---|---|---|---|---|---|---|
| Pauwels et al. 2011 | 888 | 1385 | 881 | NA | NA | NA | NA | NA |
| Wang et al. 2014 | 799 | 1385 | 881 | 775 | NA | NA | NA | 719 |
| Zhang et al. 2015 | 569 | 4192 | NA | NA | NA | NA | NA | NA |
| Kuang et al. 2014 | 404 | 461 | NA | NA | NA | NA | NA | NA |
| Huang et al. 2011 | 578 | 1447 | NA | 3880 | NA | NA | NA | NA |
| Zhang et al. 2015 | 1080 | 2260 | 881 | 1046 | 96 | 160 | 268 | 2537 |
| Liu et al. 2013 | 832 | 1384 | 881 | 786 | 72 | 111 | 173 | 869 |
| Present study | 965 | 5497 | 881 | 1264 | 86 | 182 | NA | 1840 |
Fig. 1Depicts the outline of the computational methodology followed in the present study
Lists the 36 cardiovascular ADRs along with their SIDER ids for which the RF and SMO models were generated
| Cardiovascular side effect | SIDER id |
|---|---|
| Arrhythmia | C0003811 |
| Atrioventricular block | C0004245 |
| Atrioventricular block complete | C0151517 |
| Atrioventricular block first degree | C0085614 |
| Atrioventricular block second degree | C0264906 |
| Block heart | C0018794 |
| Bradycardia | C0428977 |
| Cardiac arrest | C0018790 |
| Cardiac death | C0376297 |
| Cardiac disorder | C0018799 |
| Cardiac failure acute | C0264714 |
| Cardiac failure congestive | C0018802 |
| Cardiac failure | C0018801 |
| Cardiac fibrillation | C0232197 |
| Cardiac flutter | C0016385 |
| Cardiac murmur | C0018808 |
| Cardiac output decreased | C0007166 |
| Cardiac tamponade | C0007177 |
| Cardiac valve disease | C0018824 |
| Cardiogenic shock | C0036980 |
| Cardiomegaly | C0018800 |
| Cardiomyopathy | C0878544 |
| Cardiopulmonary failure | C1444565 |
| Cardio-respiratory arrest | C0600228 |
| Cardiotoxicity | C0876994 |
| Cardiovascular disorder | C0007222 |
| Conduction disorder | C0264886 |
| Cor pulmonale | C0034072 |
| Decompensation cardiac | C1961112 |
| Heart malformation | C0018798 |
| Heart rate irregular | C0237314 |
| Heartburn | C0018834 |
| Left ventricular failure | C0023212 |
| Myocardial ischaemia | C0151744 |
| Shock | C0036974 |
| Tachycardia | C0039231 |
Provides the overall tenfold cross-validation performance of the models generated using training dataset with biological, chemical, and phenotypic features and the combination of the two and three levels of features
| Type of feature | RF | SMO | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ACC | Precision | Recall | F-score | AUC | PRC | ACC | Precision | Recall | F-score | AUC | PRC | |
| Biological | 78.11 | 0.77 | 0.99 | 0.87 | 0.62 | 0.81 | 76.73 | 0.76 | 0.99 | 0.86 | 0.58 | 0.73 |
| Chemical | 83.34 | 0.84 | 0.97 | 0.89 | 0.78 | 0.89 | 76.32 | 0.78 | 0.94 | 0.85 | 0.68 | 0.74 |
| Phenotypic | 77.06 | 0.75 | 1.00 | 0.86 | 0.54 | 0.75 | 77.91 | 0.77 | 1.00 | 0.87 | 0.54 | 0.74 |
| Biological + chemical | 84.80 | 0.86 | 0.95 | 0.90 | 0.81 | 0.91 | 78.87 | 0.80 | 0.95 | 0.86 | 0.72 | 0.75 |
| Biological + phenotypic | 80.87 | 0.80 | 0.99 | 0.88 | 0.66 | 0.82 | 79.13 | 0.78 | 0.99 | 0.87 | 0.63 | 0.75 |
| Chemical + phenotypic | 83.69 | 0.84 | 0.97 | 0.89 | 0.79 | 0.90 | 77.79 | 0.79 | 0.96 | 0.86 | 0.69 | 0.75 |
| Biological + chemical + phenotypic | 85.25 | 0.85 | 0.96 | 0.90 | 0.82 | 0.91 | 89.07 | 0.94 | 0.95 | 0.94 | 0.47 | 0.75 |
Provides the overall performance measures for the models generated using biological, chemical, and phenotypic features and the combination of the two and three levels of features on non-redundant testing dataset using over sampling of minority class
| Type of feature | RF | SMO | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ACC | Precision | Recall | F-score | AUC | PRC | ACC | Precision | Recall | F-score | AUC | PRC | |
| Biological | 93.56 | 0.94 | 0.99 | 0.96 | 0.52 | 0.93 | 91.24 | 0.93 | 0.99 | 0.96 | 0.51 | 0.93 |
| Chemical | 91.41 | 0.94 | 0.97 | 0.95 | 0.52 | 0.94 | 88.75 | 0.94 | 0.95 | 0.93 | 0.48 | 0.93 |
| Phenotypic | 93.83 | 0.95 | 1.00 | 0.97 | 0.50 | 0.93 | 93.83 | 0.94 | 1.00 | 0.97 | 0.50 | 0.93 |
| Biological + chemical | 90.24 | 0.94 | 0.96 | 0.94 | 0.53 | 0.94 | 89.06 | 0.94 | 0.95 | 0.94 | 0.53 | 0.93 |
| Biological + phenotypic | 93.66 | 0.94 | 1.00 | 0.97 | 0.52 | 0.94 | 93.30 | 0.94 | 0.99 | 0.96 | 0.52 | 0.93 |
| Chemical + phenotypic | 91.49 | 0.94 | 0.97 | 0.95 | 0.50 | 0.94 | 90.54 | 0.94 | 0.96 | 0.95 | 0.48 | 0.93 |
| Biological + chemical + phenotypic | 90.92 | 0.94 | 0.96 | 0.95 | 0.54 | 0.95 | 89.07 | 0.94 | 0.95 | 0.94 | 0.47 | 0.93 |
Provides the overall performance measures for the models generated using biological, chemical, and phenotypic features and the combination of the two and three levels of features on non-redundant testing dataset using under sampling of majority class
| Type of feature | RF | SMO | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ACC | Precision | Recall | F-score | AUC | PRC | ACC | Precision | Recall | F-score | AUC | PRC | |
| Biological | 93.90 | 0.88 | 0.91 | 0.89 | 0.46 | 0.94 | 92.15 | 0.88 | 0.91 | 0.89 | 0.45 | 0.94 |
| Chemical | 93.32 | 0.85 | 0.88 | 0.87 | 0.48 | 0.94 | 93.69 | 0.85 | 0.89 | 0.87 | 0.44 | 0.93 |
| Phenotypic | 93.65 | 0.85 | 0.88 | 0.87 | 0.44 | 0.93 | 93.85 | 0.85 | 0.89 | 0.87 | 0.44 | 0.93 |
| Biological + chemical | 93.07 | 0.85 | 0.88 | 0.86 | 0.46 | 0.94 | 93.72 | 0.85 | 0.89 | 0.87 | 0.44 | 0.93 |
| Biological + phenotypic | 93.51 | 0.85 | 0.88 | 0.87 | 0.43 | 0.93 | 93.82 | 0.85 | 0.89 | 0.87 | 0.44 | 0.93 |
| Chemical + phenotypic | 93.43 | 0.85 | 0.88 | 0.87 | 0.45 | 0.94 | 93.83 | 0.85 | 0.89 | 0.87 | 0.44 | 0.93 |
| Biological + chemical + phenotypic | 93.06 | 0.85 | 0.88 | 0.86 | 0.47 | 0.94 | 93.61 | 0.85 | 0.89 | 0.87 | 0.44 | 0.93 |
Provides the overall performance measures for the models generated using eighty biological, chemical, and phenotypic features and the combination of the two and three levels of features on non-redundant testing dataset
| Type of feature | RF | SMO | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ACC | Precision | Recall | F-score | AUC | PRC | ACC | Precision | Recall | F-score | AUC | PRC | |
| Biological | 93.76 | 0.94 | 0.99 | 0.96 | 0.48 | 0.94 | 93.59 | 0.93 | 0.99 | 0.96 | 0.50 | 0.93 |
| Chemical | 93.03 | 0.93 | 0.98 | 0.95 | 0.54 | 0.95 | 93.98 | 0.93 | 100.00 | 0.96 | 0.50 | 0.93 |
| Phenotypic | 94.85 | 0.95 | 0.98 | 0.97 | 0.69 | 0.96 | 94.23 | 0.94 | 0.99 | 0.96 | 0.52 | 0.94 |
| Biological + chemical | 93.81 | 0.94 | 0.98 | 0.96 | 0.55 | 0.95 | 93.94 | 0.93 | 0.99 | 0.96 | 0.50 | 0.93 |
| Biological + phenotypic | 94.31 | 0.94 | 0.99 | 0.96 | 0.72 | 0.96 | 94.57 | 0.94 | 0.99 | 0.97 | 0.50 | 0.94 |
| Chemical + phenotypic | 94.97 | 0.95 | 0.99 | 0.97 | 0.72 | 0.97 | 94.16 | 0.94 | 0.99 | 0.95 | 0.52 | 0.94 |
| Biological + chemical + phenotypic | 94.28 | 0.94 | 0.99 | 0.96 | 0.68 | 0.96 | 92.34 | 0.95 | 0.99 | 0.97 | 0.51 | 0.95 |
Provides the comparison of performances of models generated in the present study with other approaches for drug ADR prediction
| Dataset | Feature | Algorithm | AUC | Sensitivity/recall | Precision | Accuracy |
|---|---|---|---|---|---|---|
| Pauwels et al. 2011 | Substructures | RF | 0.62 | 0.97 | 0.93 | 91.30 |
| SMO | 0.50 | 1.00 | 0.92 | 92.42 | ||
| Present study | Biological | Random forest | 0.52 | 0.99 | 0.94 | 91.24 |
| Chemical | 0.52 | 0.97 | 0.94 | 88.75 | ||
| Phenotypic | 0.5 | 1.00 | 0.95 | 93.83 | ||
| Biological + chemical | 0.53 | 0.96 | 0.94 | 89.06 | ||
| Biological + phenotypic | 0.52 | 1.00 | 0.94 | 93.3 | ||
| Chemical + phenotypic | 0.5 | 0.97 | 0.94 | 90.54 | ||
| Biological + chemical + phenotypic | 0.54 | 0.96 | 0.94 | 89.07 | ||
| Biological | Support vector machine | 0.51 | 0.99 | 0.93 | 93.56 | |
| Chemical | 0.48 | 0.95 | 0.94 | 91.41 | ||
| Phenotypic | 0.5 | 1.00 | 0.94 | 93.83 | ||
| Biological + chemical | 0.53 | 0.95 | 0.94 | 90.24 | ||
| Biological + phenotypic | 0.52 | 0.99 | 0.94 | 93.66 | ||
| Chemical + phenotypic | 0.48 | 0.96 | 0.94 | 91.49 | ||
| Biological + chemical + phenotypic | 0.47 | 0.95 | 0.94 | 90.92 |
Lists the cardiovascular ADRs predicted by the machine learning RF and SMO models on uncharacterized drugs in SIDER
| Drug name | ADR predicted by RF and SMO |
|---|---|
| DB00176 Fluvoxamine | Decompensation cardiac |
| DB00255 Diethylstilboestrol | Shock, cardiac failure congestive, cardiopulmonary failure, left ventricular failure, cor pulmonale, tachycardia |
| DB00358 Mefloquine | Tachycardia, cardiac failure congestive |
| DB00755 Tretinoin | Block heart, cardiac murmur, heart rate irregular, cardiac failure acute |
| DB00882 Clomifene | Decompensation cardiac |
| DB00927 Famotidine | Tachycardia, cardiac murmur, heart rate irregular, cardiac failure acute |
| DB01026 Ketoconazole | Tachycardia, shock, cardiac failure congestive, cardiotoxicity, cor pulmonale, left ventricular failure, heart malformation, cardiopulmonary failure |
| DB01394 Colchicine | Tachycardia, shock, cardiac failure congestive, cardiac murmur, cardiac disorder |
| DB03904 Urea | Cardiac murmur, heart rate irregular, cardiac failure acute |
| DB05260 Gallium nitrate | Decompensation cardiac |
| DB06210 Eltrombopag | Block heart, cardiac murmur, heart rate irregular, cardiac failure acute, cardiac failure, arrhythmia |
| DB08872 Gabapentin Enacarbil | Decompensation cardiac |