| Literature DB >> 28155697 |
Ali Ezzat1, Min Wu2, Xiao-Li Li3, Chee-Keong Kwoh1.
Abstract
BACKGROUND: Multiple computational methods for predicting drug-target interactions have been developed to facilitate the drug discovery process. These methods use available data on known drug-target interactions to train classifiers with the purpose of predicting new undiscovered interactions. However, a key challenge regarding this data that has not yet been addressed by these methods, namely class imbalance, is potentially degrading the prediction performance. Class imbalance can be divided into two sub-problems. Firstly, the number of known interacting drug-target pairs is much smaller than that of non-interacting drug-target pairs. This imbalance ratio between interacting and non-interacting drug-target pairs is referred to as the between-class imbalance. Between-class imbalance degrades prediction performance due to the bias in prediction results towards the majority class (i.e. the non-interacting pairs), leading to more prediction errors in the minority class (i.e. the interacting pairs). Secondly, there are multiple types of drug-target interactions in the data with some types having relatively fewer members (or are less represented) than others. This variation in representation of the different interaction types leads to another kind of imbalance referred to as the within-class imbalance. In within-class imbalance, prediction results are biased towards the better represented interaction types, leading to more prediction errors in the less represented interaction types.Entities:
Keywords: Between-class imbalance; Class imbalance; Drug-target interaction prediction; Ensemble learning; Small disjuncts; Within-class imbalance
Mesh:
Substances:
Year: 2016 PMID: 28155697 PMCID: PMC5259867 DOI: 10.1186/s12859-016-1377-y
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Statistics of the interaction dataset used in this study
| Drugs | Targets | Interactions |
|---|---|---|
| 5877 | 3348 | 12674 |
Fig. 1Plot of ROC curves of the different methods. ROC curves for the different methods are plotted together, providing a visual comparison between their prediction performances
AUC Results of cross validation experiments
| Decision Tree | 0.760 (0.004) |
| SVM | 0.804 (0.004) |
| Nearest Neighbor | 0.814 (0.003) |
| Random Forest | 0.855 (0.006) |
|
|
|
Standard deviations are included between parentheses. Best AUC is indicated in bold
Top 20 targets predicted for Aripiprazole and Theophylline
| Aripiprazole | Theophylline | ||
|---|---|---|---|
| Rank | Target | Rank | Target |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 | Adenosine receptor A3 |
|
|
| 10 | Thymidylate synthase |
|
|
| 11 | Histone deacetylase 1 |
| 12 | Delta-type opioid receptor | 12 | Cyclin-dependent kinase 2 |
|
|
| 13 | Reverse transcriptase/RNaseH |
|
|
| 14 | Cap-specific mRNA (nucleoside-2’-O-)-methyltransferase |
|
|
| 15 | Multi-sensor signal transduction histidine kinase |
|
|
| 16 | Alpha-1 adrenergic receptor |
|
|
| 17 | Serine/threonine-protein kinase pim-1 |
|
|
| 18 | Serine-protein kinase ATM |
|
|
| 19 | Proto-oncogene tyrosine-protein kinase Src |
|
|
| 20 | Phosphatidylinositol 4,5-bisphosphate 3-kinase |
| catalytic subunit alpha isoform | |||
Targets in bold are the true known targets of the drugs
Top 20 drugs predicted for Glutamate receptor ionotropic, kainate 2 and Xylose isomerase
| Glutamate receptor ionotropic, kainate 2 | Xylose isomerase | ||
|---|---|---|---|
| Rank | Drug | Rank | Drug |
|
|
|
|
|
|
|
| 2 | alpha-D-Xylopyranose |
|
|
|
|
|
|
|
| 4 | beta-D-Ribopyranose |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 | Tris-Hydroxymethyl-Methyl-Ammonium |
|
|
|
|
|
|
|
| 11 | Ethanol |
|
|
| 12 | Beta-D-Glucose |
|
|
| 13 | D-Allopyranose |
|
|
| 14 | 2-Deoxy-Beta-D-Galactose |
|
|
| 15 | Tris |
| 16 | Lysine Nz-Carboxylic Acid | 16 | 3-O-Methylfructose in Linear Form |
|
|
| 17 | Dithioerythritol |
|
|
| 18 | (2s,3s)-1,4-Dimercaptobutane-2,3-Diol |
| 19 | Vitamin A | 19 | 1,4-Dithiothreitol |
| 20 | Mephenytoin | 20 | Glycerol |
Drugs in bold are true known drugs of the targets