| Literature DB >> 32154410 |
Farshid Rayhan1, Sajid Ahmed1, Zaynab Mousavian2, Dewan Md Farid1, Swakkhar Shatabda1.
Abstract
The task of drug-target interaction prediction holds significant importance in pharmacology and therapeutic drug design. In this paper, we present FRnet-DTI, an auto-encoder based feature manipulation and a convolutional neural network based classifier for drug target interaction prediction. Two convolutional neural networks are proposed: FRnet-Encode and FRnet-Predict. Here, one model is used for feature manipulation and the other one for classification. Using the first method FRnet-Encode, we generate 4096 features for each of the instances in each of the datasets and use the second method, FRnet-Predict, to identify interaction probability employing those features. We have tested our method on four gold standard datasets extensively used by other researchers. Experimental results shows that our method significantly improves over the state-of-the-art method on three out of four drug-target interaction gold standard datasets on both area under curve for Receiver Operating Characteristic (auROC) and area under Precision Recall curve (auPR) metric. We also introduce twenty new potential drug-target pairs for interaction based on high prediction scores. The source codes and implementation details of our methods are available from https://github.com/farshidrayhanuiu/FRnet-DTI/ and also readily available to use as an web application from http://farshidrayhan.pythonanywhere.com/FRnet-DTI/.Entities:
Keywords: Bioinformatics; Class imbalance; Classification; Computer Science; Drug-Target; Ensemble classifier; Feature engineering
Year: 2020 PMID: 32154410 PMCID: PMC7052404 DOI: 10.1016/j.heliyon.2020.e03444
Source DB: PubMed Journal: Heliyon ISSN: 2405-8440
A short summary of structural and evolutionary features used for fingerprint features and protein targets for drugs. Each Group column shows a different feature group, used in this experiment and they are discussed in a later sections.
| Feature group | Number of features | Reference |
|---|---|---|
| Molecular finger print | 881 | |
| PSSM bigram | 400 | |
| Secondary Structure Composition | 3 | |
| Accessible Surface Area Composition | 1 | |
| Torsional Angles Composition | 8 | |
| Torsional Angles Auto-Covariance | 80 | |
| Structural Probabilities Auto-Covariance | 30 | |
| Torsional Angles bigram | 64 | |
| Structural Probabilities bigram | 9 | |
| Total | 1476 | |
Figure 3Inception module from [40], [41], [42].
Figure 5Top view of the proposed system.
Figure 1Architecture of FRnet-Encode.
Shapes of tensor after each convolution operation leading up to the merge operation. Also know as Inception Operation.
| Index | Input shape | Output shape after first convolution | Output shape after second convolution/Max-pool |
|---|---|---|---|
| a | X, 53, 2, 32 | X, 53, 2, 16 | X, 53, 2, 64 |
| b | X, 53, 2, 32 | X, 53, 2, 16 | X, 53, 2, 64 |
| c | X, 53, 2, 32 | X, 53, 2, 16 | X, 53, 2, 32 |
| d | X, 53, 2, 32 | – | X, 53, 2, 32 |
| Final Tensor shape | (X, 53, 2, 192) |
Figure 2Accuracy curves of FRnet-Encodeusing Binary crossentropy as loss function on four datasets: (a) enzymes (b) ion channels (ic) (c) GPCRs (d) nuclear receptors (nr).
Figure 4Architecture of FRnet-Predict.
Description of the gold standard datasets with structural and evolutionary features [8].
| Dataset | Drugs | Proteins | Positive interactions | Imbalance ratio |
|---|---|---|---|---|
| Enzyme | 445 | 664 | 2926 | 99.98 |
| Ion Chanel | 210 | 204 | 1476 | 28.02 |
| GPCR | 223 | 95 | 635 | 32.36 |
| Nuclear Receptor | 54 | 26 | 90 | 14.6 |
A comparison of performances among FRnet-Predict and other classifiers on the gold standard datasets in terms of auROC and auPR using 4096 features generated by FRnet-Encode.
| Dataset | Classifier | auPR | auROC |
|---|---|---|---|
| enzymes | Decision Tree | 0.28 | 0.9376 |
| SVM | 0.53 | 0.9010 | |
| MEBoost | 0.41 | 0.9404 | |
| CUSBoost | 0.9345 | ||
| FRnet-Predict | 0.70 | ||
| GPCR | Decision Tree | 0.31 | 0.9038 |
| SVM | 0.44 | 0.8859 | |
| MEBoost | 0.46 | 0.9075 | |
| CUSBoost | 0.65 | 0.8989 | |
| FRnet-Predict | |||
| Ion Channel | Decision Tree | 0.29 | 0.933 |
| SVM | 0.40 | 0.8904 | |
| MEBoost | 0.39 | 0.928 | |
| CUSBoost | 0.45 | 0.8851 | |
| FRnet-Predict | |||
| NR | Decision Tree | 0.46 | 0.8147 |
| SVM | 0.41 | 0.7605 | |
| MEBoost | 0.23 | 0.9165 | |
| CUSBoost | 0.71 | 0.8989 | |
| FRnet-Predict | |||
A performance comparison among FRnet-Predict with AdaBoost, Support Vector Machine and Random Forest classifiers on the gold standard datasets auROC and auPR curve.
| Dataset | Reference | Classifier | auPR | auROC |
|---|---|---|---|---|
| enzymes | AdaBoost | 0.68 | 0.9689 | |
| Random Forest | 0.43 | 0.9457 | ||
| SVM | 0.54 | 0.9194 | ||
| FRnet-Predict | ||||
| GPCR | AdaBoost | 0.31 | 0.9128 | |
| Random Forest | 0.30 | 0.9168 | ||
| SVM | 0.28 | 0.8720 | ||
| FRnet-Predict | ||||
| Ion Channel | AdaBoost | 0.48 | 0.9369 | |
| Random Forest | 0.40 | 0.9234 | ||
| SVM | 0.39 | 0.8890 | ||
| FRnet-Predict | ||||
| NR | AdaBoost | |||
| Random Forest | 0.29 | 0.7723 | ||
| SVM | 0.41 | 0.8690 | ||
| FRnet-Predict | 0.73 | 0.9241 | ||
Performance of FRnet-Predict on the four benchmark gold datasets in terms of auROC with comparison to other state-of-the-art methods.'N/A' denotes where that particular dataset was not used in the article.
| Method | Dataset | |||
|---|---|---|---|---|
| Enzyme | GPCR | ion channels | nuclear receptor | |
| Yamanishi et al. | 0.904 | 0.8510 | 0.8990 | 0.8430 |
| Yamanishi et al. | 0.8920 | 0.8120 | 0.8270 | 0.8350 |
| DBSI | 0.8075 | 0.8029 | 0.8022 | 0.7578 |
| KBMF2K | 0.8320 | 0.7990 | 0.8570 | 0.8240 |
| NetCBP | 0.8251 | .8034 | 0.8235 | 0.8394 |
| Wang et al. | 0.8860 | 0.8930 | 0.8730 | 0.8240 |
| Mutowo et al. | 0.9480 | 0.8990 | 0.8720 | 0.8690 |
| iDTI-ESBoost | 0.9689 | 0.9369 | 0.9222 | |
| Wang et al. | 0.9425 | 0.8743 | 0.9107 | 0.8176 |
| CFSBoost | 0.9563 | 0.9377 | 0.9278 | 0.8147 |
| Our Method | 0.9241 | |||
Comparison of the performance of FRnet-Predict on the four benchmark gold datasets from [8] in terms of auPR with other the state-of-the-art methods.
| Predictor | enzymes | GPCRs | Ion channels | Nuclear receptors |
|---|---|---|---|---|
| Mousavian et al. | 0.54 | 0.39 | 0.28 | 0.41 |
| iDTI-ESBoost | 0.68 | 0.48 | 0.48 | |
| CFSBoost | 0.68 | 0.54 | 0.50 | 0.73 |
| Ezzat et al. | 0.41 | 0.42 | 0.36 | 0.57 |
| FRnet-Predict | 0.73 |
Performance comparison of FRnet-Predict and other classifiers on the gold standard datasets in terms of auROC and auPR using 4096 features generated by FRnet-Encode with input shape (X, 7, 211, 1).
| Dataset | Classifier | auPR | auROC |
|---|---|---|---|
| enzymes | Decision Tree | 0.27 | 0.9299 |
| SVM | 0.54 | 0.9035 | |
| FRnet-Predict | |||
| GPCR | Decision Tree | 0.32 | 0.9038 |
| SVM | 0.48 | 0.8859 | |
| FRnet-Predict | |||
| Ion Channel | Decision Tree | 0.60 | 0.9235 |
| SVM | 0.52 | 0.8894 | |
| FRnet-Predict | |||
| NR | Decision Tree | 0.43 | 0.8207 |
| SVM | 0.42 | 0.7588 | |
| FRnet-Predict | |||
New prediction made by FRnet-Predict for four gold standard datasets used in this paper.
| Dataset | Protein Id | Drug Id | Drug name | Score |
|---|---|---|---|---|
| enzymes | hsa:10825 | D00041 | Threonine (USP) | 0.8351 |
| hsa:4759 | D00041 | Threonine (USP) | 0.8321 | |
| hsa:129807 | D00041 | Threonine (USP) | 0.8255 | |
| hsa:4953 | D00041 | Threonine (USP) | 0.8048 | |
| hsa:1845 | D00041 | Threonine (USP) | 0.7923 | |
| ion channels | hsa:285242 | D00294 | Diazoxide (JAN/USP/INN) | 0.9823 |
| hsa:779 | D00294 | Diazoxide (JAN/USP/INN) | 0.9712 | |
| hsa:2561 | D00294 | Diazoxide (JAN/USP/INN) | 0.9723 | |
| hsa:785 | D00294 | Diazoxide (JAN/USP/INN) | 0.9634 | |
| hsa:11254 | D00294 | Diazoxide (JAN/USP/INN) | 0.9565 | |
| GPCRs | hsas:9052 | D04625 | Isoetharine (USP) | 0.9013 |
| hsa:9052 | D00632 | Dobutamine hydrochloride (JP17/USP) | 0.9013 | |
| hsa:9052 | D03880 | Dobutamine lactobionate (USAN) | 0.8912 | |
| hsa:9052 | D03881 | Dobutamine tartrate (USP) | 0.8904 | |
| hsa:1909 | D03621 | Cyclizine (INN) | 0.8898 | |
| nuclear receptors | hsa:2099 | D01132 | Tazarotene (JAN/USAN/INN) | 0.9912 |
| hsa:2101 | D00956 | Nandrolone phenpropionate (USP) | 0.9876 | |
| hsa:2101 | D00443 | Spironolactone (JP17/USP/INN) | 0.9885 | |
| hsa:2099 | D00316 | Etretinate (JAN/USAN/INN) | 0.9472 | |
| hsa:9971 | D00316 | Etretinate (JAN/USAN/INN) | 0.9102 | |