| Literature DB >> 32313024 |
Lei Wang1,2, Zhu-Hong You3, Li-Ping Li4, Xin Yan5, Wei Zhang6.
Abstract
Accumulating evidence has shown that drug-target interactions (DTIs) play a crucial role in the process of genomic drug discovery. Although biological experimental technology has made great progress, the identification of DTIs is still very time-consuming and expensive nowadays. Hence it is urgent to develop in silico model as a supplement to the biological experiments to predict the potential DTIs. In this work, a new model is designed to predict DTIs by incorporating chemical sub-structures and protein evolutionary information. Specifically, we first use Position-Specific Scoring Matrix (PSSM) to convert the protein sequence into the numerical descriptor containing biological evolutionary information, then use Discrete Cosine Transform (DCT) algorithm to extract the hidden features and integrate them with the chemical sub-structures descriptor, and finally utilize Rotation Forest (RF) classifier to accurately predict whether there is interaction between the drug and the target protein. In the 5-fold cross-validation (CV) experiment, the average accuracy of the proposed model on the benchmark datasets of Enzymes, Ion Channels, GPCRs and Nuclear Receptors reached 0.9140, 0.8919, 0.8724 and 0.8111, respectively. In order to fully evaluate the performance of the proposed model, we compare it with different feature extraction model, classifier model, and other state-of-the-art models. Furthermore, we also implemented case studies. As a result, 8 of the top 10 drug-target pairs with the highest prediction score were confirmed by related databases. These excellent results indicate that the proposed model has outstanding ability in predicting DTIs and can provide reliable candidates for biological experiments.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32313024 PMCID: PMC7171114 DOI: 10.1038/s41598-020-62891-2
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1The workflow of the proposed model to predict potential drug-target interactions.
Statistics for the drug-target interactions.
| Statistics | Enzyme | Ion Channel | GPCR | Nuclear Receptor |
|---|---|---|---|---|
| No. of drugs | 445 | 210 | 223 | 54 |
| No. of target proteins | 664 | 204 | 95 | 26 |
| No. of drug-target interactions | 2926 | 1476 | 635 | 90 |
Figure 2Accuracy surface obtained of rotation forest for optimizing parameter K and L.
Average 5-fold CV results obtained by our model on four benchmark datasets.
| Dataset | Evaluation Criteria | Accu. | Prec. | Sen. | MCC | AUC |
|---|---|---|---|---|---|---|
| Average | 0.9140 | 0.9202 | 0.9070 | 0.8428 | 0.9088 | |
| Standard Deviation | 0.0075 | 0.0139 | 0.0225 | 0.0125 | 0.0116 | |
| Average | 0.8919 | 0.8928 | 0.8899 | 0.7836 | 0.8925 | |
| Standard Deviation | 0.0107 | 0.0188 | 0.0166 | 0.0237 | 0.0140 | |
| Average | 0.8724 | 0.8799 | 0.8632 | 0.7454 | 0.8673 | |
| Standard Deviation | 0.0066 | 0.0337 | 0.0272 | 0.0134 | 0.0181 | |
| Average | 0.8111 | 0.8040 | 0.8346 | 0.6328 | 0.7993 | |
| Standard Deviation | 0.0412 | 0.0944 | 0.1160 | 0.0817 | 0.0593 |
Figure 3ROC curves performed by the proposed method on Enzymes dataset.
Figure 6ROC curves performed by the proposed method on Nuclear Receptors dataset.
Average 5-fold CV results obtained by Pseudo-AAC model on four benchmark datasets.
| Dataset | Evaluation Criteria | Accu. | Prec. | Sen. | MCC | AUC |
|---|---|---|---|---|---|---|
| Average | 0.8450 | 0.8536 | 0.8335 | 0.6905 | 0.8435 | |
| Standard Deviation | 0.0085 | 0.0203 | 0.0120 | 0.0174 | 0.0140 | |
| Average | 0.8296 | 0.8267 | 0.8354 | 0.6596 | 0.8314 | |
| Standard Deviation | 0.0141 | 0.0182 | 0.0269 | 0.0286 | 0.0135 | |
| Average | 0.7425 | 0.7463 | 0.7342 | 0.4846 | 0.7531 | |
| Standard Deviation | 0.0299 | 0.0321 | 0.0350 | 0.0588 | 0.0215 | |
| Average | 0.7000 | 0.6836 | 0.7396 | 0.3982 | 0.7259 | |
| Standard Deviation | 0.0362 | 0.0714 | 0.0702 | 0.0826 | 0.0564 |
Average 5-fold CV results obtained by SVM model on four benchmark datasets.
| Dataset | Evaluation Criteria | Accu. | Prec. | Sen. | MCC | AUC |
|---|---|---|---|---|---|---|
| Average | 0.8518 | 0.8479 | 0.8578 | 0.7040 | 0.8512 | |
| Standard Deviation | 0.0085 | 0.0184 | 0.0147 | 0.0168 | 0.0104 | |
| Average | 0.8492 | 0.8499 | 0.8492 | 0.6984 | 0.8489 | |
| Standard Deviation | 0.0139 | 0.0230 | 0.0108 | 0.0279 | 0.0154 | |
| Average | 0.7803 | 0.7753 | 0.7944 | 0.5640 | 0.7800 | |
| Standard Deviation | 0.0218 | 0.0495 | 0.0425 | 0.0432 | 0.0339 | |
| Average | 0.6778 | 0.6726 | 0.6905 | 0.3605 | 0.6665 | |
| Standard Deviation | 0.0421 | 0.0787 | 0.1151 | 0.0786 | 0.0718 |
Figure 7Comparison of experimental results of three models on the benchmark datasets. (a) The results of three models on Enzymes dataset using 5-flod CV. (b) The results of three models on Ion Channels dataset using 5-flod CV. (c) The results of three models on GPCRs dataset using 5-flod CV. (d) The results of three models on Nuclear Receptors dataset using 5-flod CV.
Comparison of other excellent models and the proposed model on four benchmark datasets in terms of the AUC.
| Dataset | Our model | MLCLE[ | KBMF2K[ | AM-PSSM[ | SIMCOMP[ |
|---|---|---|---|---|---|
| 0.842 | 0.832 | 0.843 | 0.863 | ||
| 0.795 | 0.799 | 0.722 | 0.776 | ||
| 0.850 | 0.857 | 0.839 | 0.867 | ||
| 0.7993 | 0.790 | 0.824 | 0.767 |
Details of the top 10 drug-target pairs with the highest predicted scores.
| Drug ID | Drug Name | Taregt Protein ID | Target Protein Name | Validation Source |
|---|---|---|---|---|
| D00049 | Nikotinsaeure | hsa 8843 | G109B_HUMAN | SuperTarget |
| D00348 | Isotretinoino | hsa6256 | RXRA_HUMAN | SuperTarget |
| D00437 | Nifedipine Monohydrochloride | hsa1559 | CP2C9_HUMAN | SuperTarget |
| D00139 | Xanthotoxine | hsa1543 | CP1A1_HUMAN | SuperTarget |
| D00585 | Mifepristone | hsa2099 | ESR1_HUMAN | SuperTarget |
| D00951 | Medroxyprogesteroneacetate | hsa2099 | ESR1_HUMAN | SuperTarget |
| D02340 | Loxapinsuccinate | hsa1812 | DRD1_HUMAN | SuperTarget |
| D00900 | Monomethylhydrazine | hsa1020 | CDK5_HUMAN | N/A |
| D03365 | Transdermal Nicotine | hsa1137 | ACHA4_HUMAN | SuperTarget |
| D00448 | Methylphosphonothiolate | hsa10720 | UDB11_HUMAN | N/A |