| Literature DB >> 35955587 |
Ashutosh Ghimire1, Hilal Tayara2, Zhenyu Xuan3, Kil To Chong1,4.
Abstract
Drug discovery, which aids to identify potential novel treatments, entails a broad range of fields of science, including chemistry, pharmacology, and biology. In the early stages of drug development, predicting drug-target affinity is crucial. The proposed model, the prediction of drug-target affinity using a convolution model with self-attention (CSatDTA), applies convolution-based self-attention mechanisms to the molecular drug and target sequences to predict drug-target affinity (DTA) effectively, unlike previous convolution methods, which exhibit significant limitations related to this aspect. The convolutional neural network (CNN) only works on a particular region of information, excluding comprehensive details. Self-attention, on the other hand, is a relatively recent technique for capturing long-range interactions that has been used primarily in sequence modeling tasks. The results of comparative experiments show that CSatDTA surpasses previous sequence-based or other approaches and has outstanding retention abilities.Entities:
Keywords: artificial intelligence; attention; binding affinity; convolution neural network; deep learning; drug discovery and development; drug–target interaction; ligands; pharmacometrics; proteins
Mesh:
Year: 2022 PMID: 35955587 PMCID: PMC9369082 DOI: 10.3390/ijms23158453
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 6.208
Hyperparameters for CSatDTA.
| Hyperparameters | |
|---|---|
| Learning rate (initially) | 0.001 |
| Batch size | 64 |
| Optimizer | Adadelta |
| Kernel initializer | Glorot Normal |
| CNN layers | 2 |
| Attention layers | 2 |
| Number of attention heads for SMILES | 4 |
| Number of attention heads for proteins | 10 |
| Filters for keys and values for SMILES | 2 |
| Filters for keys and values for proteins | 5 |
Performance on the KiBA dataset in terms of prediction.
| Method | Compound Rep. | Protein Rep. | MSE | RMSE | CI |
|---|---|---|---|---|---|
| KronRLS | Pubchem-Sim | Smith–Waterman | 0.411 | 0.641 | 0.782 |
| SimBoost | Pubchem-Sim | Smith–Waterman | 0.222 | 0.471 | 0.836 |
| DeepDTA | 1D | 1D | 0.179 | 0.423 | 0.863 |
| WideDTA | 1D + LMCS | 1D + PDM | 0.194 | 0.440 | 0.875 |
| GAT_GCN | Graph | 1D | 0.140 | 0.374 | 0.891 |
| CsatDTA (Proposed) | 1D | 1D | 0.134 | 0.366 | 0.898 |
Performance on the Davis dataset in terms of prediction.
| Method | Compound Rep. | Protein Rep. | MSE | RMSE | CI |
|---|---|---|---|---|---|
| KronRLS | Pubchem-Sim | Smith–Waterman | 0.379 | 0.615 | 0.871 |
| SimBoost | Pubchem-Sim | Smith–Waterman | 0.282 | 0.531 | 0.872 |
| DeepDTA | 1D | 1D | 0.261 | 0.510 | 0.878 |
| WideDTA | 1D + LMCS | 1D + PDM | 0.262 | 0.511 | 0.886 |
| GAT_GCN | Graph | 1D | 0.245 | 0.494 | 0.881 |
| CsatDTA (Proposed) | 1D | 1D | 0.241 | 0.490 | 0.892 |
Figure 1Predictions of the CSatDTA model vs. measured binding affinity values for the Davis dataset.
Figure 2Predictions of the CSatDTA model vs. measured binding affinity values for KiBA.
Figure 3Snapshot of webserver showing binding affinity prediction.
Datasets.
| Proteins | Compounds | Interactions | |
|---|---|---|---|
| KIBA | 229 | 2111 | 118,254 |
| Davis | 442 | 68 | 30,056 |
Figure 4Analysis of KiBA and Davis datasets: (a) distribution of length of SMILES for KiBA datasets, (b) distribution of length of protein sequences for KiBA datasets, (c) distribution of length of SMILES for Davis datasets, and (d) distribution of length of protein sequences for Davis datasets.
Figure 5The proposed model’s architecture.