| Literature DB >> 36014351 |
Mohit Pandey1, Mariia Radaeva1, Hazem Mslati1, Olivia Garland1, Michael Fernandez1, Martin Ester2, Artem Cherkasov1.
Abstract
Computational prediction of ligand-target interactions is a crucial part of modern drug discovery as it helps to bypass high costs and labor demands of in vitro and in vivo screening. As the wealth of bioactivity data accumulates, it provides opportunities for the development of deep learning (DL) models with increasing predictive powers. Conventionally, such models were either limited to the use of very simplified representations of proteins or ineffective voxelization of their 3D structures. Herein, we present the development of the PSG-BAR (Protein Structure Graph-Binding Affinity Regression) approach that utilizes 3D structural information of the proteins along with 2D graph representations of ligands. The method also introduces attention scores to selectively weight protein regions that are most important for ligand binding.Entities:
Keywords: SARS-CoV-2; computer-aided drug discovery; deep learning; drug–target interaction; graph attention network; protein–ligand binding; virtual screening
Mesh:
Substances:
Year: 2022 PMID: 36014351 PMCID: PMC9416537 DOI: 10.3390/molecules27165114
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.927
Data statistics for datasets used for evaluation of PSG-BAR.
| Dataset | Unique Targets | Unavailable | Unique Ligands | Unique |
|---|---|---|---|---|
| PDBBind | 9619 | 0 | 7981 | 9777 |
| KIBA | 467 | 181 | 2356 | 124,374 |
| DAVIS | 442 | 139 | 68 | 20,604 |
| BindingDB | 1038 | 100 | 13,222 | 41,142 |
| AID 1706 | 1 | 0 | 290,765 | 290,765 |
Figure 1PSG-BAR Architecture.
Figure 2Virtual edge set to calculate cross-attention between protein graph nodes and learned representation of the drug.
Comparison of PSG-BAR to other state-of-the-art methods on popular benchmark datasets. For each of the 4 datasets, we compare PSG-BAR to the best reported performance on that dataset that we find in the literature survey. For brevity, RSME data were consolidated into MSE when not directly available from the authors. Since most works report their performance on warm setting, we present our benchmarking results in the same setting. For GEFA and DeepPurpose, we were able to reproduce respective implementations and report results on the same subset of the dataset as PSG-BAR. For other works, we report the best reported performance metric in the literature for these methods.
| Dataset | Architecture | MSE (↓) | Pearson (↑) |
|---|---|---|---|
| DAVIS | GCNConvNet [ | 0.284 | 0.804 |
| GINConvNet [ | 0.257 | 0.824 | |
| DGraphDTA [ | 0.241 | 0.837 | |
| GEFA [ |
| 0.846 | |
| PSG-BAR | 0.237 |
| |
| KIBA | KronRLS [ | 0.261 | 0.752 |
| GANsDTA [ | 0.387 | 0.662 | |
| SimCNN-DTA [ | 0.257 | 0.757 | |
| SimBoost [ | 0.204 | - | |
| PSG-BAR | 0.200 | 0.850 | |
| PSG-BAR w/AF |
|
| |
| BindingDB | DeepAffinity [ | 1.212 | 0.700 |
| DeepPurpose [ | 0.765 | 0.836 | |
| PSG-BAR |
|
| |
| PDBBind | GAT [ | 3.115 | 0.601 |
| SGCN [ | 2.505 | 0.686 | |
| SIGN [ | 1.731 | 0.797 | |
| KDeep [ |
|
| |
| PSG-BAR | 1.660 | 0.762 |
PSG-BAR performance on various dataset stratification schemes.
| Dataset | Architecture | MSE (↓) | Pearson (↑) |
|---|---|---|---|
| DAVIS | Warm | 0.237 | 0.856 |
| Cold Drug | 0.902 | 0.456 | |
| Cold Protein | 0.436 | 0.612 | |
| Cold Protein–Ligand | 0.910 | 0.357 | |
| KIBA | Warm | 0.200 | 0.850 |
| Cold Drug | 0.362 | 0.601 | |
| Cold Protein | 0.298 | 0.756 | |
| Cold Protein–Ligand | 0.415 | 0.360 | |
| BindingDB | Warm | 0.651 | 0.864 |
| Cold Drug | 1.353 | 0.720 | |
| Cold Protein | 1.811 | 0.540 | |
| Cold Protein–Ligand | 2.102 | 0.515 | |
| PDBBind | Warm | 1.660 | 0.762 |
| Cold Drug | 1.895 | 0.694 | |
| Cold Protein | 2.011 | 0.602 | |
| Cold Protein–Ligand | 2.100 | 0.599 |
Figure 3PSG-BAR predictions for SARS-CoV inhibitors. (A) ROC plot. (B) Confusion matrix. 0: inactives; 1: actives.
Figure 4Histogram of PSG-BAR scores for docked compounds suggests that our method assigns higher-score hits from docking and low scores to poor binders. The distribution to the right shows the PSG-BAR scores for top ranking hits of docking screen for MPro target. The left distribution is of PSG-BAR scores for 20,000 randomly chosen compounds from docking screen that were not considered as hits. These compounds on average had lower PSG-BAR score (−30) compared to scores of hit compounds (−15). PSG-BAR scores are unnormalized logits of our trained deep neural network.
Figure 5Interaction attention scores predict residues on the protein surface. For the 6 most frequent protein families in the PDBBind dataset, the 10 highest scored residues by PSG-BAR have mean SAA higher than the mean SAA of the entire protein.
Figure 6Evaluation of off-target effects of MPro hits on human proteins using PSG-BAR. (A) ROC-AUC on ADR dataset (B) Histogram of average scores across 35 ADR-related proteins for 19 predicted Mpro binders. The average score is a proxy for compound promiscuity. (C) Predicted likelihood of binding to ADR proteins for the most promiscuous molecule (red) and least promiscuous molecule (green). Lower values (closer to the center) indicate a high likelihood of binding.
Figure 7Effect of skip connection on model performance on PDBBind dataset. (A) skip connection on stacking GAT layers compared to GAT models of same complexity without skip connection. (B) skip connection vs no skip connection on 3-layered GAT: the early success of the no-skip variant is superseded by the skip connection variant over increasing epochs.
Protein surface features improve PSG-BAR predictions across all 4 benchmarked datasets.
| With Surface Features | Without Surface Features | |||
|---|---|---|---|---|
| Dataset | MSE | Pearson | MSE | Pearson |
| BindingDB | 0.651 | 0.864 | 0.678 | 0.851 |
| PDBBind | 1.660 | 0.762 | 1.744 | 0.749 |
| KIBA | 0.200 | 0.850 | 0.209 | 0.837 |
| DAVIS | 0.237 | 0.856 | 0.249 | 0.845 |
Figure 8(A) Linear correlation of experimental vs predicted pKd values from the KIBA dataset. (B) Interval wise predictive performance on KIBA dataset. The most common interval of 10–12 has highest performance while the extremes on both sides have very poor performance.