| Literature DB >> 33369425 |
Mariarosaria Ferraro1, Elisabetta Moroni1, Emiliano Ippoliti2,3, Silvia Rinaldi1, Carlos Sanchez-Martin4, Andrea Rasola4, Luca F Pavarino5, Giorgio Colombo1,6.
Abstract
Allosteric molecules provide a powerful means to modulate protein function. However, the effect of such ligands on distal orthosteric sites cannot be easily described by classical docking methods. Here, we applied machine learning (ML) approaches to expose the links between local dynamic patterns and different degrees of allosteric inhibition of the ATPase function in the molecular chaperone TRAP1. We focused on 11 novel allosteric modulators with similar affinities to the target but with inhibitory efficacy between the 26.3 and 76%. Using a set of experimentally related local descriptors, ML enabled us to connect the molecular dynamics (MD) accessible to ligand-bound (perturbed) and unbound (unperturbed) systems to the degree of ATPase allosteric inhibition. The ML analysis of the comparative perturbed ensembles revealed a redistribution of dynamic states in the inhibitor-bound versus inhibitor-free systems following allosteric binding. Linear regression models were built to quantify the percentage of experimental variance explained by the predicted inhibitor-bound TRAP1 states. Our strategy provides a comparative MD-ML framework to infer allosteric ligand functionality. Alleviating the time scale issues which prevent the routine use of MD, a combination of MD and ML represents a promising strategy to support in silico mechanistic studies and drug design.Entities:
Year: 2020 PMID: 33369425 PMCID: PMC8016192 DOI: 10.1021/acs.jpcb.0c09742
Source DB: PubMed Journal: J Phys Chem B ISSN: 1520-5207 Impact factor: 2.991
Chart 1Chemical Structures of the 11 TRAP1 Allosteric Modulators Investigated in This Study
Training and External Test Sets Used for Comparative ML Analysesa
Details on trajectories included in the training/test sets are reported in the corresponding cells. The small test set was used for external validation of both the training sets (merged row); extended trajectories of compounds (cmpds) 5–7 (red text) were “unseen” only by the original training set, so the extended training set was not validated against the large test set (X).
Figure 1MD descriptors mapped onto 180°-rotated views of the buckled (Bu.) and the straight (Str.) TRAP1 monomers in the active asymmetric state. Two ATP molecules are bound to their pockets in the NTDs (pink) and establish the featuring salt-bridge with R417 on the ATP sensor loop (vdW spheres). Protein segments with enhanced local dynamics within the dimer are shown (green) and labeled accordingly. The two NTDs make cross-monomer interactions with the N-terminal strap of the partner monomer. S582 (vdW spheres) is shown in the SMD–CTD linker; the segment 566–572 (purple) is highlighted in its ordered (straight) and disordered (buckled) structure. For clarity, in each view, labels are reported for the front monomer only.
Figure 2Probability distributions for the eight features in TRAP1 states A (blue) and I (red) in original (720 ns for each state A/I) and extended training sets (2.52 μs for each state A/I). The plots were obtained distributing individual features vectors collected from 9 inhibitor-unbound replicates and 9 inhibitor-bound complexes containing compounds (5–7) with the highest inhibitory efficacy in the allosteric site (see Table ).
Internal Cross-Validated Performances of Generative and Discriminative ML Modelsa
| GNB | KNB | GDF-SVM | |||||||
|---|---|---|---|---|---|---|---|---|---|
| ⟨ | TPR | TNR | ⟨ | TPR | TNR | ⟨ | TPR | TNR | |
| original training set | 73 | 0.63 | 0.83 | 78.9 | 0.77 | 0.81 | 94.9 | 0.95 | 0.95 |
| extended training set | 66.7 | 0.55 | 0.79 | 74 | 0.70 | 0.78 | 91.0 | 0.92 | 0.92 |
Cross-validated percentage accuracy and corresponding TPR and TNR for GNB, KNB, and GDF-SVM models (“medium” preset: σ = 2.8 and C = 1). Performance metrics were calculated summing up TP and TN predictions on the five validation subsets generated from the training data.
Internal Validation Metrics Reported as TPR % and TNR % for Individual Systems in the Bound/Unbound Statesa
| | original
training set | extended training set | |||
|---|---|---|---|---|---|
| ML models | TRAP1 complexes | TPR % (state I) | TNR % (state A) | TPR % (state I) | TNR % (state A) |
| GNB | 5 | 71.4 | 49.8 | ||
| 7 | 63.0 | 52.0 | |||
| 6 | 61.2 | 62.1 | |||
| inhibitor-free (rep. 1–3) | 69.9 | 75.1 | |||
| inhibitor-free (rep.4–6) | 84.5 | 80.3 | |||
| inhibitor-free (rep.7–9) | 90.0 | 80.2 | |||
| KNB | 5 | 70.9 | 78.3 | ||
| 7 | 94.2 | 63.9 | |||
| 6 | 66.6 | 68.8 | |||
| inhibitor-free (rep. 1–3) | 70.1 | 69.5 | |||
| inhibitor-free (rep.4–6) | 89.6 | 83.4 | |||
| inhibitor-free (rep.7–9) | 82.9 | 78.4 | |||
| GDF-SVM | 5 | 92.5 | 90.9 | ||
| 7 | 97.3 | 93.7 | |||
| 6 | 95.4 | 91.3 | |||
| inhibitor-free (rep. 1–3) | 95.7 | 92.7 | |||
| inhibitor-free (rep.4–6) | 98.4 | 96.7 | |||
| inhibitor-free (rep.7–9) | 92.5 | 89.7 | |||
Percentages are shown over chunks of 12 000 (original training set) or 18 000 (extended training set) frames. Models trained on the entire training set were used to make predictions.
Performance Metrics for the External Validation of the Three ML Algorithms Trained on the Original and Extended Dataseta
| GNB | KNB | GDF-SVM | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| training set | test set | TNR | TPR | TNR | TPR | TNR | TPR | |||
| original | small | 0.68 | 0.38 | 0.50 | 0.59 | 0.42 | 0.45 | 0.69 | 0.40 | 0.11 |
| large | 0.67 | 0.41 | 0.64 | 0.58 | 0.44 | 0.61 | 0.68 | 0.42 | 0.53 | |
| extended | small | 0.77 | 0.31 | 0.56 | 0.65 | 0.37 | 0.71 | 0.68 | 0.37 | 0.18 |
TNR (states A) and TPR (states I) are extracted from 96 000 and 102 000 MD frames for each state in the small and large test sets, respectively. The r2 values obtained from linear regression analyses are also shown for each set of predictions, with the best values highlighted in red.
Figure 3External validation of GNB, KNB and GDF-SVM models against small and large test sets. Predicted TPR percentage (TPR%, red dots) for the 8 ligands (a, c, d, f, g, i) or 11 ligands (b, e, h) against observed percentage of TRAP1 inhibition. In each plot, FPR percentage (FPR%) are calculated as the percentage of states I in the same number of inhibitor-free systems. ML models validated on the original training set (a, d, g) and the extended training set (c, f, i) were used for predictions. The original training set was also tested on “unseen” trajectories of compounds 5–7 (large test set) (b, e, h). Regression lines are shown in solid gray lines, with the associated equations and r2 values. Ligands are numbered as in Table . Dashed gray lines identify boundaries between A/I states: the first line from the left passes through the blue point that defines the maximum FPR% found in at least 62.5% of inhibitor-free trajectories; the second line from the left goes through the first TPR% point (red) found immediately after the first boundary and delimits a region where predicted states I in the inhibitor-bound trajectories (TPR%) is significantly greater than the threshold of states I characterizing the inhibitor-free trajectories (FPR%). Regression models built from docking scores on the small (l) and large (m) test set are shown for comparison.
Percentage Decrease in TRAP1 ATPase Function after Treatment with the 11 Allosteric Inhibitors Investigated in This Studya
| inhibitor-bound TRAP1 | % TRAP1 inhibition |
|---|---|
| 76.0 | |
| 75.2 | |
| 73.0 | |
| 65.9 | |
| 51.3 | |
| 50.8 | |
| 50.5 | |
| 39.6 | |
| 35.5 | |
| 27.4 | |
| 26.3 | |
| inhibitor-free TRAPl | 0.0 |
Functional assays are described in our previous publication.[27] Ligands are numbered as in Chart and are ordered by decreasing effects.