| Literature DB >> 30500183 |
Shengchao Liu1,2, Moayad Alnammi1,2, Spencer S Ericksen3, Andrew F Voter4, Gene E Ananiev3, James L Keck4, F Michael Hoffmann3,5, Scott A Wildman3, Anthony Gitter1,2,6.
Abstract
Virtual (computational) high-throughput screening provides a strategy for prioritizing compounds for experimental screens, but the choice of virtual screening algorithm depends on the data set and evaluation strategy. We consider a wide range of ligand-based machine learning and docking-based approaches for virtual screening on two protein-protein interactions, PriA-SSB and RMI-FANCM, and present a strategy for choosing which algorithm is best for prospective compound prioritization. Our workflow identifies a random forest as the best algorithm for these targets over more sophisticated neural network-based models. The top 250 predictions from our selected random forest recover 37 of the 54 active compounds from a library of 22,434 new molecules assayed on PriA-SSB. We show that virtual screening methods that perform well on public data sets and synthetic benchmarks, like multi-task neural networks, may not always translate to prospective screening performance on a specific assay of interest.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30500183 PMCID: PMC6351977 DOI: 10.1021/acs.jcim.8b00363
Source DB: PubMed Journal: J Chem Inf Model ISSN: 1549-9596 Impact factor: 4.956
Summary Statistics for the Four Binary Data Sets
| Stage | Data set | % inhibition threshold | # actives | # inactives |
|---|---|---|---|---|
| Cross-validation | PriA-SSB AS | ≥35% | 79 | 72,344 |
| PriA-SSB FP | ≥30% | 24 | 72,399 | |
| RMI-FANCM FP | ≥ mean + 2 SD | 230 | 49,566 | |
| Prospective | PriA-SSB prospective | ≥35% | 54 | 22,380 |
Summary of Virtual Screening Methods and Which Labels Each Model Used during Traininga
| Model | Continuous % inhibition | Binary label | PCBA binary labels |
|---|---|---|---|
| Dock | |||
| CD | |||
| STNN-C | √ | ||
| STNN-R | √ | ||
| MTNN-C | √ | √ | |
| LSTM | √ | ||
| IRV | √ | ||
| RF | √ | ||
| Similarity baseline | √ |
The docking and consensus docking models do not train on the PriA-SSB or RMI-FANCM data sets.
Figure 1Neural network structures. The neural networks map the input features (e.g., fingerprints) in the input (bottom) layer to intermediate chemical representations in the hidden (middle) layers and finally to the output (top) layer, which makes either continuous or binary predictions. Panel (a) has only one unit in the output layer. Panel (b) has multiple units in the output layer representing different targets, one for our new target of interest and the others for PCBA targets.
Figure 2Initially, 258 neural network and random forest models were evaluated to eliminate poorly performing hyperparameter combinations. The models with the best hyperparameters advanced to cross-validation along with IRV and docking-based methods for a total of 35 models. Cross-validation identified a random forest as the best overall model. The VS methods and similarity baseline then predicted active compounds in the PriA-SSB prospective data set. After the predictions were finalized, we experimentally screened the compounds to evaluate the predictions. Black text denotes ligand-based machine learning models. Red text denotes docking-based models, which did not train on the target-specific HTS data.
Figure 3Evaluation metric distributions on PriA-SSB AS over the cross-validation folds. The metrics are (a) AUC[ROC], (b) AUC[PR], (c) AUC[BEDROC], and (d) NEF1% as described in Section . Unlike the ligand-based models, the docking methods do not train on the PriA-SSB AS training folds and are applied directly to the test fold during cross-validation (see Section 4).
Top-Ranked Models by Means versus DTK+Mean on the Three Tasks. Evaluation metric means were computed over all cross-validation foldsa
| Best
by Mean Model | Best
by DTK+Mean Model | |||||
|---|---|---|---|---|---|---|
| Metric | PriA-SSB AS | PriA-SSB FP | RMI-FANCM FP | PriA-SSB AS | PriA-SSB FP | RMI-FANCM FP |
| AUC[ROC] | RF_d | STNN-R_a | RF_h | RF_d | STNN-R_a | RF_h |
| AUC[BEDROC] | RF_h | STNN-R_b | RF_h | RF_h | STNN-R_b | RF_h |
| AUC[PR] | RF_g | STNN-R_a | RF_h | STNN-C_b | STNN-R_b | STNN-C_b |
| AUC[NEF] | RF_h | STNN-R_b | RF_h | RF_h | STNN-R_b | RF_h |
| NEF1% | RF_h | STNN-R_b | RF_h | RF_h | STNN-R_b | RF_h |
The prospective screening was only performed on PriA-SSB. Model names are mapped to their hyperparameter values in Part E of the Supporting Information.
Number of Active Compounds in the Top 250 Predictions from the Seven Selected Models and the Chemical Similarity Baseline Compared to the Number of Experimentally Identified Activesa
| Model | Actives | Actives not in baseline | SIM clusters | MCS clusters |
|---|---|---|---|---|
| Experimental | 54 | – | 27 | 35 |
| Similarity baseline | 31 | – | 14 | 17 |
| CD_efr1_opt | 0 | 0 | 0 | 0 |
| STNN-C_a | 21 | 2 | 11 | 13 |
| STNN-R_b | 28 | 8 | 14 | 18 |
| LSTM_b | 1 | 1 | 1 | 1 |
| MTNN-C_b | 27 | 3 | 13 | 17 |
| RF_h | 37 | 7 | 17 | 22 |
| IRV_d | 29 | 4 | 15 | 18 |
These selected models are the best in each algorithm category from cross-validation. The last two columns correspond to the number of distinct chemical clusters from similarity or maximum common substructure clustering that are represented among the 54 actives. The consensus docking model CD_efr1_opt ranks the PriA-SSB prospective compounds without using information from the PriA-SSB AS training data. Prospective performance for all VS models is in Part L of the Supporting Information.
Figure 4UpSet plot showing the overlap between the top 250 predictions from the selected VS models and the chemical similarity baseline on PriA-SSB prospective. The plot generalizes a Venn diagram by indicating the overlapping sets with dots on the bottom and the size of the overlaps with the bar graph.[58] Altogether, the combined predictions from the best-in-class VS methods and the baseline found 43 of the 54 actives.