Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Leave-cluster-out cross-validation is appropriate for scoring functions derived from diverse protein data sets.

Literature DB >> 20936880

Leave-cluster-out cross-validation is appropriate for scoring functions derived from diverse protein data sets.

Abstract

With the emergence of large collections of protein-ligand complexes complemented by binding data, as found in PDBbind or BindingMOAD, new opportunities for parametrizing and evaluating scoring functions have arisen. With huge data collections available, it becomes feasible to fit scoring functions in a QSAR style, i.e., by defining protein-ligand interaction descriptors and analyzing them with modern machine-learning methods. As in each data modeling ansatz, care has to be taken to validate the model carefully. Here, we show that there are large differences measured in R (0.77 vs 0.46) or R² (0.59 vs 0.21) for a relatively simple scoring function depending on whether it is validated against the PDBbind core set or validated in a leave-cluster-out cross-validation. If proteins from the same family are present in both the training and validation set, the estimated prediction quality from standard validation techniques looks too optimistic.

Mesh：

Year: 2010 PMID： 20936880 DOI： 10.1021/ci100264e

Source DB: PubMed Journal: J Chem Inf Model ISSN： 1549-9596 Impact factor: 4.956

Keyword Cloud
Cited

18 in total

1. Visualizing convolutional neural network protein-ligand scoring.

Authors: Joshua Hochuli; Alec Helbling; Tamar Skaist; Matthew Ragoza; David Ryan Koes
Journal: J Mol Graph Model Date: 2018-06-18 Impact factor: 2.518

2. Protein-Ligand Scoring with Convolutional Neural Networks.

Authors: Matthew Ragoza; Joshua Hochuli; Elisa Idrobo; Jocelyn Sunseri; David Ryan Koes
Journal: J Chem Inf Model Date: 2017-04-11 Impact factor: 4.956

3. A D3R prospective evaluation of machine learning for protein-ligand scoring.

Authors: Jocelyn Sunseri; Matthew Ragoza; Jasmine Collins; David Ryan Koes
Journal: J Comput Aided Mol Des Date: 2016-09-03 Impact factor: 3.686

4. AGL-Score: Algebraic Graph Learning Score for Protein-Ligand Binding Scoring, Ranking, Docking, and Screening.

Authors: Duc Duy Nguyen; Guo-Wei Wei
Journal: J Chem Inf Model Date: 2019-07-01 Impact factor: 4.956

5. Scoring Functions for Protein-Ligand Binding Affinity Prediction using Structure-Based Deep Learning: A Review.

Authors: Rocco Meli; Garrett M Morris; Philip C Biggin
Journal: Front Bioinform Date: 2022-06-17

6. Lessons learned in empirical scoring with smina from the CSAR 2011 benchmarking exercise.

Authors: David Ryan Koes; Matthew P Baumgartner; Carlos J Camacho
Journal: J Chem Inf Model Date: 2013-02-12 Impact factor: 4.956

7. Target-Specific Prediction of Ligand Affinity with Structure-Based Interaction Fingerprints.

Authors: Florian Leidner; Nese Kurt Yilmaz; Celia A Schiffer
Journal: J Chem Inf Model Date: 2019-08-19 Impact factor: 4.956

8. Three-Dimensional Convolutional Neural Networks and a Cross-Docked Data Set for Structure-Based Drug Design.

Authors: Paul G Francoeur; Tomohide Masuda; Jocelyn Sunseri; Andrew Jia; Richard B Iovanisci; Ian Snyder; David R Koes
Journal: J Chem Inf Model Date: 2020-09-10 Impact factor: 4.956

9. Machine-learning scoring functions trained on complexes dissimilar to the test set already outperform classical counterparts on a blind benchmark.

Authors: Hongjian Li; Gang Lu; Kam-Heung Sze; Xianwei Su; Wai-Yee Chan; Kwong-Sak Leung
Journal: Brief Bioinform Date: 2021-11-05 Impact factor: 11.622

10. One Size Does Not Fit All: The Limits of Structure-Based Models in Drug Discovery.

Authors: Gregory A Ross; Garrett M Morris; Philip C Biggin
Journal: J Chem Theory Comput Date: 2013-08-05 Impact factor: 6.006