Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Low-Quality Structural and Interaction Data Improves Binding Affinity Prediction via Random Forest.

Literature DB >> 26076113

Low-Quality Structural and Interaction Data Improves Binding Affinity Prediction via Random Forest.

Hongjian Li¹, Kwong-Sak Leung², Man-Hon Wong³, Pedro J Ballester⁴.

Abstract

Docking scoring functions can be used to predict the strength of protein-ligand binding. It is widely believed that training a scoring function with low-quality data is detrimental for its predictive performance. Nevertheless, there is a surprising lack of systematic validation experiments in support of this hypothesis. In this study, we investigated to which extent training a scoring function with data containing low-quality structural and binding data is detrimental for predictive performance. We actually found that low-quality data is not only non-detrimental, but beneficial for the predictive performance of machine-learning scoring functions, though the improvement is less important than that coming from high-quality data. Furthermore, we observed that classical scoring functions are not able to effectively exploit data beyond an early threshold, regardless of its quality. This demonstrates that exploiting a larger data volume is more important for the performance of machine-learning scoring functions than restricting to a smaller set of higher data quality.

Entities: Chemical Disease Species

Keywords: binding affinity prediction; docking; machine-learning scoring functions

Mesh：

Year: 2015 PMID： 26076113 PMCID： PMC6272292 DOI： 10.3390/molecules200610947

Source DB: PubMed Journal: Molecules ISSN： 1420-3049 Impact factor: 4.411

Keyword Cloud
Cited

22 in total

1. Improving scoring-docking-screening powers of protein-ligand scoring functions using random forest.

Authors: Cheng Wang; Yingkai Zhang
Journal: J Comput Chem Date: 2016-11-17 Impact factor: 3.376

2. DG-GL: Differential geometry-based geometric learning of molecular datasets.

Authors: Duc Duy Nguyen; Guo-Wei Wei
Journal: Int J Numer Method Biomed Eng Date: 2019-02-07 Impact factor: 2.747

3. AGL-Score: Algebraic Graph Learning Score for Protein-Ligand Binding Scoring, Ranking, Docking, and Screening.

Authors: Duc Duy Nguyen; Guo-Wei Wei
Journal: J Chem Inf Model Date: 2019-07-01 Impact factor: 4.956

4. Using diverse potentials and scoring functions for the development of improved machine-learned models for protein-ligand affinity and docking pose prediction.

Authors: Omar N A Demerdash
Journal: J Comput Aided Mol Des Date: 2021-10-28 Impact factor: 3.686

Review 5. A review of mathematical representations of biomolecular data.

Authors: Duc Duy Nguyen; Zixuan Cang; Guo-Wei Wei
Journal: Phys Chem Chem Phys Date: 2020-02-26 Impact factor: 3.676

6. Three-Dimensional Convolutional Neural Networks and a Cross-Docked Data Set for Structure-Based Drug Design.

Authors: Paul G Francoeur; Tomohide Masuda; Jocelyn Sunseri; Andrew Jia; Richard B Iovanisci; Ian Snyder; David R Koes
Journal: J Chem Inf Model Date: 2020-09-10 Impact factor: 4.956

7. Machine-learning scoring functions trained on complexes dissimilar to the test set already outperform classical counterparts on a blind benchmark.

Authors: Hongjian Li; Gang Lu; Kam-Heung Sze; Xianwei Su; Wai-Yee Chan; Kwong-Sak Leung
Journal: Brief Bioinform Date: 2021-11-05 Impact factor: 11.622