Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Enrichment of high-throughput screening data with increasing levels of noise using support vector machines, recursive partitioning, and laplacian-modified naive bayesian classifiers.

Literature DB >> 16426055

Enrichment of high-throughput screening data with increasing levels of noise using support vector machines, recursive partitioning, and laplacian-modified naive bayesian classifiers.

Meir Glick¹, Jeremy L Jenkins, James H Nettles, Hamilton Hitchings, John W Davies.

Abstract

High-throughput screening (HTS) plays a pivotal role in lead discovery for the pharmaceutical industry. In tandem, cheminformatics approaches are employed to increase the probability of the identification of novel biologically active compounds by mining the HTS data. HTS data is notoriously noisy, and therefore, the selection of the optimal data mining method is important for the success of such an analysis. Here, we describe a retrospective analysis of four HTS data sets using three mining approaches: Laplacian-modified naive Bayes, recursive partitioning, and support vector machine (SVM) classifiers with increasing stochastic noise in the form of false positives and false negatives. All three of the data mining methods at hand tolerated increasing levels of false positives even when the ratio of misclassified compounds to true active compounds was 5:1 in the training set. False negatives in the ratio of 1:1 were tolerated as well. SVM outperformed the other two methods in capturing active compounds and scaffolds in the top 1%. A Murcko scaffold analysis could explain the differences in enrichments among the four data sets. This study demonstrates that data mining methods can add a true value to the screen even when the data is contaminated with a high level of stochastic noise.

Entities: Disease

Year: 2006 PMID： 16426055 DOI： 10.1021/ci050374h

Source DB: PubMed Journal: J Chem Inf Model ISSN： 1549-9596 Impact factor: 4.956

Keyword Cloud
Cited

24 in total

1. Analysis of high-throughput screening assays using cluster enrichment.

Authors: Minya Pu; Tomoko Hayashi; Howard Cottam; Joseph Mulvaney; Michelle Arkin; Maripat Corr; Dennis Carson; Karen Messer
Journal: Stat Med Date: 2012-07-05 Impact factor: 2.373

2. Quantitative high-throughput screening: a titration-based approach that efficiently identifies biological activities in large chemical libraries.

Authors: James Inglese; Douglas S Auld; Ajit Jadhav; Ronald L Johnson; Anton Simeonov; Adam Yasgar; Wei Zheng; Christopher P Austin
Journal: Proc Natl Acad Sci U S A Date: 2006-07-24 Impact factor: 11.205

Review 3. Evaluation of machine-learning methods for ligand-based virtual screening.

Authors: Beining Chen; Robert F Harrison; George Papadatos; Peter Willett; David J Wood; Xiao Qing Lewell; Paulette Greenidge; Nikolaus Stiefl
Journal: J Comput Aided Mol Des Date: 2007-01-05 Impact factor: 3.686

4. Enhancement of chemical rules for predicting compound reactivity towards protein thiol groups.

Authors: James T Metz; Jeffrey R Huth; Philip J Hajduk
Journal: J Comput Aided Mol Des Date: 2007-03-06 Impact factor: 3.686

5. Has discovery-based cancer research been a bust?

Authors: R J Epstein
Journal: Clin Transl Oncol Date: 2013-09-04 Impact factor: 3.405

6. Predicting DPP-IV inhibitors with machine learning approaches.

Authors: Jie Cai; Chanjuan Li; Zhihong Liu; Jiewen Du; Jiming Ye; Qiong Gu; Jun Xu
Journal: J Comput Aided Mol Des Date: 2017-02-02 Impact factor: 3.686

7. Enhancing the rate of scaffold discovery with diversity-oriented prioritization.

Authors: S Joshua Swamidass; Bradley T Calhoun; Joshua A Bittker; Nicole E Bodycombe; Paul A Clemons
Journal: Bioinformatics Date: 2011-06-17 Impact factor: 6.937

8. Linking High-Throughput Screens to Identify MoAs and Novel Inhibitors of Mycobacterium tuberculosis Dihydrofolate Reductase.

Authors: John P Santa Maria; Yumi Park; Lihu Yang; Nicholas Murgolo; Michael D Altman; Paul Zuck; Greg Adam; Chad Chamberlin; Peter Saradjian; Peter Dandliker; Helena I M Boshoff; Clifton E Barry; Charles Garlisi; David B Olsen; Katherine Young; Meir Glick; Elliott Nickbarg; Peter S Kutchukian
Journal: ACS Chem Biol Date: 2017-08-29 Impact factor: 5.100

9. Integrated in silico approaches for the prediction of Ames test mutagenicity.

Authors: Sandeep Modi; Jin Li; Sophie Malcomber; Claire Moore; Andrew Scott; Andrew White; Paul Carmichael
Journal: J Comput Aided Mol Des Date: 2012-08-24 Impact factor: 3.686

10. Kinome-wide activity modeling from diverse public high-quality data sets.

Authors: Stephan C Schürer; Steven M Muskal
Journal: J Chem Inf Model Date: 2013-01-09 Impact factor: 4.956