Literature DB >> 23799269

Comparison of confirmed inactive and randomly selected compounds as negative training examples in support vector machine-based virtual screening.

Kathrin Heikamp1, Jürgen Bajorath.   

Abstract

The choice of negative training data for machine learning is a little explored issue in chemoinformatics. In this study, the influence of alternative sets of negative training data and different background databases on support vector machine (SVM) modeling and virtual screening has been investigated. Target-directed SVM models have been derived on the basis of differently composed training sets containing confirmed inactive molecules or randomly selected database compounds as negative training instances. These models were then applied to search background databases consisting of biological screening data or randomly assembled compounds for available hits. Negative training data were found to systematically influence compound recall in virtual screening. In addition, different background databases had a strong influence on the search results. Our findings also indicated that typical benchmark settings lead to an overestimation of SVM-based virtual screening performance compared to search conditions that are more relevant for practical applications.

Mesh:

Year:  2013        PMID: 23799269     DOI: 10.1021/ci4002712

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  14 in total

1.  Nature is the best source of anti-inflammatory drugs: indexing natural products for their anti-inflammatory bioactivity.

Authors:  Miran Aswad; Mahmoud Rayan; Saleh Abu-Lafi; Mizied Falah; Jamal Raiyn; Ziyad Abdallah; Anwar Rayan
Journal:  Inflamm Res       Date:  2017-09-27       Impact factor: 4.575

Review 2.  Collaborative drug discovery for More Medicines for Tuberculosis (MM4TB).

Authors:  Sean Ekins; Anna Coulon Spektor; Alex M Clark; Krishna Dole; Barry A Bunin
Journal:  Drug Discov Today       Date:  2016-11-22       Impact factor: 7.851

3.  Predicting novel substrates for enzymes with minimal experimental effort with active learning.

Authors:  Dante A Pertusi; Matthew E Moura; James G Jeffryes; Siddhant Prabhu; Bradley Walters Biggs; Keith E J Tyo
Journal:  Metab Eng       Date:  2017-10-10       Impact factor: 9.783

4.  Fusing dual-event data sets for Mycobacterium tuberculosis machine learning models and their evaluation.

Authors:  Sean Ekins; Joel S Freundlich; Robert C Reynolds
Journal:  J Chem Inf Model       Date:  2013-10-30       Impact factor: 4.956

5.  Benchmarking of protein descriptor sets in proteochemometric modeling (part 2): modeling performance of 13 amino acid descriptor sets.

Authors:  Gerard Jp van Westen; Remco F Swier; Isidro Cortes-Ciriano; Jörg K Wegner; John P Overington; Adriaan P Ijzerman; Herman Wt van Vlijmen; Andreas Bender
Journal:  J Cheminform       Date:  2013-09-24       Impact factor: 5.514

6.  Sequential application of ligand and structure based modeling approaches to index chemicals for their hH4R antagonism.

Authors:  Matteo Pappalardo; Nir Shachaf; Livia Basile; Danilo Milardi; Mouhammed Zeidan; Jamal Raiyn; Salvatore Guccione; Anwar Rayan
Journal:  PLoS One       Date:  2014-10-16       Impact factor: 3.240

7.  Nature is the best source of anticancer drugs: Indexing natural products for their anticancer bioactivity.

Authors:  Anwar Rayan; Jamal Raiyn; Mizied Falah
Journal:  PLoS One       Date:  2017-11-09       Impact factor: 3.240

8.  Influence of Varying Training Set Composition and Size on Support Vector Machine-Based Prediction of Active Compounds.

Authors:  Raquel Rodríguez-Pérez; Martin Vogt; Jürgen Bajorath
Journal:  J Chem Inf Model       Date:  2017-04-10       Impact factor: 4.956

9.  The influence of negative training set size on machine learning-based virtual screening.

Authors:  Rafał Kurczab; Sabina Smusz; Andrzej J Bojarski
Journal:  J Cheminform       Date:  2014-06-11       Impact factor: 5.514

10.  The influence of the negative-positive ratio and screening database size on the performance of machine learning-based virtual screening.

Authors:  Rafał Kurczab; Andrzej J Bojarski
Journal:  PLoS One       Date:  2017-04-06       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.