Literature DB >> 17387437

Support vector inductive logic programming outperforms the naive Bayes classifier and inductive logic programming for the classification of bioactive chemical compounds.

Edward O Cannon1, Ata Amini, Andreas Bender, Michael J E Sternberg, Stephen H Muggleton, Robert C Glen, John B O Mitchell.   

Abstract

We investigate the classification performance of circular fingerprints in combination with the Naive Bayes Classifier (MP2D), Inductive Logic Programming (ILP) and Support Vector Inductive Logic Programming (SVILP) on a standard molecular benchmark dataset comprising 11 activity classes and about 102,000 structures. The Naive Bayes Classifier treats features independently while ILP combines structural fragments, and then creates new features with higher predictive power. SVILP is a very recently presented method which adds a support vector machine after common ILP procedures. The performance of the methods is evaluated via a number of statistical measures, namely recall, specificity, precision, F-measure, Matthews Correlation Coefficient, area under the Receiver Operating Characteristic (ROC) curve and enrichment factor (EF). According to the F-measure, which takes both recall and precision into account, SVILP is for seven out of the 11 classes the superior method. The results show that the Bayes Classifier gives the best recall performance for eight of the 11 targets, but has a much lower precision, specificity and F-measure. The SVILP model on the other hand has the highest recall for only three of the 11 classes, but generally far superior specificity and precision. To evaluate the statistical significance of the SVILP superiority, we employ McNemar's test which shows that SVILP performs significantly (p < 5%) better than both other methods for six out of 11 activity classes, while being superior with less significance for three of the remaining classes. While previously the Bayes Classifier was shown to perform very well in molecular classification studies, these results suggest that SVILP is able to extract additional knowledge from the data, thus improving classification results further.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17387437     DOI: 10.1007/s10822-007-9113-3

Source DB:  PubMed          Journal:  J Comput Aided Mol Des        ISSN: 0920-654X            Impact factor:   4.179


  21 in total

1.  Virtual Screening for Bioactive Molecules by Evolutionary De Novo Design Special thanks to Neil R. Taylor for his help in preparation of the manuscript.

Authors: 
Journal:  Angew Chem Int Ed Engl       Date:  2000-11-17       Impact factor: 15.336

Review 2.  3-D pharmacophores in drug discovery.

Authors:  J S Mason; A C Good; E J Martin
Journal:  Curr Pharm Des       Date:  2001-05       Impact factor: 3.116

3.  Applications of rule-induction in the derivation of quantitative structure-activity relationships.

Authors:  M A-Razzak; R C Glen
Journal:  J Comput Aided Mol Des       Date:  1992-08       Impact factor: 3.686

4.  Drug design by machine learning: the use of inductive logic programming to model the structure-activity relationships of trimethoprim analogues binding to dihydrofolate reductase.

Authors:  R D King; S Muggleton; R A Lewis; M J Sternberg
Journal:  Proc Natl Acad Sci U S A       Date:  1992-12-01       Impact factor: 11.205

Review 5.  Comparison of fingerprint-based methods for virtual screening using multiple bioactive reference structures.

Authors:  Jérôme Hert; Peter Willett; David J Wilton; Pierre Acklin; Kamal Azzaoui; Edgar Jacoby; Ansgar Schuffenhauer
Journal:  J Chem Inf Comput Sci       Date:  2004 May-Jun

6.  A comparative study on feature selection methods for drug discovery.

Authors:  Ying Liu
Journal:  J Chem Inf Comput Sci       Date:  2004 Sep-Oct

7.  Representation of molecular structure using quantum topology with inductive logic programming in structure-activity relationships.

Authors:  Bård Buttingsrud; Einar Ryeng; Ross D King; Bjørn K Alsberg
Journal:  J Comput Aided Mol Des       Date:  2006-10-13       Impact factor: 3.686

8.  Circular fingerprints: flexible molecular descriptors with applications from physical chemistry to ADME.

Authors:  Robert C Glem; Andreas Bender; Catrin H Arnby; Lars Carlsson; Scott Boyer; James Smith
Journal:  IDrugs       Date:  2006-03

9.  Neighborhood behavior: a useful concept for validation of "molecular diversity" descriptors.

Authors:  D E Patterson; R D Cramer; A M Ferguson; R D Clark; L E Weinberger
Journal:  J Med Chem       Date:  1996-08-02       Impact factor: 7.446

10.  Chemoinformatics-based classification of prohibited substances employed for doping in sport.

Authors:  Edward O Cannon; Andreas Bender; David S Palmer; John B O Mitchell
Journal:  J Chem Inf Model       Date:  2006 Nov-Dec       Impact factor: 4.956

View more
  8 in total

1.  Quantitative comparison of catalytic mechanisms and overall reactions in convergently evolved enzymes: implications for classification of enzyme function.

Authors:  Daniel E Almonacid; Emmanuel R Yera; John B O Mitchell; Patricia C Babbitt
Journal:  PLoS Comput Biol       Date:  2010-03-12       Impact factor: 4.475

2.  Discovering rules for protein-ligand specificity using support vector inductive logic programming.

Authors:  Lawrence A Kelley; Paul J Shrimpton; Stephen H Muggleton; Michael J E Sternberg
Journal:  Protein Eng Des Sel       Date:  2009-07-02       Impact factor: 1.650

Review 3.  Enzyme informatics.

Authors:  Rosanna G Alderson; Luna De Ferrari; Lazaros Mavridis; James L McDonagh; John B O Mitchell; Neetika Nath
Journal:  Curr Top Med Chem       Date:  2012       Impact factor: 3.295

4.  PyPLIF HIPPOS-Assisted Prediction of Molecular Determinants of Ligand Binding to Receptors.

Authors:  Enade P Istyastono; Nunung Yuniarti; Vivitri D Prasasty; Sudi Mungkasi
Journal:  Molecules       Date:  2021-04-22       Impact factor: 4.411

5.  A novel hybrid ultrafast shape descriptor method for use in virtual screening.

Authors:  Edward O Cannon; Florian Nigsch; John B O Mitchell
Journal:  Chem Cent J       Date:  2008-02-18       Impact factor: 4.215

6.  The influence of negative training set size on machine learning-based virtual screening.

Authors:  Rafał Kurczab; Sabina Smusz; Andrzej J Bojarski
Journal:  J Cheminform       Date:  2014-06-11       Impact factor: 5.514

7.  Machine learning methods in chemoinformatics.

Authors:  John B O Mitchell
Journal:  Wiley Interdiscip Rev Comput Mol Sci       Date:  2014-09-01

8.  Incorporating Virtual Reactions into a Logic-based Ligand-based Virtual Screening Method to Discover New Leads.

Authors:  Christopher R Reynolds; Stephen H Muggleton; Michael J E Sternberg
Journal:  Mol Inform       Date:  2015-03-20       Impact factor: 3.353

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.