Literature DB >> 12691989

Feature selection and transduction for prediction of molecular bioactivity for drug design.

Jason Weston1, Fernando Pérez-Cruz, Olivier Bousquet, Olivier Chapelle, André Elisseeff, Bernhard Schölkopf.   

Abstract

MOTIVATION: In drug discovery a key task is to identify characteristics that separate active (binding) compounds from inactive (non-binding) ones. An automated prediction system can help reduce resources necessary to carry out this task.
RESULTS: Two methods for prediction of molecular bioactivity for drug design are introduced and shown to perform well in a data set previously studied as part of the KDD (Knowledge Discovery and Data Mining) Cup 2001. The data is characterized by very few positive examples, a very large number of features (describing three-dimensional properties of the molecules) and rather different distributions between training and test data. Two techniques are introduced specifically to tackle these problems: a feature selection method for unbalanced data and a classifier which adapts to the distribution of the the unlabeled test data (a so-called transductive method). We show both techniques improve identification performance and in conjunction provide an improvement over using only one of the techniques. Our results suggest the importance of taking into account the characteristics in this data which may also be relevant in other problems of a similar type.

Mesh:

Substances:

Year:  2003        PMID: 12691989     DOI: 10.1093/bioinformatics/btg054

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  10 in total

Review 1.  Evaluation of the performance of 3D virtual screening protocols: RMSD comparisons, enrichment assessments, and decoy selection--what can we learn from earlier mistakes?

Authors:  Johannes Kirchmair; Patrick Markt; Simona Distinto; Gerhard Wolber; Thierry Langer
Journal:  J Comput Aided Mol Des       Date:  2008-01-15       Impact factor: 3.686

2.  Boosted feature selectors: a case study on prediction P-gp inhibitors and substrates.

Authors:  Gonzalo Cerruela García; Nicolás García-Pedrajas
Journal:  J Comput Aided Mol Des       Date:  2018-10-26       Impact factor: 3.686

3.  Influence of feature rankers in the construction of molecular activity prediction models.

Authors:  Gonzalo Cerruela-García; José Pérez-Parra Toledano; Aída de Haro-García; Nicolás García-Pedrajas
Journal:  J Comput Aided Mol Des       Date:  2019-12-31       Impact factor: 3.686

4.  Using machine learning, general regression, and Cox proportional hazards regression to predict the effectiveness of treatment in patients with breast cancer.

Authors:  Xiaoyan Wang; Dawn L Hershman; Alfred I Neugut
Journal:  AMIA Annu Symp Proc       Date:  2006

5.  Prediction of enzyme mutant activity using computational mutagenesis and incremental transduction.

Authors:  Nada Basit; Harry Wechsler
Journal:  Adv Bioinformatics       Date:  2011-10-09

6.  Specificity prediction of adenylation domains in nonribosomal peptide synthetases (NRPS) using transductive support vector machines (TSVMs).

Authors:  Christian Rausch; Tilmann Weber; Oliver Kohlbacher; Wolfgang Wohlleben; Daniel H Huson
Journal:  Nucleic Acids Res       Date:  2005-10-12       Impact factor: 16.971

7.  Graph-Based Feature Selection Approach for Molecular Activity Prediction.

Authors:  Gonzalo Cerruela-García; José Manuel Cuevas-Muñoz; Nicolás García-Pedrajas
Journal:  J Chem Inf Model       Date:  2022-03-22       Impact factor: 4.956

8.  Entropy-based gene ranking without selection bias for the predictive classification of microarray data.

Authors:  Cesare Furlanello; Maria Serafini; Stefano Merler; Giuseppe Jurman
Journal:  BMC Bioinformatics       Date:  2003-11-06       Impact factor: 3.169

9.  Profiled support vector machines for antisense oligonucleotide efficacy prediction.

Authors:  Gustavo Camps-Valls; Alistair M Chalk; Antonio J Serrano-López; José D Martín-Guerrero; Erik L L Sonnhammer
Journal:  BMC Bioinformatics       Date:  2004-09-22       Impact factor: 3.169

10.  A review of machine learning methods to predict the solubility of overexpressed recombinant proteins in Escherichia coli.

Authors:  Narjeskhatoon Habibi; Siti Z Mohd Hashim; Alireza Norouzi; Mohammed Razip Samian
Journal:  BMC Bioinformatics       Date:  2014-05-08       Impact factor: 3.169

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.