Literature DB >> 15921468

Boosting: an ensemble learning tool for compound classification and QSAR modeling.

Vladimir Svetnik1, Ting Wang, Christopher Tong, Andy Liaw, Robert P Sheridan, Qinghua Song.   

Abstract

A classification and regression tool, J. H. Friedman's Stochastic Gradient Boosting (SGB), is applied to predicting a compound's quantitative or categorical biological activity based on a quantitative description of the compound's molecular structure. Stochastic Gradient Boosting is a procedure for building a sequence of models, for instance regression trees (as in this paper), whose outputs are combined to form a predicted quantity, either an estimate of the biological activity, or a class label to which a molecule belongs. In particular, the SGB procedure builds a model in a stage-wise manner by fitting each tree to the gradient of a loss function: e.g., squared error for regression and binomial log-likelihood for classification. The values of the gradient are computed for each sample in the training set, but only a random sample of these gradients is used at each stage. (Friedman showed that the well-known boosting algorithm, AdaBoost of Freund and Schapire, could be considered as a particular case of SGB.) The SGB method is used to analyze 10 cheminformatics data sets, most of which are publicly available. The results show that SGB's performance is comparable to that of Random Forest, another ensemble learning method, and are generally competitive with or superior to those of other QSAR methods. The use of SGB's variable importance with partial dependence plots for model interpretation is also illustrated.

Mesh:

Substances:

Year:  2005        PMID: 15921468     DOI: 10.1021/ci0500379

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  21 in total

1.  Toward better QSAR/QSPR modeling: simultaneous outlier detection and variable selection using distribution of model features.

Authors:  Dongsheng Cao; Yizeng Liang; Qingsong Xu; Yifeng Yun; Hongdong Li
Journal:  J Comput Aided Mol Des       Date:  2010-11-13       Impact factor: 3.686

Review 2.  QSAR without borders.

Authors:  Eugene N Muratov; Jürgen Bajorath; Robert P Sheridan; Igor V Tetko; Dmitry Filimonov; Vladimir Poroikov; Tudor I Oprea; Igor I Baskin; Alexandre Varnek; Adrian Roitberg; Olexandr Isayev; Stefano Curtarolo; Denis Fourches; Yoram Cohen; Alan Aspuru-Guzik; David A Winkler; Dimitris Agrafiotis; Artem Cherkasov; Alexander Tropsha
Journal:  Chem Soc Rev       Date:  2020-05-01       Impact factor: 54.564

3.  Prediction of carcinogenicity for diverse chemicals based on substructure grouping and SVM modeling.

Authors:  Kazutoshi Tanabe; Bono Lučić; Dragan Amić; Takio Kurita; Mikio Kaihara; Natsuo Onodera; Takahiro Suzuki
Journal:  Mol Divers       Date:  2010-02-26       Impact factor: 2.943

4.  Multi-Descriptor Read Across (MuDRA): A Simple and Transparent Approach for Developing Accurate Quantitative Structure-Activity Relationship Models.

Authors:  Vinicius M Alves; Alexander Golbraikh; Stephen J Capuzzi; Kammy Liu; Wai In Lam; Daniel Robert Korn; Diane Pozefsky; Carolina Horta Andrade; Eugene N Muratov; Alexander Tropsha
Journal:  J Chem Inf Model       Date:  2018-06-13       Impact factor: 4.956

5.  Brainstorming: weighted voting prediction of inhibitors for protein targets.

Authors:  Dariusz Plewczynski
Journal:  J Mol Model       Date:  2010-09-21       Impact factor: 1.810

6.  Dermoscopic Image Classification Method Using an Ensemble of Fine-Tuned Convolutional Neural Networks.

Authors:  Xin Shen; Lisheng Wei; Shaoyu Tang
Journal:  Sensors (Basel)       Date:  2022-05-30       Impact factor: 3.847

7.  Heterogeneous Ensemble Deep Learning Model for Enhanced Arabic Sentiment Analysis.

Authors:  Hager Saleh; Sherif Mostafa; Abdullah Alharbi; Shaker El-Sappagh; Tamim Alkhalifah
Journal:  Sensors (Basel)       Date:  2022-05-12       Impact factor: 3.847

Review 8.  Considerations and recent advances in QSAR models for cytochrome P450-mediated drug metabolism prediction.

Authors:  Haiyan Li; Jin Sun; Xiaowen Fan; Xiaofan Sui; Lan Zhang; Yongjun Wang; Zhonggui He
Journal:  J Comput Aided Mol Des       Date:  2008-06-24       Impact factor: 3.686

9.  Charged aerosol detector response modeling for fatty acids based on experimental settings and molecular features: a machine learning approach.

Authors:  Ruben Pawellek; Jovana Krmar; Adrian Leistner; Nevena Djajić; Biljana Otašević; Ana Protić; Ulrike Holzgrabe
Journal:  J Cheminform       Date:  2021-07-15       Impact factor: 5.514

10.  [Ensemble hologram quantitative structure activity relationship model of the chromatographic retention index of aldehydes and ketones].

Authors:  Bin Lei; Yunlei Zang; Zhiwei Xue; Yiqing Ge; Wei Li; Qian Zhai; Long Jiao
Journal:  Se Pu       Date:  2021-03
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.