| Literature DB >> 19807194 |
Qianyi Zhang1, Jacqueline M Hughes-Oliver, Raymond T Ng.
Abstract
Ensemble methods have become popular for QSAR modeling, but most studies have assumed balanced data, consisting of approximately equal numbers of active and inactive compounds. Cheminformatics data are often far from being balanced. We extend the application of ensemble methods to include cases of imbalance of class membership and to more adequately assess model output. Based on the extension, we propose an ensemble method called MBEnsemble that automatically determines the appropriate tuning parameters to provide reliable predictions and maximize the F-measure. Results from multiple data sets demonstrate that the proposed ensemble technique works well on imbalanced data.Entities:
Mesh:
Year: 2009 PMID: 19807194 PMCID: PMC2760036 DOI: 10.1021/ci900080f
Source DB: PubMed Journal: J Chem Inf Model ISSN: 1549-9596 Impact factor: 4.956