| Literature DB >> 15154758 |
Jörg K Wegner1, Holger Fröhlich, Andreas Zell.
Abstract
The paper describes different aspects of classification models based on molecular data sets with the focus on feature selection methods. Especially model quality and avoiding a high variance on unseen data (overfitting) will be discussed with respect to the feature selection problem. We present several standard approaches and modifications of our Genetic Algorithm based on the Shannon Entropy Cliques (GA-SEC) algorithm and the extension for classification problems using boosting.Year: 2004 PMID: 15154758 DOI: 10.1021/ci0342324
Source DB: PubMed Journal: J Chem Inf Comput Sci ISSN: 0095-2338