Literature DB >> 19519331

Controlling feature selection in random forests of decision trees using a genetic algorithm: classification of class I MHC peptides.

Loren Hansen1, Ernestine A Lee, Kevin Hestir, Lewis T Williams, David Farrelly.   

Abstract

Feature selection is an important challenge in many classification problems, especially if the number of features greatly exceeds the number of examples available. We have developed a procedure--GenForest--which controls feature selection in random forests of decision trees by using a genetic algorithm. This approach was tested through our entry into the Comparative Evaluation of Prediction Algorithms 2006 (CoEPrA) competition (accessible online at: http://www.coepra.org). CoEPrA was a modeling competition organized to provide an objective testing for various classification and regression algorithms via the process of blind prediction. In the competition GenForest ranked 10/23, 5/16 and 9/16 on CoEPrA classification problems 1, 3 and 4, respectively, which involved the classification of type I MHC nonapeptides i.e. peptides containing nine amino acids. These problems each involved the classification of different sets of nonapeptides. Associated with each amino acid was a set of 643 features for a total of 5787 features per peptide. The method, its application to the CoEPrA datasets, and its performance in the competition are described.

Mesh:

Substances:

Year:  2009        PMID: 19519331     DOI: 10.2174/138620709788488984

Source DB:  PubMed          Journal:  Comb Chem High Throughput Screen        ISSN: 1386-2073            Impact factor:   1.339


  3 in total

1.  Prediction using step-wise L1, L2 regularization and feature selection for small data sets with large number of features.

Authors:  Ozgur Demir-Kavuk; Mayumi Kamada; Tatsuya Akutsu; Ernst-Walter Knapp
Journal:  BMC Bioinformatics       Date:  2011-10-25       Impact factor: 3.169

2.  Analysis of biological features associated with meiotic recombination hot and cold spots in Saccharomyces cerevisiae.

Authors:  Loren Hansen; Nak-Kyeong Kim; Leonardo Mariño-Ramírez; David Landsman
Journal:  PLoS One       Date:  2011-12-29       Impact factor: 3.240

3.  MHCII3D-Robust Structure Based Prediction of MHC II Binding Peptides.

Authors:  Josef Laimer; Peter Lackner
Journal:  Int J Mol Sci       Date:  2020-12-22       Impact factor: 5.923

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.