Literature DB >> 22378240

Multi-class HingeBoost. Method and application to the classification of cancer types using gene expression data.

Z Wang1.   

Abstract

BACKGROUND: Multi-class molecular cancer classification has great potential clinical implications. Such applications require statistical methods to accurately classify cancer types with a small subset of genes from thousands of genes in the data.
OBJECTIVES: This paper presents a new functional gradient descent boosting algorithm that directly extends the HingeBoost algorithm from the binary case to the multi-class case without reducing the original problem to multiple binary problems.
METHODS: Minimizing a multi-class hinge loss with boosting technique, the proposed HingeBoost has good theoretical properties by implementing the Bayes decision rule and providing a unifying framework with either equal or unequal misclassification costs. Furthermore, we propose Twin HingeBoost which has better feature selection behavior than HingeBoost by reducing the number of ineffective covariates. Simulated data, benchmark data and two cancer gene expression data sets are utilized to evaluate the performance of the proposed approach.
RESULTS: Simulations and the benchmark data showed that the multi-class HingeBoost generated accurate predictions when compared with the alternative methods, especially with high-dimensional covariates. The multi-class HingeBoost also produced more accurate prediction or comparable prediction in two cancer classification problems using gene expression data.
CONCLUSIONS: This work has shown that the HingeBoost provides a powerful tool for multi-classification problems. In many applications, the classification accuracy and feature selection behavior can be further improved when using Twin HingeBoost.

Entities:  

Mesh:

Year:  2012        PMID: 22378240     DOI: 10.3414/ME11-02-0020

Source DB:  PubMed          Journal:  Methods Inf Med        ISSN: 0026-1270            Impact factor:   2.176


  3 in total

1.  From bed to bench: bridging from informatics practice to theory: an exploratory analysis.

Authors:  R Haux; C U Lehmann
Journal:  Appl Clin Inform       Date:  2014-10-29       Impact factor: 2.342

2.  Statistical representation models for mutation information within genomic data.

Authors:  N Özlem Özcan Şimşek; Arzucan Özgür; Fikret Gürgen
Journal:  BMC Bioinformatics       Date:  2019-06-13       Impact factor: 3.169

3.  A novel gene selection method for gene expression data for the task of cancer type classification.

Authors:  N Özlem Özcan ŞİmŞek; Arzucan ÖzgÜr; Fikret GÜrgen
Journal:  Biol Direct       Date:  2021-02-08       Impact factor: 4.540

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.