Literature DB >> 21551140

Application of the Bayesian MMSE estimator for classification error to gene expression microarray data.

Lori A Dalton1, Edward R Dougherty.   

Abstract

MOTIVATION: With the development of high-throughput genomic and proteomic technologies, coupled with the inherent difficulties in obtaining large samples, biomedicine faces difficult small-sample classification issues, in particular, error estimation. Most popular error estimation methods are motivated by intuition rather than mathematical inference. A recently proposed error estimator based on Bayesian minimum mean square error estimation places error estimation in an optimal filtering framework. In this work, we examine the application of this error estimator to gene expression microarray data, including the suitability of the Gaussian model with normal-inverse-Wishart priors and how to find prior probabilities.
RESULTS: We provide an implementation for non-linear classification, where closed form solutions are not available. We propose a methodology for calibrating normal-inverse-Wishart priors based on discarded microarray data and examine the performance on synthetic high-dimensional data and a real dataset from a breast cancer study. The calibrated Bayesian error estimator has superior root mean square performance, especially with moderate to high expected true errors and small feature sizes. AVAILABILITY: We have implemented in C code the Bayesian error estimator for Gaussian distributions and normal-inverse-Wishart priors for both linear classifiers, with exact closed-form representations, and arbitrary classifiers, where we use a Monte Carlo approximation. Our code for the Bayesian error estimator and a toolbox of related utilities are available at http://gsp.tamu.edu/Publications/supplementary/dalton11a. Several supporting simulations are also included. CONTACT: ldalton@tamu.edu

Entities:  

Mesh:

Year:  2011        PMID: 21551140     DOI: 10.1093/bioinformatics/btr272

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  9 in total

1.  Performance reproducibility index for classification.

Authors:  Mohammadmahdi R Yousefi; Edward R Dougherty
Journal:  Bioinformatics       Date:  2012-09-06       Impact factor: 6.937

2.  Building gene expression profile classifiers with a simple and efficient rejection option in R.

Authors:  Alfredo Benso; Stefano Di Carlo; Gianfranco Politano; Alessandro Savino; Hafeez Hafeezurrehman
Journal:  BMC Bioinformatics       Date:  2011-11-30       Impact factor: 3.169

3.  Scientific knowledge is possible with small-sample classification.

Authors:  Edward R Dougherty; Lori A Dalton
Journal:  EURASIP J Bioinform Syst Biol       Date:  2013-08-20

4.  Modeling the next generation sequencing sample processing pipeline for the purposes of classification.

Authors:  Noushin Ghaffari; Mohammadmahdi R Yousefi; Charles D Johnson; Ivan Ivanov; Edward R Dougherty
Journal:  BMC Bioinformatics       Date:  2013-10-11       Impact factor: 3.169

5.  The Model-Based Study of the Effectiveness of Reporting Lists of Small Feature Sets Using RNA-Seq Data.

Authors:  Eunji Kim; Ivan Ivanov; Jianping Hua; Johanna W Lampe; Meredith Aj Hullar; Robert S Chapkin; Edward R Dougherty
Journal:  Cancer Inform       Date:  2017-06-12

6.  On optimal Bayesian classification and risk estimation under multiple classes.

Authors:  Lori A Dalton; Mohammadmahdi R Yousefi
Journal:  EURASIP J Bioinform Syst Biol       Date:  2015-10-24

7.  MCMC implementation of the optimal Bayesian classifier for non-Gaussian models: model-based RNA-Seq classification.

Authors:  Jason M Knight; Ivan Ivanov; Edward R Dougherty
Journal:  BMC Bioinformatics       Date:  2014-12-10       Impact factor: 3.169

8.  Prediction Accuracy in Multivariate Repeated-Measures Bayesian Forecasting Models with Examples Drawn from Research on Sleep and Circadian Rhythms.

Authors:  Clark Kogan; Leonid Kalachev; Hans P A Van Dongen
Journal:  Comput Math Methods Med       Date:  2016-01-14       Impact factor: 2.238

9.  An Efficient Feature Selection Strategy Based on Multiple Support Vector Machine Technology with Gene Expression Data.

Authors:  Ying Zhang; Qingchun Deng; Wenbin Liang; Xianchun Zou
Journal:  Biomed Res Int       Date:  2018-08-30       Impact factor: 3.411

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.