Literature DB >> 16352654

Differential gene expression detection and sample classification using penalized linear regression models.

Baolin Wu1.   

Abstract

Differential gene expression detection and sample classification using microarray data have received much research interest recently. Owing to the large number of genes p and small number of samples n (p >> n), microarray data analysis poses big challenges for statistical analysis. An obvious problem owing to the 'large p small n' is over-fitting. Just by chance, we are likely to find some non-differentially expressed genes that can classify the samples very well. The idea of shrinkage is to regularize the model parameters to reduce the effects of noise and produce reliable inferences. Shrinkage has been successfully applied in the microarray data analysis. The SAM statistics proposed by Tusher et al. and the 'nearest shrunken centroid' proposed by Tibshirani et al. are ad hoc shrinkage methods. Both methods are simple, intuitive and prove to be useful in empirical studies. Recently Wu proposed the penalized t/F-statistics with shrinkage by formally using the (1) penalized linear regression models for two-class microarray data, showing good performance. In this paper we systematically discussed the use of penalized regression models for analyzing microarray data. We generalize the two-class penalized t/F-statistics proposed by Wu to multi-class microarray data. We formally derive the ad hoc shrunken centroid used by Tibshirani et al. using the (1) penalized regression models. And we show that the penalized linear regression models provide a rigorous and unified statistical framework for sample classification and differential gene expression detection.

Mesh:

Year:  2005        PMID: 16352654     DOI: 10.1093/bioinformatics/bti827

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  10 in total

1.  Sparse regularized discriminant analysis with application to microarrays.

Authors:  Ran Li; Baolin Wu
Journal:  Comput Biol Chem       Date:  2012-07-04       Impact factor: 2.877

Review 2.  Statistical methods for integrating multiple types of high-throughput data.

Authors:  Yang Xie; Chul Ahn
Journal:  Methods Mol Biol       Date:  2010

3.  Bias-corrected diagonal discriminant rules for high-dimensional classification.

Authors:  Song Huang; Tiejun Tong; Hongyu Zhao
Journal:  Biometrics       Date:  2010-12       Impact factor: 2.571

4.  L1 penalized continuation ratio models for ordinal response prediction using high-dimensional datasets.

Authors:  K J Archer; A A A Williams
Journal:  Stat Med       Date:  2012-02-23       Impact factor: 2.373

5.  Identification of significant features in DNA microarray data.

Authors:  Eric Bair
Journal:  Wiley Interdiscip Rev Comput Stat       Date:  2013-07

6.  EPS-LASSO: test for high-dimensional regression under extreme phenotype sampling of continuous traits.

Authors:  Chao Xu; Jian Fang; Hui Shen; Yu-Ping Wang; Hong-Wen Deng
Journal:  Bioinformatics       Date:  2018-06-15       Impact factor: 6.937

7.  Characterizing Human Cell Types and Tissue Origin Using the Benford Law.

Authors:  Sne Morag; Mali Salmon-Divon
Journal:  Cells       Date:  2019-08-29       Impact factor: 6.600

8.  Penalized negative binomial models for modeling an overdispersed count outcome with a high-dimensional predictor space: Application predicting micronuclei frequency.

Authors:  Rebecca R Lehman; Kellie J Archer
Journal:  PLoS One       Date:  2019-01-08       Impact factor: 3.240

9.  A feature selection-based framework to identify biomarkers for cancer diagnosis: A focus on lung adenocarcinoma.

Authors:  Omar Abdelwahab; Nourelislam Awad; Menattallah Elserafy; Eman Badr
Journal:  PLoS One       Date:  2022-09-06       Impact factor: 3.752

10.  Classifying Incomplete Gene-Expression Data: Ensemble Learning with Non-Pre-Imputation Feature Filtering and Best-First Search Technique.

Authors:  Yuanting Yan; Tao Dai; Meili Yang; Xiuquan Du; Yiwen Zhang; Yanping Zhang
Journal:  Int J Mol Sci       Date:  2018-10-30       Impact factor: 5.923

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.