Literature DB >> 20361856

Sparse partial least squares classification for high dimensional data.

Dongjun Chung1, Sunduz Keles.   

Abstract

Partial least squares (PLS) is a well known dimension reduction method which has been recently adapted for high dimensional classification problems in genome biology. We develop sparse versions of the recently proposed two PLS-based classification methods using sparse partial least squares (SPLS). These sparse versions aim to achieve variable selection and dimension reduction simultaneously. We consider both binary and multicategory classification. We provide analytical and simulation-based insights about the variable selection properties of these approaches and benchmark them on well known publicly available datasets that involve tumor classification with high dimensional gene expression data. We show that incorporation of SPLS into a generalized linear model (GLM) framework provides higher sensitivity in variable selection for multicategory classification with unbalanced sample sizes between classes. As the sample size increases, the two-stage approach provides comparable sensitivity with better specificity in variable selection. In binary classification and multicategory classification with balanced sample sizes, the two-stage approach provides comparable variable selection and prediction accuracy as the GLM version and is computationally more efficient.

Entities:  

Mesh:

Year:  2010        PMID: 20361856      PMCID: PMC2861314          DOI: 10.2202/1544-6115.1492

Source DB:  PubMed          Journal:  Stat Appl Genet Mol Biol        ISSN: 1544-6115


  16 in total

1.  A solution to the problem of separation in logistic regression.

Authors:  Georg Heinze; Michael Schemper
Journal:  Stat Med       Date:  2002-08-30       Impact factor: 2.373

2.  Classification using partial least squares with penalized logistic regression.

Authors:  Gersende Fort; Sophie Lambert-Lacroix
Journal:  Bioinformatics       Date:  2004-11-05       Impact factor: 6.937

3.  Quantifying the association between gene expressions and DNA-markers by penalized canonical correlation analysis.

Authors:  Sandra Waaijenborg; Philip C Verselewel de Witt Hamer; Aeilko H Zwinderman
Journal:  Stat Appl Genet Mol Biol       Date:  2008-01-23

4.  A sparse PLS for variable selection when integrating omics data.

Authors:  Kim-Anh Lê Cao; Debra Rossouw; Christèle Robert-Granié; Philippe Besse
Journal:  Stat Appl Genet Mol Biol       Date:  2008-11-18

5.  Diagnosis of multiple cancer types by shrunken centroids of gene expression.

Authors:  Robert Tibshirani; Trevor Hastie; Balasubramanian Narasimhan; Gilbert Chu
Journal:  Proc Natl Acad Sci U S A       Date:  2002-05-14       Impact factor: 11.205

6.  Regularization Paths for Generalized Linear Models via Coordinate Descent.

Authors:  Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal:  J Stat Softw       Date:  2010       Impact factor: 6.440

7.  Tumor classification by partial least squares using microarray gene expression data.

Authors:  Danh V Nguyen; David M Rocke
Journal:  Bioinformatics       Date:  2002-01       Impact factor: 6.937

8.  BagBoosting for tumor classification with gene expression data.

Authors:  Marcel Dettling
Journal:  Bioinformatics       Date:  2004-10-05       Impact factor: 6.937

9.  Gene expression correlates of clinical prostate cancer behavior.

Authors:  Dinesh Singh; Phillip G Febbo; Kenneth Ross; Donald G Jackson; Judith Manola; Christine Ladd; Pablo Tamayo; Andrew A Renshaw; Anthony V D'Amico; Jerome P Richie; Eric S Lander; Massimo Loda; Philip W Kantoff; Todd R Golub; William R Sellers
Journal:  Cancer Cell       Date:  2002-03       Impact factor: 31.743

10.  Multi-class cancer classification via partial least squares with gene expression profiles.

Authors:  Danh V Nguyen; David M Rocke
Journal:  Bioinformatics       Date:  2002-09       Impact factor: 6.937

View more
  33 in total

1.  Two-step paretial least square regression classifiers in brain-state decoding using functional magnetic resonance imaging.

Authors:  Zhiying Long; Yubao Wang; Xuanping Liu; Li Yao
Journal:  PLoS One       Date:  2019-04-10       Impact factor: 3.240

2.  Helminth infection promotes colonization resistance via type 2 immunity.

Authors:  Deepshika Ramanan; Rowann Bowcutt; Soo Ching Lee; Mei San Tang; Zachary D Kurtz; Yi Ding; Kenya Honda; William C Gause; Martin J Blaser; Richard A Bonneau; Yvonne A L Lim; P'ng Loke; Ken Cadwell
Journal:  Science       Date:  2016-04-14       Impact factor: 47.728

3.  Spatially Weighted Principal Component Analysis for Imaging Classification.

Authors:  Ruixin Guo; Mihye Ahn; Hongtu Zhu
Journal:  J Comput Graph Stat       Date:  2015-01       Impact factor: 2.302

4.  Regularized Partial Least Squares with an Application to NMR Spectroscopy.

Authors:  Genevera I Allen; Christine Peterson; Marina Vannucci; Mirjana Maletić-Savatić
Journal:  Stat Anal Data Min       Date:  2013-08-01       Impact factor: 1.051

5.  Semi-supervised identification of cancer subgroups using survival outcomes and overlapping grouping information.

Authors:  Wei Wei; Zequn Sun; Willian A da Silveira; Zhenning Yu; Andrew Lawson; Gary Hardiman; Linda E Kelemen; Dongjun Chung
Journal:  Stat Methods Med Res       Date:  2018-01-16       Impact factor: 3.021

6.  Differential scanning calorimetry as a complementary diagnostic tool for the evaluation of biological samples.

Authors:  Nichola C Garbett; Guy N Brock
Journal:  Biochim Biophys Acta       Date:  2015-10-14

7.  Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems.

Authors:  Kim-Anh Lê Cao; Simon Boitard; Philippe Besse
Journal:  BMC Bioinformatics       Date:  2011-06-22       Impact factor: 3.169

8.  A novel systems pharmacology model for herbal medicine injection: a case using Reduning injection.

Authors:  Haixing Yang; Wenjuan Zhang; Chao Huang; Wei Zhou; Yao Yao; Zhenzhong Wang; Yan Li; Wei Xiao; Yonghua Wang
Journal:  BMC Complement Altern Med       Date:  2014-11-04       Impact factor: 3.659

9.  Comparison of Genomic Selection Models to Predict Flowering Time and Spike Grain Number in Two Hexaploid Wheat Doubled Haploid Populations.

Authors:  Saravanan Thavamanikumar; Rudy Dolferus; Bala R Thumma
Journal:  G3 (Bethesda)       Date:  2015-07-22       Impact factor: 3.154

10.  Multivariate Time Series Analysis of Temperatures in the Archaeological Museum of L'Almoina (Valencia, Spain).

Authors:  Sandra Ramírez; Manuel Zarzo; Fernando-Juan García-Diego
Journal:  Sensors (Basel)       Date:  2021-06-26       Impact factor: 3.576

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.