Literature DB >> 26549920

Matrix Completion Discriminant Analysis.

Tong Tong Wu1, Kenneth Lange2.   

Abstract

Matrix completion discriminant analysis (MCDA) is designed for semi-supervised learning where the rate of missingness is high and predictors vastly outnumber cases. MCDA operates by mapping class labels to the vertices of a regular simplex. With c classes, these vertices are arranged on the surface of the unit sphere in c - 1 dimensional Euclidean space. Because all pairs of vertices are equidistant, the classes are treated symmetrically. To assign unlabeled cases to classes, the data is entered into a large matrix (cases along rows and predictors along columns) that is augmented by vertex coordinates stored in the last c - 1 columns. Once the matrix is constructed, its missing entries can be filled in by matrix completion. To carry out matrix completion, one minimizes a sum of squares plus a nuclear norm penalty. The simplest solution invokes an MM algorithm and singular value decomposition. Choice of the penalty tuning constant can be achieved by cross validation on randomly withheld case labels. Once the matrix is completed, an unlabeled case is assigned to the class vertex closest to the point deposited in its last c - 1 columns. A variety of examples drawn from the statistical literature demonstrate that MCDA is competitive on traditional problems and outperforms alternatives on large-scale problems.

Entities:  

Keywords:  Classification; MM algorithm; Missing observations; Semi-supervised learning; Singular value decomposition

Year:  2015        PMID: 26549920      PMCID: PMC4634674          DOI: 10.1016/j.csda.2015.06.006

Source DB:  PubMed          Journal:  Comput Stat Data Anal        ISSN: 0167-9473            Impact factor:   1.681


  11 in total

1.  Soft and hard classification by reproducing kernel Hilbert space methods.

Authors:  Grace Wahba
Journal:  Proc Natl Acad Sci U S A       Date:  2002-12-11       Impact factor: 11.205

2.  Hard or Soft Classification? Large-margin Unified Machines.

Authors:  Yufeng Liu; Hao Helen Zhang; Yichao Wu
Journal:  J Am Stat Assoc       Date:  2011-03-01       Impact factor: 5.033

3.  MissForest--non-parametric missing value imputation for mixed-type data.

Authors:  Daniel J Stekhoven; Peter Bühlmann
Journal:  Bioinformatics       Date:  2011-10-28       Impact factor: 6.937

4.  Impact of missing value imputation on classification for DNA microarray gene expression data--a model-based study.

Authors:  Youting Sun; Ulisses Braga-Neto; Edward R Dougherty
Journal:  EURASIP J Bioinform Syst Biol       Date:  2010-03-02

5.  Next Generation Statistical Genetics: Modeling, Penalization, and Optimization in High-Dimensional Data.

Authors:  Kenneth Lange; Jeanette C Papp; Janet S Sinsheimer; Eric M Sobel
Journal:  Annu Rev Stat Appl       Date:  2014-01-01       Impact factor: 5.810

6.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays.

Authors:  U Alon; N Barkai; D A Notterman; K Gish; S Ybarra; D Mack; A J Levine
Journal:  Proc Natl Acad Sci U S A       Date:  1999-06-08       Impact factor: 11.205

7.  Spectral Regularization Algorithms for Learning Large Incomplete Matrices.

Authors:  Rahul Mazumder; Trevor Hastie; Robert Tibshirani
Journal:  J Mach Learn Res       Date:  2010-03-01       Impact factor: 3.654

8.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks.

Authors:  J Khan; J S Wei; M Ringnér; L H Saal; M Ladanyi; F Westermann; F Berthold; M Schwab; C R Antonescu; C Peterson; P S Meltzer
Journal:  Nat Med       Date:  2001-06       Impact factor: 53.440

9.  Nonlinear Vertex Discriminant Analysis with Reproducing Kernels.

Authors:  Tong Tong Wu; Yichao Wu
Journal:  Stat Anal Data Min       Date:  2012-04       Impact factor: 1.051

10.  Gene expression correlates of clinical prostate cancer behavior.

Authors:  Dinesh Singh; Phillip G Febbo; Kenneth Ross; Donald G Jackson; Judith Manola; Christine Ladd; Pablo Tamayo; Andrew A Renshaw; Anthony V D'Amico; Jerome P Richie; Eric S Lander; Massimo Loda; Philip W Kantoff; Todd R Golub; William R Sellers
Journal:  Cancer Cell       Date:  2002-03       Impact factor: 31.743

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.