Literature DB >> 30287970

PCA in High Dimensions: An orientation.

Iain M Johnstone1, Debashis Paul2.   

Abstract

When the data are high dimensional, widely used multivariate statistical methods such as principal component analysis can behave in unexpected ways. In settings where the dimension of the observations is comparable to the sample size, upward bias in sample eigenvalues and inconsistency of sample eigenvectors are among the most notable phenomena that appear. These phenomena, and the limiting behavior of the rescaled extreme sample eigenvalues, have recently been investigated in detail under the spiked covariance model. The behavior of the bulk of the sample eigenvalues under weak distributional assumptions on the observations has been described. These results have been exploited to develop new estimation and hypothesis testing methods for the population covariance matrix. Furthermore, partly in response to these phenomena, alternative classes of estimation procedures have been developed by exploiting sparsity of the eigenvectors or the covariance matrix. This paper gives an orientation to these areas.

Entities:  

Keywords:  Marchenko-Pastur distribution; Tracy-Widom law; phase transition phenomena; principal component analysis; random matrix theory; spiked covariance model

Year:  2018        PMID: 30287970      PMCID: PMC6167023          DOI: 10.1109/JPROC.2018.2846730

Source DB:  PubMed          Journal:  Proc IEEE Inst Electr Electron Eng        ISSN: 0018-9219            Impact factor:   10.961


  10 in total

1.  Principal-component-analysis eigenvalue spectra from data with symmetry-breaking structure.

Authors:  D C Hoyle; M Rattray
Journal:  Phys Rev E Stat Nonlin Soft Matter Phys       Date:  2004-02-27

2.  APPROXIMATE NULL DISTRIBUTION OF THE LARGEST ROOT IN MULTIVARIATE ANALYSIS.

Authors:  Iain M Johnstone
Journal:  Ann Appl Stat       Date:  2009       Impact factor: 2.083

3.  Roy's largest root test under rank-one alternatives.

Authors:  I M Johnstone; B Nadler
Journal:  Biometrika       Date:  2017-01-13       Impact factor: 2.445

4.  Asymptotics of empirical eigenstructure for high dimensional spiked covariance.

Authors:  Weichen Wang; Jianqing Fan
Journal:  Ann Stat       Date:  2017-06-13       Impact factor: 4.028

5.  The Statistics and Mathematics of High Dimension Low Sample Size Asymptotics.

Authors:  Dan Shen; Haipeng Shen; Hongtu Zhu; J S Marron
Journal:  Stat Sin       Date:  2016-10       Impact factor: 1.261

6.  On Consistency and Sparsity for Principal Components Analysis in High Dimensions.

Authors:  Iain M Johnstone; Arthur Yu Lu
Journal:  J Am Stat Assoc       Date:  2009-06-01       Impact factor: 5.033

7.  MINIMAX BOUNDS FOR SPARSE PCA WITH NOISY HIGH-DIMENSIONAL DATA.

Authors:  Aharon Birnbaum; Iain M Johnstone; Boaz Nadler; Debashis Paul
Journal:  Ann Stat       Date:  2013-06       Impact factor: 4.028

8.  Optimal Estimation and Rank Detection for Sparse Spiked Covariance Matrices.

Authors:  Tony Cai; Zongming Ma; Yihong Wu
Journal:  Probab Theory Relat Fields       Date:  2015-04-01       Impact factor: 2.391

9.  Large Covariance Estimation by Thresholding Principal Orthogonal Complements.

Authors:  Jianqing Fan; Yuan Liao; Martina Mincheva
Journal:  J R Stat Soc Series B Stat Methodol       Date:  2013-09-01       Impact factor: 4.488

10.  Optimal Shrinkage of Eigenvalues in the Spiked Covariance Model.

Authors:  David L Donoho; Matan Gavish; Iain M Johnstone
Journal:  Ann Stat       Date:  2018-06-27       Impact factor: 4.028

  10 in total
  5 in total

1.  How to reduce dimension with PCA and random projections?

Authors:  Fan Yang; Sifan Liu; Edgar Dobriban; David P Woodruff
Journal:  IEEE Trans Inf Theory       Date:  2021-09-14       Impact factor: 2.978

2.  Continuous flexibility analysis of SARS-CoV-2 spike prefusion structures.

Authors:  Roberto Melero; Carlos Oscar S Sorzano; Brent Foster; José-Luis Vilas; Marta Martínez; Roberto Marabini; Erney Ramírez-Aportela; Ruben Sanchez-Garcia; David Herreros; Laura Del Caño; Patricia Losana; Yunior C Fonseca-Reyna; Pablo Conesa; Daniel Wrapp; Pablo Chacon; Jason S McLellan; Hemant D Tagare; Jose-Maria Carazo
Journal:  IUCrJ       Date:  2020-09-29       Impact factor: 4.769

3.  A spectral theory for Wright's inbreeding coefficients and related quantities.

Authors:  Olivier François; Clément Gain
Journal:  PLoS Genet       Date:  2021-07-19       Impact factor: 5.917

4.  Continuous flexibility analysis of SARS-CoV-2 Spike prefusion structures.

Authors:  Roberto Melero; Carlos Oscar S Sorzano; Brent Foster; José-Luis Vilas; Marta Martínez; Roberto Marabini; Erney Ramírez-Aportela; Ruben Sanchez-Garcia; David Herreros; Laura Del Caño; Patricia Losana; Yunior C Fonseca-Reyna; Pablo Conesa; Daniel Wrapp; Pablo Chacon; Jason S McLellan; Hemant D Tagare; Jose-Maria Carazo
Journal:  bioRxiv       Date:  2020-07-08

Review 5.  The medical AI insurgency: what physicians must know about data to practice with intelligent machines.

Authors:  D Douglas Miller
Journal:  NPJ Digit Med       Date:  2019-06-28
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.