Literature DB >> 27616801

Fast, Exact Bootstrap Principal Component Analysis for p > 1 million.

Aaron Fisher, Brian Caffo, Brian Schwartz, Vadim Zipunnikov.   

Abstract

Many have suggested a bootstrap procedure for estimating the sampling variability of principal component analysis (PCA) results. However, when the number of measurements per subject (p) is much larger than the number of subjects (n), calculating and storing the leading principal components from each bootstrap sample can be computationally infeasible. To address this, we outline methods for fast, exact calculation of bootstrap principal components, eigenvalues, and scores. Our methods leverage the fact that all bootstrap samples occupy the same n-dimensional subspace as the original sample. As a result, all bootstrap principal components are limited to the same n-dimensional subspace and can be efficiently represented by their low dimensional coordinates in that subspace. Several uncertainty metrics can be computed solely based on the bootstrap distribution of these low dimensional coordinates, without calculating or storing the p-dimensional bootstrap components. Fast bootstrap PCA is applied to a dataset of sleep electroencephalogram recordings (p = 900, n = 392), and to a dataset of brain magnetic resonance images (MRIs) (p ≈ 3 million, n = 352). For the MRI dataset, our method allows for standard errors for the first 3 principal components based on 1000 bootstrap samples to be calculated on a standard laptop in 47 minutes, as opposed to approximately 4 days with standard methods.

Entities:  

Keywords:  PCA; SVD; functional data analysis; image analysis; singular value decomposition

Year:  2016        PMID: 27616801      PMCID: PMC5014451          DOI: 10.1080/01621459.2015.1062383

Source DB:  PubMed          Journal:  J Am Stat Assoc        ISSN: 0162-1459            Impact factor:   5.033


  21 in total

1.  A method for making group inferences from functional MRI data using independent component analysis.

Authors:  V D Calhoun; T Adali; G D Pearlson; J J Pekar
Journal:  Hum Brain Mapp       Date:  2001-11       Impact factor: 5.038

2.  Asymptotic conditional singular value decomposition for high-dimensional genomic data.

Authors:  Jeffrey T Leek
Journal:  Biometrics       Date:  2010-06-16       Impact factor: 2.571

3.  Bootstrap scree tests: a Monte Carlo simulation and applications to published data.

Authors:  Sungjin Hong; Stephen K Mitchell; Richard A Harshman
Journal:  Br J Math Stat Psychol       Date:  2006-05       Impact factor: 3.380

4.  Estimating confidence intervals for principal component loadings: a comparison between the bootstrap and asymptotic results.

Authors:  Marieke E Timmerman; Henk A L Kiers; Age K Smilde
Journal:  Br J Math Stat Psychol       Date:  2007-11       Impact factor: 3.380

5.  Relations of brain volumes with cognitive function in males 45 years and older with past lead exposure.

Authors:  Brian S Schwartz; Sining Chen; Brian Caffo; Walter F Stewart; Karen I Bolla; David Yousem; Christos Davatzikos
Journal:  Neuroimage       Date:  2007-06-02       Impact factor: 6.556

6.  An image-processing system for qualitative and quantitative volumetric analysis of brain images.

Authors:  A F Goldszal; C Davatzikos; D L Pham; M X Yan; R N Bryan; S M Resnick
Journal:  J Comput Assist Tomogr       Date:  1998 Sep-Oct       Impact factor: 1.826

7.  Population Value Decomposition, a Framework for the Analysis of Image Populations.

Authors:  Ciprian M Crainiceanu; Brian S Caffo; Sheng Luo; Vadim M Zipunnikov; Naresh M Punjabi
Journal:  J Am Stat Assoc       Date:  2011       Impact factor: 5.033

8.  An information-maximization approach to blind separation and blind deconvolution.

Authors:  A J Bell; T J Sejnowski
Journal:  Neural Comput       Date:  1995-11       Impact factor: 2.026

9.  Past adult lead exposure is linked to neurodegeneration measured by brain MRI.

Authors:  W F Stewart; B S Schwartz; C Davatzikos; D Shen; D Liu; X Wu; A C Todd; W Shi; S Bassett; D Youssem
Journal:  Neurology       Date:  2006-05-23       Impact factor: 9.910

10.  Voxel-based morphometry using the RAVENS maps: methods and validation using simulated longitudinal atrophy.

Authors:  C Davatzikos; A Genc; D Xu; S M Resnick
Journal:  Neuroimage       Date:  2001-12       Impact factor: 6.556

View more
  5 in total

1.  A principal component analysis-based framework for statistical modeling of bone displacement during wrist maneuvers.

Authors:  Brent H Foster; Calvin B Shaw; Robert D Boutin; Anand A Joshi; Christopher O Bayne; Robert M Szabo; Abhijit J Chaudhari
Journal:  J Biomech       Date:  2019-01-24       Impact factor: 2.712

2.  Discrepancies in metabolomic biomarker identification from patient-derived lung cancer revealed by combined variation in data pre-treatment and imputation methods.

Authors:  Hunter A Miller; Ramy Emam; Chip M Lynch; Samuel Bockhorst; Hermann B Frieboes
Journal:  Metabolomics       Date:  2021-03-27       Impact factor: 4.290

3.  Transcriptional profiling reveals extraordinary diversity among skeletal muscle tissues.

Authors:  Erin E Terry; Xiping Zhang; Christy Hoffmann; Laura D Hughes; Scott A Lewis; Jiajia Li; Matthew J Wallace; Lance A Riley; Collin M Douglas; Miguel A Gutierrez-Monreal; Nicholas F Lahens; Ming C Gong; Francisco Andrade; Karyn A Esser; Michael E Hughes
Journal:  Elife       Date:  2018-05-29       Impact factor: 8.140

4.  Evidence against tetrapod-wide digit identities and for a limited frame shift in bird wings.

Authors:  Thomas A Stewart; Cong Liang; Justin L Cotney; James P Noonan; Thomas J Sanger; Günter P Wagner
Journal:  Nat Commun       Date:  2019-07-19       Impact factor: 14.919

5.  MOSS: multi-omic integration with sparse value decomposition.

Authors:  Agustin Gonzalez-Reymundez; Alexander Grueneberg; Guanqi Lu; Filipe Couto Alves; Gonzalo Rincon; Ana I Vazquez
Journal:  Bioinformatics       Date:  2022-05-13       Impact factor: 6.937

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.