Literature DB >> 21442047

CONVERGENCE AND PREDICTION OF PRINCIPAL COMPONENT SCORES IN HIGH-DIMENSIONAL SETTINGS.

Seunggeun Lee1, Fei Zou, Fred A Wright.   

Abstract

A number of settings arise in which it is of interest to predict Principal Component (PC) scores for new observations using data from an initial sample. In this paper, we demonstrate that naive approaches to PC score prediction can be substantially biased towards 0 in the analysis of large matrices. This phenomenon is largely related to known inconsistency results for sample eigenvalues and eigenvectors as both dimensions of the matrix increase. For the spiked eigenvalue model for random matrices, we expand the generality of these results, and propose bias-adjusted PC score prediction. In addition, we compute the asymptotic correlation coefficient between PC scores from sample and population eigenvectors. Simulation and real data examples from the genetics literature show the improved bias and numerical properties of our estimators.

Entities:  

Year:  2010        PMID: 21442047      PMCID: PMC3062912          DOI: 10.1214/10-AOS821

Source DB:  PubMed          Journal:  Ann Stat        ISSN: 0090-5364            Impact factor:   4.028


  7 in total

1.  Principal components analysis corrects for stratification in genome-wide association studies.

Authors:  Alkes L Price; Nick J Patterson; Robert M Plenge; Michael E Weinblatt; Nancy A Shadick; David Reich
Journal:  Nat Genet       Date:  2006-07-23       Impact factor: 38.330

2.  Predicting survival from microarray data--a comparative study.

Authors:  H M Bøvelstad; S Nygård; H L Størvold; M Aldrin; Ø Borgan; A Frigessi; O C Lingjaerde
Journal:  Bioinformatics       Date:  2007-06-06       Impact factor: 6.937

3.  Additive risk models for survival data with high-dimensional covariates.

Authors:  Shuangge Ma; Michael R Kosorok; Jason P Fine
Journal:  Biometrics       Date:  2006-03       Impact factor: 2.571

4.  PLINK: a tool set for whole-genome association and population-based linkage analyses.

Authors:  Shaun Purcell; Benjamin Neale; Kathe Todd-Brown; Lori Thomas; Manuel A R Ferreira; David Bender; Julian Maller; Pamela Sklar; Paul I W de Bakker; Mark J Daly; Pak C Sham
Journal:  Am J Hum Genet       Date:  2007-07-25       Impact factor: 11.025

5.  On Consistency and Sparsity for Principal Components Analysis in High Dimensions.

Authors:  Iain M Johnstone; Arthur Yu Lu
Journal:  J Am Stat Assoc       Date:  2009-06-01       Impact factor: 5.033

6.  Population structure and eigenanalysis.

Authors:  Nick Patterson; Alkes L Price; David Reich
Journal:  PLoS Genet       Date:  2006-12       Impact factor: 5.917

7.  A whole-genome association study of major determinants for host control of HIV-1.

Authors:  Jacques Fellay; Kevin V Shianna; Dongliang Ge; Sara Colombo; Bruno Ledergerber; Mike Weale; Kunlin Zhang; Curtis Gumbs; Antonella Castagna; Andrea Cossarizza; Alessandro Cozzi-Lepri; Andrea De Luca; Philippa Easterbrook; Patrick Francioli; Simon Mallal; Javier Martinez-Picado; José M Miro; Niels Obel; Jason P Smith; Josiane Wyniger; Patrick Descombes; Stylianos E Antonarakis; Norman L Letvin; Andrew J McMichael; Barton F Haynes; Amalio Telenti; David B Goldstein
Journal:  Science       Date:  2007-07-19       Impact factor: 47.728

  7 in total
  19 in total

1.  On the meta-analysis of genome-wide association studies: a robust and efficient approach to combine population and family-based studies.

Authors:  Sungho Won; Qing Lu; Lars Bertram; Rudolph E Tanzi; Christoph Lange
Journal:  Hum Hered       Date:  2012-01-18       Impact factor: 0.444

2.  Improved ancestry estimation for both genotyping and sequencing data using projection procrustes analysis and genotype imputation.

Authors:  Chaolong Wang; Xiaowei Zhan; Liming Liang; Gonçalo R Abecasis; Xihong Lin
Journal:  Am J Hum Genet       Date:  2015-05-28       Impact factor: 11.025

3.  Genetic ancestry inference using support vector machines, and the active emergence of a unique American population.

Authors:  Ryan J Haasl; Catherine A McCarty; Bret A Payseur
Journal:  Eur J Hum Genet       Date:  2012-12-05       Impact factor: 4.246

4.  Improved ancestry inference using weights from external reference panels.

Authors:  Chia-Yen Chen; Samuela Pollack; David J Hunter; Joel N Hirschhorn; Peter Kraft; Alkes L Price
Journal:  Bioinformatics       Date:  2013-03-28       Impact factor: 6.937

5.  Convergence of Sample Eigenvalues, Eigenvectors, and Principal Component Scores for Ultra-High Dimensional Data.

Authors:  Seunggeun Lee; Fei Zou; Fred A Wright
Journal:  Biometrika       Date:  2014-06       Impact factor: 2.445

6.  A Split-and-Merge Approach for Singular Value Decomposition of Large-Scale Matrices.

Authors:  Faming Liang; Runmin Shi; Qianxing Mo
Journal:  Stat Interface       Date:  2016-09-14       Impact factor: 0.582

7.  Rare variant association test in family-based sequencing studies.

Authors:  Xuefeng Wang; Zhenyu Zhang; Nathan Morris; Tianxi Cai; Seunggeun Lee; Chaolong Wang; Timothy W Yu; Christopher A Walsh; Xihong Lin
Journal:  Brief Bioinform       Date:  2017-11-01       Impact factor: 11.622

8.  An improved and explicit surrogate variable analysis procedure by coefficient adjustment.

Authors:  Seunggeun Lee; Wei Sun; Fred A Wright; Fei Zou
Journal:  Biometrika       Date:  2017-04-21       Impact factor: 2.445

Review 9.  New approaches to population stratification in genome-wide association studies.

Authors:  Alkes L Price; Noah A Zaitlen; David Reich; Nick Patterson
Journal:  Nat Rev Genet       Date:  2010-07       Impact factor: 53.242

10.  Isolation-by-distance-and-time in a stepping-stone model.

Authors:  Nicolas Duforet-Frebourg; Montgomery Slatkin
Journal:  Theor Popul Biol       Date:  2015-11-21       Impact factor: 1.570

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.