Literature DB >> 21242203

Principal component analysis based methods in bioinformatics studies.

Shuangge Ma1, Ying Dai.   

Abstract

In analysis of bioinformatics data, a unique challenge arises from the high dimensionality of measurements. Without loss of generality, we use genomic study with gene expression measurements as a representative example but note that analysis techniques discussed in this article are also applicable to other types of bioinformatics studies. Principal component analysis (PCA) is a classic dimension reduction approach. It constructs linear combinations of gene expressions, called principal components (PCs). The PCs are orthogonal to each other, can effectively explain variation of gene expressions, and may have a much lower dimensionality. PCA is computationally simple and can be realized using many existing software packages. This article consists of the following parts. First, we review the standard PCA technique and their applications in bioinformatics data analysis. Second, we describe recent 'non-standard' applications of PCA, including accommodating interactions among genes, pathways and network modules and conducting PCA with estimating equations as opposed to gene expressions. Third, we introduce several recently proposed PCA-based techniques, including the supervised PCA, sparse PCA and functional PCA. The supervised PCA and sparse PCA have been shown to have better empirical performance than the standard PCA. The functional PCA can analyze time-course gene expression data. Last, we raise the awareness of several critical but unsolved problems related to PCA. The goal of this article is to make bioinformatics researchers aware of the PCA technique and more importantly its most recent development, so that this simple yet effective dimension reduction technique can be better employed in bioinformatics data analysis.

Mesh:

Year:  2011        PMID: 21242203      PMCID: PMC3220871          DOI: 10.1093/bib/bbq090

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  14 in total

1.  Principal component analysis for clustering gene expression data.

Authors:  K Y Yeung; W L Ruzzo
Journal:  Bioinformatics       Date:  2001-09       Impact factor: 6.937

2.  A web-based tool for principal component and significance analysis of microarray data.

Authors:  Alexei A Sharov; Dawood B Dudekula; Minoru S H Ko
Journal:  Bioinformatics       Date:  2005-02-25       Impact factor: 6.937

3.  Additive risk models for survival data with high-dimensional covariates.

Authors:  Shuangge Ma; Michael R Kosorok; Jason P Fine
Journal:  Biometrics       Date:  2006-03       Impact factor: 2.571

4.  Clustering of time-course gene expression data using functional data analysis.

Authors:  Joon Jin Song; Ho-Jin Lee; Jeffrey S Morris; Sanghoon Kang
Journal:  Comput Biol Chem       Date:  2007-06-02       Impact factor: 2.877

5.  Identification of differential gene pathways with principal component analysis.

Authors:  Shuangge Ma; Michael R Kosorok
Journal:  Bioinformatics       Date:  2009-02-17       Impact factor: 6.937

Review 6.  Penalized feature selection and classification in bioinformatics.

Authors:  Shuangge Ma; Jian Huang
Journal:  Brief Bioinform       Date:  2008-06-18       Impact factor: 11.622

7.  Pathway-based analysis for genome-wide association studies using supervised principal components.

Authors:  Xi Chen; Lily Wang; Bo Hu; Mingsheng Guo; John Barnard; Xiaofeng Zhu
Journal:  Genet Epidemiol       Date:  2010-11       Impact factor: 2.135

8.  Semi-supervised methods to predict patient survival from gene expression data.

Authors:  Eric Bair; Robert Tibshirani
Journal:  PLoS Biol       Date:  2004-04-13       Impact factor: 8.029

9.  Visualization methods for statistical analysis of microarray clusters.

Authors:  Matthew A Hibbs; Nathaniel C Dirksen; Kai Li; Olga G Troyanskaya
Journal:  BMC Bioinformatics       Date:  2005-05-12       Impact factor: 3.169

10.  Weighted gene co-expression network analysis of the peripheral blood from Amyotrophic Lateral Sclerosis patients.

Authors:  Christiaan G J Saris; Steve Horvath; Paul W J van Vught; Michael A van Es; Hylke M Blauw; Tova F Fuller; Peter Langfelder; Joseph DeYoung; John H J Wokke; Jan H Veldink; Leonard H van den Berg; Roel A Ophoff
Journal:  BMC Genomics       Date:  2009-08-27       Impact factor: 3.969

View more
  56 in total

1.  CT-based Radiomic Signatures for Predicting Histopathologic Features in Head and Neck Squamous Cell Carcinoma.

Authors:  Pritam Mukherjee; Murilo Cintra; Chao Huang; Mu Zhou; Shankuan Zhu; A Dimitrios Colevas; Nancy Fischbein; Olivier Gevaert
Journal:  Radiol Imaging Cancer       Date:  2020-05-15

2.  In silico prediction of ROCK II inhibitors by different classification approaches.

Authors:  Chuipu Cai; Qihui Wu; Yunxia Luo; Huili Ma; Jiangang Shen; Yongbin Zhang; Lei Yang; Yunbo Chen; Zehuai Wen; Qi Wang
Journal:  Mol Divers       Date:  2017-08-02       Impact factor: 2.943

3.  Distance-based classifiers as potential diagnostic and prediction tools for human diseases.

Authors:  Boris Veytsman; Lei Wang; Tiange Cui; Sergey Bruskin; Ancha Baranova
Journal:  BMC Genomics       Date:  2014-12-19       Impact factor: 3.969

4.  Classification of Plasmodium falciparum glucose-6-phosphate dehydrogenase inhibitors by support vector machine.

Authors:  Xiaoli Hou; Aixia Yan
Journal:  Mol Divers       Date:  2013-05-09       Impact factor: 2.943

5.  Deep Learning-Based Prediction of Drug-Induced Cardiotoxicity.

Authors:  Chuipu Cai; Pengfei Guo; Yadi Zhou; Jingwei Zhou; Qi Wang; Fengxue Zhang; Jiansong Fang; Feixiong Cheng
Journal:  J Chem Inf Model       Date:  2019-02-15       Impact factor: 4.956

6.  Latent Feature Decompositions for Integrative Analysis of Multi-Platform Genomic Data.

Authors:  Karl B Gregory; Amin A Momin; Kevin R Coombes; Veerabhadran Baladandayuthapani
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2014-05-19       Impact factor: 3.710

Review 7.  Comprehensive evaluation of published gene expression prognostic signatures for biomarker-based lung cancer clinical studies.

Authors:  H Tang; S Wang; G Xiao; J Schiller; V Papadimitrakopoulou; J Minna; I I Wistuba; Y Xie
Journal:  Ann Oncol       Date:  2017-04-01       Impact factor: 32.976

8.  Molpher: a software framework for systematic chemical space exploration.

Authors:  David Hoksza; Petr Skoda; Milan Voršilák; Daniel Svozil
Journal:  J Cheminform       Date:  2014-03-21       Impact factor: 5.514

9.  Integrative sparse principal component analysis of gene expression data.

Authors:  Mengque Liu; Xinyan Fan; Kuangnan Fang; Qingzhao Zhang; Shuangge Ma
Journal:  Genet Epidemiol       Date:  2017-11-08       Impact factor: 2.135

10.  Differential gene expression profile of first-generation and second-generation rapamycin-resistant allogeneic T cells.

Authors:  Luciano Castiello; Miriam Mossoba; Antonella Viterbo; Marianna Sabatino; Vicki Fellowes; Jason E Foley; Matthew Winterton; David C Halverson; Sara Civini; Ping Jin; Daniel H Fowler; David F Stroncek
Journal:  Cytotherapy       Date:  2013-01-24       Impact factor: 5.414

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.