Mark Girolami1, Rainer Breitling. 1. Bioinformatics Research Centre, Department of Computing Science, University of Glasgow, UK. girolami@dcs.gla.ac.uk
Abstract
MOTIVATION: The identification of physiological processes underlying and generating the expression pattern observed in microarray experiments is a major challenge. Principal component analysis (PCA) is a linear multivariate statistical method that is regularly employed for that purpose as it provides a reduced-dimensional representation for subsequent study of possible biological processes responding to the particular experimental conditions. Making explicit the data assumptions underlying PCA highlights their lack of biological validity thus making biological interpretation of the principal components problematic. A microarray data representation which enables clear biological interpretation is a desirable analysis tool. RESULTS: We address this issue by employing the probabilistic interpretation of PCA and proposing alternative linear factor models which are based on refined biological assumptions. A practical study on two well-understood microarray datasets highlights the weakness of PCA and the greater biological interpretability of the linear models we have developed.
MOTIVATION: The identification of physiological processes underlying and generating the expression pattern observed in microarray experiments is a major challenge. Principal component analysis (PCA) is a linear multivariate statistical method that is regularly employed for that purpose as it provides a reduced-dimensional representation for subsequent study of possible biological processes responding to the particular experimental conditions. Making explicit the data assumptions underlying PCA highlights their lack of biological validity thus making biological interpretation of the principal components problematic. A microarray data representation which enables clear biological interpretation is a desirable analysis tool. RESULTS: We address this issue by employing the probabilistic interpretation of PCA and proposing alternative linear factor models which are based on refined biological assumptions. A practical study on two well-understood microarray datasets highlights the weakness of PCA and the greater biological interpretability of the linear models we have developed.
Authors: Jeanette Treviño; Nataly Perez; Esmeralda Ramirez-Peña; Zhuyun Liu; Samuel A Shelburne; James M Musser; Paul Sumby Journal: Infect Immun Date: 2009-05-18 Impact factor: 3.441
Authors: Alberto Pascual-Montano; Pedro Carmona-Saez; Monica Chagoyen; Francisco Tirado; Jose M Carazo; Roberto D Pascual-Marqui Journal: BMC Bioinformatics Date: 2006-07-28 Impact factor: 3.169
Authors: Nicole E Baldwin; Elissa J Chesler; Stefan Kirov; Michael A Langston; Jay R Snoddy; Robert W Williams; Bing Zhang Journal: J Biomed Biotechnol Date: 2005-06-30