MOTIVATION: In high-throughput genomic and proteomic experiments, investigators monitor expression across a set of experimental conditions. To gain an understanding of broader biological phenomena, researchers have until recently been limited to post hoc analyses of significant gene lists. METHOD: We describe a general framework, significance analysis of function and expression (SAFE), for conducting valid tests of gene categories ab initio. SAFE is a two-stage, permutation-based method that can be applied to various experimental designs, accounts for the unknown correlation among genes and enables permutation-based estimation of error rates. RESULTS: The utility and flexibility of SAFE is illustrated with a microarray dataset of human lung carcinomas and gene categories based on Gene Ontology and the Protein Family database. Significant gene categories were observed in comparisons of (1) tumor versus normal tissue, (2) multiple tumor subtypes and (3) survival times. AVAILABILITY: Code to implement SAFE in the statistical package R is available from the authors. SUPPLEMENTARY INFORMATION: http://www.bios.unc.edu/~fwright/SAFE.
MOTIVATION: In high-throughput genomic and proteomic experiments, investigators monitor expression across a set of experimental conditions. To gain an understanding of broader biological phenomena, researchers have until recently been limited to post hoc analyses of significant gene lists. METHOD: We describe a general framework, significance analysis of function and expression (SAFE), for conducting valid tests of gene categories ab initio. SAFE is a two-stage, permutation-based method that can be applied to various experimental designs, accounts for the unknown correlation among genes and enables permutation-based estimation of error rates. RESULTS: The utility and flexibility of SAFE is illustrated with a microarray dataset of humanlung carcinomas and gene categories based on Gene Ontology and the Protein Family database. Significant gene categories were observed in comparisons of (1) tumor versus normal tissue, (2) multiple tumor subtypes and (3) survival times. AVAILABILITY: Code to implement SAFE in the statistical package R is available from the authors. SUPPLEMENTARY INFORMATION: http://www.bios.unc.edu/~fwright/SAFE.
Authors: Wanda K O'Neal; Paul Gallins; Rhonda G Pace; Hong Dang; Whitney E Wolf; Lisa C Jones; XueLiang Guo; Yi-Hui Zhou; Vered Madar; Jinyan Huang; Liming Liang; Miriam F Moffatt; Garry R Cutting; Mitchell L Drumm; Johanna M Rommens; Lisa J Strug; Wei Sun; Jaclyn R Stonebraker; Fred A Wright; Michael R Knowles Journal: Am J Hum Genet Date: 2015-01-29 Impact factor: 11.025
Authors: David B Liesenfeld; Dmitry Grapov; Johannes F Fahrmann; Mariam Salou; Dominique Scherer; Reka Toth; Nina Habermann; Jürgen Böhm; Petra Schrotz-King; Biljana Gigic; Martin Schneider; Alexis Ulrich; Esther Herpel; Peter Schirmacher; Oliver Fiehn; Johanna W Lampe; Cornelia M Ulrich Journal: Am J Clin Nutr Date: 2015-07-08 Impact factor: 7.045