K Y Yeung1, D R Haynor, W L Ruzzo. 1. Computer Science and Engineering, Box 352350, University of Washington, Seattle, WA 98195, USA.
Abstract
MOTIVATION: Many clustering algorithms have been proposed for the analysis of gene expression data, but little guidance is available to help choose among them. We provide a systematic framework for assessing the results of clustering algorithms. Clustering algorithms attempt to partition the genes into groups exhibiting similar patterns of variation in expression level. Our methodology is to apply a clustering algorithm to the data from all but one experimental condition. The remaining condition is used to assess the predictive power of the resulting clusters-meaningful clusters should exhibit less variation in the remaining condition than clusters formed by chance. RESULTS: We successfully applied our methodology to compare six clustering algorithms on four gene expression data sets. We found our quantitative measures of cluster quality to be positively correlated with external standards of cluster quality.
MOTIVATION: Many clustering algorithms have been proposed for the analysis of gene expression data, but little guidance is available to help choose among them. We provide a systematic framework for assessing the results of clustering algorithms. Clustering algorithms attempt to partition the genes into groups exhibiting similar patterns of variation in expression level. Our methodology is to apply a clustering algorithm to the data from all but one experimental condition. The remaining condition is used to assess the predictive power of the resulting clusters-meaningful clusters should exhibit less variation in the remaining condition than clusters formed by chance. RESULTS: We successfully applied our methodology to compare six clustering algorithms on four gene expression data sets. We found our quantitative measures of cluster quality to be positively correlated with external standards of cluster quality.
Authors: Christian Trötschel; Stefan P Albaum; Daniel Wolff; Simon Schröder; Alexander Goesmann; Tim W Nattkemper; Ansgar Poetsch Journal: Mol Cell Proteomics Date: 2012-04-06 Impact factor: 5.911
Authors: Steffen Vanneste; Bert De Rybel; Gerrit T S Beemster; Karin Ljung; Ive De Smet; Gert Van Isterdael; Mirande Naudts; Ryusuke Iida; Wilhelm Gruissem; Masao Tasaka; Dirk Inzé; Hidehiro Fukaki; Tom Beeckman Journal: Plant Cell Date: 2005-10-21 Impact factor: 11.277
Authors: Daniel R Lewis; Amy L Olex; Stacey R Lundy; William H Turkett; Jacquelyn S Fetrow; Gloria K Muday Journal: Plant Cell Date: 2013-09-17 Impact factor: 11.277
Authors: Jeff W Chou; Tong Zhou; William K Kaufmann; Richard S Paules; Pierre R Bushel Journal: BMC Bioinformatics Date: 2007-11-02 Impact factor: 3.169