Ying Li1, Ye He1, Yu Zhang2. 1. College of Computer Science and Technology, Jilin University, Changchun 130012, China; Key Laboratory of Symbolic Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun 130012, China. 2. College of Computer Science and Technology, Jilin University, Changchun 130012, China; Key Laboratory of Symbolic Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun 130012, China. Electronic address: zy26@jlu.edu.cn.
Abstract
OBJECTIVE: Biological processes actually are a dynamic molecular process over time. Time course gene expression experiments provide opportunities to explore patterns of gene expression change over a time and understand the dynamic behavior of gene expression, which is crucial for study on development and progression of biology and disease. Analysis of the gene expression time-course profiles has not been fully exploited so far. It is still a challenge problem. We propose a novel shape-based mixture model clustering method for gene expression time-course profiles to explore the significant gene groups. RESULTS: Based on multi-resolution fractal features and mixture clustering model, we proposed a multi-resolution shape mixture model algorithm. Multi-resolution fractal features is computed by wavelet decomposition, which explore patterns of change over time of gene expression at different resolution. Our proposed multi-resolution shape mixture model algorithm is a probabilistic framework which offers a more natural and robust way of clustering time-course gene expression. We assessed the performance of our proposed algorithm using yeast time-course gene expression profiles compared with several popular clustering methods for gene expression profiles. The grouped genes identified by different methods are evaluated by enrichment analysis of biological pathways and known protein-protein interactions from experiment evidence. The grouped genes identified by our proposed algorithm have more strong biological significance. CONCLUSION: A novel multi-resolution shape mixture model algorithm based on multi-resolution fractal features is proposed. Our proposed model provides a novel horizons and an alternative tool for visualization and analysis of time-course gene expression profiles. AVAILABILITY: The R and Matlab program is available upon the request.
OBJECTIVE: Biological processes actually are a dynamic molecular process over time. Time course gene expression experiments provide opportunities to explore patterns of gene expression change over a time and understand the dynamic behavior of gene expression, which is crucial for study on development and progression of biology and disease. Analysis of the gene expression time-course profiles has not been fully exploited so far. It is still a challenge problem. We propose a novel shape-based mixture model clustering method for gene expression time-course profiles to explore the significant gene groups. RESULTS: Based on multi-resolution fractal features and mixture clustering model, we proposed a multi-resolution shape mixture model algorithm. Multi-resolution fractal features is computed by wavelet decomposition, which explore patterns of change over time of gene expression at different resolution. Our proposed multi-resolution shape mixture model algorithm is a probabilistic framework which offers a more natural and robust way of clustering time-course gene expression. We assessed the performance of our proposed algorithm using yeast time-course gene expression profiles compared with several popular clustering methods for gene expression profiles. The grouped genes identified by different methods are evaluated by enrichment analysis of biological pathways and known protein-protein interactions from experiment evidence. The grouped genes identified by our proposed algorithm have more strong biological significance. CONCLUSION: A novel multi-resolution shape mixture model algorithm based on multi-resolution fractal features is proposed. Our proposed model provides a novel horizons and an alternative tool for visualization and analysis of time-course gene expression profiles. AVAILABILITY: The R and Matlab program is available upon the request.