Literature DB >> 11791221

Minimum spanning trees for gene expression data clustering.

Y Xu1, V Olman, D Xu.   

Abstract

This paper describes a new framework for microarray gene-expression data clustering. The foundation of this framework is a minimum spanning tree (MST) representation of a set of multi-dimensional gene expression data. A key property of this representation is that each cluster of the expression data corresponds to one subtree of the MST, which rigorously converts a multi-dimensional clustering problem to a tree partitioning problem. We have demonstrated that though the inter-data relationship is greatly simplified in the MST representation, no essential information is lost for the purpose of clustering. Two key advantages in representing a set of multi-dimensional data as an MST are: (1) the simple structure of a tree facilitates efficient implementations of rigorous clustering algorithms, which otherwise are highly computationally challenging; and (2) as an MST-based clustering does not depend on detailed geometric shape of a cluster, it can overcome many of the problems faced by classical clustering algorithms. Based on the MST representation, we have developed a number of rigorous and efficient clustering algorithms, including two with guaranteed global optimality. We have implemented these algorithms as a computer software EXCAVATOR. To demonstrate its effectiveness, we have tested it on two data sets, i.e., expression data from yeast Saccharomyces cerevisiae, and Arabidopsis expression data in response to chitin elicitation.

Entities:  

Mesh:

Year:  2001        PMID: 11791221

Source DB:  PubMed          Journal:  Genome Inform        ISSN: 0919-9454


  8 in total

1.  Analysis of time-series gene expression data: methods, challenges, and opportunities.

Authors:  I P Androulakis; E Yang; R R Almon
Journal:  Annu Rev Biomed Eng       Date:  2007       Impact factor: 9.590

2.  Clustering of gene expression data based on shape similarity.

Authors:  Travis J Hestilow; Yufei Huang
Journal:  EURASIP J Bioinform Syst Biol       Date:  2009-04-23

Review 3.  Protein-protein interaction analysis for functional characterization of helicases.

Authors:  Boris L Zybailov; Alicia K Byrd; Galina V Glazko; Yasir Rahmatallah; Kevin D Raney
Journal:  Methods       Date:  2016-04-20       Impact factor: 3.608

4.  Network theory analysis of antibody-antigen reactivity data: the immune trees at birth and adulthood.

Authors:  Asaf Madi; Dror Y Kenett; Sharron Bransburg-Zabary; Yifat Merbl; Francisco J Quintana; Alfred I Tauber; Irun R Cohen; Eshel Ben-Jacob
Journal:  PLoS One       Date:  2011-03-08       Impact factor: 3.240

5.  Combining Pareto-optimal clusters using supervised learning for identifying co-expressed genes.

Authors:  Ujjwal Maulik; Anirban Mukhopadhyay; Sanghamitra Bandyopadhyay
Journal:  BMC Bioinformatics       Date:  2009-01-20       Impact factor: 3.169

6.  Gene Sets Net Correlations Analysis (GSNCA): a multivariate differential coexpression test for gene sets.

Authors:  Yasir Rahmatallah; Frank Emmert-Streib; Galina Glazko
Journal:  Bioinformatics       Date:  2013-11-30       Impact factor: 6.937

Review 7.  Clustering Algorithms: Their Application to Gene Expression Data.

Authors:  Jelili Oyelade; Itunuoluwa Isewon; Funke Oladipupo; Olufemi Aromolaran; Efosa Uwoghiren; Faridah Ameh; Moses Achas; Ezekiel Adebiyi
Journal:  Bioinform Biol Insights       Date:  2016-11-30

8.  Cluster Analysis of Cell Nuclei in H&E-Stained Histological Sections of Prostate Cancer and Classification Based on Traditional and Modern Artificial Intelligence Techniques.

Authors:  Subrata Bhattacharjee; Kobiljon Ikromjanov; Kouayep Sonia Carole; Nuwan Madusanka; Nam-Hoon Cho; Yeong-Byn Hwang; Rashadul Islam Sumon; Hee-Cheol Kim; Heung-Kook Choi
Journal:  Diagnostics (Basel)       Date:  2021-12-22
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.