Literature DB >> 15647300

Detecting clusters of different geometrical shapes in microarray gene expression data.

Dae-Won Kim1, Kwang H Lee, Doheon Lee.   

Abstract

MOTIVATION: Clustering has been used as a popular technique for finding groups of genes that show similar expression patterns under multiple experimental conditions. Many clustering methods have been proposed for clustering gene-expression data, including the hierarchical clustering, k-means clustering and self-organizing map (SOM). However, the conventional methods are limited to identify different shapes of clusters because they use a fixed distance norm when calculating the distance between genes. The fixed distance norm imposes a fixed geometrical shape on the clusters regardless of the actual data distribution. Thus, different distance norms are required for handling the different shapes of clusters.
RESULTS: We present the Gustafson-Kessel (GK) clustering method for microarray gene-expression data. To detect clusters of different shapes in a dataset, we use an adaptive distance norm that is calculated by a fuzzy covariance matrix (F) of each cluster in which the eigenstructure of F is used as an indicator of the shape of the cluster. Moreover, the GK method is less prone to falling into local minima than the k-means and SOM because it makes decisions through the use of membership degrees of a gene to clusters. The algorithmic procedure is accomplished by the alternating optimization technique, which iteratively improves a sequence of sets of clusters until no further improvement is possible. To test the performance of the GK method, we applied the GK method and well-known conventional methods to three recently published yeast datasets, and compared the performance of each method using the Saccharomyces Genome Database annotations. The clustering results of the GK method are more significantly relevant to the biological annotations than those of the other methods, demonstrating its effectiveness and potential for clustering gene-expression data. AVAILABILITY: The software was developed using Java language, and can be executed on the platforms that JVM (Java Virtual Machine) is running. It is available from the authors upon request. SUPPLEMENTARY INFORMATION: Supplementary data are available at http://dragon.kaist.ac.kr/gk.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 15647300     DOI: 10.1093/bioinformatics/bti251

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  8 in total

Review 1.  Systems analysis of high-throughput data.

Authors:  Rosemary Braun
Journal:  Adv Exp Med Biol       Date:  2014       Impact factor: 2.622

2.  SpaCEM3: a software for biological module detection when data is incomplete, high dimensional and dependent.

Authors:  Matthieu Vignes; Juliette Blanchet; Damien Leroux; Florence Forbes
Journal:  Bioinformatics       Date:  2011-02-03       Impact factor: 6.937

3.  Partition decoupling for multi-gene analysis of gene expression profiling data.

Authors:  Rosemary Braun; Gregory Leibon; Scott Pauls; Daniel Rockmore
Journal:  BMC Bioinformatics       Date:  2011-12-30       Impact factor: 3.169

4.  Systematic gene function prediction from gene expression data by using a fuzzy nearest-cluster method.

Authors:  Xiao-Li Li; Yin-Chet Tan; See-Kiong Ng
Journal:  BMC Bioinformatics       Date:  2006-12-12       Impact factor: 3.169

5.  Identification of temporal association rules from time-series microarray data sets.

Authors:  Hojung Nam; KiYoung Lee; Doheon Lee
Journal:  BMC Bioinformatics       Date:  2009-03-19       Impact factor: 3.169

6.  MULTI-K: accurate classification of microarray subtypes using ensemble k-means clustering.

Authors:  Eun-Youn Kim; Seon-Young Kim; Daniel Ashlock; Dougu Nam
Journal:  BMC Bioinformatics       Date:  2009-08-22       Impact factor: 3.169

7.  iPcc: a novel feature extraction method for accurate disease class discovery and prediction.

Authors:  Xianwen Ren; Yong Wang; Xiang-Sun Zhang; Qi Jin
Journal:  Nucleic Acids Res       Date:  2013-06-12       Impact factor: 16.971

8.  Feature selection of gene expression data for Cancer classification using double RBF-kernels.

Authors:  Shenghui Liu; Chunrui Xu; Yusen Zhang; Jiaguo Liu; Bin Yu; Xiaoping Liu; Matthias Dehmer
Journal:  BMC Bioinformatics       Date:  2018-10-29       Impact factor: 3.169

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.