Literature DB >> 18541047

Genetic weighted k-means algorithm for clustering large-scale gene expression data.

Fang-Xiang Wu1.   

Abstract

BACKGROUND: The traditional (unweighted) k-means is one of the most popular clustering methods for analyzing gene expression data. However, it suffers three major shortcomings. It is sensitive to initial partitions, its result is prone to the local minima, and it is only applicable to data with spherical-shape clusters. The last shortcoming means that we must assume that gene expression data at the different conditions follow the independent distribution with the same variances. Nevertheless, this assumption is not true in practice.
RESULTS: In this paper, we propose a genetic weighted K-means algorithm (denoted by GWKMA), which solves the first two problems and partially remedies the third one. GWKMA is a hybridization of a genetic algorithm (GA) and a weighted K-means algorithm (WKMA). In GWKMA, each individual is encoded by a partitioning table which uniquely determines a clustering, and three genetic operators (selection, crossover, mutation) and a WKM operator derived from WKMA are employed. The superiority of the GWKMA over the k-means is illustrated on a synthetic and two real-life gene expression datasets.
CONCLUSION: The proposed algorithm has general application to clustering large-scale biological data such as gene expression data and peptide mass spectral data.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18541047      PMCID: PMC2423435          DOI: 10.1186/1471-2105-9-S6-S12

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  7 in total

1.  The Stanford Microarray Database.

Authors:  G Sherlock; T Hernandez-Boussard; A Kasarskis; G Binkley; J C Matese; S S Dwight; M Kaloper; S Weng; H Jin; C A Ball; M B Eisen; P T Spellman; P O Brown; D Botstein; J M Cherry
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

2.  Global analysis of the genetic network controlling a bacterial cell cycle.

Authors:  M T Laub; H H McAdams; T Feldblyum; C M Fraser; L Shapiro
Journal:  Science       Date:  2000-12-15       Impact factor: 47.728

3.  Model-based clustering and data transformations for gene expression data.

Authors:  K Y Yeung; C Fraley; A Murua; A E Raftery; W L Ruzzo
Journal:  Bioinformatics       Date:  2001-10       Impact factor: 6.937

4.  Genetic K-means algorithm.

Authors:  K Krishna; M Narasimha Murty
Journal:  IEEE Trans Syst Man Cybern B Cybern       Date:  1999

5.  Convergence analysis of canonical genetic algorithms.

Authors:  G Rudolph
Journal:  IEEE Trans Neural Netw       Date:  1994

6.  The transcriptional program of sporulation in budding yeast.

Authors:  S Chu; J DeRisi; M Eisen; J Mulholland; D Botstein; P O Brown; I Herskowitz
Journal:  Science       Date:  1998-10-23       Impact factor: 47.728

7.  A prediction-based resampling method for estimating the number of clusters in a dataset.

Authors:  Sandrine Dudoit; Jane Fridlyand
Journal:  Genome Biol       Date:  2002-06-25       Impact factor: 13.583

  7 in total
  12 in total

1.  Computing gene expression data with a knowledge-based gene clustering approach.

Authors:  Bruce A Rosa; Sookyung Oh; Beronda L Montgomery; Jin Chen; Wensheng Qin
Journal:  Int J Biochem Mol Biol       Date:  2010-06-15

Review 2.  Cardiovascular genomics: a biomarker identification pipeline.

Authors:  John H Phan; Chang F Quo; May Dongmei Wang
Journal:  IEEE Trans Inf Technol Biomed       Date:  2012-05-16

3.  Reverse engineering biomolecular systems using -omic data: challenges, progress and opportunities.

Authors:  Chang F Quo; Chanchala Kaddi; John H Phan; Amin Zollanvari; Mingqing Xu; May D Wang; Gil Alterovitz
Journal:  Brief Bioinform       Date:  2012-07       Impact factor: 11.622

4.  A glance at DNA microarray technology and applications.

Authors:  Amir Ata Saei; Yadollah Omidi
Journal:  Bioimpacts       Date:  2011-08-04

5.  Clustering of High Throughput Gene Expression Data.

Authors:  Harun Pirim; Burak Ekşioğlu; Andy Perkins; Cetin Yüceer
Journal:  Comput Oper Res       Date:  2012-12       Impact factor: 4.008

6.  Gravitation field algorithm and its application in gene cluster.

Authors:  Ming Zheng; Yan Wang; Gui-Xia Liu; Chun-Guang Zhou; Yan-Chun Liang
Journal:  Algorithms Mol Biol       Date:  2010-09-20       Impact factor: 1.405

7.  Highlighting computations in bioscience and bioinformatics: review of the Symposium of Computations in Bioinformatics and Bioscience (SCBB07).

Authors:  Guoqing Lu; Jun Ni
Journal:  BMC Bioinformatics       Date:  2008-05-28       Impact factor: 3.169

8.  GESearch: An Interactive GUI Tool for Identifying Gene Expression Signature.

Authors:  Ning Ye; Hengfu Yin; Jingjing Liu; Xiaogang Dai; Tongming Yin
Journal:  Biomed Res Int       Date:  2015-06-25       Impact factor: 3.411

9.  Mining Functional Modules in Heterogeneous Biological Networks Using Multiplex PageRank Approach.

Authors:  Jun Li; Patrick X Zhao
Journal:  Front Plant Sci       Date:  2016-06-22       Impact factor: 5.753

10.  A multi-objective gene clustering algorithm guided by apriori biological knowledge with intensification and diversification strategies.

Authors:  Jorge Parraga-Alava; Marcio Dorn; Mario Inostroza-Ponta
Journal:  BioData Min       Date:  2018-08-07       Impact factor: 2.522

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.