Literature DB >> 20689652

Mining Approximate Order Preserving Clusters in the Presence of Noise.

Mengsheng Zhang1, Wei Wang, Jinze Liu.   

Abstract

Subspace clustering has attracted great attention due to its capability of finding salient patterns in high dimensional data. Order preserving subspace clusters have been proven to be important in high throughput gene expression analysis, since functionally related genes are often co-expressed under a set of experimental conditions. Such co-expression patterns can be represented by consistent orderings of attributes. Existing order preserving cluster models require all objects in a cluster have identical attribute order without deviation. However, real data are noisy due to measurement technology limitation and experimental variability which prohibits these strict models from revealing true clusters corrupted by noise. In this paper, we study the problem of revealing the order preserving clusters in the presence of noise. We propose a noise-tolerant model called approximate order preserving cluster (AOPC). Instead of requiring all objects in a cluster have identical attribute order, we require that (1) at least a certain fraction of the objects have identical attribute order; (2) other objects in the cluster may deviate from the consensus order by up to a certain fraction of attributes. We also propose an algorithm to mine AOPC. Experiments on gene expression data demonstrate the efficiency and effectiveness of our algorithm.

Year:  2008        PMID: 20689652      PMCID: PMC2916184          DOI: 10.1109/ICDE.2008.4497424

Source DB:  PubMed          Journal:  Proc Int Conf Data Eng        ISSN: 1084-4627


  2 in total

1.  Discovering local structure in gene expression data: the order-preserving submatrix problem.

Authors:  Amir Ben-Dor; Benny Chor; Richard Karp; Zohar Yakhini
Journal:  J Comput Biol       Date:  2003       Impact factor: 1.479

2.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization.

Authors:  P T Spellman; G Sherlock; M Q Zhang; V R Iyer; K Anders; M B Eisen; P O Brown; D Botstein; B Futcher
Journal:  Mol Biol Cell       Date:  1998-12       Impact factor: 4.138

  2 in total
  1 in total

1.  Discovery of error-tolerant biclusters from noisy gene expression data.

Authors:  Rohit Gupta; Navneet Rao; Vipin Kumar
Journal:  BMC Bioinformatics       Date:  2011-11-24       Impact factor: 3.169

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.