Literature DB >> 15355552

Iterative class discovery and feature selection using Minimal Spanning Trees.

Sudhir Varma1, Richard Simon.   

Abstract

BACKGROUND: Clustering is one of the most commonly used methods for discovering hidden structure in microarray gene expression data. Most current methods for clustering samples are based on distance metrics utilizing all genes. This has the effect of obscuring clustering in samples that may be evident only when looking at a subset of genes, because noise from irrelevant genes dominates the signal from the relevant genes in the distance calculation.
RESULTS: We describe an algorithm for automatically detecting clusters of samples that are discernable only in a subset of genes. We use iteration between Minimal Spanning Tree based clustering and feature selection to remove noise genes in a step-wise manner while simultaneously sharpening the clustering. Evaluation of this algorithm on synthetic data shows that it resolves planted clusters with high accuracy in spite of noise and the presence of other clusters. It also shows a low probability of detecting spurious clusters. Testing the algorithm on some well known micro-array data-sets reveals known biological classes as well as novel clusters.
CONCLUSIONS: The iterative clustering method offers considerable improvement over clustering in all genes. This method can be used to discover partitions and their biological significance can be determined by comparing with clinical correlates and gene annotations. The MATLAB programs for the iterative clustering algorithm are available from http://linus.nci.nih.gov/supplement.html

Entities:  

Mesh:

Year:  2004        PMID: 15355552      PMCID: PMC520744          DOI: 10.1186/1471-2105-5-126

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  14 in total

1.  Systematic determination of genetic network architecture.

Authors:  S Tavazoie; J D Hughes; M J Campbell; R J Cho; G M Church
Journal:  Nat Genet       Date:  1999-07       Impact factor: 38.330

2.  MedMiner: an Internet text-mining tool for biomedical information, with application to gene expression profiling.

Authors:  L Tanabe; U Scherf; L H Smith; J K Lee; L Hunter; J N Weinstein
Journal:  Biotechniques       Date:  1999-12       Impact factor: 1.993

3.  'Gene shaving' as a method for identifying distinct sets of genes with similar expression patterns.

Authors:  T Hastie; R Tibshirani; M B Eisen; A Alizadeh; R Levy; L Staudt; W C Chan; D Botstein; P Brown
Journal:  Genome Biol       Date:  2000-08-04       Impact factor: 13.583

4.  Identifying splits with clear separation: a new class discovery method for gene expression data.

Authors:  A von Heydebreck; W Huber; A Poustka; M Vingron
Journal:  Bioinformatics       Date:  2001       Impact factor: 6.937

5.  CLIFF: clustering of high-dimensional microarray data via iterative feature filtering using normalized cuts.

Authors:  E P Xing; R M Karp
Journal:  Bioinformatics       Date:  2001       Impact factor: 6.937

6.  ESPD: a pattern detection model underlying gene expression profiles.

Authors:  Chun Tang; Aidong Zhang; Murali Ramanathan
Journal:  Bioinformatics       Date:  2004-01-29       Impact factor: 6.937

7.  Molecular classification of cutaneous malignant melanoma by gene expression profiling.

Authors:  M Bittner; P Meltzer; Y Chen; Y Jiang; E Seftor; M Hendrix; M Radmacher; R Simon; Z Yakhini; A Ben-Dor; N Sampas; E Dougherty; E Wang; F Marincola; C Gooden; J Lueders; A Glatfelter; P Pollock; J Carpten; E Gillanders; D Leja; K Dietrich; C Beaudry; M Berens; D Alberts; V Sondak
Journal:  Nature       Date:  2000-08-03       Impact factor: 49.962

8.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling.

Authors:  A A Alizadeh; M B Eisen; R E Davis; C Ma; I S Lossos; A Rosenwald; J C Boldrick; H Sabet; T Tran; X Yu; J I Powell; L Yang; G E Marti; T Moore; J Hudson; L Lu; D B Lewis; R Tibshirani; G Sherlock; W C Chan; T C Greiner; D D Weisenburger; J O Armitage; R Warnke; R Levy; W Wilson; M R Grever; J C Byrd; D Botstein; P O Brown; L M Staudt
Journal:  Nature       Date:  2000-02-03       Impact factor: 49.962

9.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.

Authors:  T R Golub; D K Slonim; P Tamayo; C Huard; M Gaasenbeek; J P Mesirov; H Coller; M L Loh; J R Downing; M A Caligiuri; C D Bloomfield; E S Lander
Journal:  Science       Date:  1999-10-15       Impact factor: 47.728

10.  Cluster analysis and display of genome-wide expression patterns.

Authors:  M B Eisen; P T Spellman; P O Brown; D Botstein
Journal:  Proc Natl Acad Sci U S A       Date:  1998-12-08       Impact factor: 11.205

View more
  6 in total

1.  HAMSTER: visualizing microarray experiments as a set of minimum spanning trees.

Authors:  Raymond Wan; Larisa Kiseleva; Hajime Harada; Hiroshi Mamitsuka; Paul Horton
Journal:  Source Code Biol Med       Date:  2009-11-20

2.  A unified computational model for revealing and predicting subtle subtypes of cancers.

Authors:  Xianwen Ren; Yong Wang; Jiguang Wang; Xiang-Sun Zhang
Journal:  BMC Bioinformatics       Date:  2012-05-01       Impact factor: 3.169

3.  Similarity searches in genome-wide numerical data sets.

Authors:  Galina Glazko; Michael Coleman; Arcady Mushegian
Journal:  Biol Direct       Date:  2006-05-30       Impact factor: 4.540

4.  Individualized markers optimize class prediction of microarray data.

Authors:  Pavlos Pavlidis; Panayiota Poirazi
Journal:  BMC Bioinformatics       Date:  2006-07-14       Impact factor: 3.169

5.  A novel strategy for gene selection of microarray data based on gene-to-class sensitivity information.

Authors:  Fei Han; Wei Sun; Qing-Hua Ling
Journal:  PLoS One       Date:  2014-05-20       Impact factor: 3.240

6.  Gene selection for classification of microarray data based on the Bayes error.

Authors:  Ji-Gang Zhang; Hong-Wen Deng
Journal:  BMC Bioinformatics       Date:  2007-10-03       Impact factor: 3.169

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.