Literature DB >> 12724290

SamCluster: an integrated scheme for automatic discovery of sample classes using gene expression profile.

Wuju Li1, Ming Fan, Momiao Xiong.   

Abstract

MOTIVATION: Feature (gene) selection can dramatically improve the accuracy of gene expression profile based sample class prediction. Many statistical methods for feature (gene) selection such as stepwise optimization and Monte Carlo simulation have been developed for tissue sample classification. In contrast to class prediction, few statistical and computational methods for feature selection have been applied to clustering algorithms for pattern discovery.
RESULTS: An integrated scheme and corresponding program SamCluster for automatic discovery of sample classes based on gene expression profile is presented in this report. The scheme incorporates the feature selection algorithms based on the calculation of CV (coefficient of variation) and t-test into hierarchical clustering and proceeds as follows. At first, the genes with their CV greater than the pre-specified threshold are selected for cluster analysis, which results in two putative sample classes. Then, significantly differentially expressed genes in the two putative sample classes with p-values < or = 0.01, 0.05, or 0.1 from t-test are selected for further cluster analysis. The above processes were iterated until the two stable sample classes were found. Finally, the consensus sample classes are constructed from the putative classes that are derived from the different CV thresholds, and the best putative sample classes that have the minimum distance between the consensus classes and the putative classes are identified. To evaluate the performance of the feature selection for cluster analysis, the proposed scheme was applied to four expression datasets COLON, LEUKEMIA72, LEUKEMIA38, and OVARIAN. The results show that there are only 5, 1, 0, and 0 samples that have been misclassified, respectively. We conclude that the proposed scheme, SamCluster, is an efficient method for discovery of sample classes using gene expression profile. AVAILABILITY: The related program SamCluster is available upon request or from the web page http://www.sph.uth.tmc.edu:8052/hgc/Downloads.asp.

Entities:  

Mesh:

Year:  2003        PMID: 12724290     DOI: 10.1093/bioinformatics/btg095

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  6 in total

1.  RNA Sequencing Reveals Interacting Key Determinants of Osteoarthritis Acting in Subchondral Bone and Articular Cartilage: Identification of IL11 and CHADL as Attractive Treatment Targets.

Authors:  Margo Tuerlings; Marcella van Hoolwerff; Evelyn Houtman; Eka H E D Suchiman; Nico Lakenberg; Hailiang Mei; Enrike H M J van der Linden; Rob R G H H Nelissen; Yolande Y F M Ramos; Rodrigo Coutinho de Almeida; Ingrid Meulenbelt
Journal:  Arthritis Rheumatol       Date:  2021-03-21       Impact factor: 10.995

2.  A unified computational model for revealing and predicting subtle subtypes of cancers.

Authors:  Xianwen Ren; Yong Wang; Jiguang Wang; Xiang-Sun Zhang
Journal:  BMC Bioinformatics       Date:  2012-05-01       Impact factor: 3.169

3.  Selecting genes by test statistics.

Authors:  Dechang Chen; Zhenqiu Liu; Xiaobin Ma; Dong Hua
Journal:  J Biomed Biotechnol       Date:  2005-06-30

4.  Identification and characterization of two consistent osteoarthritis subtypes by transcriptome and clinical data integration.

Authors:  Rodrigo Coutinho de Almeida; Ahmed Mahfouz; Hailiang Mei; Evelyn Houtman; Wouter den Hollander; Jamie Soul; Eka Suchiman; Nico Lakenberg; Jennifer Meessen; Kasper Huetink; Rob G H H Nelissen; Yolande F M Ramos; Marcel Reinders; Ingrid Meulenbelt
Journal:  Rheumatology (Oxford)       Date:  2021-03-02       Impact factor: 7.580

5.  BioSunMS: a plug-in-based software for the management of patients information and the analysis of peptide profiles from mass spectrometry.

Authors:  Yuan Cao; Na Wang; Xiaomin Ying; Ailing Li; Hengsha Wang; Xuemin Zhang; Wuju Li
Journal:  BMC Med Inform Decis Mak       Date:  2009-02-17       Impact factor: 2.796

Review 6.  multiClust: An R-package for Identifying Biologically Relevant Clusters in Cancer Transcriptome Profiles.

Authors:  Nathan Lawlor; Alec Fabbri; Peiyong Guan; Joshy George; R Krishna Murthy Karuturi
Journal:  Cancer Inform       Date:  2016-06-12
  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.