Literature DB >> 14594719

An unsupervised hierarchical dynamic self-organizing approach to cancer class discovery and marker gene identification in microarray data.

Arthur L Hsu1, Sen-Lin Tang, Saman K Halgamuge.   

Abstract

MOTIVATION: Current Self-Organizing Maps (SOMs) approaches to gene expression pattern clustering require the user to predefine the number of clusters likely to be expected. Hierarchical clustering methods used in this area do not provide unique partitioning of data. We describe an unsupervised dynamic hierarchical self-organizing approach, which suggests an appropriate number of clusters, to perform class discovery and marker gene identification in microarray data. In the process of class discovery, the proposed algorithm identifies corresponding sets of predictor genes that best distinguish one class from other classes. The approach integrates merits of hierarchical clustering with robustness against noise known from self-organizing approaches.
RESULTS: The proposed algorithm applied to DNA microarray data sets of two types of cancers has demonstrated its ability to produce the most suitable number of clusters. Further, the corresponding marker genes identified through the unsupervised algorithm also have a strong biological relationship to the specific cancer class. The algorithm tested on leukemia microarray data, which contains three leukemia types, was able to determine three major and one minor cluster. Prediction models built for the four clusters indicate that the prediction strength for the smaller cluster is generally low, therefore labelled as uncertain cluster. Further analysis shows that the uncertain cluster can be subdivided further, and the subdivisions are related to two of the original clusters. Another test performed using colon cancer microarray data has automatically derived two clusters, which is consistent with the number of classes in data (cancerous and normal). AVAILABILITY: JAVA software of dynamic SOM tree algorithm is available upon request for academic use. SUPPLEMENTARY INFORMATION: A comparison of rectangular and hexagonal topologies for GSOM is available from http://www.mame.mu.oz.au/mechatronics/journalinfo/Hsu2003supp.pdf

Entities:  

Mesh:

Substances:

Year:  2003        PMID: 14594719     DOI: 10.1093/bioinformatics/btg296

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  12 in total

1.  Gene function prediction based on genomic context clustering and discriminative learning: an application to bacteriophages.

Authors:  Jason Li; Saman K Halgamuge; Christopher I Kells; Sen-Lin Tang
Journal:  BMC Bioinformatics       Date:  2007-05-22       Impact factor: 3.169

Review 2.  "New" molecular taxonomy in breast cancer.

Authors:  Marta Hergueta-Redondo; José Palacios; Amparo Cano; Gema Moreno-Bueno
Journal:  Clin Transl Oncol       Date:  2008-12       Impact factor: 3.405

3.  A unified computational model for revealing and predicting subtle subtypes of cancers.

Authors:  Xianwen Ren; Yong Wang; Jiguang Wang; Xiang-Sun Zhang
Journal:  BMC Bioinformatics       Date:  2012-05-01       Impact factor: 3.169

4.  A comparison of machine learning techniques for survival prediction in breast cancer.

Authors:  Leonardo Vanneschi; Antonella Farinaccio; Giancarlo Mauri; Mauro Antoniotti; Paolo Provero; Mario Giacobini
Journal:  BioData Min       Date:  2011-05-11       Impact factor: 2.522

5.  Non-linear mapping for exploratory data analysis in functional genomics.

Authors:  Francisco Azuaje; Haiying Wang; Alban Chesneau
Journal:  BMC Bioinformatics       Date:  2005-01-20       Impact factor: 3.169

6.  The oligonucleotide frequency derived error gradient and its application to the binning of metagenome fragments.

Authors:  Isaam Saeed; Saman K Halgamuge
Journal:  BMC Genomics       Date:  2009-12-03       Impact factor: 3.969

7.  Using growing self-organising maps to improve the binning process in environmental whole-genome shotgun sequencing.

Authors:  Chon-Kit Kenneth Chan; Arthur L Hsu; Sen-Lin Tang; Saman K Halgamuge
Journal:  J Biomed Biotechnol       Date:  2008

8.  Multiclass discovery in array data.

Authors:  Yingchun Liu; Markus Ringnér
Journal:  BMC Bioinformatics       Date:  2004-06-04       Impact factor: 3.169

9.  Influence of microarrays experiments missing values on the stability of gene groups by hierarchical clustering.

Authors:  Alexandre G de Brevern; Serge Hazout; Alain Malpertuy
Journal:  BMC Bioinformatics       Date:  2004-08-23       Impact factor: 3.169

10.  Binning sequences using very sparse labels within a metagenome.

Authors:  Chon-Kit Kenneth Chan; Arthur L Hsu; Saman K Halgamuge; Sen-Lin Tang
Journal:  BMC Bioinformatics       Date:  2008-04-28       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.