Literature DB >> 19962867

Clustering of high-dimensional gene expression data with feature filtering methods and diffusion maps.

Rui Xu1, Steven Damelin, Boaz Nadler, Donald C Wunsch.   

Abstract

OBJECTIVE: The importance of gene expression data in cancer diagnosis and treatment has become widely known by cancer researchers in recent years. However, one of the major challenges in the computational analysis of such data is the curse of dimensionality because of the overwhelming number of variables measured (genes) versus the small number of samples. Here, we use a two-step method to reduce the dimension of gene expression data and aim to address the problem of high dimensionality.
METHODS: First, we extract a subset of genes based on statistical characteristics of their corresponding gene expression levels. Then, for further dimensionality reduction, we apply diffusion maps, which interpret the eigenfunctions of Markov matrices as a system of coordinates on the original data set, in order to obtain efficient representation of data geometric descriptions. Finally, a neural network clustering theory, fuzzy ART, is applied to the resulting data to generate clusters of cancer samples.
RESULTS: Experimental results on the small round blue-cell tumor data set, compared with other widely used clustering algorithms, such as the hierarchical clustering algorithm and K-means, show that our proposed method can effectively identify different cancer types and generate high-quality cancer sample clusters.
CONCLUSION: The proposed feature selection methods and diffusion maps can achieve useful information from the multidimensional gene expression data and prove effective at addressing the problem of high dimensionality inherent in gene expression data analysis. 2009 Elsevier B.V. All rights reserved.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19962867     DOI: 10.1016/j.artmed.2009.06.001

Source DB:  PubMed          Journal:  Artif Intell Med        ISSN: 0933-3657            Impact factor:   5.326


  2 in total

1.  Molecular phenotyping using networks, diffusion, and topology: soft tissue sarcoma.

Authors:  James C Mathews; Maryam Pouryahya; Caroline Moosmüller; Yannis G Kevrekidis; Joseph O Deasy; Allen Tannenbaum
Journal:  Sci Rep       Date:  2019-09-27       Impact factor: 4.379

2.  A new avenue for classification and prediction of olive cultivars using supervised and unsupervised algorithms.

Authors:  Amir H Beiki; Saba Saboor; Mansour Ebrahimi
Journal:  PLoS One       Date:  2012-09-05       Impact factor: 3.240

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.