Literature DB >> 18801650

Fuzzy ensemble clustering based on random projections for DNA microarray data analysis.

Roberto Avogadri1, Giorgio Valentini.   

Abstract

OBJECTIVE: Two major problems related the unsupervised analysis of gene expression data are represented by the accuracy and reliability of the discovered clusters, and by the biological fact that the boundaries between classes of patients or classes of functionally related genes are sometimes not clearly defined. The main goal of this work consists in the exploration of new strategies and in the development of new clustering methods to improve the accuracy and robustness of clustering results, taking into account the uncertainty underlying the assignment of examples to clusters in the context of gene expression data analysis.
METHODOLOGY: We propose a fuzzy ensemble clustering approach both to improve the accuracy of clustering results and to take into account the inherent fuzziness of biological and bio-medical gene expression data. We applied random projections that obey the Johnson-Lindenstrauss lemma to obtain several instances of lower dimensional gene expression data from the original high-dimensional ones, approximately preserving the information and the metric structure of the original data. Then we adopt a double fuzzy approach to obtain a consensus ensemble clustering, by first applying a fuzzy k-means algorithm to the different instances of the projected low-dimensional data and then by using a fuzzy t-norm to combine the multiple clusterings. Several variants of the fuzzy ensemble clustering algorithms are proposed, according to different techniques to combine the base clusterings and to obtain the final consensus clustering. RESULTS AND
CONCLUSION: We applied our proposed fuzzy ensemble methods to the gene expression analysis of leukemia, lymphoma, adenocarcinoma and melanoma patients, and we compared the results with other state of the art ensemble methods. Results show that in some cases, taking into account the natural fuzziness of the data, we can improve the discovery of classes of patients defined at bio-molecular level. The reduction of the dimension of the data, achieved through random projections techniques, is well-suited to the characteristics of high-dimensional gene expression data, thus resulting in improved performance with respect to single fuzzy k-means and with respect to ensemble methods based on resampling techniques. Moreover, we show that the analysis of the accuracy and diversity of the base fuzzy clusterings can be useful to explain the advantages and the limitations of the proposed fuzzy ensemble approach.

Entities:  

Mesh:

Year:  2008        PMID: 18801650     DOI: 10.1016/j.artmed.2008.07.014

Source DB:  PubMed          Journal:  Artif Intell Med        ISSN: 0933-3657            Impact factor:   5.326


  8 in total

1.  A new exact test for the evaluation of population pharmacokinetic and/or pharmacodynamic models using random projections.

Authors:  Celine Marielle Laffont; Didier Concordet
Journal:  Pharm Res       Date:  2011-04-14       Impact factor: 4.200

2.  Prediction of slaughter age in pigs and assessment of the predictive value of phenotypic and genetic information using random forest.

Authors:  Ahmad Alsahaf; George Azzopardi; Bart Ducro; Egiel Hanenberg; Roel F Veerkamp; Nicolai Petkov
Journal:  J Anim Sci       Date:  2018-12-03       Impact factor: 3.159

3.  Unsupervised Algorithms for Microarray Sample Stratification.

Authors:  Michele Fratello; Luca Cattelani; Antonio Federico; Alisa Pavel; Giovanni Scala; Angela Serra; Dario Greco
Journal:  Methods Mol Biol       Date:  2022

4.  Interpolation based consensus clustering for gene expression time series.

Authors:  Tai-Yu Chiu; Ting-Chieh Hsu; Chia-Cheng Yen; Jia-Shung Wang
Journal:  BMC Bioinformatics       Date:  2015-04-16       Impact factor: 3.169

5.  Clustering cancer gene expression data by projective clustering ensemble.

Authors:  Xianxue Yu; Guoxian Yu; Jun Wang
Journal:  PLoS One       Date:  2017-02-24       Impact factor: 3.240

6.  Autoencoder-based cluster ensembles for single-cell RNA-seq data analysis.

Authors:  Thomas A Geddes; Taiyun Kim; Lihao Nan; James G Burchfield; Jean Y H Yang; Dacheng Tao; Pengyi Yang
Journal:  BMC Bioinformatics       Date:  2019-12-24       Impact factor: 3.169

Review 7.  Phenotype clustering in health care: A narrative review for clinicians.

Authors:  Tyler J Loftus; Benjamin Shickel; Jeremy A Balch; Patrick J Tighe; Kenneth L Abbott; Brian Fazzone; Erik M Anderson; Jared Rozowsky; Tezcan Ozrazgat-Baslanti; Yuanfang Ren; Scott A Berceli; William R Hogan; Philip A Efron; J Randall Moorman; Parisa Rashidi; Gilbert R Upchurch; Azra Bihorac
Journal:  Front Artif Intell       Date:  2022-08-12

8.  Ensemble-based prediction of RNA secondary structures.

Authors:  Nima Aghaeepour; Holger H Hoos
Journal:  BMC Bioinformatics       Date:  2013-04-24       Impact factor: 3.169

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.