Literature DB >> 23608734

Robust complementary hierarchical clustering for gene expression data analysis by β-divergence.

Md Bahadur Badsha1, Md Nurul Haque Mollah, Nusrat Jahan, Hiroyuki Kurata.   

Abstract

A hierarchical clustering (HC) algorithm is one of the most widely used unsupervised statistical techniques for analyzing microarray gene expression data. When applying the HC algorithm to the gene expression data to cluster individuals, most of the HC algorithms generate clusters based on the highly differentially expressed (DE) genes that have very similar expression patterns. These highly DE genes may sometimes be irrelevant in biological processes. The serious problem is that those irrelevant genes with high expressions potentially drown out the low expressed genes that have important biological functions. To overcome the problem, Nowak and Tibshirani proposed the complementary hierarchical clustering (CHC) (Biostatistics, 9, 467-483, 2008). However, it is not robust against outlying expression and often produces misleading results if there exist some contaminations in the gene expression data. Thus, we propose the robust CHC (RCHC) method to robustify the CHC with respect to outliers by maximizing the β-likelihood function for sequential extraction of a gene-set with proper groups of individuals. Note that the proposed method reduces to the CHC with the tuning parameter β → 0. A value of β plays a key role in the performance of the RCHC method, which controls the tradeoff between the robustness and efficiency of the estimators. Using simulation and real gene expression analysis, the RCHC method shows robust properties to gene expression clustering with respect to data contaminations, overcomes the problem of the CHC, and predicts critically important genes from breast cancer data. Crown
Copyright © 2013. Published by Elsevier B.V. All rights reserved.

Entities:  

Keywords:  DNA microarray; Gene expression; Maximum β-likelihood; Relative gene importance; Robust complementary hierarchical clustering (RCHC); Robustness; Selection procedure of β

Mesh:

Year:  2013        PMID: 23608734     DOI: 10.1016/j.jbiosc.2013.03.010

Source DB:  PubMed          Journal:  J Biosci Bioeng        ISSN: 1347-4421            Impact factor:   2.894


  3 in total

1.  MRPC: An R Package for Inference of Causal Graphs.

Authors:  Md Bahadur Badsha; Evan A Martin; Audrey Qiuyan Fu
Journal:  Front Genet       Date:  2021-04-30       Impact factor: 4.599

2.  Learning Causal Biological Networks With the Principle of Mendelian Randomization.

Authors:  Md Bahadur Badsha; Audrey Qiuyan Fu
Journal:  Front Genet       Date:  2019-05-21       Impact factor: 4.599

3.  An introduction to new robust linear and monotonic correlation coefficients.

Authors:  Mohammad Tabatabai; Stephanie Bailey; Zoran Bursac; Habib Tabatabai; Derek Wilus; Karan P Singh
Journal:  BMC Bioinformatics       Date:  2021-03-31       Impact factor: 3.169

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.