Literature DB >> 17483500

Weighted rank aggregation of cluster validation measures: a Monte Carlo cross-entropy approach.

Vasyl Pihur1, Susmita Datta, Somnath Datta.   

Abstract

MOTIVATION: Biologists often employ clustering techniques in the explorative phase of microarray data analysis to discover relevant biological groupings. Given the availability of numerous clustering algorithms in the machine-learning literature, an user might want to select one that performs the best for his/her data set or application. While various validation measures have been proposed over the years to judge the quality of clusters produced by a given clustering algorithm including their biological relevance, unfortunately, a given clustering algorithm can perform poorly under one validation measure while outperforming many other algorithms under another validation measure. A manual synthesis of results from multiple validation measures is nearly impossible in practice, especially, when a large number of clustering algorithms are to be compared using several measures. An automated and objective way of reconciling the rankings is needed.
RESULTS: Using a Monte Carlo cross-entropy algorithm, we successfully combine the ranks of a set of clustering algorithms under consideration via a weighted aggregation that optimizes a distance criterion. The proposed weighted rank aggregation allows for a far more objective and automated assessment of clustering results than a simple visual inspection. We illustrate our procedure using one simulated as well as three real gene expression data sets from various platforms where we rank a total of eleven clustering algorithms using a combined examination of 10 different validation measures. The aggregate rankings were found for a given number of clusters k and also for an entire range of k. AVAILABILITY: R code for all validation measures and rank aggregation is available from the authors upon request. SUPPLEMENTARY INFORMATION: Supplementary information are available at http://www.somnathdatta.org/Supp/RankCluster/supp.htm.

Mesh:

Year:  2007        PMID: 17483500     DOI: 10.1093/bioinformatics/btm158

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  38 in total

1.  Stacking models for nearly optimal link prediction in complex networks.

Authors:  Amir Ghasemian; Homa Hosseinmardi; Aram Galstyan; Edoardo M Airoldi; Aaron Clauset
Journal:  Proc Natl Acad Sci U S A       Date:  2020-09-04       Impact factor: 11.205

2.  Identifying protein complexes in PPI network using non-cooperative sequential game.

Authors:  Ujjwal Maulik; Srinka Basu; Sumanta Ray
Journal:  Sci Rep       Date:  2017-08-21       Impact factor: 4.379

3.  Multi-Response Based Personalized Treatment Selection with Data from Crossover Designs for Multiple Treatments.

Authors:  K B Kulasekera; Chathura Siriwardhana
Journal:  Commun Stat Simul Comput       Date:  2019-09-10       Impact factor: 1.118

4.  How long does the mRNA remains stable in untreated whole bovine blood?

Authors:  Rodrigo Giglioti; Bianca Tainá Azevedo; Henrique Nunes de Oliveira; Luciana Morita Katiki; Anibal Eugênio Vercesi Filho; Márcia Cristina de Sena Oliveira; Cintia Hiromi Okino
Journal:  Mol Biol Rep       Date:  2021-10-16       Impact factor: 2.316

5.  An adaptive optimal ensemble classifier via bagging and rank aggregation with applications to high dimensional data.

Authors:  Susmita Datta; Vasyl Pihur; Somnath Datta
Journal:  BMC Bioinformatics       Date:  2010-08-18       Impact factor: 3.169

6.  Stable reference genes for expression studies in breast muscle of normal and white striping-affected chickens.

Authors:  Caroline Michele Marinho Marciano; Adriana Mércia Guaratini Ibelli; Jane de Oliveira Peixoto; Igor Ricardo Savoldi; Kamilla Bleil do Carmo; Lana Teixeira Fernandes; Mônica Corrêa Ledur
Journal:  Mol Biol Rep       Date:  2019-10-03       Impact factor: 2.316

7.  Comparison of different ranking methods in protein-ligand binding site prediction.

Authors:  Jun Gao; Qi Liu; Hong Kang; Zhiwei Cao; Ruixin Zhu
Journal:  Int J Mol Sci       Date:  2012-07-16       Impact factor: 6.208

8.  A systems approach to mapping transcriptional networks controlling surfactant homeostasis.

Authors:  Yan Xu; Minlu Zhang; Yanhua Wang; Pooja Kadambi; Vrushank Dave; Long J Lu; Jeffrey A Whitsett
Journal:  BMC Genomics       Date:  2010-07-26       Impact factor: 3.969

9.  Feature Ranking and Screening for Class-Imbalanced Metabolomics Data Based on Rank Aggregation Coupled with Re-Balance.

Authors:  Guang-Hui Fu; Jia-Bao Wang; Min-Jie Zong; Lun-Zhao Yi
Journal:  Metabolites       Date:  2021-06-14

10.  Systematic analysis of the gene expression in the livers of nonalcoholic steatohepatitis: implications on potential biomarkers and molecular pathological mechanism.

Authors:  Yida Zhang; Susan S Baker; Robert D Baker; Ruixin Zhu; Lixin Zhu
Journal:  PLoS One       Date:  2012-12-26       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.