Literature DB >> 32698720

ClusterMine: A knowledge-integrated clustering approach based on expression profiles of gene sets.

Hong-Dong Li1, Yunpei Xu1, Xiaoshu Zhu1,2, Quan Liu1, Gilbert S Omenn3, Jianxin Wang1.   

Abstract

Clustering analysis of gene expression data is essential for understanding complex biological data, and is widely used in important biological applications such as the identification of cell subpopulations and disease subtypes. In commonly used methods such as hierarchical clustering (HC) and consensus clustering (CC), holistic expression profiles of all genes are often used to assess the similarity between samples for clustering. While these methods have been proven successful in identifying sample clusters in many areas, they do not provide information about which gene sets (functions) contribute most to the clustering, thus limiting the interpretability of the resulting cluster. We hypothesize that integrating prior knowledge of annotated gene sets would not only achieve satisfactory clustering performance but also, more importantly, enable potential biological interpretation of clusters. Here we report ClusterMine, an approach that identifies clusters by assessing functional similarity between samples through integrating known annotated gene sets in functional annotation databases such as Gene Ontology. In addition to the cluster membership of each sample as provided by conventional approaches, it also outputs gene sets that most likely contribute to the clustering, thus facilitating biological interpretation. We compare ClusterMine with conventional approaches on nine real-world experimental datasets that represent different application scenarios in biology. We find that ClusterMine achieves better performances and that the gene sets prioritized by our method are biologically meaningful. ClusterMine is implemented as an R package and is freely available at: www.genemine.org/clustermine.php.

Entities:  

Keywords:  Clustering; annotated gene sets; expression profiles; functional similarity

Mesh:

Year:  2020        PMID: 32698720      PMCID: PMC8864677          DOI: 10.1142/S0219720020400090

Source DB:  PubMed          Journal:  J Bioinform Comput Biol        ISSN: 0219-7200            Impact factor:   1.122


  66 in total

1.  Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning.

Authors:  Bo Wang; Junjie Zhu; Emma Pierson; Daniele Ramazzotti; Serafim Batzoglou
Journal:  Nat Methods       Date:  2017-03-06       Impact factor: 28.547

2.  Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma.

Authors:  Anoop P Patel; Itay Tirosh; John J Trombetta; Alex K Shalek; Shawn M Gillespie; Hiroaki Wakimoto; Daniel P Cahill; Brian V Nahed; William T Curry; Robert L Martuza; David N Louis; Orit Rozenblatt-Rosen; Mario L Suvà; Aviv Regev; Bradley E Bernstein
Journal:  Science       Date:  2014-06-12       Impact factor: 47.728

3.  Regulation of Cell Cycle to Stimulate Adult Cardiomyocyte Proliferation and Cardiac Regeneration.

Authors:  Tamer M A Mohamed; Yen-Sin Ang; Ethan Radzinsky; Ping Zhou; Yu Huang; Arye Elfenbein; Amy Foley; Sergey Magnitsky; Deepak Srivastava
Journal:  Cell       Date:  2018-03-01       Impact factor: 41.582

4.  DeepSignal: detecting DNA methylation state from Nanopore sequencing reads using deep-learning.

Authors:  Peng Ni; Neng Huang; Zhi Zhang; De-Peng Wang; Fan Liang; Yu Miao; Chuan-Le Xiao; Feng Luo; Jianxin Wang
Journal:  Bioinformatics       Date:  2019-11-01       Impact factor: 6.937

5.  Reactome: a database of reactions, pathways and biological processes.

Authors:  David Croft; Gavin O'Kelly; Guanming Wu; Robin Haw; Marc Gillespie; Lisa Matthews; Michael Caudy; Phani Garapati; Gopal Gopinath; Bijay Jassal; Steven Jupe; Irina Kalatskaya; Shahana Mahajan; Bruce May; Nelson Ndegwa; Esther Schmidt; Veronica Shamovsky; Christina Yung; Ewan Birney; Henning Hermjakob; Peter D'Eustachio; Lincoln Stein
Journal:  Nucleic Acids Res       Date:  2010-11-09       Impact factor: 16.971

6.  Subtypes of primary colorectal tumors correlate with response to targeted treatment in colorectal cell lines.

Authors:  Andreas Schlicker; Garry Beran; Christine M Chresta; Gael McWalter; Alison Pritchard; Susie Weston; Sarah Runswick; Sara Davenport; Kerry Heathcote; Denis Alferez Castro; George Orphanides; Tim French; Lodewyk F A Wessels
Journal:  BMC Med Genomics       Date:  2012-12-31       Impact factor: 3.063

7.  MIsoMine: a genome-scale high-resolution data portal of expression, function and networks at the splice isoform level in the mouse.

Authors:  Hong-Dong Li; Gilbert S Omenn; Yuanfang Guan
Journal:  Database (Oxford)       Date:  2015-05-07       Impact factor: 3.451

8.  SC3: consensus clustering of single-cell RNA-seq data.

Authors:  Vladimir Yu Kiselev; Kristina Kirschner; Michael T Schaub; Tallulah Andrews; Andrew Yiu; Tamir Chandra; Kedar N Natarajan; Wolf Reik; Mauricio Barahona; Anthony R Green; Martin Hemberg
Journal:  Nat Methods       Date:  2017-03-27       Impact factor: 28.547

9.  Single-cell RNA-seq denoising using a deep count autoencoder.

Authors:  Gökcen Eraslan; Lukas M Simon; Maria Mircea; Nikola S Mueller; Fabian J Theis
Journal:  Nat Commun       Date:  2019-01-23       Impact factor: 14.919

10.  Inhibition of IKK/NF-κB Signaling Enhances Differentiation of Mesenchymal Stromal Cells from Human Embryonic Stem Cells.

Authors:  Peng Deng; Chenchen Zhou; Ruth Alvarez; Christine Hong; Cun-Yu Wang
Journal:  Stem Cell Reports       Date:  2016-03-10       Impact factor: 7.765

View more
  1 in total

1.  SSRE: Cell Type Detection Based on Sparse Subspace Representation and Similarity Enhancement.

Authors:  Zhenlan Liang; Min Li; Ruiqing Zheng; Yu Tian; Xuhua Yan; Jin Chen; Fang-Xiang Wu; Jianxin Wang
Journal:  Genomics Proteomics Bioinformatics       Date:  2021-02-27       Impact factor: 7.691

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.