Literature DB >> 34744522

Integrative Generalized Convex Clustering Optimization and Feature Selection for Mixed Multi-View Data.

Minjie Wang1, Genevera I Allen2.   

Abstract

In mixed multi-view data, multiple sets of diverse features are measured on the same set of samples. By integrating all available data sources, we seek to discover common group structure among the samples that may be hidden in individualistic cluster analyses of a single data view. While several techniques for such integrative clustering have been explored, we propose and develop a convex formalization that enjoys strong empirical performance and inherits the mathematical properties of increasingly popular convex clustering methods. Specifically, our Integrative Generalized Convex Clustering Optimization (iGecco) method employs different convex distances, losses, or divergences for each of the different data views with a joint convex fusion penalty that leads to common groups. Additionally, integrating mixed multi-view data is often challenging when each data source is high-dimensional. To perform feature selection in such scenarios, we develop an adaptive shifted group-lasso penalty that selects features by shrinking them towards their loss-specific centers. Our so-called iGecco+ approach selects features from each data view that are best for determining the groups, often leading to improved integrative clustering. To solve our problem, we develop a new type of generalized multi-block ADMM algorithm using sub-problem approximations that more efficiently fits our model for big data sets. Through a series of numerical experiments and real data examples on text mining and genomics, we show that iGecco+ achieves superior empirical performance for high-dimensional mixed multi-view data.

Entities:  

Keywords:  Bregman divergences; GLM deviance; Integrative clustering; convex clustering; convex optimization; feature selection; sparse clustering

Year:  2021        PMID: 34744522      PMCID: PMC8570363     

Source DB:  PubMed          Journal:  J Mach Learn Res        ISSN: 1532-4435            Impact factor:   5.177


  48 in total

1.  moCluster: Identifying Joint Patterns Across Multiple Omics Data Sets.

Authors:  Chen Meng; Dominic Helm; Martin Frejno; Bernhard Kuster
Journal:  J Proteome Res       Date:  2015-12-30       Impact factor: 4.466

2.  FOXA1 expression in breast cancer--correlation with luminal subtype A and survival.

Authors:  Sunil Badve; Dmitry Turbin; Mangesh A Thorat; Akira Morimiya; Torsten O Nielsen; Charles M Perou; Sandi Dunn; David G Huntsman; Harikrishna Nakshatri
Journal:  Clin Cancer Res       Date:  2007-08-01       Impact factor: 12.531

3.  A New Algorithm and Theory for Penalized Regression-based Clustering.

Authors:  Chong Wu; Sunghoon Kwon; Xiaotong Shen; Wei Pan
Journal:  J Mach Learn Res       Date:  2016       Impact factor: 3.654

4.  A non-negative matrix factorization method for detecting modules in heterogeneous omics multi-modal data.

Authors:  Zi Yang; George Michailidis
Journal:  Bioinformatics       Date:  2015-09-15       Impact factor: 6.937

5.  JOINT AND INDIVIDUAL VARIATION EXPLAINED (JIVE) FOR INTEGRATED ANALYSIS OF MULTIPLE DATA TYPES.

Authors:  Eric F Lock; Katherine A Hoadley; J S Marron; Andrew B Nobel
Journal:  Ann Appl Stat       Date:  2013-03-01       Impact factor: 2.083

6.  Identification of GATA3 as a breast cancer prognostic marker by global gene expression meta-analysis.

Authors:  Rohit Mehra; Sooryanarayana Varambally; Lei Ding; Ronglai Shen; Michael S Sabel; Debashis Ghosh; Arul M Chinnaiyan; Celina G Kleer
Journal:  Cancer Res       Date:  2005-12-15       Impact factor: 12.701

7.  A fully Bayesian latent variable model for integrative clustering analysis of multi-type omics data.

Authors:  Qianxing Mo; Ronglai Shen; Cui Guo; Marina Vannucci; Keith S Chan; Susan G Hilsenbeck
Journal:  Biostatistics       Date:  2018-01-01       Impact factor: 5.899

8.  Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments.

Authors:  James H Bullard; Elizabeth Purdom; Kasper D Hansen; Sandrine Dudoit
Journal:  BMC Bioinformatics       Date:  2010-02-18       Impact factor: 3.169

9.  Dynamic Visualization and Fast Computation for Convex Clustering via Algorithmic Regularization.

Authors:  Michael Weylandt; John Nagorski; Genevera I Allen
Journal:  J Comput Graph Stat       Date:  2019-07-19       Impact factor: 2.302

10.  miR-190 suppresses breast cancer metastasis by regulation of TGF-β-induced epithelial-mesenchymal transition.

Authors:  Yue Yu; Wei Luo; Zheng-Jun Yang; Jiang-Rui Chi; Yun-Rui Li; Yu Ding; Jie Ge; Xin Wang; Xu-Chen Cao
Journal:  Mol Cancer       Date:  2018-03-06       Impact factor: 27.401

View more
  2 in total

1.  Fast and interpretable consensus clustering via minipatch learning.

Authors:  Luqin Gan; Genevera I Allen
Journal:  PLoS Comput Biol       Date:  2022-10-03       Impact factor: 4.779

2.  Two-stage linked component analysis for joint decomposition of multiple biologically related data sets.

Authors:  Huan Chen; Brian Caffo; Genevieve Stein-O'Brien; Jinrui Liu; Ben Langmead; Carlo Colantuoni; Luo Xiao
Journal:  Biostatistics       Date:  2022-10-14       Impact factor: 5.279

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.