Literature DB >> 31792509

Clustering and variable selection evaluation of 13 unsupervised methods for multi-omics data integration.

Morgane Pierre-Jean1, Jean-François Deleuze2, Edith Le Floch3, Florence Mauger3.   

Abstract

Recent advances in NGS sequencing, microarrays and mass spectrometry for omics data production have enabled the generation and collection of different modalities of high-dimensional molecular data. The integration of multiple omics datasets is a statistical challenge, due to the limited number of individuals, the high number of variables and the heterogeneity of the datasets to integrate. Recently, a lot of tools have been developed to solve the problem of integrating omics data including canonical correlation analysis, matrix factorization and SM. These commonly used techniques aim to analyze simultaneously two or more types of omics. In this article, we compare a panel of 13 unsupervised methods based on these different approaches to integrate various types of multi-omics datasets: iClusterPlus, regularized generalized canonical correlation analysis, sparse generalized canonical correlation analysis, multiple co-inertia analysis (MCIA), integrative-NMF (intNMF), SNF, MoCluster, mixKernel, CIMLR, LRAcluster, ConsensusClustering, PINSPlus and multi-omics factor analysis (MOFA). We evaluate the ability of the methods to recover the subgroups and the variables that drive the clustering on eight benchmarks of simulation. MOFA does not provide any results on these benchmarks. For clustering, SNF, MoCluster, CIMLR, LRAcluster, ConsensusClustering and intNMF provide the best results. For variable selection, MoCluster outperforms the others. However, the performance of the methods seems to depend on the heterogeneity of the datasets (especially for MCIA, intNMF and iClusterPlus). Finally, we apply the methods on three real studies with heterogeneous data and various phenotypes. We conclude that MoCluster is the best method to analyze these omics data. Availability: An R package named CrIMMix is available on GitHub at https://github.com/CNRGH/crimmix to reproduce all the results of this article. © The authors 2019. Published by Oxford University Press on behalf of the Institute of Mathematics and its Applications. All rights reserved.

Entities:  

Keywords:  benchmarks; multi-omics; performance evaluation; real data; unsupervised integrative methods

Year:  2020        PMID: 31792509     DOI: 10.1093/bib/bbz138

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  11 in total

1.  Multi-omics subtyping of hepatocellular carcinoma patients using a Bayesian network mixture model.

Authors:  Polina Suter; Eva Dazert; Jack Kuipers; Charlotte K Y Ng; Tuyana Boldanova; Michael N Hall; Markus H Heim; Niko Beerenwinkel
Journal:  PLoS Comput Biol       Date:  2022-09-06       Impact factor: 4.779

2.  Integrative clustering methods for multi-omics data.

Authors:  Xiaoyu Zhang; Zhenwei Zhou; Hanfei Xu; Ching-Ti Liu
Journal:  Wiley Interdiscip Rev Comput Stat       Date:  2021-02-07

Review 3.  A Review and a Framework of Variables for Defining and Characterizing Tinnitus Subphenotypes.

Authors:  Eleni Genitsaridi; Derek J Hoare; Theodore Kypraios; Deborah A Hall
Journal:  Brain Sci       Date:  2020-12-04

4.  DNA methylation and gene expression integration in cardiovascular disease.

Authors:  Alba Fernández-Sanlés; Roberto Elosua; Guillermo Palou-Márquez; Isaac Subirana; Lara Nonell
Journal:  Clin Epigenetics       Date:  2021-04-09       Impact factor: 6.551

5.  Benchmarking joint multi-omics dimensionality reduction approaches for the study of cancer.

Authors:  Laura Cantini; Pooya Zakeri; Celine Hernandez; Aurelien Naldi; Denis Thieffry; Elisabeth Remy; Anaïs Baudot
Journal:  Nat Commun       Date:  2021-01-05       Impact factor: 14.919

6.  PIntMF: Penalized Integrative Matrix Factorization method for Multi-omics data.

Authors:  Morgane Pierre-Jean; Florence Mauger; Jean-François Deleuze; Edith Le Floch
Journal:  Bioinformatics       Date:  2021-11-26       Impact factor: 6.937

7.  Genome-scale metabolic modeling reveals SARS-CoV-2-induced metabolic changes and antiviral targets.

Authors:  Kuoyuan Cheng; Laura Martin-Sancho; Lipika R Pal; Yuan Pu; Laura Riva; Xin Yin; Sanju Sinha; Nishanth Ulhas Nair; Sumit K Chanda; Eytan Ruppin
Journal:  Mol Syst Biol       Date:  2021-11       Impact factor: 13.068

Review 8.  Unsupervised Multi-Omics Data Integration Methods: A Comprehensive Review.

Authors:  Nasim Vahabi; George Michailidis
Journal:  Front Genet       Date:  2022-03-22       Impact factor: 4.599

9.  A benchmark study of deep learning-based multi-omics data fusion methods for cancer.

Authors:  Dongjin Leng; Linyi Zheng; Yuqi Wen; Yunhao Zhang; Lianlian Wu; Jing Wang; Meihong Wang; Zhongnan Zhang; Song He; Xiaochen Bo
Journal:  Genome Biol       Date:  2022-08-09       Impact factor: 17.906

10.  Exploration of the Immunotyping Landscape and Immune Infiltration-Related Prognostic Markers in Ovarian Cancer Patients.

Authors:  Na Zhao; Yujuan Xing; Yanfang Hu; Hao Chang
Journal:  Front Oncol       Date:  2022-07-08       Impact factor: 5.738

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.