Literature DB >> 29077792

Unsupervised multiple kernel learning for heterogeneous data integration.

Jérôme Mariette1, Nathalie Villa-Vialaneix1.   

Abstract

Motivation: Recent high-throughput sequencing advances have expanded the breadth of available omics datasets and the integrated analysis of multiple datasets obtained on the same samples has allowed to gain important insights in a wide range of applications. However, the integration of various sources of information remains a challenge for systems biology since produced datasets are often of heterogeneous types, with the need of developing generic methods to take their different specificities into account.
Results: We propose a multiple kernel framework that allows to integrate multiple datasets of various types into a single exploratory analysis. Several solutions are provided to learn either a consensus meta-kernel or a meta-kernel that preserves the original topology of the datasets. We applied our framework to analyse two public multi-omics datasets. First, the multiple metagenomic datasets, collected during the TARA Oceans expedition, was explored to demonstrate that our method is able to retrieve previous findings in a single kernel PCA as well as to provide a new image of the sample structures when a larger number of datasets are included in the analysis. To perform this analysis, a generic procedure is also proposed to improve the interpretability of the kernel PCA in regards with the original data. Second, the multi-omics breast cancer datasets, provided by The Cancer Genome Atlas, is analysed using a kernel Self-Organizing Maps with both single and multi-omics strategies. The comparison of these two approaches demonstrates the benefit of our integration method to improve the representation of the studied biological system. Availability and implementation: Proposed methods are available in the R package mixKernel, released on CRAN. It is fully compatible with the mixOmics package and a tutorial describing the approach can be found on mixOmics web site http://mixomics.org/mixkernel/. Contact: jerome.mariette@inra.fr or nathalie.villa-vialaneix@inra.fr. Supplementary information: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Year:  2018        PMID: 29077792     DOI: 10.1093/bioinformatics/btx682

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  20 in total

1.  DIABLO: an integrative approach for identifying key molecular drivers from multi-omics assays.

Authors:  Amrit Singh; Casey P Shannon; Benoît Gautier; Florian Rohart; Michaël Vacher; Scott J Tebbutt; Kim-Anh Lê Cao
Journal:  Bioinformatics       Date:  2019-09-01       Impact factor: 6.937

Review 2.  Machine learning: its challenges and opportunities in plant system biology.

Authors:  Mohsen Hesami; Milad Alizadeh; Andrew Maxwell Phineas Jones; Davoud Torkamaneh
Journal:  Appl Microbiol Biotechnol       Date:  2022-05-16       Impact factor: 4.813

3.  Feature selection for kernel methods in systems biology.

Authors:  Céline Brouard; Jérôme Mariette; Rémi Flamary; Nathalie Vialaneix
Journal:  NAR Genom Bioinform       Date:  2022-03-07

Review 4.  Heterogeneous data integration methods for patient similarity networks.

Authors:  Jessica Gliozzo; Marco Mesiti; Marco Notaro; Alessandro Petrini; Alex Patak; Antonio Puertas-Gallardo; Alberto Paccanaro; Giorgio Valentini; Elena Casiraghi
Journal:  Brief Bioinform       Date:  2022-07-18       Impact factor: 13.994

5.  A Sparse Mixture-of-Experts Model With Screening of Genetic Associations to Guide Disease Subtyping.

Authors:  Marie Courbariaux; Kylliann De Santiago; Cyril Dalmasso; Fabrice Danjou; Samir Bekadar; Jean-Christophe Corvol; Maria Martinez; Marie Szafranski; Christophe Ambroise
Journal:  Front Genet       Date:  2022-06-06       Impact factor: 4.772

6.  A U-statistics for integrative analysis of multilayer omics data.

Authors:  Xiaqiong Wang; Yalu Wen
Journal:  Bioinformatics       Date:  2020-04-15       Impact factor: 6.937

7.  A multiple kernel density clustering algorithm for incomplete datasets in bioinformatics.

Authors:  Longlong Liao; Kenli Li; Keqin Li; Canqun Yang; Qi Tian
Journal:  BMC Syst Biol       Date:  2018-11-22

8.  PIMKL: Pathway-Induced Multiple Kernel Learning.

Authors:  Matteo Manica; Joris Cadow; Roland Mathis; María Rodríguez Martínez
Journal:  NPJ Syst Biol Appl       Date:  2019-03-05

9.  MOGSA: Integrative Single Sample Gene-set Analysis of Multiple Omics Data.

Authors:  Chen Meng; Azfar Basunia; Bjoern Peters; Amin Moghaddas Gholami; Bernhard Kuster; Aedín C Culhane
Journal:  Mol Cell Proteomics       Date:  2019-06-26       Impact factor: 5.911

Review 10.  Integrated Multi-Omics Analyses in Oncology: A Review of Machine Learning Methods and Tools.

Authors:  Giovanna Nicora; Francesca Vitali; Arianna Dagliati; Nophar Geifman; Riccardo Bellazzi
Journal:  Front Oncol       Date:  2020-06-30       Impact factor: 6.244

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.