Literature DB >> 31029063

A multi-omics data simulator for complex disease studies and its application to evaluate multi-omics data analysis methods for disease classification.

Ren-Hua Chung1, Chen-Yu Kang1.   

Abstract

BACKGROUND: An integrative multi-omics analysis approach that combines multiple types of omics data including genomics, epigenomics, transcriptomics, proteomics, metabolomics, and microbiomics has become increasing popular for understanding the pathophysiology of complex diseases. Although many multi-omics analysis methods have been developed for complex disease studies, only a few simulation tools that simulate multiple types of omics data and model their relationships with disease status are available, and these tools have their limitations in simulating the multi-omics data.
RESULTS: We developed the multi-omics data simulator OmicsSIMLA, which simulates genomics (i.e., single-nucleotide polymorphisms [SNPs] and copy number variations), epigenomics (i.e., bisulphite sequencing), transcriptomics (i.e., RNA sequencing), and proteomics (i.e., normalized reverse phase protein array) data at the whole-genome level. Furthermore, the relationships between different types of omics data, such as methylation quantitative trait loci (SNPs influencing methylation), expression quantitative trait loci (SNPs influencing gene expression), and expression quantitative trait methylations (methylations influencing gene expression), were modeled. More importantly, the relationships between these multi-omics data and the disease status were modeled as well. We used OmicsSIMLA to simulate a multi-omics dataset for breast cancer under a hypothetical disease model and used the data to compare the performance among existing multi-omics analysis methods in terms of disease classification accuracy and runtime. We also used OmicsSIMLA to simulate a multi-omics dataset with a scale similar to an ovarian cancer multi-omics dataset. The neural network-based multi-omics analysis method ATHENA was applied to both the real and simulated data and the results were compared. Our results demonstrated that complex disease mechanisms can be simulated by OmicsSIMLA, and ATHENA showed the highest prediction accuracy when the effects of multi-omics features (e.g., SNPs, copy number variations, and gene expression levels) on the disease were strong. Furthermore, similar results can be obtained from ATHENA when analyzing the simulated and real ovarian multi-omics data.
CONCLUSIONS: OmicsSIMLA will be useful to evaluate the performace of different multi-omics analysis methods. Sample sizes and power can also be calculated by OmicsSIMLA when planning a new multi-omics disease study.
© The Author(s) 2019. Published by Oxford University Press.

Entities:  

Keywords:  complex disease study; multi-omics data; simulation tool

Mesh:

Year:  2019        PMID: 31029063      PMCID: PMC6486474          DOI: 10.1093/gigascience/giz045

Source DB:  PubMed          Journal:  Gigascience        ISSN: 2047-217X            Impact factor:   6.524


  35 in total

1.  Calibrating a coalescent simulation of human genome sequence variation.

Authors:  Stephen F Schaffner; Catherine Foo; Stacey Gabriel; David Reich; Mark J Daly; David Altshuler
Journal:  Genome Res       Date:  2005-11       Impact factor: 9.043

2.  Fast protein classification with multiple networks.

Authors:  Koji Tsuda; HyunJung Shin; Bernhard Schölkopf
Journal:  Bioinformatics       Date:  2005-09-01       Impact factor: 6.937

3.  HAPGEN2: simulation of multiple disease SNPs.

Authors:  Zhan Su; Jonathan Marchini; Peter Donnelly
Journal:  Bioinformatics       Date:  2011-06-08       Impact factor: 6.937

4.  Quantifying the regulatory effect size of cis-acting genetic variation using allelic fold change.

Authors:  Pejman Mohammadi; Stephane E Castel; Andrew A Brown; Tuuli Lappalainen
Journal:  Genome Res       Date:  2017-10-11       Impact factor: 9.043

5.  Integrating heterogeneous high-throughput data for meta-dimensional pharmacogenomics and disease-related studies.

Authors:  Emily R Holzinger; Marylyn D Ritchie
Journal:  Pharmacogenomics       Date:  2012-01       Impact factor: 2.533

6.  Integrating diverse genomic data using gene sets.

Authors:  Svitlana Tyekucheva; Luigi Marchionni; Rachel Karchin; Giovanni Parmigiani
Journal:  Genome Biol       Date:  2011-10-21       Impact factor: 13.583

Review 7.  Complex and multi-allelic copy number variation in human disease.

Authors:  Christina L Usher; Steven A McCarroll
Journal:  Brief Funct Genomics       Date:  2015-07-09       Impact factor: 4.241

8.  WGBSSuite: simulating whole-genome bisulphite sequencing data and benchmarking differential DNA methylation analysis tools.

Authors:  Owen J L Rackham; Petros Dellaportas; Enrico Petretto; Leonardo Bottolo
Journal:  Bioinformatics       Date:  2015-03-15       Impact factor: 6.937

9.  Heuristic identification of biological architectures for simulating complex hierarchical genetic interactions.

Authors:  Jason H Moore; Ryan Amos; Jeff Kiralis; Peter C Andrews
Journal:  Genet Epidemiol       Date:  2014-11-13       Impact factor: 2.135

Review 10.  Multi-omics approaches to disease.

Authors:  Yehudit Hasin; Marcus Seldin; Aldons Lusis
Journal:  Genome Biol       Date:  2017-05-05       Impact factor: 13.583

View more
  7 in total

1.  A multi-omics data simulator for complex disease studies and its application to evaluate multi-omics data analysis methods for disease classification.

Authors:  Ren-Hua Chung; Chen-Yu Kang
Journal:  Gigascience       Date:  2019-05-01       Impact factor: 6.524

2.  A Machine Learning-Based Approach Using Multi-omics Data to Predict Metabolic Pathways.

Authors:  Aakaanksha Kaul; Maryanne Varghese; Vidya Niranjan; Akshay Uttarkar
Journal:  Methods Mol Biol       Date:  2023

3.  Performance Comparison of Deep Learning Autoencoders for Cancer Subtype Detection Using Multi-Omics Data.

Authors:  Edian F Franco; Pratip Rana; Aline Cruz; Víctor V Calderón; Vasco Azevedo; Rommel T J Ramos; Preetam Ghosh
Journal:  Cancers (Basel)       Date:  2021-04-22       Impact factor: 6.639

Review 4.  Contributions from the 2019 Literature on Bioinformatics and Translational Informatics.

Authors:  Malika Smaïl-Tabbone; Bastien Rance
Journal:  Yearb Med Inform       Date:  2020-08-21

5.  PIntMF: Penalized Integrative Matrix Factorization method for Multi-omics data.

Authors:  Morgane Pierre-Jean; Florence Mauger; Jean-François Deleuze; Edith Le Floch
Journal:  Bioinformatics       Date:  2021-11-26       Impact factor: 6.937

6.  Genetic assortative mating for schizophrenia and bipolar disorder.

Authors:  Oskar Hougaard Jefsen; Ron Nudel; Yunpeng Wang; Jonas Bybjerg-Grauholm; Nicoline Hemager; Camilla A J Christiani; Birgitte K Burton; Katrine S Spang; Ditte Ellersgaard; Ditte L Gantriis; Kerstin Jessica Plessen; Jens Richardt M Jepsen; Anne A E Thorup; Thomas Werge; Merete Nordentoft; Ole Mors; Aja Neergaard Greve
Journal:  Eur Psychiatry       Date:  2022-08-23       Impact factor: 7.156

7.  Deep learning based feature-level integration of multi-omics data for breast cancer patients survival analysis.

Authors:  Li Tong; Jonathan Mitchel; Kevin Chatlin; May D Wang
Journal:  BMC Med Inform Decis Mak       Date:  2020-09-15       Impact factor: 2.796

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.