| Literature DB >> 28289683 |
Douglas Arneson1, Le Shu2, Brandon Tsai3, Rio Barrere-Cain3, Christine Sun3, Xia Yang4.
Abstract
Elucidating the mechanisms of complex diseases such as cardiovascular disease (CVD) remains a significant challenge due to multidimensional alterations at molecular, cellular, tissue, and organ levels. To better understand CVD and offer insights into the underlying mechanisms and potential therapeutic strategies, data from multiple omics types (genomics, epigenomics, transcriptomics, metabolomics, proteomics, microbiomics) from both humans and model organisms have become available. However, individual omics data types capture only a fraction of the molecular mechanisms. To address this challenge, there have been numerous efforts to develop integrative genomics methods that can leverage multidimensional information from diverse data types to derive comprehensive molecular insights. In this review, we summarize recent methodological advances in multidimensional omics integration, exemplify their applications in cardiovascular research, and pinpoint challenges and future directions in this incipient field.Entities:
Keywords: cardiovascular disease; epigenomics; genomics; integrative genomics; metabolomics; multidimensional omics integration; proteomics; transcriptomics
Year: 2017 PMID: 28289683 PMCID: PMC5327355 DOI: 10.3389/fcvm.2017.00008
Source DB: PubMed Journal: Front Cardiovasc Med ISSN: 2297-055X
Figure 1Summary of different omics data types and multidimensional data integration methods. Cardiovascular disease (CVD) involves various omics spaces and complex inter-omics interactions. To discover accurate biomarkers and disentangle disease mechanisms of CVD, multidimensional data integration methods are available, broadly categorized into clustering/dimensionality reduction-based approaches, predictive modeling approaches, pairwise omics data integration, network-based approaches, and composite approaches integrating multiple modeling approaches.
Comparison of multidimensional data integration methodologies discussed in the manuscript.
| Method category | Brief description | Advantages | Limitations | Representative tools |
|---|---|---|---|---|
| Clustering/dimensionality reduction-based approaches | Transform data into common space through graph or kernel-based methods | Easy to implement using common statistical techniques; retain within-data properties; robust to different units of measurements and different data sets from the public domain | Cross-data interaction may be altered; application limited to visual overview of data and detection of subpopulations | Clustering-based: iCluster ( |
| Dimensionality reduction: Biofilter ( | ||||
| Predictive modeling approaches | Machine learning based methodologies to predict prognosis or diagnosis and discover biomarkers | High predictive power; versatile methodologies; data-driven approach (does not require preexisting knowledge of omics interaction) | Overfitting issue; can require high computational power; does not integrate biological knowledge; higher accuracy requires larger data sets | Camelot ( |
| Pairwise omics data integration | Centered on interaction information between pairs of omics data | Easy to implement; reflects inter-omics interaction; causal implication | Available data dominated by expression quantitative trait loci (eQTLs); low robustness of | MERLIN ( |
| Network-based approaches | Reduce data complexity by converging multi-omics information onto networks | Networks can accommodate multiple layers of data; intuitive depiction and visualization of regulatory circuits | Computationally expensive; difficult to model feedback loops in multidimensional space | Weighted gene coexpression network analysis ( |
| Composite approaches | Flexible integration of multiple integration models | Flexibility and adaptability to diverse research needs | Few well-acknowledged frameworks available | Analysis Tool for Heritable and Environmental Network Associations ( |