| Literature DB >> 28500050 |
Chao Xu1, Ji-Gang Zhang1, Dongdong Lin2, Lan Zhang1, Hui Shen1, Hong-Wen Deng3,4.
Abstract
Integrating diverse genomics data can provide a global view of the complex biological processes related to the human complex diseases. Although substantial efforts have been made to integrate different omics data, there are at least three challenges for multi-omics integration methods: (i) How to simultaneously consider the effects of various genomic factors, since these factors jointly influence the phenotypes; (ii) How to effectively incorporate the information from publicly accessible databases and omics datasets to fully capture the interactions among (epi)genomic factors from diverse omics data; and (iii) Until present, the combination of more than two omics datasets has been poorly explored. Current integration approaches are not sufficient to address all of these challenges together. We proposed a novel integrative analysis framework by incorporating sparse model, multivariate analysis, Gaussian graphical model, and network analysis to address these three challenges simultaneously. Based on this strategy, we performed a systemic analysis for glioblastoma multiforme (GBM) integrating genome-wide gene expression, DNA methylation, and miRNA expression data. We identified three regulatory modules of genomic factors associated with GBM survival time and revealed a global regulatory pattern for GBM by combining the three modules, with respect to the common regulatory factors. Our method can not only identify disease-associated dysregulated genomic factors from different omics, but more importantly, it can incorporate the information from publicly accessible databases and omics datasets to infer a comprehensive interaction map of all these dysregulated genomic factors. Our work represents an innovative approach to enhance our understanding of molecular genomic mechanisms underlying human complex diseases.Entities:
Keywords: glioblastoma multiforme; integrative analysis; multi-omics data; network analysis; sparse modeling
Mesh:
Substances:
Year: 2017 PMID: 28500050 PMCID: PMC5499134 DOI: 10.1534/g3.117.042408
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Workflow of the integrative analysis of multi-omics data.
Figure 2Venn diagram of the samples with different Omics data. Methylation ∩ miRNA = 261; methylation ∩ mRNA = 265; miRNA ∩ mRNA = 504.
Figure 3Heat map of correlations between and within coexpression modules constructed by WGCNA. Each row/column represents a gene. Each cell element is the absolute value of correlation coefficient between two genes. The intensity of red coloring indicates the strength of correlation between pairs of genes, with green color corresponding to low correlation. The independent modules are represented as isolated boxes along the diagonal.
The identified miRNAs and methylation sites for three modules by SPLS model
| mRNAs | miRNAs | Methylation Sites | |
|---|---|---|---|
| Module 1 | 50 | 46 | 353 |
| Module 2 | 28 | 11 | 125 |
| Module 3 | 18 | 33 | 174 |
Figure 4The interaction modules incorporating information of miRNAs and methylation sites. Each rectangle represents a gene; the circle represents miRNA. The green rectangles are genes with cis effects, with brighter indicating higher. The pink dashed edges indicate miRNA–gene interactions annotated in previously mentioned miRNA databases. The green solid edges are gene–gene connections resulting from PCST.
Figure 5Combined regulatory network with three identified modules. Each rectangle represents a gene; the circle represents miRNA. Different colors of the rectangles/circles indicate their belonging to a single module or overlap of the three modules. The pink dashed edges indicate miRNA–gene interactions annotated in previously mentioned miRNA databases. The green solid edges are gene–gene connections resulting from PCST.