| Literature DB >> 35996085 |
Anna Pačínková1,2, Vlad Popovici3.
Abstract
BACKGROUND: Integration of multi-omics data can provide a more complex view of the biological system consisting of different interconnected molecular components, the crucial aspect for developing novel personalised therapeutic strategies for complex diseases. Various tools have been developed to integrate multi-omics data. However, an efficient multi-omics framework for regulatory network inference at the genome level that incorporates prior knowledge is still to emerge.Entities:
Keywords: Bayesian networks; Integrative analysis; Knowledge discovery; Multimodal omics; Regulatory networks
Mesh:
Year: 2022 PMID: 35996085 PMCID: PMC9396869 DOI: 10.1186/s12859-022-04891-9
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.307
Fig. 1IntOMICS framework. IntOMICS framework takes as input (i) gene expression matrix GE with samples and genes, (ii) the associated copy number variation matrix ( x ), (iii) the associated DNA methylation matrix of beta-values ( x ) sampled from the same individuals, and (iv) the biological prior knowledge matrix ( x ) with information on known interactions among molecular features. An automatically tuned MCMC algorithm [30] estimates parameters and empirical biological knowledge. Conventional MCMC algorithm with additional Markov blanket resampling step is used to infer resulting regulatory network structure consisting of three types of nodes: GE nodes (highlighted in green) refer to gene expression levels, CNV nodes (highlighted in blue) refer to copy number variations, and METH nodes (highlighted in red) refer to DNA methylation. Edge weight represents the empirical frequency of given edge over samples of network structures
Empirical biological knowledge estimation
| Edge | Operation | Frequency | Candidate | Acceptance |
|---|---|---|---|---|
| Add | Accepted | |||
| Rejected | ||||
| Delete | Accepted | |||
| Rejected | ||||
| Reverse | Accepted | |||
| Rejected | ||||
Assuming there is no prior knowledge about the direct interaction from node i to node j, the empirical biological matrix is estimated, with
Summary of all used data sets
| Dataset | Investigated gene set | Samples | Details |
|---|---|---|---|
| GSE127960 ZIC5 WT | 16GE / 0CNV / 0METH | 2 | HCT116 colon cancer cell line |
| GSE127960 ZIC5 KO | 16GE / 0CNV / 0METH | 4 | HCT116 colon cancer cell line |
| ZIC5 knockout | |||
| TCGA-COAD MSS | 24GE / 24CNV / 4METH | 27 | Primary tumour stage II/III |
| With MSS phenotype | |||
| TCGA-COAD MSI | 24GE / 24CNV / 3METH | 69 | Primary tumour |
| With MSI phenotype | |||
| TCGA-COAD NAT | 24GE / 24CNV / 16METH | 19 | Histologically normal tissue |
| Adjacent to the tumour | |||
| DREAM4 1-5 | 10GE / 0CNV / 0METH | 1 | Five independent in silico networks |
| To assess the consistency of prediction | |||
| TCGA-AML | 25GE / 25CNV / 25METH | 173 | Acute myeloid leukemia |
| PETACC-3 MSS | 39GE / 23CNV / 0METH | 176 | Primary tumour stage III |
| With MSS phenotype |
MSS microsatellite stability, MSI microsatellite instability, NAT histologically normal tissue adjacent to the tumour, GE gene expression, CNV copy number variation, METH methylation probe, KO knockout, WT wild-type, AML acute myeloid leukemia
Number of samples in PETACC-3 clinical trial according to the treatment and relapse-free survival
| 5FU/FA | FOLFIRI | |
|---|---|---|
| Low-RFS | 31 | 34 |
| High-RFS | 53 | 58 |
5FU/FA fluorouracil/leucovorin, FOLFIRI fluorouracil/leucovorin with irinotecan, RFS relapse-free survival
Fig. 2Performance comparison of IntOMICS and W &H algorithm [4] using real datasets. Receiver-operating characteristic curve (as a function of the edge weights) serves as the main performance index. Gold standard Wnt signalling pathway is used as the ground truth. NAT histologically normal tissue adjacent to the tumour; MSI microsatellite instability; WT wild-type; KO knockout
Fig. 3Performance comparison of IntOMICS and W &H algorithm [4] using in silico gene expression dataset. Receiver-operating characteristic curve (as a function of the edge weights) with 95% confidence interval serves as the main performance index
Fig. 4The intersection of interactions between features originating from distinct omics modalities identified by RACER, KiMONo, and IntOMICS
Genes with ABCG2 direct interaction and the confidence of the regulation (w) determined by IntOMICS
| 5FU/FA | FOLFIRI | |||
|---|---|---|---|---|
| Low-RFS | High-RFS | Low-RFS | High-RFS | |
| ELK1 | High | Med | Low | Med |
| HRAS | Low | Med | Low | High |
| MAP3K1 | High | Med | Low | NA |
| MAP2K1 | Med | High | NA | Low |
| MAPK1/ERK2 | Low | Low | Med | High |
| MAPK3/ERK1 | High | Low | Low | NA |
| MRAS | Low | Low | High | Low |
| MYC | High | Med | Med | Med |
| MYCN | Low | Low | Low | High |
| RAF1 | Med | NA | Low | High |
| RPS6KA3 | High | NA | NA | Low |
| ABCG2 CNV | High | High | High | High |
Genes with the highest predictive potential are highlighted in bold. low quantile of all edge weights; med quantile of all edge weights; high empirical frequency quantile of all edge weights; NA the edge was not identified
Spearman’s correlation coefficient between ABCG2 GE and MYCN GE in MSS stage III colon cancer and corresponding p-value
| Treatment and survival | ||
|---|---|---|
| 5FU/FA low-RFS | − 0.17 | 0.35 |
| 5FU/FA high-RFS | − 0.11 | 0.41 |
| FOLFIRI low-RFS | − 0.01 | 0.96 |
| FOLFIRI high-RFS | 0.23 | 0.09 |
Spearman’s correlation coefficient; 5FU/FA fluorouracil/leucovorin, FOLFIRI fluorouracil/leucovorin with irinotecan, RFS relapse-free survival