| Literature DB >> 31312515 |
Hiromi W L Koh1,2, Damian Fermin3, Christine Vogel4, Kwok Pui Choi5, Rob M Ewing6, Hyungwon Choi1,2,7.
Abstract
Computational tools for multiomics data integration have usually been designed for unsupervised detection of multiomics features explaining large phenotypic variations. To achieve this, some approaches extract latent signals in heterogeneous data sets from a joint statistical error model, while others use biological networks to propagate differential expression signals and find consensus signatures. However, few approaches directly consider molecular interaction as a data feature, the essential linker between different omics data sets. The increasing availability of genome-scale interactome data connecting different molecular levels motivates a new class of methods to extract interactive signals from multiomics data. Here we developed iOmicsPASS, a tool to search for predictive subnetworks consisting of molecular interactions within and between related omics data types in a supervised analysis setting. Based on user-provided network data and relevant omics data sets, iOmicsPASS computes a score for each molecular interaction, and applies a modified nearest shrunken centroid algorithm to the scores to select densely connected subnetworks that can accurately predict each phenotypic group. iOmicsPASS detects a sparse set of predictive molecular interactions without loss of prediction accuracy compared to alternative methods, and the selected network signature immediately provides mechanistic interpretation of the multiomics profile representing each sample group. Extensive simulation studies demonstrate clear benefit of interaction-level modeling. iOmicsPASS analysis of TCGA/CPTAC breast cancer data also highlights new transcriptional regulatory network underlying the basal-like subtype as positive protein markers, a result not seen through analysis of individual omics data.Entities:
Keywords: Computational biology and bioinformatics; Systems biology
Mesh:
Year: 2019 PMID: 31312515 PMCID: PMC6616462 DOI: 10.1038/s41540-019-0099-y
Source DB: PubMed Journal: NPJ Syst Biol Appl ISSN: 2056-7189
Fig. 1a iOmicsPASS workflow. iOmicsPASS takes multiomics data, biological network data, and sample meta information as input. The omics data sets are integrated via interaction scores for all interactions in the network. Subnetwork discovery module discovers the subnetwork signatures distinguishing phenotypic groups, and pathway enrichment module reports associated biological processes. The software also produces a set of text files containing the details of the selected subnetworks and the materials for visualization of networks in the Cytoscape software. b Each omics data is first standardized into Z-scores and converted to interaction scores over the network. Two TFs (gene 1 and gene 3) and their common target gene (gene 2) are shown as an example. Interaction scores are computed for the PPI between protein 1 and protein 3 and the transcription factor regulation between the two TF proteins and mRNA molecule of their target gene. c The resulting interaction scores are used as an input to select the predictive edges for phenotypic groups using the modified nearest shrunken centroid algorithm
Fig. 2a Simulation results using the NSC algorithm applied to the concatenated data (black lines), the NSC algorithm to the interaction scores (blue lines), and the modified NSC algorithm to the interaction scores in iOmicsPASS (red lines). Six different parameters determining the levels of signal and noise were used to simulate data based on a biological network sampled from a real TF and PPI network. b Area under the curve (AUC) of three approaches, each represented by one colored line in a, using three simulation setups at assay sensitivity values of 0.7 and 0.8
Fig. 3a Heatmap of the interaction scores for the union of all four subnetworks in the BRCA data. The cyan color bar on the right-side highlights the subtype-specific subnetworks. b Heatmap of statistical significance scores for the pathway enrichment in the subtype-specific subnetworks. The significance score was calculated as minus the logarithm (base 10) of Benjamini–Hochberg adjusted p-value. For downregulated pathways that were enriched with genes or proteins with lower interaction scores, we multiplied −1 to the significance score to make the score negative. Red and blue represent the direction of interaction scores (positive and negative, respectively)
Fig. 4Subnetwork signatures predictive of the four intrinsic subtypes of BRCA illustrated in Cytoscape. Red and blue lines (edges) are interactions (TF regulation or PPI) with higher and lower interaction scores compared to the overall centroid, i.e. average profile in the data set. In each subtype-specific network, the proteins are indicated by cyan-colored circles and the mRNAs are green-colored triangles. Gray-colored nodes are not a part of the predictive subnetwork in a given subtype. Yellow-colored nodes indicate hub proteins of subnetworks for each breast cancer subtype