| Literature DB >> 35521545 |
Abstract
Cancers evolve from normal tissues and share an endogenous regulatory realm distinctive from that of normal human tissues. Unearthing such an endogenous realm faces challenges due to heterogeneous biology data. This study computes petabyte level data and reveals the endogenous regulatory networks of normal and cancers and then unearths the most important endogenous regulators for normal and cancerous realm. In normal, proteins dominate the entire realm and trans-regulate their targets across chromosomes and ribosomal proteins serve as the most important drivers. However, in cancerous realm, noncoding RNAs dominate the whole realm and pseudogenes work as the most important regulators that cis-regulate their neighbors, in which they primarily regulate their targets within 1 million base pairs but they rarely regulate their cognates with complementary sequences as thought. Therefore, two distinctive mechanisms rule the normal and cancerous realm separately, in which noncoding RNAs endogenously regulate cancers, instead of proteins as currently conceptualized. This establishes a fundamental avenue to understand the basis of cancerous and normal physiology.Entities:
Keywords: Big data; Cancer; Endogenous; FINET; Noncoding RNA; Pseudogene; Regulatory network; Systems
Year: 2022 PMID: 35521545 PMCID: PMC9062140 DOI: 10.1016/j.csbj.2022.04.015
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 6.155
Fig. 1Gene regulatory networks endogenous in cancers and normal humans. A, the workflow of this study. B-C, Completed gene regulatory network endogenous in normal (B) and cancer(C). The nodes (genes) and edges (interactions) were grouped into 5 gene category sets, including protein (light green), antisense(blue), lincRNA (pink), p_pseudogene (red), and the rest (other, lightblue). Interaction domains shift from protein interactions at normal(B) to noncoding interactions at cancer(C). D, an example of sub_network, PTEN interacting with PTENP1 in the cancer network directly extracted from our network database. Network annotation follows these 4 points: 1) Node color denotes gene category, lightGreen, blue, pink, red, lightSkyBlue respectively denote protein_coding, antisenseRNA, lincRNA, processed-pseudogene, other. 2) Edge color represents regulation strength: red, pink, blue, lightSkyBlue, and lightGray respectively represent strong positive, middle positive, strong negative, middle negative and weak regulation(positive or negative). 3) Edge thickness denotes confidence, thicker, more confident. 4) Edge arrow denotes the regulatory direction, from a regulator to a target. E, overall distribution of regulators and their targets at normal and cancer. The top 5 most abundant categories were shown. F, The target distribution of three categorized noncoding RNAs, antisense, lincRNA, and p-pseudogene, at normal (n_) and cancer (c_). Targets were counted separately when the three individual noncoding RNAs were regulators. Self denotes the targets as self-categorized genes. For example, c_self antisense represents cancerous antisense RNAs that were targeted by cancerous antisense RNAs. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Fig. 2Network module compositions and normal network centrality. A, the module composition differences between normal and cancer network. B, Compositions of top 1000 normal network centrality. C, Top 20 normal centrality. D-E, RPS23 first neighbors in normal (D) and cancer(E).
Fig. 3Cancer network centrality. A, Compositions of top 1000 cancer network centrality. B, proportion of inducers and repressors in top 1000 cancer network centrality. C, compositions of inducers in top cancer centrality(B). D, top 20 cancer centrality. E, ENSG00000250144.1 has more interactions in cancers than normal. Please note that the gene symbol was different in the two annotation versions as labeled in the figure and our database.
Fig. 4Top 300 strongest inducers and repressors at the normal and cancer network. The composition of strongest inducers and repressors at normal and cancer network (A-D). A, The top 300 strongest inducers (left) and their targets (right) of normal network. B, the top 300 strongest cancerous inducers and their targets. C, the top 300 strongest repressors and their targets at normal. D, The top 300 strongest cancerous repressors and their targets. Clinic data of top cancerous inducers(E-F). E, comparison of hazard ratio (HR) between protein, p-pseudogenes and noncoding RNAs in top 300 strongest inducers in the cancer network built by this present study. F, HR profiling of top 480 deadliest inducers directly extracted from Cox proportional-hazards model analysis of all TCGA RNAseq data [18]. P-values (above line) were calculated by t-test.
Fig. 5Target distance distribution. A-B, lincRNA regulatory network in normal (A) and cancer(B). These two networks were grouped by chromosome to show crosstalk between chromosomes. The chromosome 14 section (circle in B) was detailed in Fig. S4. LincRNAs trans-regulate their targets at normal(A) but cis-regulate their targets at cancer(B). C-D, target location distribution of top abundant gene categories (>10%) in normal (C) and cancer (D). OutChro represents targets located outside the chromosome, and M denotes million bp inside the chromosome. E, the percentage of cognates and non-cognates targeted by antisense RNAs and pseudogenes at normal and cancer. Non denotes non-cognate.