| Literature DB >> 29459674 |
YongKiat Wee1, Yining Liu2, Jiachun Lu2,3, Xiaoyan Li4, Min Zhao5.
Abstract
Prognosis identifies the seriousness and the chances of survival of a cancer patient. However, it remains a challenge to identify the key cancer genes in prognostic studies. In this study, we collected 2064 genes that were related to prognostic studies by using gene expression measurements curated from published literatures. Among them, 1820 genes were associated with copy number variations (CNVs). The further functional enrichment on 889 genes with frequent copy number gains (CNGs) revealed that these genes were significantly associated with cancer pathways including regulation of cell cycle, cell differentiation and mitogen-activated protein kinase (MAPK) cascade. We further conducted integrative analyses of CNV and their target genes expression using the data from matched tumour samples of The Cancer Genome Atlas (TCGA). Ultimately, 95 key prognosis-related genes were extracted, with concordant CNG events and increased up-regulation in at least 300 tumour samples. These genes, and the number of samples in which they were found, included: ACTL6A (399), ATP6V1C1 (425), EBAG9 (412), FADD (308), MTDH (377), and SENP5 (304). This study provides the first observation of CNV in prognosis-related genes across pan-cancer. The systematic concordance between CNG and up-regulation of gene expression in these novel prognosis-related genes may indicate their prognostic significance.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29459674 PMCID: PMC5818516 DOI: 10.1038/s41598-018-21691-5
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Pipeline for the discovery of consistency in copy number of gain and up-regulation of novel prognosis-related genes in pan-cancer and the gene enrichment analysis of 889 genes with frequent copy number gains (CNGs) and mutational landscape of 95 genes with constant CNGs and up-regulation. (A) This flowchart shows the pipeline for finding the novel prognosis-related genes which consistent with the copy number of gain in CNVs and their corresponding gene expression. The work divides into several steps: Identifying short descriptions containing both cancer and prognosis keywords: [(prognosis OR prognostic) AND (cancer OR tumour OR carcinoma)] from GeneRIF (Gene Reference Into Function) database; Manually curating the data from published literature to extract the corresponding gene names in Human. (B) 2309 genes with different studies (each with unique PubMed ID) extracted from the literature database and identified 2064 genes related to prognostic studies; A gene set of 1820 genes which associated with CNVs;Total number of 1050 prognosis-related genes identified with frequent CNGs based on the cut-off point (ratio of Gain/Loss > 2)and 277 genes associated with CNLs (ratio of Loss/Gain > 2); 889 genes observed as frequent CNGs with number of CNGs TCGA samples >30; Lastly, 95 genes identified with consistent CNGs and over-expression in the same TCGA samples. (C) Gene enrichment analysis of 889 prognosis-related genes with concordant copy number gains (CNGs). The scatterplot presents the summarized GO terms of all 889 prognosis-related genes with CNGs. Circles show the GO clusters and are plotted in two-dimensional space according to other GO terms’ sematic similarities. Y-axis demonstrates the similarity of the GO terms; x-axis indicates the log of corrected P-value (bubbles of right corrected P-values are larger); circle colour represents directly proportional to the frequency of the GO term in the Gene Ontology Annotation (GOA) database (D) A general pan-cancer overview between the correlations of copy number variation (CNV) aspects based on 95 prognosis-related genes with up-regulated gene expression conceivably caused by copy number gains (CNGs). Y-axis shows the alteration frequency in percentage (including both amplification and deletion mutation); x-axis indicates the cancer types. Blue - Deletion; Red- Amplification.
Figure 2Sample-based mutational and network analysis for the eight-potential cross-cancer prognosis-related genes with high amplification rate. (A) Sample-based mutational patterns for the eight genes from the three different cancer samples - TCGA esophageal carcinoma, TCGA ovarian serous cystadecarcinoma, TCGA esophagus-stomach cancers. Columns indicate samples and rows indicate genes. The colour bar is used to represent the genomic alterations such as CNVs and somatic mutations. The different mutational types are marked using different colours. The mutational types in (A–D) were depicted by colours. The red and blue show the amplification and deletion respectively. The grey indicates no mutations in the sample. The percentage represents the alteration frequency for each gene. (B) The network of the common eight genes with high amplification rates. The network represents the molecular function-based relationship between these eight genes and the novel linker genes in cancer development. Yellow circles represent prognosis-related genes and blue circles indicate linker genes. (C) A pan-cancer global view of copy number variation (CNV) features based on these common eight genes with increased gene expression potentially induced by copy number gains (CNGs). Y-axis shows the alteration frequency in percentage (including both amplification and deletion mutation); x-axis indicates the cancer types. Blue - Deletion; Red- Amplification.
Figure 3A pan-cancer view of copy number variation (CNV) distribution in three novel prognosis-related genes: BIRC5 (A), ERBB2 (B) and EZH2 (C) and their corresponding CNV mutational landscape. Y-axis shows the mutation frequency in percentage (including both amplification and deletion mutation); x-axis indicates the cancer types. Blue - Deletion; Red- Amplification.
Figure 4The expression analysis of up-regulated expression of three novel prognosis-related genes with CNGs and their survival curves: BIRC5, ERBB2 and EZH2. Plots were derived from cBioPortal based on the Kaplan-Meier analysis. Blue line indicates lower expression and red line indicates higher expression. (A) The expression level of BIRC5 in TCGA sarcoma. (B) The expression level of ERBB2 in TCGA uterine corpus endometrial carcinoma. (C) The expression level of EZH2 in TCGA ovarian serous cystadenocarcinoma. (D) Overall survival analysis of BIRC5 in TCGA sarcoma. (E) Overall survival analysis of ERBB2 in TCGA uterine corpus endometrial carcinoma. (F) Overall survival analysis of EZH2 in TCGA ovarian serous cystadenocarcinoma.