| Literature DB >> 34718715 |
Gongyu Tang1,2, Minsu Cho1, Xiaowei Wang1,3.
Abstract
Large-scale multi-omics datasets, most prominently from the TCGA consortium, have been made available to the public for systematic characterization of human cancers. However, to date, there is a lack of corresponding online resources to utilize these valuable data to study gene expression dysregulation and viral infection, two major causes for cancer development and progression. To address these unmet needs, we established OncoDB, an online database resource to explore abnormal patterns in gene expression as well as viral infection that are correlated to clinical features in cancer. Specifically, OncoDB integrated RNA-seq, DNA methylation, and related clinical data from over 10 000 cancer patients in the TCGA study as well as from normal tissues in the GTEx study. Another unique aspect of OncoDB is its focus on oncoviruses. By mining TCGA RNA-seq data, we have identified six major oncoviruses across cancer types and further correlated viral infection to changes in host gene expression and clinical outcomes. All the analysis results are integratively presented in OncoDB with a flexible web interface to search for data related to RNA expression, DNA methylation, viral infection, and clinical features of the cancer patients. OncoDB is freely accessible at http://oncodb.org.Entities:
Mesh:
Year: 2022 PMID: 34718715 PMCID: PMC8728272 DOI: 10.1093/nar/gkab970
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Workflow for the RNA-seq alignment pipeline to summarize human and viral sequencing reads.
Figure 2.Examples to demonstrate features of the Expression Analysis, Methylation Analysis, and Clinical Analysis modules. (A) A boxplot to compare RNA expression level of CCT3 in liver tumor vs. normal samples. CCT3 was recently reported as a biomarker for liver cancer (17). (B) A scatter plot to compare RNA expression level of two genes (MITF and ZEB2) in breast tumor samples. Previous research indicates these two genes have highly correlated expression profiles (18). (C) A line graph to compare the DNA methylation level of CCND2 in breast tumor versus normal samples. Previous research reported that CCND2 is hypermethylated in breast cancer (19). (D) A boxplot for subgroup comparison of CCT3 RNA expression, stratified by pathological M stage in liver tumor samples. (E) A survival plot to evaluate the prognostic significance of CCT3 RNA expression in liver tumor samples.
Figure 3.Examples to demonstrate features of the Oncovirus Analysis module. (A) A boxplot to compare RNA expression level of CDKN2A in HPV-positive vs. HPV-negative cervix tumor samples. CDKN2A (commonly known as p16) is a well-established marker for HPV infection (20). (B) A line graph to compare the DNA methylation level of CDKN2A in HPV-positive versus HPV-negative cervix tumor samples. (C) A survival plot to evaluate the prognostic significance of CDKN2A expression in HPV-positive cervix tumor samples only. (D) A survival plot to evaluate the prognostic significance of HPV infection in cervix tumor samples.