| Literature DB >> 31598702 |
Deyou Tang1,2, Bingrui Li1, Tianyi Xu1,3, Ruifeng Hu1, Daqiang Tan2, Xiaofeng Song3, Peilin Jia1, Zhongming Zhao1,4,5,6.
Abstract
Virus integration into the human genome occurs frequently and represents a key driving event in human disease. Many studies have reported viral integration sites (VISs) proximal to structural or functional regions of the human genome. Here, we systematically collected and manually curated all VISs reported in the literature and publicly available data resources to construct the Viral Integration Site DataBase (VISDB, https://bioinfo.uth.edu/VISDB). Genomic information including target genes, nearby genes, nearest transcription start site, chromosome fragile sites, CpG islands, viral sequences and target sequences were integrated to annotate VISs. We further curated VIS-involved oncogenes and tumor suppressor genes, virus-host interactions involved in non-coding RNA (ncRNA), target gene and microRNA expression in five cancers, among others. Moreover, we developed tools to visualize single integration events, VIS clusters, DNA elements proximal to VISs and virus-host interactions involved in ncRNA. The current version of VISDB contains a total of 77 632 integration sites of five DNA viruses and four RNA retroviruses. VISDB is currently the only active comprehensive VIS database, which provides broad usability for the study of disease, virus related pathophysiology, virus biology, host-pathogen interactions, sequence motif discovery and pattern recognition, molecular evolution and adaption, among others.Entities:
Mesh:
Substances:
Year: 2020 PMID: 31598702 PMCID: PMC6943068 DOI: 10.1093/nar/gkz867
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Overview of VISDB. Data were collected from three sources (literature, biological DB and curated Bio-DB) and curated into three categories. The core part is the identification and curation of the VISs. Various functions are implemented in VISDB.
Summary of VISDB and comparison with Dr.VIS 2.0 and HPVbasea
|
|
|
|
|
| # total VISs (# viruses) | 1257 (1) | 3340 (8) | 77 632 (9) |
| # diseases | 8 | 25 | 27 |
| # samples | NA | 3339 | 2880 |
| # experimental methods | 42 | NA | 50 |
| # publications | 59 | 64 | 108 |
| # junction sequences | NA | 551 | 2577 |
| # target genes/# VISs | 481/713 | 1153 | 15 064/49 525 |
| # nearby genes/# VISs | NA | 406 | 11 643/28 132 |
| # nearest CFSs/# VISs | 98/395 | NA | 123/25 193 |
|
| |||
| # virus sequences | 33 066 | ||
| # target sequences | 76 290 | ||
| # transcripts mapped with VIS’s flanking sequences | 16 353 | ||
| # CpG islands mapped with VIS’s flanking sequences | 14 595 | ||
| # miRNAs mapped with VIS’s flanking sequences | 207 | ||
| # genes with expression/# VISs | 3594/8610 | ||
| # miRNAs with expressionb/# VISs | 227/3092 | ||
|
| |||
| # oncogenes/# VISs | 451/2768 | ||
| # tumor suppressor genes/# VISs | 680/3707 | ||
| # pathway genes/# VISs | 105/624 | ||
|
| |||
| # oncogenes/# VISs | 314/940 | ||
| # tumor suppressor genes/# VISs | 450/1362 | ||
|
| |||
| # VISs correlated with lncRNA-associated interaction | 83 | ||
| # VISs correlated with miRNA-associated interaction | 26 414 | ||
aTwo virus integration datasets/databases, RID and HIRIS, are not included in comparison because there are no such statistics from their publications or websites.
bWe only include the miRNAs whose target genes mapped with VIS and having expression data in TCGA.
Figure 2.Main features of the VISDB web content. (A) Homepage. (B) Basic search. (C) Advanced search. (D) Browse function of VISs on the Virus web page. The page shows the HBV information, including reference, positions in the genome, disease and method. On the top, users can view pathway and GO analysis results. (E) Browse VIS data by cluster using ‘Clustered VIS browse’ submenu in Browse menu.
Figure 3.The web page showing detailed VIS analysis results. (A) Visualization of a single integration event. (B) Visualization of DNA elements flanking a VIS. (C) Gene and miRNA expression analysis using The Cancer Genome Atlas (TCGA) data. (D) Detailed description of VIS. Only selected parts of the page are shown due to space limitations.