| Literature DB >> 27339696 |
Song Cao1, Michael C Wendl1,2,3, Matthew A Wyczalkowski1, Kristine Wylie1,4, Kai Ye1,2, Reyka Jayasinghe1,5, Mingchao Xie1,5, Song Wu1, Beifang Niu1, Robert Grubb6, Kimberly J Johnson7, Hiram Gay8, Ken Chen9, Janet S Rader10, John F Dipersio5,8, Feng Chen5,8, Li Ding1,2,5,8.
Abstract
We applied a newly developed bioinformatics system called VirusScan to investigate the viral basis of 6,813 human tumors and 559 adjacent normal samples across 23 cancer types and identified 505 virus positive samples with distinctive, organ system- and cancer type-specific distributions. We found that herpes viruses (e.g., subtypes HHV4, HHV5, and HHV6) that are highly prevalent across cancers of the digestive tract showed significantly higher abundances in tumor versus adjacent normal samples, supporting their association with these cancers. We also found three HPV16-positive samples in brain lower grade glioma (LGG). Further, recurrent HBV integration at the KMT2B locus is present in three liver tumors, but absent in their matched adjacent normal samples, indicating that viral integration induced host driver genetic alterations are required on top of viral oncogene expression for initiation and progression of liver hepatocellular carcinoma. Notably, viral integrations were found in many genes, including novel recurrent HPV integrations at PTPN13 in cervical cancer. Finally, we observed a set of HHV4 and HBV variants strongly associated with ethnic groups, likely due to viral sequence evolution under environmental influences. These findings provide important new insights into viral roles of tumor initiation and progression and potential new therapeutic targets.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27339696 PMCID: PMC4919655 DOI: 10.1038/srep28294
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1The detected frequency of various viruses across 23 cancer types, classified across (A) 10 different organ systems and (B) 12 different HNSC anatomic sites. The top number in (A) is the total number of samples in the given cancer type and the bottom number in (B) is the number of samples by anatomic site in head and neck cancers. In (B), circle area is proportional to frequency. The full name of each cancer type in (A) can be found in Materials and Methods.
Figure 2(A) Comparison of virus abundance in 81 tumor-adjacent normal pairs across 15 cancer types, with inclusion based on either the tumor or normal tissue showing a virus signature (RPHM ≥ 5). Red and blue denote tumor and normal samples, respectively, with cancer types represented by the standard color palette. Virus abundance is quantified by a white (zero) to red (maximum) continuum. (B) Comparison of HBV integration sites for six tumor-adjacent normal pairs. In six tumor/adjacent normal pairs, both tumor and normal samples have HBV abundance RPHM ≥ 100. The color bar shows the number of discordant read pairs (DRP) in log10 scale. Two recurrent sites, FN1 and KMT2B, are found in adjacent normal and tumor samples, respectively. (C) The comparison of HBV gene expression in 6 tumor-normal pairs. For a direct comparison, we use the normalized depth (see Methods) to quantify the gene expression.
Figure 3(A) The histogram of HHV4 and HHV5-postive samples across different cancer types. Comparison of depths for two different viruses: (B) HHV4, and (C) HHV5, across the entire virus genome. X-axis is genomic position and y-axis is the calculated sequencing depth.
Figure 4(A) Genes with recurrent virus integrations in LIHC (Green), HNSC (Purple), and CESC (Orange). The gray box indicates sample (x-axis) with virus integration in the specific gene (y-axis). (B) Differences in RPKM between case and the mean value of controls (without viral infection) for various exons in the longest transcripts of four important genes (ERBB2, PTPN13, KMT2B, TERT) with recurrent virus integrations. Circle area is proportional to −log10 of the difference p-value. The x coordinate of the circle’s center represents the mid-point of each exon, which is ordered from the left to right for positive strand gene and the right to left for negative strand gene. PTPN13, KMT2B and ERBB2 are positive strand genes and TERT is a negative strand gene. Different samples are marked by different colors.
Figure 5Unsupervised clustering (A,B) and phylogenetic analyses (C,D) based on HHV4 variants and HBV variants from PHYLIP. All variants have >10X coverage across all samples. In (A,B), the x-axis and y-axis are sample ID and virus genomic coordinates, respectively. The suffix “11A” in the sample IDs is an abbreviation of adjacent normal samples following TCGA convention.