| Literature DB >> 28642609 |
Sagarika Banerjee1, Tian Tian2, Zhi Wei2, Kristen N Peck1, Natalie Shih3, Ara A Chalian1, Bert W O'Malley1, Gregory S Weinstein1, Michael D Feldman3, James Alwine4, Erle S Robertson5.
Abstract
The microbiome is fundamentally one of the most unique organs in the human body. Dysbiosis can result in critical inflammatory responses and result in pathogenesis contributing to neoplastic events. We used a pan-pathogen array technology (PathoChip) coupled with next-generation sequencing to establish microbial signatures unique to human oral and oropharyngeal squamous cell carcinomas (OCSCC/OPSCC). Signatures for DNA and RNA viruses including oncogenic viruses, gram positive and negative bacteria, fungi and parasites were detected. Cluster and topological analyses identified 2 distinct groups of microbial signatures related to OCSCCs/OPSCCs. Results were validated by probe capture next generation sequencing; the data from which also provided a comprehensive map of integration sites and chromosomal hotspots for micro-organism genomic insertions. Identification of these microbial signatures and their integration sites may provide biomarkers for OCSCC/OPSCC diagnosis and prognosis as well as novel avenues for study of their potential role in OCSCCs/OPSCCs.Entities:
Mesh:
Year: 2017 PMID: 28642609 PMCID: PMC5481414 DOI: 10.1038/s41598-017-03466-6
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Viral signatures detected in oral cancer and control samples. (a) The viral signatures that are detected with hybridization signal (g–r > 30) by PathoChip screen of 100 oral cancer samples are shown and ranked according to decreasing hybridization signal (weighted score sum of all the probes per accession) and prevalence. (b and c) Figure b and c shows the hybridization signals and prevalence for the viral signatures detected in matched (MC) and non-matched (NC) controls respectively, ranked in descending order. (d) Figure d shows the association of different molecular signatures of viral families with cancer and controls, represented as a venn diagram, and as colored bars. (e) Figure e shows the heat map of hybridization signals detected by PathoChip screen of the HPV probes (Y-axis) with the oral cancer and control samples (x-axis). The hybridization signals of the cancer samples to each of these probes were compared to MCs and NCs. Samples were screened individually or in pools (marked with a ▪). (f) Figure f shows percentage of HPV16 probes detected with low (g-r > 30–300), medium (g–r > 300–3000) and high (g–r > 3000) hybridization signal in 100 oral cancer samples screened individually and in pools (▪) and 20 each of MCs and NCs screened in pools of 5.
Significant detection of the probes of micro-organisms in cancer compared to the matched (MC) and non-matched control (NC) samples.
| Types | Phyla | Family/Genera | Hybridization Signal (weighted score sum) |
| |||
|---|---|---|---|---|---|---|---|
| Cancer | MC | NC | Cancer vs. MC | Cancer vs. NC | |||
| Viruses | HPVs (2, 6b, 1, 18, 16, 26, 34) | 195426486 | 8029 | 6839 | 2.31E-10 | 2.3E-10 | |
| HPV16 | 191835193 | 6468 | 2862 | 4.02E-10 | 3.98E-10 | ||
| Poxviridae | 2206031 | 3151 | 11698 | 9.95E-06 | 3.59E-05 | ||
| Retroviridae | 1696995 | 7616 | 73887 | 0.002287 | 0.291564 | ||
| Polyomaviridae | 847250 | 1153 | 69753 | 8.59E-05 | 0.427587 | ||
| Herpesviridae | 4383363 | 9881 | 74503 | 7.97E-07 | 9.1E-05 | ||
| Reoviridae | 101004 | 44 | 251 | 3.26E-09 | 7.59E-09 | ||
| Orthomyxoviridae | 266120 | 235 | 158 | 1.4E-05 | 1.3E-05 | ||
| Bacteria | Actinobacteria |
| 4313000 | 0 | 81 | 8.44E-06 | 8.48E-06 |
| Actinobacteria |
| 2684602 | 38 | 229 | 2.83E-06 | 2.88E-06 | |
| Actinobacteria |
| 2095762 | 310 | 0 | 1.46E-05 | 1.41E-05 | |
| Actinobacteria |
| 1255769 | 0 | 0 | 6.62E-06 | 6.62E-06 | |
| Actinobacteria |
| 1192284 | 0 | 75 | 4.33E-06 | 4.4E-06 | |
| Actinobacteria |
| 632038 | 1186 | 5901 | 0.000114 | 0.000808 | |
| Proteobacteria |
| 662587 | 0 | 0 | 0.007274 | 0.007274 | |
| Proteobacteria |
| 53035 | 217 | 0 | 0.002417 | 0.001331 | |
| Proteobacteria |
| 412231 | 4524 | 478 | 0.031211 | 0.00984 | |
| Proteobacteria |
| 381307 | 188 | 70 | 0.005386 | 0.005212 | |
| Proteobacteria |
| 372204 | 2567 | 435 | 0.000375 | 4.83E-05 | |
| Proteobacteria |
| 347736 | 1782 | 166 | 0.008933 | 0.005065 | |
| Proteobacteria |
| 327628 | 37 | 0 | 0.002719 | 0.002682 | |
| Proteobacteria |
| 297486 | 168 | 248 | 0.012434 | 0.012716 | |
| Proteobacteria |
| 231427 | 0 | 3448 | 0.030438 | 0.089252 | |
| Proteobacteria |
| 206703 | 0 | 112 | 0.02798 | 0.028969 | |
| Proteobacteria |
| 117585 | 31 | 4521 | 0.053936 | 0.311419 | |
| Proteobacteria |
| 115458 | 0 | 0 | 0.365984 | 3.95E-06 | |
| Proteobacteria |
| 111747 | 0 | 2046 | 0.039465 | 0.125478 | |
| Proteobacteria |
| 81383 | 0 | 0 | 0.045451 | 0.045451 | |
| Proteobacteria |
| 45839 | 0 | 37 | 0.015141 | 0.016169 | |
| Proteobacteria |
| 16380 | 0 | 36 | 0.006651 | 0.006758 | |
| Proteobacteria |
| 2278234 | 0 | 0 | 4.41E-06 | 4.41E-06 | |
| Firmicutes |
| 846305 | 0 | 0 | 0.000318 | 0.000318 | |
| Firmicutes |
| 338497 | 2586 | 69 | 0.003597 | 0.000755 | |
| Firmicutes |
| 321750 | 288 | 72 | 0.000116 | 0.000101 | |
| Firmicutes |
| 65757 | 0 | 0 | 0.061028 | 0.060022 | |
| Firmicutes |
| 53035 | 217 | 0 | 0.005775 | 0.003531 | |
| Firmicutes |
| 50613 | 228 | 379 | 0.05041 | 0.058804 | |
| Bacteroidetes |
| 290038 | 275 | 338 | 0.012396 | 0.012614 | |
| Bacteroidetes |
| 352917 | 187 | 110 | 0.000286 | 0.000275 | |
| Fungi |
| 17241169 | 27419 | 62314 | 3.1E-20 | 5.16E-18 | |
|
| 12912140 | 0 | 0 | 2.1E-19 | 2.1E-19 | ||
|
| 11366446 | 11539 | 25248 | 7.71E-17 | 1.64E-16 | ||
|
| 8982809 | 0 | 0 | 1.42E-11 | 1.42E-11 | ||
|
| 7035309 | 87 | 4425 | 5.87E-13 | 6.64E-10 | ||
|
| 4991102 | 185123 | 361631 | 0.052290 | 0.401344 | ||
|
| 2163898 | 2504 | 21859 | 1.11E-16 | 8.26E-14 | ||
|
| 1210219 | 0 | 0 | 9.75E-08 | 9.75E-08 | ||
|
| 283315 | 235 | 216 | 5.87E-13 | 5.67E-13 | ||
|
| 74254 | 0 | 372 | 5.75E-10 | 5.01E-09 | ||
| Parasite |
| 26463760 | 0 | 0 | 3.15E-29 | 3.15E-29 | |
|
| 19026989 | 0 | 0 | 3.40901E-29 | 3.40901E-29 | ||
|
| 16438588 | 2402 | 65928 | 1.38087E-19 | 1.78333E-17 | ||
|
| 10743239 | 16756 | 37938 | 2.52E-18 | 9.31E-18 | ||
|
| 1814992 | 0 | 0 | 5.19E-09 | 5.19E-09 | ||
|
| 306180 | 0 | 0 | 1.19E-16 | 1.19E-16 | ||
|
| 260321 | 0 | 0 | 6.87E-15 | 6.87E-15 | ||
Weighted score sum of the hybridization signals of all the probes of an organism was calculated in cancer and controls, and significance (p-value < 0.05) was calculated using one sided t-tests.
Figure 2Bacterial signatures detected in oral cancer samples. (a) Pie charts showing the percentage of different groups and phyla of bacteria detected in oral cancer, matched (MC) and non-matched controls (NC). (b) The bacterial signatures that are detected with hybridization signal (g–r > 30) by PathoChip screen of 100 oral cancer samples and in MCs and NCs are shown and ranked according to decreasing hybridization signal (weighted score sum of all the probes per accession) and prevalence. (c) Figure c shows the heat map of the hybridization signal for the bacterial probes of bacterial genera a-xyz, labeled in figure b, detected by PathoChip screen with the cancer, matched (MC) and non-matched control (NC) samples. Samples were screened individually and in pools (marked ▪). (d) Figure d shows the association of molecular signatures of different bacterial genera with oral cancer and/or controls, represented as a venn diagram, and as colored bars.
Figure 3Fungal (a–e) and parasitic (f–j) signatures detected in oral cancer samples. (a) The fungal signatures that are detected with hybridization signal (g–r > 30) by PathoChip screen of 100 oral cancer samples are shown and ranked according to decreasing hybridization signal (weighted score sum of all the probes per accession) and prevalence. (b and c) Figure b and c shows the fungal signatures detected in the matched (MC) and non-matched controls (NC) respectively, ranked according to decreasing hybridization signal and prevalence. Figure d shows the heat map of the hybridization signal for the fungal probes of fungi i–x, labeled in figure a, detected by PathoChip screen with the cancer, matched (MC) and non-matched control (NC) samples. Samples were screened individually and in pools (marked ▪). (e) Figure e shows the association of molecular signatures of different fungal genera with oral cancer and/or controls, represented as a venn diagram, and as colored bars. (f) The parasitic signatures that are detected with hybridization signal (g–r > 30) by PathoChip screen of 100 oral cancer samples are shown and ranked according to decreasing hybridization signal (weighted score sum of all the probes per accession) and prevalence. (g and h) Figure g and h shows the parasitic signatures detected in the matched and non-matched controls (MC and NC) respectively, ranked according to decreasing hybridization signals and prevalence. (i) The heat map of the hybridization signal for the parasitic probes of parasites i–vii, labeled in figure f, detected by PathoChip screen with the cancer, matched (MC) and non-matched control (NC) samples. Samples were screened individually and in pools (marked ▪). (j) Figure j shows the association of molecular signatures of different parasitic genera with oral cancer and/or controls, represented as a venn diagram, and as colored bars.
Figure 4Hierarachial clustering of 100 oral cancer samples. (a) Hierarchial clustering by R program using Euclidean distance, complete linkage and non-adjusted values. Samples marked (▪) were the samples that were screened in pools, rest were screened individually. (b) Clustering of the OCSCC samples using NBClust software [CH (Calinski and Harabasz) index, Euclidean distance, complete linkage]. (c) Topological analysis using Ayasdi software, using Euclidean (L2) metric and L-infinity centrality lenses. The OCSCC samples that had similar detection for viral and microbial signatures formed the nodes, and those nodes are connected by an edge if the corresponding node have detection pattern in common to the first node. Each nodes are color coded according to the detection of HPV 16.
Figure 5Probe capture sequencing alignment is shown for individual capture pools (HPV16, O, B, F and P). HPV16 capture probe comprised of set of HPV16 specific probes, O capture probes consisted of certain viral and bacterial probes, B pool comprised of bacterial probes, F consisted of fungal probes and P comprised of parasitic probes that are mentioned in Table S3. The hybridization signals of the HPV probes used for the capture are shown as heat map in the figure. Six pools of whole genome and transcriptome amplified DNA plus cDNA was hybridized to a set of biotinylated conserved and specific viral probes, then captured on streptavidin beads, and used for tagmentation library preparation and deep sequencing with paired –end 250-nt reads. The miseq reads from individual capture when aligned with the metagenome of PathoChip (Chip probes) was found to cluster mostly at the capture probe regions. The genomic location along with the number of miSeq reads are mentioned in the figure for each organism.
Figure 6Microbial genomic integrations in the host chromosome. (a) Bar graphs showing number of viral (HPV16 and JC Polyomaviral) integration sites in host human chromosomes and the percentage of viral genomic sites for integration into host chromosomes. (b) Circos plot highlighting fusion events with >=20 reads support for the bacterial, fungal and parasitic insertions into individual human chromosomes are shown. For the viral insertions, all the reads were taken into account. (c) Karyogram plot of bacterial insertion sites (red lines) in human chromosomes, cut off reads >=20. The number of insertion sites in each chromosome is mentioned in the figure before chromosome number. (d) Karyogram plot of virus, parasite, fungus, insertion sites in human chromosomes. Color profile: green lines for parasite genomic insertional sites, red for HPV16, yellow for JC Polyomavirus, blue for fungus. The cutoff read for bacteria, fungus and parasite, >=20 and for virus, all the insertion sites were included. The number of insertion sites in each chromosome is mentioned in the figure before chromosome number. G-banding annotation for each chromosome is shown; gneg - Giemsa negative bands; The Giemsa positive bands have further been subdivided into gpos25, gpos50, gpos75, and gpos100 with the higher number indicating a darker stain; acen - centromeric regions; gvar - variable length heterochromatic regions; stalk - tightly constricted regions on the short arms of the acrocentric chromosomes (e) Schematic representation of viral and microbial genomic insertional sites in human chromosome 17. The genomic co-ordinates of the pathogens integrated and that of the host chromosome integration sites are mentioned. The co-ordinates for human chromosomes are from GRCh37/hf19 Assembly. (f) Association of host genes affected by viral/microbial genomic integrations to neoplasia of epithelial cells, analysed by Ingenuity Pathway Analysis (IPA) program that showed a p-value of 7.17E10 for such association.
Microbial genomic integration sites in the OCSCC host somatic chromosomes.
| Microbial insertion region | Human genomic Integration sites |
|---|---|
|
| |
| HPV16 4, 188–4, 243 (hotspot for integration) | Intronic (53% integrations) regions of LAMA3, ATXN10, INADL, ABCA10, EVC2, WDR89, CADPS2, HAUS6, EPHA6, FAM179B, COL14A1, MRPS27, FUCA2, ADAMTS12, TRIOBP, CSMD1, KCNQ1. |
| Upstream (12%) of genes IL12RB2, LOC388436, LOC79999, FCHO1, MRPL52, SLC7A7 | |
| Downstream (9%) of the genes NACAP1, GUCA2A/GUCA2B, RSPH1 | |
| Intronic ncRNA gene of the FAM35BP gene (6% integrations) | |
| Intergenic integrations (6%): | |
| −upstream of SPECC1; downstream of CCDC144CP | |
| −upstream of SSTR3 and downstream of RAC2. | |
| HPV16 E1 | Intronic region of SLC13A3, DLGAP1, CCDC155 and ncRNA LOC10028863 |
| Intergenic regions | |
| HPV16 E2 and E4 | Intronic region of LOC10272495 |
| Intergenic regions | |
| HPV 16 L1 | Intronic region of PAFAH1B1, ncRNA LOC10050620 |
| Intergenic region | |
| L1 PolyA | Intronic regions of DEPDC4 |
| 3′ UTR region of the MKLN1 | |
| Intergenic regions | |
| HPV 16 L2 | Intronic region of SSH2 |
|
| |
| JC LT Ag | Intronic regions of CMTR1 and ME1 on chr6; CPO on chr2 |
| Intergenic regions of chromosomes 1, 2 and 3 | |
| VP1 ORF | Intergenic regions, 41 Kb downstream of the lncRNA gene SFTA1P (chr10) |
| Upstream of ABCA9 (chr17) | |
| 3′ UTR of the epigenetic regulator gene MECP2 | |
| VP2 and VP3 | Intronic region FAM13B (chr 5) and PCCA (chr 13) |
| Agnoprotein Jvgp1 | Intronic regions MSH3 (chr 5) and PHLSB3 (chr19) |
| Late coding region (191–253) | Intergenic regions of chromosome 16 |
| −97 Kb downstream of NPIPA7 | |
| −99 Kb upstream of the NPIPA5 gene | |
| Intronic region of SSG5 (chr15) | |
|
| |
|
| Exon of ADAMTSL1 (chr9) |
| Intron of MAP3K1 | |
|
| Exon of RASSF5 (1q32.1) |
| 3′ UTR of SEC14L4 | |
|
| Exon of SRCAP (chr16) |
| Exon of WNT3 (chr17) | |
| 5′ UTR of C1orf162 | |
| Intron of SLC9A9 | |
|
| 3′-end of SMURF2 |
| Intron of CASP9 | |
|
| 3′ UTR of COL1A1 |
|
| Itron of RIC8B |
|
| Intron of LYPD6B. |
|
| |
|
| Intergenic- 560 Kb upstream of the GABRG1 gene (chr4) |
|
| Intron of ITCH (chr20) |
| Intron of MAGI1 (chr 3) | |
|
| Intron of ZNRF2 (chr7) |
|
| Intron of CADPS2 (chr7) |
|
| |
|
| Exon of ZNF383 (chr19) |
| Intron of LNP1 (chr3) | |
| Intergenic- downstream of SLC10A2 (chr13) | |
| Intron of SPECC1 (chr17) | |
|
| Exon of RHD (chr1) |
|
| intron of AKAP1 (chr17) |
| Intron of EPS15L1 (chr19) | |
| Intergenic- 353 Kb upstream of NRG3 (chr10) | |
|
| Intron of ATRX (chrX) |
| 21 Kb upstream of FGFR2 | |
|
| Intron of USP32 (chr17) |
| Intergenic region- 37 Kb upstream of Lyn gene | |
|
| Downstream of MIR3648 (chr21) |
|
| Intergenic- 106 Kb upstream of TRIM49B (chr9) |
| in the ncRNA ANKRD30BL gene | |