| Literature DB >> 32257046 |
Rebecca M Rodriguez1,2, Brenda Y Hernandez3,2, Mark Menor1, Youping Deng1, Vedbar S Khadka1.
Abstract
Identification of microbial composition directly from tumor tissue permits studying the relationship between microbial changes and cancer pathogenesis. We interrogated bacterial presence in tumor and adjacent normal tissue strictly in pairs utilizing human whole exome sequencing to generate microbial profiles. Profiles were generated for 813 cases from stomach, liver, colon, rectal, lung, head & neck, cervical and bladder TCGA cohorts. Core microbiota examination revealed twelve taxa to be common across the nine cancer types at all classification levels. Paired analyses demonstrated significant differences in bacterial shifts between tumor and adjacent normal tissue across stomach, colon, lung squamous cell, and head & neck cohorts, whereas little or no differences were evident in liver, rectal, lung adenocarcinoma, cervical and bladder cancer cohorts in adjusted models. Helicobacter pylori in stomach and Bacteroides vulgatus in colon were found to be significantly higher in adjacent normal compared to tumor tissue after false discovery rate correction. Computational results were validated with tissue from an independent population by species-specific qPCR showing similar patterns of co-occurrence among Fusobacterium nucleatum and Selenomonas sputigena in gastric samples. This study demonstrates the ability to identify bacteria differential composition derived from human tissue whole exome sequences. Taken together our results suggest the microbial profiles shift with advanced disease and that the microbial composition of the adjacent tissue can be indicative of cancer stage disease progression.Entities:
Keywords: BLCA, bladder carcinoma; CESC, cervical & endocervical squamous cell carcinomas; COAD, colon adenocarcinoma; COREAD, colon and rectal adenocarcinoma TCGA cohorts; Cancer microbiome; Exome sequencing; HNSC, head & neck squamous cell carcinoma; L2FC, log 2 fold change; LIHC, liver hepatocellular carcinoma; LUAD, lung adenocarcinoma; LUSC, lung squamous cell carcinoma; Microbial landscape; READ, rectal adenocarcinoma; STAD, stomach adenocarcinoma; TCGA; TCGA, The Cancer Genome Atlas
Year: 2020 PMID: 32257046 PMCID: PMC7109368 DOI: 10.1016/j.csbj.2020.03.003
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Fig. 1Sample selection workflow and computational pipeline designed to extract microbial profiles based on PathoScope 2.0. Whole exome sequencing files (3758), from 813 cases were downloaded. From these cases, a total of 1681 sample sequences were processed through our modified pipeline (852 tumor and 829 adjacent normal). Bioinformatics pipeline includes additional filtering step described by Zhang et al. 2015. Additional filtering step completed against human reference genome (hg38) and simultaneously aligned to custom library of known microbial genomes. Relative abundance of PathoReport is calculated for each microbe based on normalized values in tumor and adjacent normal tissues. Detection of DNA viral sequences used as internal validation. Sample sequences without microbial reads in at least 1 pair tissue (tumor or adjacent normal) sample were removed. From PathoReport, strict one-to-one pairs with microbial reads were selected for bacterial differential analyses (a total of 1596 samples from 798 paired cases). Demographics and correlation analyses with clinical data were completed for cases with available data (746).
Proportion of samples with microbial reads at any detection level.
| TCGA Cohort | Samples with Bacteria in Tumor N (%) | Samples with Bacteria in Adjacent N (%) | Samples with Virus in Tumor N (%) | Samples with Virus in Adjacent N (%) |
|---|---|---|---|---|
| STAD (n = 176) | 74/88 (84) | 73/88 (83) | 30/88 (34) | 35/88 (40) |
| LIHC (n = 141) | 68/71 (81) | 66/70 (79) | 13/71 (15) | 17/70 (20) |
| COAD (n = 176) | 88/88 (100) | 88/88 (100) | 88/88 (100) | 88/88 (100) |
| READ (n = 36) | 18/18 (100) | 18/18 (100) | 11/18 (61) | 10/18 (56) |
| LUSC (n = 430) | 211/221 (95) | 209/221 (94) | 63/221 (28) | 81/221 (36) |
| LUAD (n = 393) | 200/200 (100) | 193/200 (97) | 41/200 (21) | 34/200 (17) |
| HNSC (n = 145) | 72/72 (100) | 71/73(97) | 11/72 (16) | 9/73 (13) |
| CESC (n = 16) | 8/8 (100) | 7/8 (88) | 8/8 (100) | 8/8 (100) |
| BLCA (n = 64) | 28/35 (76) | 28/29 (76) | 2/35 (5) | 1/29 (3) |
| Totals | 767 (94) | 753 (92) | 267 (33) | 283 (35) |
Proportion of samples with microbial reads (bacterial and viral) prior to 1:1 pairing selection. There was no significant difference in the number of samples with microbial presence between tumor and its adjacent normal tissue for any cohort. Overall, we detected microbial reads in 94% of tumors and 92% of adjacent normal. We found DNA viral presence, mainly HHV-4, HPV and HBV in 33% of tumors and 35% of adjacent with cervical and colon cancers having 100% of samples with at least one viral read.
Basic population demographic characteristics of 9 TCGA cancer cohorts.
| STAD | LIHC | COAD | READ | LUSC | LUAD | HNSC | CESC | BLCA | Totals | |
|---|---|---|---|---|---|---|---|---|---|---|
| White | 54 (64) | 64 (79) | 37 (42) | 8 (44) | 147 (67) | 120 (8) | 56 (81) | 4 (50) | 25 (89) | 515 (69) |
| African American | 3 (3) | 6 (7) | 7 (8) | 1 (6) | 16 (7) | 23 (16) | 9 (13) | 1 (13) | 2 (7) | 68 (9) |
| Asian | 16 (19) | 7 (9) | – | – | 3 (1) | 2 (1) | 1 (1) | – | – | 29 (4) |
| Other Race | – | – | 1 (1) | – | – | 1 (1) | – | 2 (25) | – | 4 (1) |
| Not reported | 12 (14) | 4 (5) | 43 (49) | 9 (50) | 55 (25) | 2 (1) | 3 (4) | 1 (13) | 1 (4) | 130 (17) |
| Mean ±SD Range | 67 ± 10.5 | 64 ± 14.7 | 71 ± 12.3 | 63 ± 14.6 | 68 ±8.4 | 65 ± 10.3 | 63 ± 12.2 | 47 ± 13.5 | 69 ± 10.7 | 64 ± 11.9 |
| Male | 48 (56) | 46 (57) | 47 (53) | 7 (39) | 64 (29) | 63 (43) | 48 (70) | – | 19 (68) | 342 (46) |
| female | 37 (44) | 35 (43) | 41 (47) | 11 (61) | 157 (71) | 85 (57) | 21 (30) | 8 (100) | 9 (32) | 404 (54) |
| I | 14 (17) | 33 (41) | 11 (13) | 4 (22) | 121 (55) | 79 (53) | 1 (1) | 4 (50) | 3 (11) | 270 (36) |
| II-III | 47 (55) | 35 (43) | 62 (70) | 9 (50) | 96 (43) | 61 (41) | 30 (43) | 4 (50) | 11 (39) | 355 (48) |
| IV | 8 (9) | 3 (4) | 14 (16) | 4 (22) | 3 (1) | 6 (4) | 38 (55) | – | 14 (50) | 90 (12) |
| No staging | 16 (19) | 10 (12) | 1 (1) | 1 (6) | 1 (1) | 2 (1) | – | – | – | 31 (4) |
Cases with paired tumor and adjacent tissue normal with available clinical data Clinical. Out of 798 cases, with bacterial presence, clinical data was available for 746. Largest fraction of cases without clinical data were from LUAD (n = 52). Other race include groups with 2 or less cases per group from Native American, Alaskan Native, Native Hawaiian or other Pacific Islander to maintain privacy. (–) indicates not available or not applicable
Fig. 2Frequency of shared bacterial species across 9 TCGA cohorts. Compositional bar graph showing size of individual core taxonomies (left horizontal bars) and intersect of shared species (black dot and connecting lines) across cohorts with species frequency (vertical bars). Twelve taxa (yellow highlight), Actinomyces oris, Bradyrhizobium sp. BTAi1 Bradyrhizobium sp. ORS, Cutibacterium acnes, Escherichia coli, Leptothix cholodnii, Neisseria sicca, Ralstonia insidiosa, Rhodopseudomonas palustris, Shingomonas melonis, Sphingomonas panacis and Bradyrhizobium diazoefficiens were found to be shared across all nine cohorts at different rates, from which 3, Bradyrhizobium sp. BTAi1, Cutibacterium acnes, and Escherichia coli, were detected in both pairs tumor and adjacent normal tissue. Colon (COAD) had the greatest number of unique taxa (260 non-shared), while cervical (CESC) and bladder (BLCA) cancers had no unique species when comparing across cohorts.
Fig. 3The landscape of bacterial shift changes across 9 tumor types at the phylum level. Proportion of bacterial reads and compositional shifts at the phylum level and anatomical proximities per cancer type in tumor and adjacent normal tissues. Phyla >1% of the total reads per tissue is shown, phyla < 1% grouped as other and may include Verrucombia, Spirochaetes, Tenericutes, Fusobacteria, and Cyanobacteria primarily from colon, gastric and head and neck cancer cohorts. Significant shifts in bacterial composition are observed between the adjacent normal and tumor tissues that may indicate disease status or disease progression within the tumor microenvironment in the continuum of disease.
Comparison of qPCR validation in gastric and colorectal cancers.
| RTR | TCGA | |||||
|---|---|---|---|---|---|---|
| Tumor | Adjacent normal | Cases | Tumor | Adjacent normal | Cases | |
| Gastric (N = 21) | Gastric (N = 85) | |||||
| positive | positive | 1 | positive | positive | 4 | |
| positive | negative | 1 | positive | negative | 3 | |
| negative | positive | 1 | negative | positive | 19 | |
| TCGA | negative | negative | 18 | negative | negative | 59 |
| positive | positive | 2 | positive | positive | 0 | |
| positive | negative | 1 | positive | negative | 8 | |
| negative | positive | 0 | negative | positive | 0 | |
| TCGA | negative | negative | 18 | negative | negative | 77 |
| positive | positive | 0 | positive | positive | 3 | |
| positive | negative | 2 | positive | negative | 12 | |
| negative | positive | 0 | negative | positive | 5 | |
| TCGA | negative | negative | 19 | negative | negative | 65 |
| Colorectal (N = 63) | Colorectal (N = 106) | |||||
| positive | positive | 3 | positive | positive | 62 | |
| positive | negative | 3 | positive | negative | 6 | |
| negative | positive | 2 | negative | positive | 18 | |
| TCGA | negative | negative | 55 | negative | negative | 20 |
| positive | positive | 1 | positive | positive | 5 | |
| positive | negative | 10 | positive | negative | 15 | |
| negative | positive | 2 | negative | positive | 4 | |
| TCGA | negative | negative | 50 | negative | negative | 82 |
Bacteria presence counts for each taxa examined in tumor and adjacent normal gastric and colorectal cances in TCGA cohorts versus Hawaii RTR. Selenomonas sputigena and Fusobacterium nucleatum were found to co-occur in Hawaii RTR gastric cases in similar patterns to those observed in TCGA gastric cancer cohort. Percent positivity in White vs non-White for Fusobacterium nucleatum was similar in TCGA and Hawaii RTR (17% vs 18% of respectively), whereas percent positivity for Bacteroides vulgatus was strikingly different (64% vs 5% respectively). Possible explanation for differences observed could be due to other population characteristics (Table 3B), FFPE sample degradation and sample size. N = paired cases; P values were from McNemar’s test. Null hypothesis: there is no difference between tumor and adjacent normal due to specific microbial presence.
Population characteristics comparison between RTR and TCGA in gastric and colorectal cancers.
| RTR | TCGA | |||||
|---|---|---|---|---|---|---|
| Gastric (N = 21) | Gastric (N = 85) | |||||
| Race other than White | 95% | 36% | ||||
| Age > 60 years | 76% | 75% | ||||
| Sex: Female | 62% | 44% | ||||
| Diagnosis anatomical site | 43% unspecified | 31% antrum | ||||
| Tumor classification | 62% stage III | 56% stage III | ||||
| Colorectal (N = 63) | Colorectal (N = 106) | |||||
| Race other than White | 73% | 58% | ||||
| Age > 60 years | 38% | 75% | ||||
| Sex: Female | 49% | 49% | ||||
| Diagnosis anatomical site | 38% sigmoid/rectosigmoid region | 34% unspecified colon/rectum | ||||
| Tumor classification | 31% stage II | 40% stage II | ||||
Compared to TCGA gastric cancer subset, Hawaii RTR population characteristics were significantly different by race, sex and tumor site at initial diagnosis. While in colorectal cancer subset, differences existed in race, age at time of diagnosis, and tumor site at initial diagnosis. We believe differences in positivity ratios could be due population differences were in TCGA population is mostly White Eastern European compared to Hawaii RTR which is mostly Hawaiian and Asian ethnic subgroups.