| Literature DB >> 35587635 |
Jawara Allen1, Axel Rosendahl Huber2,3, Cayetano Pleguezuelos-Manzano4,2, Jens Puschhof4,2, Shaoguang Wu1, Xinqun Wu1, Charelle Boot4, Aurelia Saftien4, Heather M O'Hagan5,6,7, Hao Wang8, Ruben van Boxtel2,3, Hans Clevers4,2,3, Cynthia L Sears1,9,10,11.
Abstract
Enterotoxigenic Bacteroides fragilis (ETBF) is consistently found at higher frequency in individuals with sporadic and hereditary colorectal cancer (CRC) and induces tumorigenesis in several mouse models of CRC. However, whether specific mutations induced by ETBF lead to colon tumor formation has not been investigated. To determine if ETBF-induced mutations impact the Apc gene, and other tumor suppressors or proto-oncogenes, we performed whole-exome sequencing and whole-genome sequencing on tumors isolated after ETBF and sham colonization of Apcmin/+ and Apcmin/+Msh2fl/flVC mice, as well as whole-genome sequencing of organoids cocultured with ETBF. Our results indicate that ETBF-induced tumor formation results from loss of heterozygosity (LOH) of Apc, unless the mismatch repair system is disrupted, in which case, tumor formation results from new acquisition of protein-truncating mutations in Apc. In contrast to polyketide synthase-positive Escherichia coli (pks+ E. coli), ETBF does not produce a unique mutational signature; instead, ETBF-induced tumors arise from errors in DNA mismatch repair and homologous recombination DNA damage repair, established pathways of tumor formation in the colon, and the same genetic mechanism accounting for sham tumors in these mouse models. Our analysis informs how this procarcinogenic bacterium may promote tumor formation in individuals with inherited predispositions to CRC, such as Lynch syndrome or familial adenomatous polyposis (FAP). IMPORTANCE Many studies have shown that microbiome composition in both the mucosa and the stool differs in individuals with sporadic and hereditary colorectal cancer (CRC). Both human and mouse models have established a strong association between particular microbes and colon tumor induction. However, the genetic mechanisms underlying putative microbe-induced colon tumor formation are not well established. In this paper, we applied whole-exome sequencing and whole-genome sequencing to investigate the impact of ETBF-induced genetic changes on tumor formation. Additionally, we performed whole-genome sequencing of human colon organoids exposed to ETBF to validate the mutational patterns seen in our mouse models and begin to understand their relevance in human colon epithelial cells. The results of this study highlight the importance of ETBF colonization in the development of sporadic CRC and in individuals with hereditary tumor conditions, such as Lynch syndrome and familial adenomatous polyposis (FAP).Entities:
Keywords: Bacteroides; cancer; colorectal cancer; genomics; microbes; mutational studies
Mesh:
Year: 2022 PMID: 35587635 PMCID: PMC9241831 DOI: 10.1128/spectrum.01055-22
Source DB: PubMed Journal: Microbiol Spectr ISSN: 2165-0497
Mouse tumor samples
| Mouse genotype | Mouse no. | Exptl condition | Colon tumor no. | No. of tumors analyzed |
|---|---|---|---|---|
|
| 682 | sham | 1 | 1 |
|
| 735 | sham | 2 | 2 |
|
| 678 | ETBF | 32 | 4 |
|
| 710 | ETBF | 30 | 4 |
| 877 | sham | 1 | 1 | |
| 966 | sham | 1 | 1 | |
| 879 | ETBF | 17 | 3 | |
| 965 | ETBF | 23 | 3 |
Tumors from Apcmin/+ mice were 3 mm in diameter, and tumors from Apcmin/+Msh2VC mice were 1 to 2 mm in diameter.
Tumors represent those analyzed by exome sequencing. For whole-genome sequencing, the two tumors sequenced from sham-inoculated Apcmin/+Msh2VC mice (mice 877, 966) were pooled, and one tumor sample was sequenced from mice 735, 710, and 965.
FIG 1Copy number neutral loss of heterozygosity (LOH) is seen on chromosome 18 in tumors isolated from Apcmin/+ mice. (A) LOH score for each autosomal chromosome. For each chromosome, all positions along the chromosome were averaged to create an LOH score for that chromosome. Only chromosomes with large regions of LOH will show a significantly decreased fraction of reads aligning to the reference allele. (B) Location of LOH along chromosome 18. The position on the x axis represents the median chromosomal position taken from the following four intervals along chromosome 18: 3.4 × 107 to 3.5 × 107, 3.6 × 107 to 3.7 × 107, 3.7 × 107 to 3.8 × 107, and 3.8 × 107 to 3.9 × 107. The Apc gene is located in the first interval. Error bars represent standard error of the mean. At a position that has lost heterozygosity, the fraction of reads aligning to the reference allele is expected to be considerably lower than 0.5. (C) Copy number variations on chromosome 18 in Apcmin/+ sham and Apcmin/+ ETBF tumor samples as determined by CNVkit. Each gray dot represents an individual data point, and the orange lines/dot represent the average copy number variation over a given region of chromosome 18. A log2 copy ratio of 0 represents no difference in copy number at that location between the tumor sample and a normal sample from the same mouse.
FIG 2Mutation frequency and distribution in ETBF-induced tumors and sham tumors. (A and B) Number of combined SNVs and indels identified via exome and whole-genome sequencing analyses. The number of mutations present in each group per megabase pair of sequenced DNA is identified. Total number of mutations and mutations resulting only in amino acid changes (amino acid altering) are presented. Error bars represent standard error of the mean. (C and D) Distribution of SNVs and indels by chromosome across the autosomal chromosomes via exome and whole-genome sequencing analyses. For exome sequencing, results were normalized to the overall length of hybrid probes used to target exons on that chromosome. For whole-genome sequencing, results were normalized to the length of each chromosome. The dotted line at 1 indicates the expected value if the mutational burden for each chromosome was evenly distributed. Values greater than 1 represent a higher-than-expected mutation rate for each chromosome, and values less than 1 represent a lower-than-expected mutation rate for each chromosome.
FIG 3ETBF-specific mutational profiles extracted from whole-genome sequencing and compared to COSMIC single-base substitution (SBS) signatures. The R package MutationalPatterns was used to create SBS mutational profiles. (A) SNV mutational profile in the 6-mutation type format for whole-genome sequencing data. In the 6-mutation type format, mutations are divided into the following 6 categories: C > A, C > G, C > T, T > A, T > C, and T > G. Additionally, C > T mutations are further subdivided into those that occur within a CpG dinucleotide context and those that do not. (B) Graphic detailing how the ETBF-specific mutational profiles and the de novo extracted ETBF-specific mutational signature were created from the whole-genome sequencing data in the 96-mutation type format. In the 96-mutation type format, the 6 mutations outlined above are further subdivided into 16 categories, which represent the 16 combinations of nucleotides immediately 5′ and 3′ to each mutated base. The de novo signature was extracted from the ETBF-specific mutational profiles in Apcmin/+ and Apcmin/+Msh2 VC mice. The total number of mutations belonging to each trinucleotide mutation type is presented. (C) Heatmaps comparing SBS COSMIC signatures (vertical axis) to the mutational profiles created from whole-genome sequencing data in Apcmin/+ mice and Apcmin/+Msh2VC mice. Numbers displayed represent “cosine similarity,” which is a metric used to quantify the similarity between any two mutational matrices. Only the top 10 COSMIC SBS signatures are shown. Dots indicate mutational profiles most similar to ETBF signature across multiple analyses.
FIG 4ETBF-induced mutational analyses in an organoid model of ETBF-colonization. (A) Plot showing the number of combined SNVs and indels identified via whole-genome sequencing analysis. The mutation rate per day in culture is identified. Error bars represent standard error of the mean. To determine the mutational load estimate during culture, mutation numbers were divided by the number of days the samples have been in culture. (B) Distribution of mutations throughout the genome. Only autosomal chromosomes are shown. Results were normalized to the length of each chromosome. The dotted line at 1 indicates the expected value if the mutational burden for each chromosome was evenly distributed. (C) Indel mutational profiles in the 83-mutation type format. This format groups indels based on several criteria including size of the indel, nucleotides affected, and the presence of the indel in a repetitive region and/or microhomology region. (D) SNV mutational profile in the 6-mutation type format. In the 6-mutation type format, mutations are divided into 6 categories as follows: C > A, C > G, C > T, T > A, T > C, and T > G. Additionally, C > T mutations are further subdivided into those that occur within a CpG dinucleotide context and those that do not. (E) SNV mutational profile in the 96-mutation type format. In the 96-mutation type format, the 6 mutations outlined above are further subdivided into 16 categories, which represent the 16 combinations of nucleotides immediately 5′ and 3′ to each mutated base. (F) Heatmaps comparing SBS COSMIC signatures (vertical axis) to the mutational profiles created from whole-genome sequencing data in organoids. Numbers displayed represent “cosine similarity,” which is a metric used to quantify the similarity between any two mutational matrices. Only the top 10 COSMIC SBS signatures are shown. Dots indicate mutational profiles most similar to ETBF signature.