| Literature DB >> 27171425 |
Emily Vogtmann1,2, Xing Hua1, Georg Zeller3, Shinichi Sunagawa3, Anita Y Voigt3,4,5,6, Rajna Hercog7, James J Goedert1, Jianxin Shi1, Peer Bork3,6,8,9, Rashmi Sinha1.
Abstract
Accumulating evidence indicates that the gut microbiota affects colorectal cancer development, but previous studies have varied in population, technical methods, and associations with cancer. Understanding these variations is needed for comparisons and for potential pooling across studies. Therefore, we performed whole-genome shotgun sequencing on fecal samples from 52 pre-treatment colorectal cancer cases and 52 matched controls from Washington, DC. We compared findings from a previously published 16S rRNA study to the metagenomics-derived taxonomy within the same population. In addition, metagenome-predicted genes, modules, and pathways in the Washington, DC cases and controls were compared to cases and controls recruited in France whose specimens were processed using the same platform. Associations between the presence of fecal Fusobacteria, Fusobacterium, and Porphyromonas with colorectal cancer detected by 16S rRNA were reproduced by metagenomics, whereas higher relative abundance of Clostridia in cancer cases based on 16S rRNA was merely borderline based on metagenomics. This demonstrated that within the same sample set, most, but not all taxonomic associations were seen with both methods. Considering significant cancer associations with the relative abundance of genes, modules, and pathways in a recently published French metagenomics dataset, statistically significant associations in the Washington, DC population were detected for four out of 10 genes, three out of nine modules, and seven out of 17 pathways. In total, colorectal cancer status in the Washington, DC study was associated with 39% of the metagenome-predicted genes, modules, and pathways identified in the French study. More within and between population comparisons are needed to identify sources of variation and disease associations that can be reproduced despite these variations. Future studies should have larger sample sizes or pool data across studies to have sufficient power to detect associations that are reproducible and significant after correction for multiple testing.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27171425 PMCID: PMC4865240 DOI: 10.1371/journal.pone.0155362
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Descriptive characteristics of the colorectal cancer cases and controls (population WGSS DC), Washington DC, USA, 1985–1987.
| Cases | Controls | |||
|---|---|---|---|---|
| N = 52 | N = 52 | |||
| N/Mean | %/SD | N/Mean | %/SD | |
| Male | 37 | 71.2% | 37 | 71.2% |
| Female | 15 | 28.8% | 15 | 28.8% |
| 61.8 | 13.6 | 61.2 | 11.0 | |
| Non-Hispanic white | 39 | 75.0% | 47 | 90.4% |
| Non-Hispanic black | 12 | 23.1% | 3 | 5.8% |
| Other | 1 | 1.9% | 2 | 3.8% |
| Less than high school | 8 | 15.4% | 2 | 3.8% |
| High school graduate | 11 | 21.2% | 10 | 19.2% |
| 1–3 years of college/graduate | 10 | 19.2% | 9 | 17.3% |
| 4–5 years of college/graduate | 12 | 23.1% | 15 | 28.8% |
| 6+ years of college/graduate | 8 | 15.4% | 16 | 30.8% |
| Missing data | 3 | 5.8% | 0 | 0.0% |
| Never | 24 | 46.2% | 22 | 42.3% |
| Former | 18 | 34.6% | 28 | 53.8% |
| Current | 7 | 13.5% | 2 | 3.8% |
| Missing data | 3 | 5.8% | 0 | 0.0% |
| 24.9 | 4.2 | 25.3 | 4.3 | |
| 7.4 | 11.9 | 6.1 | 10.4 | |
| Right colon | 15 | 28.8% | NA | |
| Left colon | 18 | 34.6% | NA | |
| Rectal | 14 | 26.9% | NA | |
| Missing data | 5 | 9.6% | NA | |
| NA | ||||
| Pre-invasive | 12 | 23.1% | NA | |
| Invasive, no known metastases | 21 | 40.4% | NA | |
| Known metastases | 18 | 34.6% | NA | |
| Missing data | 1 | 1.9% | NA | |
NA: Not applicable
Comparison of significant taxa detected in 16S rRNA gene sequencing data with whole-genome shotgun sequencing data (presence/absence of taxa).
| Population 16S DC | Population WGSS DC | |||||
|---|---|---|---|---|---|---|
| Case | Control | Case | Control | |||
| Taxa (phylum; class; order; family; genus) | % | % | P | % | % | P |
| Fusobacteria (phylum) | 36.2 | 16.0 | 0.007 | 76.9 | 48.1 | 0.003 |
| Fusobacteria;Fusobacteria;Fusobacteriales;Fusobacteriaceae;Fusobacterium | 31.9 | 11.7 | 0.004 | 75.0 | 48.1 | 0.006 |
| Actinobacteria;Actinobacteria;Coriobacteriales;Coriobacteriaceae;Atopobium | 19.2 | 2.1 | <0.001 | 53.8 | 44.2 | 0.328 |
| Bacteroidetes;Bacteroidia;Bacteriodales;Porphyromonadaceae;Porphyromonas | 27.7 | 7.5 | 0.001 | 61.5 | 40.4 | 0.032 |
1 P value based on two-sided chi-squared test
Comparison of significant relative abundance of taxa detected in 16S rRNA gene sequencing data with whole-genome shotgun sequencing data.
| Population 16S DC | Population WGSS DC | |||||
|---|---|---|---|---|---|---|
| Case | Control | Case | Control | |||
| Taxa (phylum; class; order; family; genus) | % | % | P | % | % | P |
| Firmicutes;Clostridia (class) | 68.6 | 77.8 | 0.005 | 33.9 | 39.0 | 0.092 |
| Firmicutes;Clostridia;Clostridiales;Lachnospiraceae;Coprococcus | 1.7 | 3.7 | 0.002 | 1.2 | 1.4 | 0.977 |
| Firmicutes;Clostridia;Clostridiales;Lachnospiraceae;Other | 16.1 | 21.2 | 0.005 | NA | NA | NA |
1 P value based on two-sided non-parametric Wilcoxon test
2 It was not possible to estimate the “other” genus using whole-genome shotgun metagenomics
Fig 1Comparison of Shannon diversity index, species richness, and community evenness for fecal samples from the Human Microbiome Project (HMP) Phase I (N = 94), MetaHIT (N = 292), and colorectal cancer cases and controls from population WGSS DC and population F.
Statistical differences between colorectal cancer cases and controls were tested using the Kruskal-Wallis test.
Fig 2QQ plot of p values for the association between the relative abundance (top) and presence/absence (bottom) of KEGG genes, modules, and pathways with colorectal cancer case status from fecal samples from population WGSS DC and population F.
Statistically significant associations after Bonferroni correction (p < 0.05/8028) between the relative abundance of a gene and colorectal cancer case status from population F and observed associations from population WGSS DC.
| Population F | Population WGSS DC | |||||
|---|---|---|---|---|---|---|
| Case | Control | P | Case | Control | P | |
| K07173 | 0.045% | 0.056% | 5.64E-07 | 0.045% | 0.047% | 4.18E-01 |
| K00177 | 0.038% | 0.025% | 3.36E-06 | 0.040% | 0.037% | 1.10E-01 |
| K01586 | 0.081% | 0.092% | 4.67E-06 | 0.087% | 0.091% | 8.39E-02 |
| K00176 | 0.031% | 0.020% | 4.75E-06 | 0.029% | 0.027% | 3.22E-01 |
| K01963 | 0.036% | 0.048% | 5.04E-06 | 0.036% | 0.039% | 9.38E-02 |
| K00394 | 0.011% | 0.019% | 5.51E-06 | 0.010% | 0.011% | 5.98E-01 |
Note: Genes in bold were reproduced (p < 0.05) in population WGSS DC
1 P value based on two-sided Wald chi-squared test after adjustment for age, sex, and body mass index
Statistically significant associations after Bonferroni correction (p < 0.05/485) between the relative abundance of a module and colorectal cancer case status from population F and observed associations from population WGSS DC.
| Population F | Population WGSS DC | |||||
|---|---|---|---|---|---|---|
| Case | Control | P | Case | Control | P | |
| M00311 | 0.342% | 0.226% | 3.32E-06 | 0.376% | 0.349% | 1.45E-01 |
| M00045 | 0.163% | 0.096% | 1.48E-05 | 0.198% | 0.174% | 5.78E-02 |
| M00185 | 0.095% | 0.147% | 2.16E-05 | 0.081% | 0.081% | 9.67E-01 |
| M00144 | 0.646% | 0.483% | 3.57E-05 | 0.727% | 0.672% | 8.97E-02 |
| M00373 | 0.276% | 0.222% | 4.42E-05 | 0.298% | 0.280% | 6.24E-02 |
| M00173 | 1.721% | 1.519% | 5.44E-05 | 1.815% | 1.783% | 3.72E-01 |
Note: Modules in bold were reproduced (p < 0.05) in population WGSS DC
1 P value based on two-sided Wald chi-squared test after adjustment for age, sex, and body mass index
Statistically significant associations after Bonferroni correction (p < 0.05/318) between the relative abundance of a pathway and colorectal cancer case status from population F and observed associations from population WGSS DC.
| Population F | Population WGSS DC | |||||
|---|---|---|---|---|---|---|
| Case | Control | P | Case | Control | P | |
| ko04964 | 0.052% | 0.035% | 4.56E-06 | 0.052% | 0.048% | 1.70E-01 |
| ko00400 | 1.245% | 1.335% | 4.22E-05 | 1.239% | 1.258% | 2.08E-01 |
| ko00430 | 0.245% | 0.226% | 4.68E-05 | 0.248% | 0.244% | 3.23E-01 |
| ko00195 | 0.615% | 0.717% | 5.53E-05 | 0.640% | 0.669% | 1.44E-01 |
| ko00627 | 0.162% | 0.134% | 6.29E-05 | 0.159% | 0.150% | 9.51E-02 |
| ko00983 | 0.579% | 0.548% | 9.11E-05 | 0.588% | 0.589% | 8.33E-01 |
| ko00360 | 0.396% | 0.352% | 1.04E-04 | 0.395% | 0.385% | 2.58E-01 |
| ko00270 | 1.736% | 1.827% | 1.29E-04 | 1.703% | 1.730% | 1.65E-01 |
| ko00643 | 0.027% | 0.017% | 1.33E-04 | 0.021% | 0.020% | 4.31E-01 |
| ko00720 | 1.622% | 1.514% | 1.42E-04 | 1.663% | 1.645% | 3.71E-01 |
Note: Pathways in bold were reproduced (p < 0.05) in population WGSS DC
1 P value based on two-sided Wald chi-squared test after adjustment for age, sex, and body mass index