| Literature DB >> 29515151 |
Roshonda B Jones1, Xiangzhu Zhu2, Emili Moan3, Harvey J Murff2, Reid M Ness4, Douglas L Seidner5, Shan Sun1, Chang Yu4, Qi Dai2, Anthony A Fodor1, M Andrea Azcarate-Peril6, Martha J Shrubsole7.
Abstract
The purpose of this study is to evaluate similarities and differences in gut bacterial measurements and stability in the microbial communities of three different types of samples that could be used to assess different niches of the gut microbiome: rectal swab, stool, and normal rectal mucosa samples. In swab-stool comparisons, there were substantial taxa differences with some taxa varying largely by sample type (e.g. Thermaceae), inter-individual subject variation (e.g. Desulfovibrionaceae), or by both sample type and participant (e.g. Enterobacteriaceae). Comparing all three sample types with whole-genome metagenome shotgun sequencing, swab samples were much closer to stool samples than mucosa samples although all KEGG functional Level 1 and Level 2 pathways were significantly different across all sample types (e.g. transcription and environmental adaptation). However, the individual signature of participants was also observed and was largely stable between two time points. Thus, we found that while the distribution of some taxa was associated with these different sampling techniques, other taxa largely reflected individual differences in the microbial community that were insensitive to sampling technique. There is substantial variability in the assessment of the gut microbial community according to the type of sample.Entities:
Mesh:
Year: 2018 PMID: 29515151 PMCID: PMC5841359 DOI: 10.1038/s41598-018-22408-4
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
16S rRNA sequence reads from both stool and swab samples after various filtering steps.
| Step | Number of Samples | Number of OTUs | Total Number of Sequence Reads | Mean Reads per sample ± SD (SE) | Minimum reads per sample | Maximum reads per sample |
|---|---|---|---|---|---|---|
| 16S rRNA amplicon reads generated | 240 | — | 36,832,742 | 153,469.76 ± 214,793.55 (13,864.86) | 85 | 1,768,150 |
| After clustering into closed-reference OTUs | 240 | 8,375 | 36,826,591 | 153,444.13 ± 213,926.13 | 85 | 1,768,150 |
| After filtering out OTUs in less than 20% of samples | 240 | 1,849 | 32,222,426 | 134,260.11 ± 209,547.62 | 85 | 1,768,150 |
Whole-genome metagenome shotgun sequence reads from both stool and swab samples after various filtering steps.
| Step | Number of Samples | Total Number of Sequence Reads | Mean Reads per sample ± SD (SE) | Minimum reads per sample | Maximum reads per sample |
|---|---|---|---|---|---|
| After removing reads mapping to the human genome | 128 | 25,720,978 | 200,945.1 ± 104,515.8 (9,237.98) | 342 | 521,089 |
| After removing samples with low read counts | 127 | 25,720,636 | 202,524.7 ± 103,384.5 (9173.9) | 13,405 | 521,089 |
| After assigning reads to gene families | 127 | 22,285,207 | 175,474.1 ± 97,062.82 (8612.93) | 13,863 | 455,173 |
Figure 1Multidimensional scaling (MDS) of closed-reference OTUs classified at the family level using 16S rRNA gene sequence reads. There are four samples (two each of stool and swab) from each of the 60 participants in our study colored by sample origin (red is stool and blue is swab). Repeated samples collected from individuals were collected with an average separation of 3 months. The distinct separation of colors shows that there is separation by sample type in MDS axis 1 and MDS axis 2 (a,c) but not in MDS axes 3 and 4 (b). However, MDS axis 4 shows strong clustering by participant (d).
Differences in MDS axes of closed-reference OTUs of 16S rRNA gene sequence reads classified to family level taxa due to sample source (stool vs. swab) and participant source.
| Item | Stool vs. Swab | |||||
|---|---|---|---|---|---|---|
| Stool samples (mean ± standard deviation) | Swab samples (mean ± standard deviation) | Participant | R-squared | R-squared | ||
|
| ||||||
| Number of sequences per sample | 139,959 ± 211,557 | 144,768 ± 191,410 | 0.848 | 0.099 | 0.005 | 0.097 |
| Sun rarified Richness | 10.44 ± 1.813 | 11.726 ± 2.348 | 3.00 × 10−08 | 4.80 × 10−08 | 0.087 | 0.386 |
| Shannon diversity | 1.83 ± 0.272 | 1.961 ± 0.311 | 1.55 × 10−05 | 1.36 × 10−10 | 0.049 | 0.418 |
| Shannon evenness | 0.514 ± 0.078 | 0.529 ± 0.08 | 0.095 | 2.74 × 10−05 | 0.009 | 0.255 |
| MDS axis 1 | −0.044 ± 0.03 | 0.045 ± 0.06 | 3.62 × 10−37 | 7.08 × 10−05 | 0.475 | 0.597 |
| MDS axis 2 | 0.022 ± 0.07 | −0.023 ± 0.05 | 1.20 × 10−11 | 3.92 × 10−12 | 0.127 | 0.496 |
| MDS axis 3 | −0.012 ± 0.06 | 0.012 ± 0.07 | 6.00 × 10−05 | 3.93 × 10−23 | 0.034 | 0.615 |
| MDS axis 4 | −0.005 ± 0.07 | 0.005 ± 0.06 | 0.0196 | 5.63 × 10−47 | 0.006 | 0.801 |
| MDS axis 5 | −0.008 ± 0.07 | 0.008 ± 0.06 | 0.001 | 1.58 × 10−30 | 0.026 | 0.688 |
| MDS axis 6 | 0.009 ± 0.07 | −0.009 ± 0.06 | 0.004 | 1.43 × 10−23 | 0.018 | 0.613 |
| MDS axis 7 | −0.002 ± 0.07 | 0.002 ± 0.06 | 0.557 | 1.40 × 10−14 | 0.005 | 0.472 |
| MDS axis 8 | 0.001 ± 0.07 | −0.001 ± 0.06 | 0.641 | 1.46 × 10−33 | 0 | 0.706 |
| MDS axis 9 | 0.002 ± 0.07 | −0.001 ± 0.07 | 0.809 | 2.05 × 10−14 | 0.002 | 0.483 |
| MDS axis 10 | −0.007 ± 0.07 | 0.006 ± 0.06 | 0.109 | 5.92 × 10−08 | 0.01 | 0.334 |
| MDS axis 11 | 0.006 ± 0.07 | −0.006 ± 0.06 | 0.109 | 2.35 × 10−11 | 0.008 | 0.416 |
| MDS axis 12 | −0.009 ± 0.07 | 0.009 ± 0.06 | 0.007 | 6.98 × 10−16 | 0.02 | 0.505 |
| MDS axis 13 | 0.002 ± 0.06 | −0.002 ± 0.07 | 0.557 | 8.64 × 10−09 | 0.001 | 0.35 |
| MDS axis 14 | 0.009 ± 0.06 | −0.008 ± 0.07 | 0.031 | 2.18 × 10−06 | 0.022 | 0.299 |
| MDS axis 15 | 0.003 ± 0.07 | −0.003 ± 0.06 | 0.438 | 1.34 × 10−14 | 0.008 | 0.475 |
ap-value derived from ANOVA of the mixed linear model. bR-squared marginal represents the variation that is explained by the model without the mixed effect (participant) while cthe conditional R-squared represents the variation that is explained by the model including both fixed effects and mixed effects.
Figure 2The first 15 MDS axes were regressed against sample type (swab or stool), participant ID and time point. The −log10 (p-value) for the null hypothesis that sample type, participant ID and time point have no impact on the MDS axes are all shown. While there are significant differences in the first MDS axis in stool vs swab samples, the MDS axes thereafter are significantly different between the participants. Taxonomic calls were based on QIIME closed-referenced OTU picking against GreenGenes database.
Differences in closed-reference OTUs of 16S rRNA gene sequence reads classified to family level taxa due to sample source (stool vs. swab) and participant source.
| Family Level Taxaa | Stool vs. Swab | Participant | R-square | R-squared | ||
|---|---|---|---|---|---|---|
| Log-Normalized Mean Abundance ± Standard Deviation | ||||||
| Stool | Swab | |||||
| Acidaminococcaceae | 2.943 ± 1.1 | 2.728 ± 2.73 | 0.147 | 6.74 × 10-40 | 0.012 | 0.759 |
| Actinomycetaceae | 1.342 ± 0.69 | 1.327 ± 1.33 | 0.998 | 0.066 | 0 | 0.108 |
| Aerococcaceae | 0.169 ± 0.38 | 0.326 ± 0.33 | 0.004 | 0.341 | 0.027 | 0.078 |
| Bacillaceae_1 | 1.866 ± 0.6 | 3.338 ± 3.34 | 3.18 × 10-53 | 0.148 | 0.482 | 0.526 |
| BacillalesIncertae_Sedis_XI | 0.702 ± 0.71 | 0.51 ± 0.51 | 0.055 | 0.278 | 0.016 | 0.076 |
| Bacteroidaceae | 4.564 ± 0.29 | 4.512 ± 4.51 | 0.275 | 2.88 × 10-12 | 0.011 | 0.436 |
| Beijerinckiaceae | 0.331 ± 0.51 | 0.304 ± 0.3 | 0.934 | 5.05 × 10-04 | 0.003 | 0.212 |
| Bifidobacteriaceae | 0.451 ± 0.6 | 0.317 ± 0.32 | 0.062 | 2.44 × 10-08 | 0.014 | 0.35 |
| Burkholderiaceae | 0.425 ± 0.62 | 0.499 ± 0.5 | 0.632 | 2.13 × 10-04 | 0.006 | 0.23 |
| Burkholderialesincertae_sedis | 0.697 ± 1.03 | 0.703 ± 0.7 | 0.998 | 8.92 × 10-43 | 0 | 0.779 |
| Campylobacteraceae | 0.331 ± 0.49 | 1.485 ± 1.49 | 1.10 × 10-31 | 2.47 × 10-05 | 0.304 | 0.482 |
| Carnobacteriaceae | 0.792 ± 0.68 | 0.32 ± 0.32 | 2.83 × 10-15 | 0.004 | 0.148 | 0.293 |
| Chloroplast | 0.751 ± 0.75 | 0.25 ± 0.25 | 2.30 × 10-14 | 0.197 | 0.138 | 0.201 |
| Clostridiaceae_1 | 1.415 ± 1.15 | 1.033 ± 1.03 | 0.003 | 9.08 × 10-17 | 0.028 | 0.525 |
| ClostridialesIncertae_SedisXI | 0.884 ± 0.72 | 2.887 ± 2.89 | 1.13 × 10-71 | 0.019 | 0.611 | 0.665 |
| ClostridialesIncertae_Sedis XII | 1.403 ± 0.96 | 0.938 ± 0.94 | 5.60 × 10-07 | 3.71 × 10-26 | 0.076 | 0.664 |
| ClostridialesIncertae_Sedis XIII | 2.176 ± 0.7 | 2.076 ± 2.08 | 0.435 | 8.62 × 10-10 | 0.009 | 0.384 |
| Comamonadaceae | 0.195 ± 0.41 | 0.242 ± 0.24 | 0.648 | 0.011 | 0.003 | 0.154 |
| Coriobacteriaceae | 2.958 ± 0.56 | 2.652 ± 2.65 | 9.63 × 10-08 | 0.004 | 0.074 | 0.235 |
| Corynebacteriaceae | 0.105 ± 0.26 | 0.836 ± 0.84 | 1.52 × 10-23 | 0.320 | 0.226 | 0.269 |
| Desulfovibrionaceae | 1.91 ± 1.03 | 1.935 ± 1.94 | 0.998 | 1.43 × 10-29 | 0.001 | 0.672 |
| Enterobacteriaceae | 1.423 ± 1.16 | 2.129 ± 2.13 | 5.59 × 10-09 | 2.17 × 10-17 | 0.09 | 0.565 |
| Enterococcaceae | 0.296 ± 0.68 | 0.386 ± 0.39 | 0.562 | 2.87 × 10-27 | 0.006 | 0.649 |
| Erysipelotrichaceae | 3.189 ± 0.58 | 3.225 ± 3.23 | 0.863 | 4.53 × 10-07 | 0.009 | 0.315 |
| Eubacteriaceae | 0.859 ± 0.7 | 0.547 ± 0.55 | 7.28 × 10-06 | 0.000305 | 0.055 | 0.263 |
| Flavobacteriaceae | 0.362 ± 0.52 | 0.394 ± 0.39 | 0.921 | 4.53 × 10-11 | 0.001 | 0.406 |
| Fusobacteriaceae | 0.594 ± 0.97 | 1.199 ± 1.2 | 4.98 × 10-08 | 1.47 × 10-12 | 0.084 | 0.503 |
| Hyphomicrobiaceae | 0.96 ± 0.88 | 0.885 ± 0.89 | 0.800 | 2.24 × 10-37 | 0.004 | 0.74 |
| Incertae_Sedis_XI | 0.176 ± 0.37 | 1.618 ± 1.62 | 3.71 × 10-47 | 0.024 | 0.433 | 0.508 |
| Lachnospiraceae | 4.556 ± 0.19 | 4.512 ± 4.51 | 0.158 | 0.141 | 0.03 | 0.113 |
| Lactobacillaceae | 0.896 ± 0.84 | 0.947 ± 0.95 | 0.921 | 1.60 × 10-05 | 0.002 | 0.263 |
| Microbacteriaceae | 0.296 ± 0.58 | 0.353 ± 0.35 | 0.740 | 1.65 × 10-14 | 0.006 | 0.475 |
| Micrococcaceae | 0.436 ± 0.63 | 0.326 ± 0.33 | 0.177 | 2.95 × 10-08 | 0.009 | 0.346 |
| Moraxellaceae | 0.067 ± 0.26 | 0.986 ± 0.99 | 8.74 × 10-37 | 0.162 | 0.347 | 0.4 |
| Pasteurellaceae | 0.568 ± 0.81 | 0.55 ± 0.55 | 0.998 | 1.81 × 10-09 | 0.002 | 0.369 |
| Peptococcaceae_1 | 0.416 ± 0.67 | 0.778 ± 0.78 | 3.33 × 10-06 | 3.15 × 10-22 | 0.06 | 0.614 |
| Peptostreptococcaceae | 1.924 ± 0.99 | 2.017 ± 2.02 | 0.717 | 1.62 × 10-14 | 0.003 | 0.479 |
| Porphyromonadaceae | 3.615 ± 0.67 | 3.683 ± 3.68 | 0.648 | 3.75 × 10-19 | 0.005 | 0.548 |
| Prevotellaceae | 2.633 ± 1 | 3.126 ± 3.13 | 1.27 × 10-06 | 4.60 × 1028 | 0.066 | 0.68 |
| Propionibacteriaceae | 0.172 ± 0.39 | 0.269 ± 0.27 | 0.077 | 0.000326 | 0.017 | 0.23 |
| Rikenellaceae | 3.356 ± 0.69 | 2.995 ± 3 | 1.77 × 10-06 | 1.60 × 10-19 | 0.052 | 0.576 |
| Ruminococcaceae | 4.368 ± 0.33 | 4.321 ± 4.32 | 0.435 | 6.04 × 10-12 | 0.009 | 0.434 |
| Streptococcaceae | 2.79 ± 0.77 | 2.825 ± 2.83 | 0.968 | 1.62 × 10-11 | 0.001 | 0.416 |
| Sutterellaceae | 2.569 ± 1.11 | 2.722 ± 2.72 | 0.435 | 7.57 × 10-41 | 0.01 | 0.767 |
| Synergistaceae | 0.544 ± 0.79 | 0.664 ± 0.66 | 0.435 | 2.57 × 10-14 | 0.007 | 0.473 |
| Thermaceae | 0.206 ± 0.4 | 1.581 ± 1.58 | 2.53 × 10-52 | 0.138 | 0.476 | 0.522 |
| Veillonellaceae | 2.225 ± 1.2 | 2.492 ± 2.49 | 0.061 | 2.15 × 10-22 | 0.017 | 0.599 |
| Verrucomicrobiaceae | 1.55 ± 1.22 | 1.202 ± 1.2 | 0.011 | 1.59 × 10-14 | 0.021 | 0.49 |
aLimited to families present in at least 25% of samples. bBenjamini-Hochberg corrected p-value derived from ANOVA of mixed linear model. cBenjamini-Hochberg corrected p-value derived from ANOVA of linear models with and without participant as a random effect. dR-squared marginal represents the variation that is explained by the model without the mixed effect (participant) while ethe conditional R-squared represents the variation that is explained by the model including both fixed effects and mixed effects.
Figure 3For each taxa at the family level present in at least 25% of samples, p-values for a null hypothesis of no difference by stool vs. swab vs. by participant. Red symbols are taxa that have a p-value that is significant at a 10% false discovery rate. Taxa higher in swab than stool have a negative x coordinate and taxa higher in stool than swab have a positive x-coordinate. Data was generated using closed-reference OTUs classified at the family level using 16S rRNA gene sequence reads.
Figure 4Plot of first two coordinates (a) and MDS3 vs MDS 4 (b) of an MDS ordination of the KEGG gene family abundance table for WGS sequence reads from swab samples (blue triangles), stool samples (red circles) and tissue samples (purple squares). The distinct separation of colors shows that there is separation by sample type in the first and third MDS axes. MDS axes plotted by participant ID show strong separation by sample type for MDS 1 (c) and by participant and sample type for MDS 3 (d).