| Literature DB >> 25861819 |
Erin R Reichenberger1, Gail Rosen2, Uri Hershberg3, Ruth Hershberg4.
Abstract
The causes of the great variation in nucleotide composition of prokaryotic genomes have long been disputed. Here, we use extensive metagenomic and whole-genome data to demonstrate that both phylogeny and the environment shape prokaryotic nucleotide content. We show that across environments, various phyla are characterized by different mean guanine and cytosine (GC) values as well as by the extent of variation on that mean value. At the same time, we show that GC-content varies greatly as a function of environment, in a manner that cannot be entirely explained by disparities in phylogenetic composition. We find environmentally driven differences in nucleotide content not only between highly diverged environments (e.g., soil, vs. aquatic vs. human gut) but also within a single type of environment. More specifically, we demonstrate that some human guts are associated with a microbiome that is consistently more GC-rich across phyla, whereas others are associated with a more AT-rich microbiome. These differences appear to be driven both by variations in phylogenetic composition and by environmental differences-which are independent of these phylogenetic composition differences. Combined, our results demonstrate that both phylogeny and the environment significantly affect nucleotide composition and that the environmental differences affecting nucleotide composition are far subtler than previously appreciated.Entities:
Keywords: GC-content; evolutionary forces; genomic variation; metagenomics; mutation; natural selection
Mesh:
Substances:
Year: 2015 PMID: 25861819 PMCID: PMC4453058 DOI: 10.1093/gbe/evv063
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Number of Data Sets and Sequences by Environment
| Environment | Number of Data Sets | Number of Sequences |
|---|---|---|
| Chicken Cecum | 2 | 384,676 |
| Contaminated Soil | 3 | 3,654,826 |
| Coral | 7 | 427,591 |
| Cow Rumen | 3 | 490,767 |
| Dental Plaque | 8 | 1,725,397 |
| Fish Slime | 2 | 80,878 |
| Fish Gut | 2 | 57,122 |
| Human Gut | 111 | 16,047,825 |
| Microbialites | 13 | 515,358 |
| Tundra | 1 | 5,894,070 |
| Water Marine | 13 | 810,607 |
| Water Mine | 2 | 359,534 |
| Water PondFresh | 4 | 325,037 |
| Water PondSaline | 12 | 665,214 |
| Total | 183 | 31,438,902 |
aNumber of classified sequences.
FRelative abundance of each phylum within the various sampled environments.
FAverage GC-composition by environment. The GC-composition was averaged across the ten phyla found to be most abundant across all sampled biomes in an environmental category.
Spearman Correlation Coefficients for (A) All Environments, (B) Human Gut Environment, and (C) All Environments minus the Human Gut
| Act | Bact | Chl | Cren | DT | Eury | Firm | Pro | Spiro | Ten | |
|---|---|---|---|---|---|---|---|---|---|---|
| ( | ||||||||||
| Act | 1 | 0.169 | −0.084 | 0.216 | 0.661 | 0.56 | 0.032 | 0.56 | −0.203 | −0.213 |
| Bact | 0.169 | 1 | 0.544 | 0.458 | 0.161 | 0.596 | 0.758 | 0.607 | 0.661 | 0.26 |
| Chl | −0.084 | 0.544 | 1 | 0.429 | −0.108 | 0.325 | 0.614 | 0.235 | 0.688 | 0.457 |
| Cren | 0.216 | 0.458 | 0.429 | 1 | 0.121 | 0.683 | 0.265 | 0.296 | 0.318 | 0.259 |
| DT | 0.661 | 0.161 | −0.108 | 0.121 | 1 | 0.304 | 0.104 | 0.394 | −0.13 | −0.227 |
| Eury | 0.56 | 0.596 | 0.325 | 0.683 | 0.304 | 1 | 0.296 | 0.621 | 0.249 | 0.137 |
| Firm | 0.032 | 0.758 | 0.614 | 0.265 | 0.104 | 0.296 | 1 | 0.477 | 0.774 | 0.366 |
| Pro | 0.56 | 0.607 | 0.235 | 0.296 | 0.394 | 0.621 | 0.477 | 1 | 0.245 | −0.067 |
| Spiro | −0.203 | 0.661 | 0.688 | 0.318 | −0.13 | 0.249 | 0.774 | 0.245 | 1 | 0.601 |
| Ten | −0.213 | 0.26 | 0.457 | 0.259 | −0.227 | 0.137 | 0.366 | −0.067 | 0.601 | 1 |
| ( | ||||||||||
| Act | 1 | 0.016 | 0.085 | 0.048 | 0.444 | 0.248 | 0.056 | 0.394 | 0.075 | 0.063 |
| Bact | 0.016 | 1 | 0.575 | 0.759 | 0.022 | 0.716 | 0.627 | 0.51 | 0.804 | 0.347 |
| Chl | 0.085 | 0.575 | 1 | 0.637 | −0.008 | 0.566 | 0.555 | 0.282 | 0.725 | 0.349 |
| Cren | 0.048 | 0.759 | 0.637 | 1 | −0.048 | 0.717 | 0.559 | 0.423 | 0.779 | 0.441 |
| DT | 0.444 | 0.022 | −0.008 | −0.048 | 1 | −0.052 | 0.166 | 0.094 | −0.023 | −0.089 |
| Eury | 0.248 | 0.716 | 0.566 | 0.717 | −0.052 | 1 | 0.455 | 0.528 | 0.749 | 0.497 |
| Firm | 0.056 | 0.627 | 0.555 | 0.559 | 0.166 | 0.455 | 1 | 0.412 | 0.779 | 0.381 |
| Pro | 0.394 | 0.51 | 0.282 | 0.423 | 0.094 | 0.528 | 0.412 | 1 | 0.489 | 0.125 |
| Spiro | 0.075 | 0.804 | 0.725 | 0.779 | −0.023 | 0.749 | 0.779 | 0.489 | 1 | 0.49 |
| Ten | 0.063 | 0.347 | 0.349 | 0.441 | −0.089 | 0.497 | 0.381 | 0.125 | 0.49 | 1 |
| ( | ||||||||||
| Act | 1 | 0.554 | −0.015 | −0.003 | 0.753 | 0.792 | 0.485 | 0.803 | 0.08 | −0.266 |
| Bact | 0.554 | 1 | 0.431 | 0.386 | 0.368 | 0.779 | 0.908 | 0.719 | 0.557 | 0.139 |
| Chl | −0.015 | 0.431 | 1 | 0.636 | −0.082 | 0.337 | 0.525 | 0.124 | 0.658 | 0.558 |
| Cren | −0.003 | 0.386 | 0.636 | 1 | 0.033 | 0.361 | 0.422 | 0.105 | 0.539 | 0.478 |
| DT | 0.753 | 0.368 | −0.082 | 0.033 | 1 | 0.524 | 0.298 | 0.584 | 0.042 | −0.233 |
| Eury | 0.792 | 0.779 | 0.337 | 0.361 | 0.524 | 1 | 0.699 | 0.815 | 0.412 | 0.067 |
| Firm | 0.485 | 0.908 | 0.525 | 0.422 | 0.298 | 0.699 | 1 | 0.693 | 0.587 | 0.195 |
| Pro | 0.803 | 0.719 | 0.124 | 0.105 | 0.584 | 0.815 | 0.693 | 1 | 0.183 | −0.254 |
| Spiro | 0.08 | 0.557 | 0.658 | 0.539 | 0.042 | 0.412 | 0.587 | 0.183 | 1 | 0.664 |
| Ten | −0.266 | 0.139 | 0.558 | 0.478 | −0.233 | 0.067 | 0.195 | −0.254 | 0.664 | 1 |
Note.—Act, Actinobacteria; Bact, Bacteroidetes; Chl, Chlamydiae; Cren, Crenarchaeota; DT, Deinococcus-Thermus; Eury, Euryarchaeota; Firm, Firmicutes; Pro, Proteobacteria; Spiro, Spirochaetes; Ten, Tenericutes. Asterisk denotes statistical significance (P < 0.05, according to the Spearman Correlation test).
FRelative abundance of the ten most abundant phyla in the human gut samples.
Within the Human Gut, High Abundance of the GC-Rich Phylum Actinobacteria Is Associated with Higher GC-Contents of Most Other Phyla: (A) Hypergeometric Probability and (B) Mann–Whitney–Wilcoxon P values for Each Phylum Comparing the Mean GC-Content for the Top 22% Guts with the Bottom 78% Guts Ordered by Actinobacteria Abundance
| ( | No. of Top 24 Most Actinobacteria-Rich |
|---|---|
| Samples that Are Most GC-Rich (hypergeometric | |
| Actinobacteria (GC) | 9 (0.026) |
| Bacteroidetes (GC) | 10 (0.008) |
| Chlamydiae (GC) | 11 (0.002) |
| Crenarchaeota (GC) | 11 (0.002) |
| Deinococcus-Thermus (GC) | 9 (0.026) |
| Euryarchaeota (GC) | 16 (0) |
| Firmicutes (GC) | 5 (0.220) |
| Proteobacteria (GC) | 9 (0.026) |
| Spirochaetes (GC) | 11 (0.002) |
| Tenericutes (GC) | 10 (0.008) |
| Actinobacteria | 7.4e-6 |
| Bacteroidetes | 6.4e-4 |
| Chlamydiae | 2.4e-4 |
| Crenarchaeota | 1.6e-4 |
| Deinococcus-Thermus | 0.257 |
| Euryarchaeota | 1.8e-10 |
| Firmicutes | 0.198 |
| Proteobacteria | 3.0e-4 |
| Spirochaetes | 5.2e-4 |
| Tenericutes | 0.077 |
Note.—Number of guts that are dominated by Actinobacteria (24 data sets = ∼22%) of GC-content for each phylum out of the 24 guts with the highest Actinobacteria abundance. In parenthesis is the P value of the hypergeometric distribution. This indicates the likelihood that such overabundance is possible by chance.