| Literature DB >> 21423704 |
John Lightfield1, Noah R Fram, Bert Ely.
Abstract
The GC content of bacterial genomes ranges from 16% to 75% and wide ranges of genomic GC content are observed within many bacterial phyla, including both gram negative and gram positive phyla. Thus, divergent genomic GC content has evolved repeatedly in widely separated bacterial taxa. Since genomic GC content influences codon usage, we examined codon usage patterns and predicted protein amino acid content as a function of genomic GC content within eight different phyla or classes of bacteria. We found that similar patterns of codon usage and protein amino acid content have evolved independently in all eight groups of bacteria. For example, in each group, use of amino acids encoded by GC-rich codons increased by approximately 1% for each 10% increase in genomic GC content, while the use of amino acids encoded by AT-rich codons decreased by a similar amount. This consistency within every phylum and class studied led us to conclude that GC content appears to be the primary determinant of the codon and amino acid usage patterns observed in bacterial genomes. These results also indicate that selection for translational efficiency of highly expressed genes is constrained by the genomic parameters associated with the GC content of the host genome.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21423704 PMCID: PMC3053387 DOI: 10.1371/journal.pone.0017677
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Bacterial species included within each phylum analyzed in this study.
| Phylum | Species | GC% |
|
|
| 42 |
|
| 45.7 | |
|
| 50.9 | |
|
| 53.5 | |
|
| 56.3 | |
|
| 60 | |
|
| 64.2 | |
|
| 66 | |
|
| 71.2 | |
|
| 74.2 | |
|
|
| 22.4 |
|
| 27.1 | |
|
| 32.5 | |
|
| 35 | |
|
| 42 | |
|
| 45 | |
|
| 50.1 | |
|
| 55.3 | |
|
| 64.3 | |
|
| 66.1 | |
|
|
| 30.8 |
|
| 35 | |
|
| 41.3 | |
|
| 47 | |
|
| 50 | |
|
| 55.4 | |
|
| 60.2 | |
|
| 62 | |
|
|
| 23.8 |
|
| 25 | |
|
| 30.7 | |
|
| 35 | |
|
| 40 | |
|
| 46.1 | |
|
| 50.3 | |
|
| 55.8 | |
|
| 60.8 | |
|
| 68.7 |
Bacterial species included within the classes analyzed in this study.
| Class | Species | GC% |
|
|
| 27.5 |
|
| 30.1 | |
|
| 35.2 | |
|
| 41.1 | |
|
| 45.2 | |
|
| 50 | |
|
| 56.1 | |
|
| 60.8 | |
|
| 65 | |
|
|
| 44.8 |
|
| 45.5 | |
|
| 48.5 | |
|
| 50.7 | |
|
| 53.9 | |
|
| 55.7 | |
|
| 59.2 | |
|
| 61.4 | |
|
| 63.3 | |
|
| 65.2 | |
|
| 68.9 | |
|
|
| 33.1 |
|
| 46.6 | |
|
| 50.6 | |
|
| 55.1 | |
|
| 60.3 | |
|
| 67.1 | |
|
| 71.4 | |
|
| 74.9 | |
|
|
| 16.6 |
|
| 20.2 | |
|
| 25.3 | |
|
| 31.6 | |
|
| 37.2 | |
|
| 40 | |
|
| 45.1 | |
|
| 50.3 | |
|
| 55 | |
|
| 60.1 | |
|
| 65 | |
|
| 68 |
Figure 1Distribution of genomic GC content within bacterial phyla or classes.
Within each class or phylum, the genomic GC contents from all individual genomes available on June 16, 2010 were binned in five percent increments with the number on the X-axis representing the top range of the bin. A) Actinobacteria; B) Alphaproteobacteria; C) Betaproteobacteria; D) Bacterioides/Chlorbi; E) Cyanobacteria; F) Deltaproteobacteria; G) Firmicutes; H) Gammaproteobacteria.
Figure 2The GC content at each of the three codon positions was plotted against the genomic GC content of the representative Alphaproteobacteria listed in .
Figure 3The predicted percentage of amino acids encoded by the three high-GC codon families, proline, alanine, and glycine, plotted against the genomic GC content of the representative bacteria among the following groups.
A) Alphaproteobacteria; B) Betaproteobacteria; C) Gammaproteobacteria; D) Deltaproteobacteria; E) Actinobacteria; F) Bacteriodetes/Chlorobi; G) Cyanobacteria; H) Firmicutes.
Figure 4The predicted percentage of amino acids encoded three low-GC codon families, asparagine, lysine, and isoleucine, plotted against the genomic GC content of the representative bacteria among the following groups.
A) Alphaproteobacteria; B) Betaproteobacteria; C) Gammaproteobacteria; D) Deltaproteobacteria; E) Actinobacteria; F) Bacteriodetes/Chlorobi; G) Cyanobacteria; H) Firmicutes.
Figure 5Average of the slopes for all 20 amino acid codon families (with the six-fold degenerate amino acid codon families being divided into two groups based on the first position nucleotide) for all eight groups plotted against the average GC content of their respective codons.
For each class or phylum, the frequency of each amino acid in the coding portion of each genome was plotted against the genomic GC content of the genome. The slopes of these eight plots were then averaged and plotted versus the average GC content of the codon family for each amino acid.
Figure 6Arginine codon use plotted against the genomic GC content of the representative bacteria among the following groups.
A) Alphaproteobacteria; B) Betaproteobacteria; C) Gammaproteobacteria; D) Deltaproteobacteria; E) Actinobacteria; F) Bacteriodetes/Chlorobi; G) Cyanobacteria; H) Firmicutes. Arg-A and Arg-C refer to arginine codon families with either an A or a C in the first codon position, respectively.
Figure 7Leucine codon use plotted against the genomic GC content of the representative bacteria among the following groups.
A) Alphaproteobacteria; B) Betaproteobacteria; C) Gammaproteobacteria; D) Deltaproteobacteria; E) Actinobacteria; F) Bacteriodetes/Chlorobi; G) Cyanobacteria; H) Firmicutes. Leu-T and Leu-C refer to leucine codon families with either a T or a C in the first codon position, respectively.