| Literature DB >> 29220489 |
Adam J Hockenberry1,2, Aaron J Stern1, Luís A N Amaral1,3,4, Michael C Jewett1,3,5,6.
Abstract
The Shine-Dalgarno (SD) sequence motif is frequently found upstream of protein coding genes and is thought to be the dominant mechanism of translation initiation used by bacteria. Experimental studies have shown that the SD sequence facilitates start codon recognition and enhances translation initiation by directly interacting with the highly conserved anti-SD sequence on the 30S ribosomal subunit. However, the proportion of SD-led genes within a genome varies across species and the factors governing this variation in translation initiation mechanisms remain largely unknown. Here, we conduct a phylogenetically informed analysis and find that species capable of rapid growth contain a higher proportion of SD-led genes throughout their genomes. We show that SD sequence utilization covaries with a suite of genomic features that are important for efficient translation initiation and elongation. In addition to these endogenous genomic factors, we further show that exogenous environmental factors may influence the evolution of translation initiation mechanisms by finding that thermophilic species contain significantly more SD-led genes than mesophiles. Our results demonstrate that variation in translation initiation mechanisms across bacterial species is predictable and is a consequence of differential life-history strategies related to maximum growth rate and environmental-specific constraints.Entities:
Keywords: Shine–Dalgarno sequence; bacterial growth; genome evolution; translation initiation
Year: 2018 PMID: 29220489 PMCID: PMC5850609 DOI: 10.1093/molbev/msx310
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
. 1.Sequence entropy quantifies genome-wide SD sequence utilization. (A) Illustration of the anti-Shine–Dalgarno(aSD)::Shine–Dalgarno(SD) sequence mechanism of translation initiation. (B) Representative sequence logos of the 5′ upstream region of all annotated coding sequences for individual genomes displays heterogeneity in sequence entropy within and between species. (C) Illustration of the ΔI metric for Caulobacter crescentus as an example.
. 2.Relationship between ΔI and existing metrics of SD sequence utilization. (A) Comparison between different ways of summarizing SD sequence utilization; each data point represents a single genome. On the left, we show the relationship between SD motif and aSD sequence complementarity based methods (ΔfSD and ). On the right, we compare ΔI and . The four largest phyla are color-coded according to the legend. Arrows highlight phyla with “anomalous” patterns. (B) Phylogenetic tree illustrating variation in SD sequence utilization across species according to the ΔI metric (indicated by bar plots on concentric rings).
Contribution of Several Factors for Predicting Minimum Doubling Times.
| Model for Min. Doubling Time | Pagel’s | ||
|---|---|---|---|
| Full model | 0.35*** | 0.91 [0.80, 0.96] | – |
| Δ | 0.17*** | 0.96 [0.92, 0.99] | |
| Δ | 0.11*** | 0.97 [0.93, 0.99] | |
| mRNA folding | 0.08*** | 0.98 [0.95, 0.99] | |
| Internal SD-like | 0.08*** | 0.98 [0.95, 0.99] | 0.03 |
| 16S gene counts | 0.06** | 0.98 [0.95, 0.99] | 0.01 |
| tRNA gene counts | 0.06*** | 0.98 [0.95, 0.99] | 0.01 |
| ATG start % | 0.02 | 0.98 [0.95, 0.99] | <0.01 |
Note.—The left column indicates individual variables that we considered for predicting minimum doubling times with the full multivariate model listed at the top. R2 column illustrates the overall goodness-of-fit for individual factors (*** indicates P < 0.001, ** indicates P < 0.01). Pagel’s λ is the fitted phylogenetic signal parameter, which we show with 95% confidence intervals in brackets. The right column illustrates the change in goodness-of-fit from a model that includes all predictors to one that excludes only the variable in the given row.
. 3.Relationship between SD sequence utilization and organismal growth. (A) ΔI is significantly correlated with minimum observed doubling times for 187 bacterial species. (B) Visualization of the full model listed in table 1 depicting a strong relationship between observed and predicted minimum doubling times. In both plots, individual species data points are colored according to phyla as in figure 2.
. 4.SD sequence utilization covaries alongside a suite of translation-related traits and according to optimal growth temperatures. (A) Correlation matrix between listed variables used in table 1 for a set of 613 diverse bacterial species. In all instances of significant correlation, the features covary with one another in the positive direction. (B) SD sequence utilization, quantified using ΔI is significantly higher in thermophiles than in mesophiles. Box limits show 25th and 75th percentiles of the data, whiskers extend to 5th and 95th percentiles, triangles depict the means of each category and red lines highlight the median.