| Literature DB >> 28364038 |
Akram Abolbaghaei1, Jordan R Silke2, Xuhua Xia3,2.
Abstract
The 3' end of the small ribosomal RNAs (ssu rRNA) in bacteria is directly involved in the selection and binding of mRNA transcripts during translation initiation via well-documented interactions between a Shine-Dalgarno (SD) sequence located upstream of the initiation codon and an anti-SD (aSD) sequence at the 3' end of the ssu rRNA. Consequently, the 3' end of ssu rRNA (3'TAIL) is strongly conserved among bacterial species because a change in the region may impact the translation of many protein-coding genes. Escherichia coli and Bacillus subtilis differ in their 3' ends of ssu rRNA, being GAUCACCUCCUUA3' in E. coli and GAUCACCUCCUUUCU3' or GAUCACCUCCUUUCUA3' in B. subtilis Such differences in 3'TAIL lead to species-specific SDs (designated SDEc for E. coli and SDBs for B. subtilis) that can form strong and well-positioned SD/aSD pairing in one species but not in the other. Selection mediated by the species-specific 3'TAIL is expected to favor SDBs against SDEc in B. subtilis, but favor SDEc against SDBs in E. coli Among well-positioned SDs, SDEc is used more in E. coli than in B. subtilis, and SDBs more in B. subtilis than in E. coli Highly expressed genes and genes of high translation efficiency tend to have longer SDs than lowly expressed genes and genes with low translation efficiency in both species, but more so in B. subtilis than in E. coli Both species overuse SDs matching the bolded part of the 3'TAIL shown above. The 3'TAIL difference contributes to the host specificity of phages.Entities:
Keywords: Bacillus subtilis; Escherichia coli; Shine-Dalgarno; anti-SD-sequence; ssu rRNA; translation efficiency
Mesh:
Substances:
Year: 2017 PMID: 28364038 PMCID: PMC5427494 DOI: 10.1534/g3.117.039305
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
ssu rRNA 3′ ends that are free to base-pair with SD motifs in E. coli and B. subtilis and their compatible motifs
| Species and 3′ TAIL Sequence | SD Motifs | |
|---|---|---|
Bolded letters show the differences in the base composition between two species. (E. coli ends with A whereas B. subtilis ends with UCU or AUCU). The underlined nucleotides denote the alternative 3′-AUCU-5′ TAIL and motifs exclusively compatible with it.
The SD motifs shown are derived from differences in 3′TAIL (boldface) for both species.
Figure 1A model of SD sequence and aSD interactions. (A) The free 3′ end of SSU rRNA (3′TAIL) of E. coli and B. subtilis based on the predicted secondary structure of the 3′ end of the ssu rRNA of E. coli and B. subtilis from mfold 3.1, adapted from the comparative RNA web site and project (http://www.rna.icmb.utexas.edu). (B) A schematic representation of SD and aSD interaction illustrates DtoStart as a better measure for quantifying the optimal positioning of SD and aSD than the conventional distance from putative SD to start codon. SD1 or SD2, as illustrated, are equally good in positioning the start codon AUG against the anticodon of the initiation tRNA, but they differ in their distances to the start codon. DtoStart is the same for the two SDs. (C, D) DtoStart is constrained to a narrow range in E. coli (C) and B. subtilis (D); solid blue line denotes SD hits with the UCU-ending TAIL, and the dashed red line shows SD hits with the UCUA-ending TAIL. The y-axis in (C) and (D) represents the percentage of SD motif hits detected. See Materials and Methods section for details.
Number of SDEc hits (N) and their proportion (Prop) in E. coli and B. subtilis genes
| SDEc motifs | Occurrence in | Occurrence in | ||
|---|---|---|---|---|
| Prop | Prop | |||
| UAAG | 85 | 0.0205 | 15 | 0.0036 |
| UAAGG | 91 | 0.0220 | 54 | 0.0129 |
| UAAGGA | 151 | 0.0365 | 30 | 0.0072 |
| UAAGGAG | 117 | 0.0283 | 74 | 0.0177 |
| UAAGGAGG | 10 | 0.0024 | 74 | 0.0177 |
| UAAGGAGGU | 0 | 0 | 14 | 0.0033 |
| UAAGGAGGUG | 1 | 0.0002 | 6 | 0.0014 |
| Total | 455 | 0.1099 | 267 | 0.0640 |
SDEc, SDs that pair perfectly with the 3′ end of small subunit rRNA from E. coli, but not from B. subtilis.
Number of SDBs hits (N) and their proportion (Prop) in all Bacillus subtilis and Escherichia coli genes considering UCU as the 3′TAIL
| SDBs motifs | Occurrence in | Occurrence in | ||
|---|---|---|---|---|
| Prop | Prop | |||
| AGAA | 12 | 0.0029 | 51 | 0.0123 |
| AGAAA | 66 | 0.0158 | 60 | 0.0145 |
| AGAAAG | 60 | 0.0144 | 14 | 0.0034 |
| AGAAAGG | 54 | 0.0129 | 7 | 0.0017 |
| AGAAAGGA | 60 | 0.0144 | 6 | 0.0014 |
| AGAAAGGAG | 28 | 0.0067 | 4 | 0.0010 |
| AGAAAGGAGG | 11 | 0.0026 | 1 | 0.0002 |
| AGAAAGGAGGU | 1 | 0.0002 | 0 | 0 |
| Subtotal | 292 | 0.0699 | 143 | 0.0345 |
| GAAA | 16 | 0.0038 | 65 | 0.0157 |
| GAAAG | 41 | 0.0098 | 28 | 0.0068 |
| GAAAGG | 68 | 0.0163 | 18 | 0.0043 |
| GAAAGGA | 51 | 0.0122 | 15 | 0.0036 |
| GAAAGGAG | 57 | 0.0137 | 10 | 0.0024 |
| GAAAGGAGG | 18 | 0.0043 | 1 | 0.0002 |
| GAAAGGAGGU | 3 | 0.0007 | 0 | 0 |
| GAAAGGAGGUG | 1 | 0.0002 | 0 | 0 |
| GAAAGGAGGUGA | 1 | 0.0002 | 0 | 0 |
| Subtotal | 240 | 0.0575 | 137 | 0.0331 |
| AAAG | 19 | 0.0046 | 38 | 0.0092 |
| AAAGG | 171 | 0.0410 | 83 | 0.0200 |
| AAAGGA | 76 | 0.0182 | 101 | 0.0244 |
| AAAGGAG | 222 | 0.0532 | 64 | 0.0155 |
| AAAGGAGG | 143 | 0.0343 | 6 | 0.0014 |
| AAAGGAGGU | 31 | 0.0074 | 3 | 0.0007 |
| AAAGGAGGUG | 6 | 0.0014 | 0 | 0 |
| AAAGGAGGUGA | 3 | 0.0007 | 1 | 0.0002 |
| Subtotal | 671 | 0.1607 | 296 | 0.0715 |
| Total | 1203 | 0.2881 | 576 | 0.1391 |
Figure 2Distribution of SDs from 200 HTE genes and 200 LTE genes over SD length for E. coli (A) and B. subtilis (B). Classifying genes into HEGs and LEGs generates equivalent results, with HEGs similar to HTE genes, and LEGs similar to LTE genes. HEGs and HTE genes tend to have longer SDs than LEGs and LTE genes.
Number of SDEc hits (N) and their proportion (Prop) in HEGs and LEGs
| SDEc motifs | Occurrence in | Occurrence in | ||||||
|---|---|---|---|---|---|---|---|---|
| HEGs | LEGs | HEGs | LEGs | |||||
| Prop | Prop | Prop | Prop | |||||
| UAAG | 22 | 0.0053 | 7 | 0.0017 | 1 | 0.0002 | 3 | 0.0007 |
| UAAGG | 32 | 0.0077 | 6 | 0.0014 | 4 | 0.0010 | 3 | 0.0007 |
| UAAGGA | 36 | 0.0087 | 20 | 0.0048 | 3 | 0.0007 | 0 | 0 |
| UAAGGAG | 40 | 0.0097 | 12 | 0.0029 | 9 | 0.0022 | 10 | 0.0024 |
| UAAGGAGG | 2 | 0.0005 | 1 | 0.0002 | 14 | 0.0034 | 2 | 0.0005 |
| UAAGGAGGU | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.0002 |
| UAAGGAGGUG | 0 | 0 | 0 | 0 | 4 | 0.0010 | 0 | 0 |
| Total | 132 | 0.0319 | 46 | 0.0111 | 35 | 0.0084 | 19 | 0.0046 |
Number of SDBs hits (N) and their proportion (Prop) in highly and lowly expressed genes
| SDBs motifs | Occurrence in | Occurrence in | ||||||
|---|---|---|---|---|---|---|---|---|
| HEGs | LEGs | HEGs | LEGs | |||||
| Prop. | Prop. | Prop. | Prop. | |||||
| AGAA | 0 | 0 | 2 | 0.0005 | 3 | 0.0007 | 3 | 0.0007 |
| AGAAA | 2 | 0.0005 | 8 | 0.0019 | 7 | 0.0017 | 9 | 0.0022 |
| AGAAAG | 6 | 0.0014 | 4 | 0.0010 | 1 | 0.0002 | 1 | 0.0002 |
| AGAAAGG | 3 | 0.0007 | 6 | 0.0014 | 1 | 0.0002 | 0 | 0 |
| AGAAAGGA | 4 | 0.0010 | 2 | 0.0005 | 2 | 0.0005 | 0 | 0 |
| AGAAAGGAG | 2 | 0.0005 | 3 | 0.0007 | 1 | 0.0002 | 0 | 0 |
| AGAAAGGAGG | 1 | 0.0002 | 2 | 0.0005 | 0 | 0 | 0 | 0 |
| AGAAAGGAGGU | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Subtotal | 18 | 0.0043 | 27 | 0.0065 | 15 | 0.0036 | 13 | 0.0031 |
| GAAA | 0 | 0 | 2 | 0.0005 | 5 | 0.0012 | 10 | 0.0024 |
| GAAAG | 2 | 0.0005 | 7 | 0.0017 | 3 | 0.0007 | 1 | 0.0002 |
| GAAAGG | 3 | 0.0007 | 11 | 0.0026 | 0 | 0 | 0 | 0 |
| GAAAGGA | 4 | 0.0010 | 5 | 0.0012 | 5 | 0.0012 | 0 | 0 |
| GAAAGGAG | 2 | 0.0005 | 6 | 0.0014 | 1 | 0.0002 | 1 | 0.0002 |
| GAAAGGAGG | 2 | 0.0005 | 2 | 0.0005 | 0 | 0 | 0 | 0 |
| GAAAGGAGGU | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| GAAAGGAGGUG | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| GAAAGGAGGUGA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Subtotal | 13 | 0.0031 | 33 | 0.0074 | 14 | 0.0034 | 12 | 0.0029 |
| AAAG | 1 | 0.0002 | 4 | 0.0010 | 2 | 0.0005 | 2 | 0.0005 |
| AAAGG | 8 | 0.0019 | 20 | 0.0048 | 7 | 0.0017 | 12 | 0.0029 |
| AAAGGA | 5 | 0.0012 | 10 | 0.0024 | 10 | 0.0024 | 9 | 0.0022 |
| AAAGGAG | 17 | 0.0041 | 26 | 0.0062 | 7 | 0.0017 | 7 | 0.0017 |
| AAAGGAGG | 14 | 0.0033 | 21 | 0.0050 | 1 | 0.0002 | 0 | 0 |
| AAAGGAGGU | 2 | 0.0005 | 1 | 0.0002 | 1 | 0.0002 | 0 | 0 |
| AAAGGAGGUG | 1 | 0.0002 | 0 | 0 | 0 | 0 | 0 | 0 |
| AAAGGAGGUGA | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.0002 |
| Subtotal | 48 | 0.0115 | 82 | 0.0196 | 28 | 0.0068 | 31 | 0.0075 |
| Total | 79 | 0.0189 | 142 | 0.0335 | 57 | 0.0138 | 56 | 0.0135 |
Figure 3Distribution of E. coli and B. subtilis SDs for HEGs and LEGs. SDs that are more frequent in HEGs than LEGs match the core aSD (in bold red) of 16S rRNA. The trailing 3′ nucleotides in B. subtilis are used mainly for SD/aSD pairing in LEGs. Classifying genes into genes of HTE and LTE generates similar results.
SD/aSD binding of nonhypothetical genes in B. subtilis phage φ29 in E. coli and B. subtilis
| Gene | ||||
|---|---|---|---|---|
| DtoStart | SD | DtoStart | SD | |
| 14 | AAGGA | 17 | AAAGGA | |
| 17 | AAGGAG | 20 | GAAAGGAG | |
| 18 | AGGAGGU | 21 | AGGAGGU | |
| 15 | AAGGA | 18 | AAAGGA | |
| 19 | UAGAAAG | |||
| 16 | GAGGUGA | 18,19 | UAGAAAG,GAGGUGA | |
| 18 | GAGGU | 21,21 | AGAAA,GAGGU | |
| 20 | GGAGGUG | 23 | GGAGGUG | |
| 16,19 | UAAGG,AGGUG | 22 | AGGUG | |
| 15 | GAGGUGA | 18 | GAGGUGA | |
| 16 | GGUGA | 19 | GGUGA | |
| 15 | UAAGGAGG | 18 | AAGGAGG | |
| 17 | GAGGU | 20 | GAGGU | |
| 17 | AAGGAG | 20 | AAAGGAG | |
| 17 | UAAGGAGG | 20 | AAGGAGG | |
| 16 | GAGGUG | 19 | GAGGUG | |
Gene gp6, which uses a species-specific SDBs, cannot form a well-positioned SD/aSD in E. coli to be translated efficiently.
The optimal DtoStart is within the range of 10–21 in E. coli.
3′AUCUUUCCUCCACUAG is used as 3′TAIL for B. subtilis, with the optimal DtoStart within the range of 15–25.