| Literature DB >> 30138483 |
Mohammad Ruhul Amin1, Alisa Yurovsky1, Yuping Chen2, Steve Skiena1, Bruce Futcher2.
Abstract
We examined 20,648 prokaryotic unique taxids with respect to the annotation of the 3' end of the 16S rRNA, which contains the anti-Shine-Dalgarno sequence. We used the sequence of highly conserved helix 45 of the 16S rRNA as a guide. By this criterion, 8,153 annotated 3' ends correctly included the anti-Shine-Dalgarno sequence, but 12,495 were foreshortened or otherwise mis-annotated, missing part or all of the anti-Shine-Dalgarno sequence, which immediately follows helix 45. We re-annotated, giving a total of 20,648 16S rRNA 3' ends. The vast majority indeed contained a consensus anti-Shine-Dalgarno sequence, embedded in a highly conserved 13 base "tail". However, 128 exceptional organisms had either a variant anti-Shine-Dalgarno, or no recognizable anti-Shine-Dalgarno, in their 16S rRNA(s). For organisms both with and without an anti-Shine-Dalgarno, we identified the Shine-Dalgarno motifs actually enriched in front of each organism's open reading frames. This showed to what extent the Shine-Dalgarno motifs correlated with anti-Shine Dalgarno motifs. In general, organisms whose rRNAs lacked a perfect anti-Shine-Dalgarno motif also lacked a recognizable Shine-Dalgarno. For organisms whose 16S rRNAs contained a perfect anti-Shine-Dalgarno motif, a variety of results were obtained. We found one genus, Alteromonas, where several taxids apparently maintain two different types of 16S rRNA genes, with different, but conserved, antiSDs. The fact that some organisms do not seem to have or use Shine-Dalgarno motifs supports the idea that prokaryotes have other robust mechanisms for recognizing start codons for translation.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30138483 PMCID: PMC6107228 DOI: 10.1371/journal.pone.0202767
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Summary of Re-annotations.
| Unique taxids | 34,439 |
|---|---|
| No annotated 16S rRNA | 11,941 |
| No helix 45 homology in sequence | 1,850 |
| Original annotation includes 13 b tail and antiSD | 8,080 |
| Annotation corrected by extension through antiSD | 12,415 |
| Variant antiSD, missing or ambiguous sequence | 25 |
| 13 b tail with variant antiSD close to consensus | 19 |
| 13 b tail, no antiSD-like sequence in tail | 109 |
Fig 1Sequence of 3’ 37 nucleotides of E. coli 16S rRNA gene.
The last 24 nucleotides of helix 45 are in italics (yellow box), followed by the 13 nucleotide “tail” (underlined, red box). The anti-Shine-Dalgarno sequence CCTCCT is underlined and in bold. The terminal 3’ “A” residue is indicated with an asterisk. The 10 genomic nucleotides following the end of the tail are shown in lower case (blue box). The overall positioning of this region of the 16S rRNA is indicated (red line).
Fig 2Consensus alignments.
In A and B, 24 bases of helix 45 are boxed in yellow; the last 13 bases of the 16S rRNA are boxed in red, and the 10 bases in the genome following the 3’ end of the 16S rRNA are boxed in blue. A. The 12,426 16S rRNAs missing the antiSD as currently annotated and corrected here, aligned by helix 45. B. 8,069 16S rRNAs which include the antiSD as currently annotated, aligned by helix 45. C. The same 8,096 16S rRNAs as in B, but aligned by the 3’ end of the current annotation.
Sequences of 13 base tails.
| Tail Sequence | Frequency |
|---|---|
| 14,088 | |
| 5,527 | |
| 489 | |
| 116 | |
| 87 | |
| 60 | |
| 53 | |
| 52 | |
| 41 |
For CCTCCT antiSDs, examples of Shine-Dalgarno motifs.
| Type | Tompa SD | Number | Z-Scores |
|---|---|---|---|
| 13 nt a-tail | AAAGGAGGTGATC | ||
| AGGAGG | AGGAGG | 14 | 37–74 |
| GGAGG | 18 | 7–79 | |
| AGGAG | 116 | 7–71 | |
| AGGA | 10 | 12–25 | |
| GGAG | 3 | 9–39 | |
| Shifted | AAGGA | 23 | 13–29 |
| AAGGAG | 11 | 26–56 | |
| GAGGTG | 6 | 14–58 | |
| GAGGT | 6 | 8–48 | |
| GGTGA | 6 | 21–58 | |
| TGATC | 2 | 11–14 | |
| Absent | TACACT | 1 | 43 |
| TAGACT | 1 | 30 | |
| TATACT | 1 | 43 | |
| CGATCG | 3 | 14–36 |
SDs are shown in their relative positions along the 13 nucleotide anti-tail (“13 nt a-tail”) (i.e., the strand complementary to the 13 nucleotide tail). SDs are classified into three types: “AGGAGG”, a subset of the classic sequence; “Shifted”, not a subset of the classic sequence, and shifted either 5’ or 3’ along the 13 base tail; and “Absent”, a sequence found by the Tompa algorithm, but not complementary to the tail. The number of each kind of SD (out of 222 examined) found by the Tompa approach is shown (for example, out of 222 species examined by the Tompa method, 14 had the SD sequence AGGAGG). The Z-score is a statistical measure of the significance of the motifs found by the Tompa approach, with larger Z-scores being more significant (Tompa, 1999).
Presence of Tompa Shine-Dalgarno as function of antiSD.
| Has CCUCCU | % | No CCUCCU | % | |
|---|---|---|---|---|
| SD present | 176 | 79 | 3 | 2 |
| SD close | 13 | 6 | 5 | 4 |
| SD absent | 33 | 15 | 120 | 94 |
| Total | 222 | 128 |
222 species that contained a CCUCCU antiSD in their 13 b tails were categorized as to whether the Tompa method found a complementary SD in front of genes, or an almost complementary SD, or no complementary SD at all. 128 species that did not contain a CCUCCU antiSD in their 13 b tails were categorized in the same three ways.
Fig 3Pylogenetic tree.
Phylogenetic tree (low resolution) showing the distribution of the 128 species that lack a CCUCCU antiSD in their 13 b tails. Green lines indicate the phylogenetic positions of the 15 species previously identified by Lim et al. (2012) (and also identified here); orange lines show the other 113 species uniquely identified here. Because of the low resolution of this tree, individual species are not visible (i.e., there are fewer than 128 colored lines).
Multiple 16S rRNA genes in Alteromonas species.
| Species | Taxid | # Genes | Tail Sequence |
|---|---|---|---|
| A. macleodii AD45 | GCA_000300175.1 | 2 | |
| 2 | |||
| 1 | |||
| A. mediterranea U4 | GCA_000439515.1 | 3 | |
| 2 | |||
| A. mediterranea | GCA_000020585.3 | 3 | |
| 2 | |||
| A. mediterranea U8 | GCA_000439555.1 | 3 | |
| 2 | |||
| A. macleodii (BS) | GCA_000299995.1 | 3 | |
| 2 | |||
| A. mediterranea UM4b | GCA_000439595.1 | 2 | |
| 3 | |||
| A. mediterranea U7 | GCA_000439535.1 | 3 | |
| 2 | |||
| A. mediterranea | GCA_001562295.1 | 5 | |
| A. mediterranea DE1 | GCA_000310085.1 | 5 |
Nine taxids from the genus Alteromonas are shown. Each taxid represents an independent isolate and sequence. Seven of the taxids have multiple different 16S rRNA genes, differing in the sequence of the tail. Bold residues are residues differing from the majority, classic sequence. Other taxids of Alteromonas (not shown) (S6 Table) have the same tail sequence as A. mediterranea DE1 (GCA_000310085.1), which we refer to as the “classic” or “majority” sequence, and which contains the classic antiSD sequence CCTCCT. There is no taxid of Alteromonas in our dataset that contains only the novel tail sequence CCTTCAAT; rather, this novel tail is found only in conjunction with the classic sequence.