| Literature DB >> 25165464 |
Corey M Hudson1, Britney Y Lau1, Kelly P Williams1.
Abstract
Genes for the RNA tmRNA and protein SmpB, partners in the trans-translation process that rescues stalled ribosomes, have previously been found in all bacteria and some organelles. During a major update of The tmRNA Website (relocated to http://bioinformatics.sandia.gov/tmrna), including addition of an SmpB sequence database, we found some bacteria that lack functionally significant regions of SmpB. Three groups with reduced genomes have lost the central loop of SmpB, which is thought to improve alanylation and EF-Tu activation: Carsonella, Hodgkinia, and the hemoplasmas (hemotropic Mycoplasma). Carsonella has also lost the SmpB C-terminal tail, thought to stimulate the decoding center of the ribosome. We validate recent identification of tmRNA homologs in oomycete mitochondria by finding partner genes from oomycete nuclei that target SmpB to the mitochondrion. We have moreover identified through exhaustive search a small number of complete, but often highly derived, bacterial genomes that appear to lack a functional copy of either the tmRNA or SmpB gene (but not both). One Carsonella isolate exhibits complete degradation of the tmRNA gene sequence yet its smpB shows no evidence for relaxed selective constraint, relative to other genes in the genome. After loss of the SmpB central loop in the hemoplasmas, one subclade apparently lost tmRNA. Carsonella also exhibits gene overlap such that tmRNA maturation should produce a non-stop smpB mRNA. At least some of the tmRNA/SmpB-deficient strains appear to further lack the ArfA and ArfB backup systems for ribosome rescue. The most frequent neighbors of smpB are the tmRNA gene, a ratA/rnfH unit, and the gene for RNaseR, a known physical and functional partner of tmRNA-SmpB.Entities:
Keywords: Carsonella; Mycoplasma; SmpB; tmRNA; trans-translation
Year: 2014 PMID: 25165464 PMCID: PMC4131195 DOI: 10.3389/fmicb.2014.00421
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Evaluation of primary tmRNA sequence-finding programs.
| Bacteria | 2033 | 1983 | 0 | 15 | 14 | 21 |
| Archaea | 10 | 0 | 0 | 7 | 3 | 0 |
| Bacteria | 13094 | 2037 | 10283 | 235 | 52 | 487 |
| Archaea | 1248 | 0 | 849 | 365 | 3 | 31 |
| Bacteria | 21337 | 5 | 15170 | 390 | 1138 | 4634 |
| Archaea | 808 | 0 | 402 | 159 | 45 | 202 |
See Materials and Methods.
Genomes with unusual .
| 19165 | ||
| secondary endosymbiont of | 19166 | |
| 19167 | ||
| 19168 | ||
| 19169 | ||
| 19170 | ||
| Bacillus phage G | 14561 | |
| Mycobacterium phage DS6A (TLD only) | 11587 | |
| Mycobacterium phage Bxz1 | 10675 | |
| Mycobacterium phage Cali | 13258 | |
| Mycobacterium phage Catera | 15205 | |
| Mycobacterium phage ET08 | 14080 | |
| Mycobacterium phage Rizal | 14900 | |
| Mycobacterium phage ScottMcG | 10349 | |
| Mycobacterium phage Spud | 11713 | |
| Mycobacterium phage Wildcat | 11059 | |
Includes links to tmRNA webpages for bacterial strains missing tmRNA and phages with tmRNA. tmID is the tmRNA Website (http://bioinformatics.sandia.gov/tmrna) identifier.
Highly reduced genome (<106 bp).
Genomes with unusual .
| Pseudogene | 19190 | |
| Truncation | 12215 | |
| Truncation | 12077 | |
| Frameshift | 11952 | |
| Frameshift | 19171 | |
| Frameshift | 10063 | |
| Frameshift | 15031 | |
| Frameshift | 15428 | |
| Frameshift | 12194 | |
| Frameshift | 16329 | |
| Frameshift | 19118 | |
| Frameshift | 10352 | |
| Frameshift | 19172 | |
| Frameshift | 16792 | |
| Frameshift | 12964 | |
| Frameshift | 13623 | |
| 19173 | ||
| Contaminant | 19176 | |
| Contaminant | 19177 | |
| Endosymbiont | 19178 | |
| Chromatophore | 19174 | |
| Oomycete mito.-targeted | 19187 | |
| Oomycete mito.-targeted | 19188 | |
| Oomycete mito.-targeted | 19189 | |
| Algal plastid-targeted | 19175 | |
| Algal plastid-targeted | 19179 | |
| Algal plastid-targeted | 19180 | |
| Algal plastid-targeted | 19181 | |
| Algal plastid-targeted | 19182 | |
| Algal plastid-targeted | 19183 | |
| Algal plastid-targeted | 19184 | |
| Algal plastid-targeted | 19185 | |
| Algal plastid-targeted | 19186 | |
Includes links to webpages for bacterial strains with defective smpBs, bacterial plasmids with smpBs, and smpBs in eukaryotic genome projects (some of which are organelle-targeted). The Hodgkinia genome pseudogene has accumulated two premature stop codons in smpB. The two “truncation” cases have lost material reaching into the β-barrel at each end. We also note that SmpB lacks the central loop in the hemoplasmas, Carsonella and Hodgkinia, and lacks the C-terminal α helix in Carsonella, but these SmpBs retain all β strand segments and may therefore retain weak function. tmID is the tmRNA Website (http://bioinformatics.sandia.gov/tmrna) identifier.
Highly reduced genome (<106 bp).
The description of this genome (Pérez-Brocal et al., 2006) noted and discussed this frameshift, suggesting confidence in the gene sequence; any of the other frameshifts could instead be sequencing errors.
Figure 1. Each neighborhood (n = 2012) in our bacterial complete genome set was taken as the 11-gene window centered at smpB. (A) Frequent neighbors. The tmRNA gene (the only RNA gene encountered) and Pfam families present in more than 200 smpB neighborhoods are listed with a representative annotation for the instances of each family. (B) Clusters. Each neighborhood was summarized as a cluster, considering only the families of (A) (note the more specific gene annotations there). The top clusters are shown with color coding of common subclusters.
Figure 2. In strain PC, the three main ssrA conserved regions, at the 5′ and 3′ termini and at the tag reading frame, have all suffered so many nucleotide changes as to be unrecognizable, yet the region is largely still present. The smpB CDS (blue) extends into ssrA (expected to produce non-stop smpB mRNAs) or the ssrA pseudogene in four cases. In the HC/HT lineage, a small deletion has caused ssrA to overlap with its downstream and oppositely-oriented neighboring tRNAPhe gene changing the last tmRNA acceptor stem (P1) nucleotide from C to U, which apparently led to a compensating G to A mutation at the first P1 nucleotide. The tag reading frame has now been determined by comparative analysis as the most conserved reading frame in ssrA, that also shares some amino acid similarity to other tag sequences. Carsonella SmpB lacks the central loop (not shown here) and the C-terminal tail, which in Thermus is a 25-residue segment following β7. The C-terminus of SmpB does extend variably beyond β7 with apparently random amino acid sequence that depends on the extent of intrusion into ssrA, but these extensions are not as long as for normal SmpBs and they do not thread into the α helix model (Kelley and Sternberg, 2009).
Figure 3. The hemoplasmas have lost the SmpB central loop and for the suis subclade we cannot find the tmRNA gene. Genomes of 54 Mycoplasma strains were aligned using Mugsy (Angiuoli and Salzberg, 2011), yielding only the rRNA operon region as alignable for all strains; this was trimmed to 1679 bp using GBlocks requiring at least half the taxa per column (Castresana, 2000), then a maximum likelihood tree was prepared using a GTR+Γ model and autoFC bootstopping in RAxML 7.2.8 (Stamatakis, 2006). The hemoplasma clade and phylogenetic surroundings agree with recent 32-protein and 16S rRNA phylogenies (Guimaraes et al., 2014).
Functional protein CDSs that overlap t(m)RNA genes.
| No. t(m)RNA | 115660 | 4809 | |||
| Overlapping Pfam CDS | 828 | 1364 | |||
| Same orientation | 379 | 735 | |||
| CDS upstream | 250 | 244 | FTSW_RODA_SPOVE | 44 | All 44 are tRNAIle-CAT in |
| CDS downstream | 106 | 186 | Aminotran_3 | 9 | 8 are tRNALeu-CAA in |
| CDS internal | 0 | 92 | – | – | – |
| CDS spanning | 23 | 213 | GTP_EFTU | 6 | All 6 are tRNASec in Rhizobiales |
| Opposite orientation | 449 | 629 | |||
| CDS upstream | 23 | 187 | RNB (RNase R) | 4 | All 4 are tRNALeu-CAG in Burkholderiaceae |
| CDS downstream | 381 | 186 | Resolvase | 72 | Diverse settings |
| CDS internal | 0 | 83 | – | – | – |
| CDS spanning | 45 | 173 | Resolvase | 16 | Diverse settings |
Of the 6,489,445 original NCBI protein calls in the 2031 bacterial genome projects, 5,805,765 were positive for functionality with the Pfam/HMMER system (testing Pfam-A and Pfam-B) or with the CDD/RPSBLAST system, and were tested for overlap with either tmRNA genes from the tmRNA Website or tRNA genes found with a combination of tRNAscan-SE and Aragorn (see Materials and Methods for distinction between “valid” and “questionable” tRNAs).