Literature DB >> 27484476

Origins of tmRNA: the missing link in the birth of protein synthesis?

Abstract

The RNA world hypothesis refers to the early period on earth in which RNA was central in assuring both genetic continuity and catalysis. The end of this era coincided with the development of the genetic code and protein synthesis, symbolized by the apparition of the first non-random messenger RNA (mRNA). Modern transfer-messenger RNA (tmRNA) is a unique hybrid molecule which has the properties of both mRNA and transfer RNA (tRNA). It acts as a key molecule during trans-translation, a major quality control pathway of modern bacterial protein synthesis. tmRNA shares many common characteristics with ancestral RNA. Here, we present a model in which proto-tmRNAs were the first molecules on earth to support non-random protein synthesis, explaining the emergence of early genetic code. In this way, proto-tmRNA could be the missing link between the first mRNA and tRNA molecules and modern ribosome-mediated protein synthesis.

Entities: Chemical Disease Species

Mesh：

Substances：

Year: 2016 PMID： 27484476 PMCID： PMC5041485 DOI： 10.1093/nar/gkw693

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

The RNA world hypothesis was first proposed over forty years ago as a major and early step in the evolution of life, at a time when there was no protein synthesis mechanism as it exists now (1–3). The theory is based on the capacity of RNA to simultaneously catalyze enzymatic reactions and store genetic information, as now done by proteins and DNA, respectively. RNA's intrinsic weaknesses support such a slow shift to modern molecular biology, in which genetic information passes from DNA to RNA and possibly to proteins. Indeed, despite the versatility of RNA, DNA has a higher molecular stability for carrying genetic information, and proteins have higher catalytic abilities. At some point during the RNA world, an evolutionary leap took place between the first system able to replicate molecules responsible for biochemical reactions (i.e. self-replicating RNA), and the cell that replicates a whole genome encoding for these biochemical activities. This evolutionary leap was embodied by the emergence of non-random coding RNA, which served as the first medium for genetic information. From there, RNA and peptides had to co-evolve through a ribonucleoprotein (RNP) world (Figure 1). This step would have been fundamental for the development of the code of life. At that time (Figure 1, red star), the world must have consisted of: a network of small RNAs sufficiently evolved as to have different catalytic and self-replicating properties; some primitive amino acids such as alanine, glycine, and aspartic acid; and the minimal prerequisites for translation, such as the first proto-ribosomes. Coding RNA may then have evolved to become non-random and specifically recognized as messenger RNA rather than another RNA type. The evolution of the translation mechanism will have been the result of ‘molecular Darwinism’, or in other words a random phenomenon leading progressively to a selective advantage.

Figure 1.

Timeline of early events, emphasizing the transition from an RNA world to modern life. During the history of life on earth, the RNA world lasted from the first appearance of short catalytic RNAs right up to the transition to a modern period in which genetic information carried by DNA and RNA became translated into proteins. The RNP (ribonucleoprotein) world was the intermediate period in which RNA and the first random peptides coexisted as informational and catalytic molecules. The red star indicates when the genetic medium stopped being random. Ga, giga-annum, or 109 (1 000 000 000) years. Many molecular fossils of the RNA world are still present and even active in modern organisms. Candidates must be either catalytic, ubiquitous, and/or central to some aspect of metabolism (4). Transfer-messenger RNA (tmRNA) is a hybrid molecule present in all bacteria. It exhibits properties of both transfer and messenger RNA, and permits the rescue of ribosomes arrested during translation. So not only does tmRNA play a key cellular role in modern bacteria, but in a single molecule it also has two major and ancient functions that were necessary for the transition from an RNA world to the modern protein synthesis pathways.

ORIGINS OF THE GENETIC CODE

The genetic code is the set of rules by which the information encoded within DNA and messenger RNA (mRNA) is translated into proteins. The information is contained in the mRNA sequence, combined into codons or nucleotide triplets. Each codon corresponds to either a specific amino acid or to a stop signal which terminates protein synthesis. With four different nucleotides and a code made of nucleotide triplets, there are 34 = 64 possibilities to code 20 amino acids and three stop codons. Consequently, modern genetic code is degenerated, or in other words, most amino acids are encoded by more than one codon. Although we may not know what led to the current distribution of codons and their corresponding amino acids, these distributions are not random. For example, amino acids that share the same biosynthetic pathway or similar polarities and/or side-chain sizes tend to be close to each other on the genetic code table (Figure 2) (5).

Figure 2.

Theory of the genetic code evolution. This shows the evolutionary pathway going from the GNC code (4 codons) to the SNS code (16 codons) to the universal genetic code (64 codons). (A) Adapted from Massimo Di Giulio (72). (B) Adapted from Kenji Ikehara (10). (C) Instead of the conventional representation, the modern genetic code is shown reflecting the order of codon occurrence (columns G and U inverted). Many theories have been put forward about the origins of the genetic code. Francis Crick first described merely a ‘frozen accident’ (1). This would mean that the system of twenty amino acids and their designated codons was good enough to work, but too resistant to change to improve. Other possibilities are: that the current shape of the code depends greatly on specific primordial biochemical interactions (such as those between RNA and amino acids) (6); that modern genetic code grew from an earlier and simpler code through a ‘biosynthetic expansion’ process (7); or that it resulted from information channels (8). Though the different forces that led to the evolution of the genetic code are unknown, there is a generally accepted model describing its appearance. In this, an RNA charged with an amino acid (proto-tRNA) targeted a proto-mRNA by complementary interactions. The evolution of interactions between the proto-tRNA and the proto-mRNA then gave birth to the first mRNAs coding for non-random short peptides (9). This reaction was probably promoted by a proto-ribosome. Today, based on the biosynthetic pathway of each amino acid and on the coevolution of transfer RNA (tRNA) and aminoacyl-tRNA synthetases (aaRS), we can reliably predict the order in which the codons appeared. The first generation of codons were of the GNS type (G = G; N = A, U, C, or G; and S = C or G), while the second generation were SNS (10). This early genetic code continued to evolve, maximizing its efficiency, until it became as it is today (Figure 2).

MODERN RIBOSOMES AND PROTO-RIBOSOMES

Nowadays, the information carried by DNA is first transcribed into mRNA, which in turn can be translated into proteins by the ribosome. Ribosomes are made of two subunits, themselves made up of ribosomal RNA (rRNA) and proteins. The ribosomes translate mRNA codons into one of the twenty amino acids that make up proteins. In a broad outline of the process (see (11) for a complete review), the small subunit permits the decoding of genetic information carried by mRNA, while the large subunit catalyzes the link between amino acids (peptide bonds). The first atomic resolution structures of the ribosome allowed for a precise description of its catalytic heart, the Peptidyl Transfer Center (PTC). Strikingly, the ribosome is a ribozyme, since only RNA performs its key role of peptide bond formation (12,13). The PTC's highly conserved structure is symmetrical, made up of a ‘stem–elbow–stem’ RNA of approximately 180 nucleotides. Its structure probably originated from the dimerization of two stem–elbow–stem motifs forming a pocket. This form evolved because it offers the best substrate orientation for peptide bond catalysis. The PTC is currently considered to be the first proto-ribosome dating from the RNA world (14–16).

ORIGINS OF TRANSFER RNA: BRIDGING TWO GENETIC CODES

Transfer RNA (tRNA) molecules have a major role to link the genetic information carried by mRNA codons and the corresponding amino acids necessary for protein synthesis. Because of their central role, tRNA have two distinct characteristics corresponding to two different genetic codes (17). First, a tRNA carries the anticodon, a specific nucleotide triplet which corresponds to the mRNA codon. Secondly, each tRNA also binds with high specificity to the amino acid corresponding to its anticodon, in a reaction catalyzed by a specific aminoacyl-tRNA synthetase (aaRS). In this sense, tRNA is a key molecule for combining ribonucleotide information (ancient RNA world) and peptide information (modern protein world). The specific attachment of a particular amino acid to its corresponding tRNA is referred to as the ‘second genetic code’. In fact, this second code must have appeared first (17) then evolved together with the aminoacyl-tRNA synthetases, a family of enzymes believed to be among the oldest proteins on earth (18). Today, aaRS discriminates between different tRNAs by recognizing elements in both the anticodon loop and acceptor stem of the RNA (19). The tRNA molecule is present in all organisms and its secondary structure is among the most evolutionarily conserved. Consequently, it is commonly accepted that at least from the Last Universal Common Ancestor (LUCA), its origin is monophyletic. Several models propose explanations for the molecular mechanisms leading to the formation of modern tRNA (Figure 3). Ancestral tRNA could have been encoded by split genes, which later were merged to encode modern tRNA (Figure 3A) (20). Modern tRNAs could instead be the result of a fusion between two ancient RNA minihelices (Figure 3B) (21). A third model is based on the sequence analysis of archaeal tRNA genes, focusing on the presence of introns for clues (Figure 3C) (22). Indeed, during archaeal evolution the nucleotide sequences of the 3′-end (CCA sequence) were more conserved than those of the 5′-halves. Since the 3′-end is required for adding the amino acid residue to tRNA, it would seem that the 3’-arm evolved first. A variety of tRNA species would have been generated at a later stage through asymmetric combination with different 5′-tRNA halves. Accordingly, most tRNA sequences have vestiges of double hairpin folding, suggesting that the structure of tRNA molecules could have been the result of double hairpin formation in the ancient prebiotic world (23).

Figure 3.

Different models explaining the origins of tRNA. (A) tRNA may originate from the dimerization of two hairpin structures. ANTI, anticodon; ID, the discrimination region for identifying tRNA. The triangle represents the position where the intron is found in tRNA genes (73). (B) tRNA may originate from the late fusion between two RNA minihelices. A new aminoacyl tRNA synthetase (aaRS) domain links the operational RNA code to the trinucleotides of the genetic code (21). (C) tRNA may originate from the fusion of split genes of non-contiguous tRNAs (22). Whatever the scenario, tRNA evolution is closely linked to aminoacylation. Although it is not precisely known how it operated in the RNA world, aminoacylation can be achieved through spontaneous chemical reactions. Indeed, it is possible to aminoacylate an RNA minihelix with an aminoacyl phosphate oligonucleotide (24), which would explain the importance of amino acid chirality in this mechanism. In another study it was shown that a simple pocket formed by a complex of four nucleotides (the C4N RNA hairpin) can be aminoacylated with specificity at its 3′-end by a simple Val–Asp dipeptide (25). The emergence of aaRS-like ribozymes certainly increased reaction specificity. Such ribozymes, referred to as flexizymes, have not yet identified in vivo, but they were isolated by means of in vitro selection (26). In the early stages of the translational system's development, the discrimination was made close to the CCA 3′-end (27). During evolution, it moved away from the acceptor stem loop until it arrived to the current situation, with aaRS recognition performed in both the anticodon and acceptor stem loops (28).

FUNCTION AND STRUCTURE OF TRANSFER-MESSENGER RNA

In all bacteria, trans-translation is the primary rescue system that permits the release of stalled ribosomes as well as the elimination of the related incomplete proteins and mRNA. A particular RNA performs this process: tmRNA associated with Small protein B (SmpB). tmRNA is a hybrid molecule carrying out both transfer and messenger RNA activities, and its total length varies between about 260 and 430 nucleotides, depending on the cell species (Figure 4). It is always aminoacylated by alanine. As for SmpB, its topology makes it similar to several other modern RNA‐binding proteins associated with translation, such as ribosomal protein s17, aspartyl tRNA synthetase, or the prokaryotic translation initiation factor IF1 (29). This structural link not only suggests that they may share a common ancestor or be linked by an evolutionary relationship, but also that they emerged late, when modern translation was already set up. Thanks to SmpB, the tRNA-like domain (TLD) of tmRNA recognizes and enters the vacant decoding site of ribosomes stalled at the 3’-end of truncated mRNAs, thus restarting translation. In a sophisticated ballet, protein synthesis then switches on the mRNA part of tmRNA. This allows the truncated protein to be extended by a short sequence of amino acids that tags it for destruction by cellular proteases, while the incomplete mRNA is released to be destroyed by ribonucleases (for a full review see (30)). Trans-translation occurs frequently, accounting for as much as 2–4% of translation reactions in Escherichia coli (31). It is found almost exclusively in the bacterial world (32), with a few exceptions. Trans-translation is also found in the plastomes of some primitive algae (33–35); in some rare diatoms (Stramenopila) that acquired genes from marine bacteria (36); and in the Jakobids, sub-lineages of the very distant organismal group Ecavates (37). The Jakobids are interesting due to their uniquely bacterial-like mitochondrial genomes (38). They are considered to be some of the most ancient living eukaryotes, which would explain the presence of tmRNA in their mitochondrial genome. Although faced with the same need for protein synthesis quality control, eukaryotic and archaeal cells have distinct rescue systems that have evolved differently. Although faced with the same need for quality control of protein synthesis, eukaryotic and archaeal cells have distinct rescue systems that have evolved differently. These include nonsense-mediated decay (NMD), no-go decay (NGD), and nonstop decay (NSD) in eukaryotes (39,40); and NGD in archaea (41). At the beginning of the DNA–RNA–protein era, translation errors and (proto)-ribosome stalling must have been far more frequent, and the need for an unlocking system such as trans-translation was even more necessary. Therefore, one can assume that a selective pressure led to the appearance of tmRNA then to its high conservation during evolution, an assumption which seems confirmed by the fact that tmRNA is now ubiquitous in all bacteria. On the other hand, SmpB is likely to have arised as modern tmRNA needed a more accurate way to recognize and rescue stalled ribosomes. tmRNA might therefore be the molecular fossil that coevolved with the translational system up to the present day.

Figure 4.

tmRNA secondary and tertiary structures. (A) Secondary structure diagram of Thermus thermophilus tmRNA. Watson–Crick base pairs are connected by lines and GU pairs are represented by dots. Domains are in color: the tRNA-like domain (TLD) is blue; helix 2 (H2) is red; pseudoknot 1 (PK1) is orange; helix 5 (H5) is brown; pseudoknot 2 (PK2) is green; pseudoknot 3 (PK3) is pink; and pseudoknot 4 (PK4) is light blue. The nucleotides within the internal open reading frame (ORF) are underlined and shown in a larger font. The resume codon is yellow and the STOP codon is indicated. (B) Cryo-EM map of the alanyl-tmRNA-SmpB complex bound to a stalled ribosome. 3D molecular model of tmRNA based on the homology modeling of each independent domain followed by flexible fitting into the cryo-EM density map of the accommodated step (74). EMDB entry: EMD-5188. Color code: tmRNA domains are the same as in (A); SmpB is magenta; the small 30S subunit is yellow; the large 50S subunit is light blue.

ORIGINS OF TRANSFER-MESSENGER RNA

Introns: common features in tmRNA and tRNA?

In the three domains of life, some rare tRNA molecules still undergo splicing of non-coding sequences (introns) located within the anticodon stem–loop (Figure 5A) (42). In bacteria, these introns are eliminated by self-splicing, while in archaea and eukaryotes they are taken care of by endonucleases. The origin of these introns is still being debated. In an ‘introns-early’ scenario (27,43), all essential tRNA genes had an intermediate block in the anticodon loop that had to be removed, and large parts of these introns were lost during evolution. A second scenario is ‘introns-late’ (44), and it theorizes that introns were inserted into genes after the emergence of tRNA. Although there is no evidence for a biological role for these non-coding intron regions, it must be noted that they are present in all three domains of life. The selective pressures that allowed the conservation of some of them up to the present day remain unclear, but their presence seems to be a signature of tRNA evolution. Remarkably, a simple comparison of the secondary structures shows that the positioning of the introns in the anticodon stem–loop of modern tRNA is the same as the large and structured ring abutting the TLD of tmRNA (Figure 5B). Other similarities—such as the length of these extra sequences and the presence of highly structured stem–loops—reinforce the idea that there is a link between modern tmRNA and tRNA. In bacteria, these introns belong to self-splicing group I introns which must be ancestral as they have characteristics that are omnipresent in different bacterial phyla and as they have self-catalytic activity reminiscent of the RNA world (45). The typical secondary structure of a group I intron consists of approximately ten paired elements organized into three domains (46). Self-splicing group I introns use a mechanism based on two distinct guanines as a cofactor (Figure 6A). The first guanine is located at the beginning of the intron in the Internal Guide Sequence (IGS), a highly conserved element in a stem–loop structure. An interesting particularity of this stem–loop is that the guanine used for the self-splicing reaction is associated with a uracil (G:U) (Figure 6A). This mismatch is conserved in all pre-tRNA molecules having a group I intron (47). Strikingly this mismatch characteristic is also found in the TLD domain of tmRNA, at the position G24:U328 ± 1 (Figure 6B) which would correspond to the start of a pre-tRNA intron (Figure 6C). The possible reason for having an intron in the tRNA anticodon is so strange today that neither its origins nor any evolutionary advantages that may have led to its preservation are known (48).

Figure 5.

Figure 6.

Similarity of the positions of G:U mismatches in the tRNA intron and in tmRNA. (A) Secondary structure of the Azoarcus group I intron. Exon sequences are in lower case and blue, while introns are in upper case letters, with red arrows indicating the splice boundaries. The conserved G-U mismatch necessary for self-splicing and the guanine partner are red. (B) Secondary structure of the Escherichia coli tRNA-like domain (TLD) of tmRNA. The conserved G-U mismatches in the TLD are red. A similarity in position between the G-U mismatch of the tRNA intron and the TLD is noticeable. (C) Secondary structure of a tRNAIle (CAU) from Azoarcus. The red arrow indicates the insertion site for the introns shown in (C). The exon sequence common to figures (A) and (C) is blue. Mismatches are signaled by dots.

Position and secondary structure similarities between the tRNA intron and tmRNA pseudoknots. (A) The tRNA intron in the three major kingdoms (Bacteria, Archaea and Eukaryota), adapted from Akio Kanai (22). Introns are framboise and mature tRNA is gray. Intron clipping sites are indicated with black arrows. (B) Secondary structure of tmRNA. The tRNA-like (TLD) and mRNA-like (MLD) domains are indicated, and the pseudoknots are framboise. Note the similar positioning of the tRNA introns in the three domains of life and in the other tmRNA domains. Similarity of the positions of G:U mismatches in the tRNA intron and in tmRNA. (A) Secondary structure of the Azoarcus group I intron. Exon sequences are in lower case and blue, while introns are in upper case letters, with red arrows indicating the splice boundaries. The conserved G-U mismatch necessary for self-splicing and the guanine partner are red. (B) Secondary structure of the Escherichia coli tRNA-like domain (TLD) of tmRNA. The conserved G-U mismatches in the TLD are red. A similarity in position between the G-U mismatch of the tRNA intron and the TLD is noticeable. (C) Secondary structure of a tRNAIle (CAU) from Azoarcus. The red arrow indicates the insertion site for the introns shown in (C). The exon sequence common to figures (A) and (C) is blue. Mismatches are signaled by dots. Although it is difficult to predict whether tmRNA derives from a tRNA carrying a type I intron or vice versa, one could imagine a scenario in line with the introns-early hypothesis (27). In this variation, a universal proto-tmRNA with a large intron could be the common ancestor of both tRNA and tmRNA molecules. Gradually, some portions of this loop would have been lost, finally giving birth to the modern panel of tRNAs (49). tmRNA could result from the evolution of this loop into an internal open reading frame, encoding several codons. tmRNA probably arose very early, evolving from a proto-tmRNA that was aminoacylated by alanine only (see below).

Alanines at the crossroads of tmRNA-based aminoacylation and tagging events

Alanine is the simplest chiral amino acid, and along with glycine and aspartic acid the first to be present on earth (50–52). Accordingly, alanine was one of the first amino acids encoded by the first generation of codons (Figure 2). Remarkably, tmRNA is always charged by alanyl-tRNA synthetase (AlaRS), a class II tRNA synthetase that also catalyzes the esterification of alanine to tRNAAla. Contrary to recognition of most aminoacyl-tRNA synthetases, AlaRS does not require a specific signature on the anticodon moiety of the corresponding tRNA. Instead, it depends mainly on the presence of both a conserved G3:U70 base pair in the acceptor stem of the tRNAAla isoacceptor, and an adenosine at the discriminator position adjacent to the 3’-terminal CCA (53). This exception is extremely conserved within the three branches of life, and without a doubt reflects the ancient recognition of RNA minihelices by the first enzymes (54). Strikingly, this age-old signature is always present in tmRNA sequences, supporting its early apparition as well as the maintenance through the ages of specific aminoacylation of tmRNA, even in the absence of an anticodon (18,55). Similarly, four alanines are also encoded by the mRNA-like domain (MLD) that has the following consensus sequence: A*AN—-ALAA (e.g A*ANDENYALAA in E.coli, with the first A* carried by the tmRNA TLD) (Figure 7A). The first alanine codon is essential to trans-translation. Indeed, this region is important because it ensures the proper placement of the reading frame which allows resumption of translation in the absence of a start codon (56). Remarkably, when comparing the sequences of the resume codon among the various bacterial species, the more primitive the bacteria, the more ancestral GNC codons are recovered (Figure 7B–C). The three conserved tmRNA alanine codons (——–A-AA) that follow are also crucial, allowing specific recognition of the tagged protein by proteases. Indeed, most tmRNA-tagged proteins are degraded by ClpXP or less frequently by ClpAP proteases. Within the typical tagging sequence, ClpX binds the C-terminal residues (———LAA), whereas ClpA recognizes both C- and N-terminal tag residues (AA—–ALA-) (57,58). Degradation is also influenced by the adaptor protein SspB that binds to the tag's N-terminal portion to deliver the tagged peptide tagged to ClpXP (59).

Figure 7.

Consensus of amino acids and repartition of first codon of mRNA-like domain in bacteria phyla. (A) Sequence of the consensus sequence and diversity of the amino acids in the MLD. Sequence conservation is measured in bits, going from weakly conserved (0) to highly conserved (4). Alanine is in red and the other amino acids are in black. The alignment was obtained using 708 tag sequences from the UTHSCSA RNP database (http://rnp.uthscsa.edu/rnp/tmRDB/peptide/peptide.html) (66). The consensus logo was created using WebLogo (http://weblogo.berkeley.edu/logo.cgi) (75). (B) Analysis of the first MLD codon in different bacteria phyla. The first codon of the MLD is indexed by the ancestral codons (GNC) and the others. The percentage of each is bold and the sequence number is in parentheses. If the percentage of a codon in a phylum is superior to 50%, it is red. Non-bacterial tmRNA is yellow. Sequence alignments are presented for 940 different tmRNAs. They were taken from the ‘tmRNA website’ (76). Briefly, the first codon is identified thanks to the highly-conserved upstream sequence determinants. Of course, the resume codon sequence itself is another likely determinant. (C) Variations in the first MLD codon in the different bacteria phyla. The green line indicates ancestral codons for the first codon of the MLD, and the blue line indicates other codons. Adapted from the phylogenetic tree of all extant organisms based on 16S rRNA gene sequence data, as originally proposed by Woese.

tmRNA-mediated initiation and termination

In the RNA world, for an RNA to be considered a messenger molecule it had to be recognized through specific signatures. Today, this recognition is complex and involves many protein partners such as initiation and release factors. In the RNA world, the mechanism would have been simpler since only the PTC catalytic domain was present. Consequently, the first mRNAs with a non-random coding sequence needed an initiation signal for starting on the correct codon and accurately stopping translation. In fact, tmRNA fulfills all of these conditions. Indeed, the TLD domain of tmRNA can resume translation on its internal coding sequence, even in the absence of any canonical initiation factors. The presence of structured RNA regions upstream of a coding sequence is one of the simplest mechanisms known for initiating translation, as observed today in many tRNA-like structures (60,61). In the same way, tmRNA also carries a signal for terminating translation on its internal reading frame. All modern tmRNAs carry this stop codon within a stem–loop (Figure 4), known to be a basic signal to stop translation (62). Since in primordial times there were no release factors, stop codons, nor ribosomes with helicase activity, the primitive stop signal was probably a stem–loop RNA structure. In the absence of a whole ribosome, the mRNA could not anchor to maintain interactions with the PTC catalytic domain. Yet these interactions were certainly necessary to encourage the translation's evolution in the RNA world. Again, tmRNA possesses the ability to host these interactions via its non-coding domain. The large pseudoknot ring of tmRNA can perform many interactions with the translational complex and may have been used for anchoring the mRNA part to the first proto-ribosomes.

A UNIVERSAL PROTO-tmRNA AT THE ORIGINS OF MODERN tRNAs AND mRNAs?

We suggest that tmRNA fulfills all the criteria for being the missing link between an ancient RNA-driven world and the apparition of modern protein synthesis, which is organized around tRNA, mRNA and ribosomes. The model is based on an early prebiotic world in which various sets of short RNA molecules were first synthesized by the non-enzymatic bonding between free-floating nucleotides present in the primordial soup (Figure 8). Driven by evolutionary forces, these short RNAs would then have tended to pair with each other and/or fold into minihelices, forming hairpin structures. This stage likely promoted further interactions and permitted the first forms of the genetic code to emerge, while new RNA catalytic properties generated self-replicating molecules and the first aminoacylated RNA. We propose that in the second stage of an intriguing ‘intron scenario’, a primitive proto-tmRNA carrying a tRNA acceptor stem with a large intron was obtained by fusing two separate hairpin RNA. Such an ancestral molecule would possess both proto-tRNA and proto-mRNA functions (archaic forms of peptidylation and coding, respectively) within a single molecule. This would in fact explain the emergence of early genetic code, the two parts having evolved very closely. At this stage, proto-tmRNA could contain different acceptor branches charged by their specific early amino acids (e.g. alanine, see above). The large intronic loop abutted to the acceptor branch would contain several primitive codons (e.g. alanine codons). Indeed, according to a recent model by Di Giulio, the first mRNAs on earth might have also been peptidated, making tmRNA a perfect candidate to be one of the first mRNAs (63). However, it would also act as a strand of primitive anticodons (49) interacting with the primitive codons of other proto-tmRNAs simply by creating antiparallel duplexes as in modern anticodon-codon interactions. Even if we cannot rule out the theory that the first interactions between proto-tmRNAs occurred in the absence of any other partners (64), these proto-tmRNAs might have rapidly interacted with the first forms of the early peptidyl transferase center (PTC) while the first genetic code was evolving. Indeed, we can assume that the proto-PTC might have emerged in the same way, by self-folding and dimerization of RNA chains, thus providing the first framework for stereochemistry favoring for peptide bond formation and substrate-mediated catalysis (16,65). At this stage, protein synthesis was surely limited to quite short peptides. Co-evolution between these peptides and RNAs might have created a ‘virtuous circle’, meaning that RNA persisted thanks to peptide protection, the increased RNAs produced more peptides, this led to longer RNAs and peptides, and so on (66). We propose that in a subsequent stage, proto-tmRNAs eventually gave birth to tRNA and mRNA, and evolved into modern tmRNA molecules. In this stage, tRNA was made by variable intronic processing, resulting in the placement of a particular anticodon in the contemporary anticodon position (49). Relics of such a transition can be found today in tRNA variable Ioops and/or introns, which may correspond to the intron loop's unprocessed excess nucleotides. As for tmRNA, it would have emerged from one of the proto-tmRNAs that had an ancestral tRNAAla acceptor branch (i.e. having a G:U base pair in the acceptor stem). Interestingly, AlaRS is one of the rare enzymes that continues to recognize only the tRNA acceptor stem (early ‘second genetic code’) rather than the tRNAAla anticodon. It is possible that, despite its lack of anticodon, the need for tmRNA to be correctly aminoacylated is part of the evolutionary force that drove AlaRS to not incorporate the anticodon into its recognition process (67). Due to its essential role in cells, tmRNA is present in all bacteria, reinforcing the idea that it arose before bacterial phylogenic divergence (68). Three simple criteria can be used to establish RNA ‘antiquity’: catalysis, ubiquity, and central placement in the cellular metabolism (4). Many modern RNAs, particularly non-coding ones (ncRNAs), can thus be considered as relics of an ancient RNA world. However, during the slow and gradual replacement of RNAs by ribonucleoproteins (RNPs) and proteins in catalysis, there must have been a crucial step that allowed for the switchover to ribosome-mediated translation. While many RNAs meet the three criteria, introducing a central role in protein synthesis as a fourth criterion makes the game more complicated. Indeed, unlike tmRNA, only few modern RNAs carry specific features related to translational events. Among these, tRNA-like structures (TLSs) are thought to be molecular fossils of the original RNA world. These structures are often located at the 3’-end of the single-stranded RNA genomes of many bacterial and plant viruses, allowing both for tRNA mimicry (including aminoacylation) and regulation of RNA genome replication (61). In an appealing ‘genomic tag model’, Weiner and Maizels (69) suggested that these 3′-terminal TLSs were molecular fossils that originally identified genomic RNA molecules for replication and also functioned as primitive telomeres. In this case, aminoacylation would have come along later, fortuitously via an aminoacylating replicase. Such a scenario does not however contradict our model. It could even explain how the first aminoacylated proto-tmRNAs arose after the cleavage of aminoacylated TLS-tagged genomic RNAs.

Figure 8.

Schematic model representing the different possibilities for the origin of tmRNA. In the beginning of the RNA world, there was a mix of different minihelices. A homodimerisation between two of these would have generated the first proto-ribosome, the Peptidyl Transferase Center (PTC). With aleatory amino acids or aminoacylated minihelices, the PTC created the first random peptide syntheses. A heterodimerization between two minihelices (plus sign) then surely generated a proto-tmRNA. In this step, the proto-tmRNA would have been used as a proto-mRNA. At the same time, the PTC must have evolved into a proto-ribosome by acquiring new RNA that improved its activity. The proto-ribosome would then have continued to evolve via the new synthesized proteins, finally ending up as the ribosomal complex seen today. In contrast to this ribosomal development, the proto-tmRNA took varying evolutionary paths. Through self-splicing, it produced a proto-tRNA that evolved into modern tRNA. It also led to modern tmRNA, which now serves as a rescue system in all bacteria. In addition, proto-tmRNA provided the first non-random genetic medium, which evolved into the RNA genome then into modern mRNA. However, we cannot exclude the theory that a proto-tRNAAla grew into modern tmRNA, through the insertion of nucleotides into its gene (question mark) (77). Modern tmRNA has two main functions (embodied in mRNA and tRNA) that are indispensable to the emergence of protein synthesis. But contrary to isolated mRNA or tRNA, it is only found in bacteria and therefore does not fulfill the second criterion of being ubiquitous. Nevertheless, it is interesting to note that overall prokaryotic and eukaryotic/archaea share several common features, including the addition of a degradation signal to incomplete peptides, and specific degradation of problematic mRNAs. However, in eukaryotes and archaea, the degradation pathway mainly involves the sophisticated ubiquitin proteasome system (70,71), which suggests that tmRNA was lost with ubiquitin's appearance. Accordingly, the Dom34p-mediated no-go decay (NGD) protein is universal in eukaryotes and archaea, suggesting that NGD is probably an ancient mechanism already present in their last common ancestor (41). Evolution of mRNA decay systems in eukaryotes might then have been driven by eRF1 and eRF3 gene duplications. We can therefore assume that tmRNA was lost very early, replaced by more elaborate protein-based quality control mechanisms (i.e. NGD, NSD and NMD as discussed above) in response to the greater diversity of potential clients in eukaryotes and archaea. In conclusion, we believe that a proto-tmRNA dating from the RNA world is the common ancestor of both modern tRNA and mRNA. This early RNA might have been the first player in protein synthesis and the evolution of the genetic code, making it one of the oldest known fossils of the RNA world.

74 in total

1. In vitro and in vivo processing of cyanelle tmRNA by RNase P.

Authors: O Gimple; A Schön
Journal: Biol Chem Date: 2001-10 Impact factor: 3.915

2. Overlapping recognition determinants within the ssrA degradation tag allow modulation of proteolysis.

Authors: J M Flynn; I Levchenko; M Seidel; S H Wickner; R T Sauer; T A Baker
Journal: Proc Natl Acad Sci U S A Date: 2001-09-04 Impact factor: 11.205

3. The structural basis of ribosome activity in peptide bond synthesis.

Authors: P Nissen; J Hansen; N Ban; P B Moore; T A Steitz
Journal: Science Date: 2000-08-11 Impact factor: 47.728

4. tRNA genes and the genetic code.

Authors: Jaromir S Foltan
Journal: J Theor Biol Date: 2008-03-13 Impact factor: 2.691

Review 5. Non-canonical roles of tRNAs and tRNA mimics in bacterial cell biology.

Authors: Assaf Katz; Sara Elgamal; Andrei Rajkovic; Michael Ibba
Journal: Mol Microbiol Date: 2016-06-28 Impact factor: 3.501

6. The split genes of Nanoarchaeum equitans have not originated in its lineage and have been merged in another Nanoarchaeota: a reply to Podar et al.

Authors: Massimo Di Giulio
Journal: J Theor Biol Date: 2014-02-20 Impact factor: 2.691

Review 7. Conserved sequences and structures of group I introns: building an active site for RNA catalysis--a review.

Authors: T R Cech
Journal: Gene Date: 1988-12-20 Impact factor: 3.688

8. Specific aminoacylation of C4N hairpin RNAs with the cognate aminoacyl-adenylates in the presence of a dipeptide: origin of the genetic code.

Authors: M Shimizu
Journal: J Biochem Date: 1995-01 Impact factor: 3.387

Review 9. How amino acids and peptides shaped the RNA world.

Authors: Peter T S van der Gulik; Dave Speijer
Journal: Life (Basel) Date: 2015-01-19

10. The tmRNA website.

Authors: Corey M Hudson; Kelly P Williams
Journal: Nucleic Acids Res Date: 2014-11-05 Impact factor: 16.971

5 in total

Review 1. The double life of the ribosome: When its protein folding activity supports prion propagation.

Authors: Cécile Voisset; Marc Blondel; Gary W Jones; Gaëlle Friocourt; Guillaume Stahl; Stéphane Chédin; Vincent Béringue; Reynald Gillet
Journal: Prion Date: 2017-03-04 Impact factor: 3.931

Review 2. The Coevolution of Biomolecules and Prebiotic Information Systems in the Origin of Life: A Visualization Model for Assembling the First Gene.

Authors: Sankar Chatterjee; Surya Yadav
Journal: Life (Basel) Date: 2022-06-02