Literature DB >> 24433543

Sense codon emancipation for proteome-wide incorporation of noncanonical amino acids: rare isoleucine codon AUA as a target for genetic code expansion.

Abstract

One of the major challenges in contemporary synthetic biology is to find a route to engineer synthetic organisms with altered chemical constitution. In terms of core reaction types, nature uses an astonishingly limited repertoire of chemistries when compared with the exceptionally rich and diverse methods of organic chemistry. In this context, the most promising route to change and expand the fundamental chemistry of life is the inclusion of amino acid building blocks beyond the canonical 20 (i.e. expanding the genetic code). This strategy would allow the transfer of numerous chemical functionalities and reactions from the synthetic laboratory into the cellular environment. Due to limitations in terms of both efficiency and practical applicability, state-of-the-art nonsense suppression- or frameshift suppression-based methods are less suitable for such engineering. Consequently, we set out to achieve this goal by sense codon emancipation, that is, liberation from its natural decoding function - a prerequisite for the reassignment of degenerate sense codons to a new 21st amino acid. We have achieved this by redesigning of several features of the post-transcriptional modification machinery which are directly involved in the decoding process. In particular, we report first steps towards the reassignment of 5797 AUA isoleucine codons in Escherichia coli using efficient tools for tRNA nucleotide modification pathway engineering.

Entities: Chemical Disease Gene Mutation Species

Keywords: codon emancipation; genetic code; orthogonal pairs; reassignment; sense codons; synthetic biology

Mesh：

Substances：

Year: 2014 PMID： 24433543 PMCID： PMC4237120 DOI： 10.1111/1574-6968.12371

Source DB: PubMed Journal: FEMS Microbiol Lett ISSN： 0378-1097 Impact factor: 2.742

The genetic code: degeneracy, wobbling and codon usage

The nearly identical assignment of triplet codons to amino acids in all living organisms illustrates the universality of the genetic code and affirms the existence of a common origin. With only very few exceptions, ribosomal synthesis is restricted to 20 naturally occurring amino acids. Codon assignment in living cells relies on highly selective aminoacyl-tRNA synthetases (aaRS) which recognize both the amino acid to be charged and the related tRNA. By catalyzing a specific aminoacylation reaction, these enzymes provide the critical link between amino acids and codons and thereby establish the genetic code. As a general rule in living cells, one aminoacyl-tRNA synthetase exists for each canonical amino acid. The most remarkable feature of the genetic code is its degeneracy. Of 64 possible triplet combinations, 61 sense codons encode the 20 amino acids, while three stop codons (UAA, ocher; UGA, opal; and UAG, amber) are used to terminate translation (see Fig.1). With the exceptions Met and Trp, most amino acids are thus represented by more than one codon. On average, each amino acid is encoded by three different codons. For example, Ile is encoded by AUC, AUU, and AUA. Surprisingly, these three codons are decoded by only two different tRNA molecules. The reason for this is wobble base pairing of the third codon position as proposed by Crick (1966). Different triplets encoding the same amino acid often vary only in the third position as evident from the aforementioned Ile codons. In this case, the first two codons, AUC and AUU, are decoded by the same anticodon GAU, either involving a G-C Watson–Crick base pairing or a G-U wobble base pairing. In general, the first two bases of a triplet form Watson–Crick base pairings, whereas the third position is able to establish wobble base pairings. In case of the third Ile codon AUA, the first base (wobble position) of the anticodon CAT is post-transcriptionally modified with a lysine to form lysidine (l), which enables A–L wobble base pairings (details are described below).

Figure 1

Presentation of the genetic code in RNA format. Sixty-one triplets are assigned to 20 canonical amino acids. The three triplets UGA, UAA, and UAG are used as termination codons. All amino acids (except Met and Trp) are represented by more than one codon. The majority of proteins and peptides reach their active conformation after being post-translationally modified (PTM) or processed. Only a small portion of cellular proteins is changed in a co-translational manner, involving special proteinogenic amino acids such as selenocysteine and pyrrolysine. For example, most of all known regulatory and localization signals (e.g. phosphorylation and glycosylation) are generated by post-translational covalent side-chain modifications. These processes, catalyzed by enzymes and enzyme assemblies, are strictly separated from translation and represent events highly coordinated in time and space. For instance, in contrast to serine, its noncanonical counterpart phosphoserine, although frequently found in proteins, does not itself participate in translation (Adapted from Michael Hösl). Another important feature of the genetic code is the codon usage, the abundance of a codon in a particular DNA sequence, which can vary significantly between different organisms. For Ile decoding by Escherichia coli, AUU and AUC are frequently used, while the AUA codon is rarely used. A comparison of AUA codons used per 1000 bases in E. coli (3.7), Saccharomyces cerevisiae (17.8), and Homo sapiens (7.5) highlights significant differences (Nakamura et al., 2000). At the time of its deciphering, the genetic code was suggested to be universal in all organisms. Crick proposed a first theory on its evolution saying that the genetic code was a result of a ‘frozen accident’ unable to evolve further, since ‘no new amino acid could be introduced without disrupting too many proteins’ (Crick, 1968; Söll & RajBhandary, 2006).

Natural variations of the genetic code

The theory of a frozen universal genetic code is believed to be disproved by genetic code variations found in vertebrate mitochondria, where AUA encodes Met instead of Ile and UGA encodes Trp instead of being a translational termination signal (Barrell et al., 1979; Osawa et al., 1992). Today, around 20 variations of the standard genetic code are known, not only in organelles but also in free-living microorganisms (Ambrogelly et al., 2007; Campbell et al., 2013; O'Donoghue et al., 2013). Well-documented natural variations suggest the feasibility of genetic code expansion. The design of organisms capable to decode more than the common 20 building blocks could lead to proteins or even organisms with novel activities (Budisa, 2005). One of the earliest discovered variations of the genetic code were found to be deviations in codon assignment between nuclear and mitochondrial genes (Barrell et al., 1979; Himeno et al., 1987; Osawa et al., 1992; Wolstenholme, 1992; Söll & RajBhandary, 1995). Deviations in the codon meanings, especially in various mitochondrial genomes, clearly indicate that in the context of the smaller (‘minimal’) genome and proteome, such reassignments in the code structure are more likely, as lethal effects are less possible (Budisa, 2005). In other living beings, species-specific reassignments are sometimes employed as beneficial adaptive variations in a particular environment. One example is the Leu→Ser change for the CUG codon in some Candida species which probably leads to better survival of these mainly parasitic species in their hosts (Tuite & Santos, 1996). Disturbing effects of such changes are avoided in these cells as CUG is usually not used as a codon in the cellular mRNA. This reassignment is ‘reserved’ for specific genomic islands of this Candida pathogen; ‘island genes’ are usually involved in stress response (like heat shock) and pathogenicity (Suzuki et al., 1997; Santos et al., 2004). Changes in codon meaning can be classified into two general categories. The first category includes reassignment of standard termination codons (UGA, UAA, and UAG) to Trp or Gln in some prokaryotes, archaea, mitochondria of several organisms, and even in nuclear genes of some protozoa (e.g. in Mycoplasma mobile the UGA termination codon encodes Trp). The second category of codon reassignment includes altered meanings in mitochondrial sense codons such as Ile→Met (Barrell et al., 1979; Budisa, 2005). The mitochondrial AUA codon is deciphered by the tRNAMet containing an unusual modified nucleotide 5-formylcytidine (f5C) in the first position (number 34) of their anticodon. This tRNA translates both AUG and AUA as methionine (Takemoto et al., 2009). Sixty-two such reassignments are known in mitochondria, but only one in nuclear genes: CUG Leu→Ser in Candida cylindrical (see above) (Osawa et al., 1992; Tuite & Santos, 1996; Suzuki et al., 1997; Santos et al., 2004). As they do not affect the amino acid repertoire, such reassignments might be seen as flexibility in the ‘frozen’ structure of the genetic code. To date, two examples for a natural expanded genetic code are known. An in-frame UGA termination codon also encodes selenocysteine (Sec), the 21st cotranslationally inserted amino acid. This recoding mechanism requires a tRNAUCA, a specialized translation elongation factor (SelB) and a particular mRNA stem-loop structure, known as the seleocysteine insertion element (SECIS) (Söll, 1988; Schön et al., 1989; Cusack et al., 2005; Ambrogelly et al., 2007). UAG is ambiguous in Methanosarcinaceae where, in addition to serving as a termination codon, it also encodes pyrrolysine (Pyl), the 22nd cotranslationally inserted amino acid. In this case, a new tRNA synthetase, pyrrolysyl-tRNA synthetase (PylRS), is essential for this recoding event (Söll, 1988; Hao et al., 2002; Srinivasan et al., 2002; Blight et al., 2004; Polycarpo et al., 2004). In fact, no known examples for codon reassignment introduce a noncanonical amino acid in response to a sense codon.

Experimental variations of the genetic code

As part of a synthetic genetic code, amino acids can be replaced by noncanonical (ncAA) ones (engineering), or alternatively, an ncAA can be added to the natural repertoire (expansion). Genetic code engineering commonly involves the use of wild-type aminoacyl-tRNA synthetases (aaRSs) to incorporate ncAAs that are close structural analogs of canonical amino acids. In this approach, a strain auxotrophic for one canonical amino acid (cAA) is used for substitution of the cAA with a structurally similar ncAA analog. In the newly synthesized proteins, the canonical amino acid is efficiently replaced by its analog at all sites (global replacement). This method is often called residue-specific incorporation, as all codons coding for a given amino acids are reassigned (Budisa, 2005). Basically, genetic code engineering leads to proteins containing 19 cAA and one ncAA or 18 cAA and 2 ncAA. Up to three different ncAA incorporated into a single protein were reported with this method (Lepthien et al., 2010). Wang et al. (2004) showed that one mRNA can be translated in different ways depending on the relative rates of competing aminoacylation reactions. Depending on the intracellular ncAA levels and whether the host is engineered with an enhanced ValRS or IleRS activity, (2S, 3R)-4, 4-, 4-Trifluorovaline could be assigned either to isoleucine or valine codons (Wang et al., 2004). Using this genetic engineering approach, proteins with new activities and properties can be generated, but the amino acid repertoire is still restricted to 20 amino acids. Genetic code expansion allows for the incorporation of a 21st amino acid. This is usually achieved by assignment of a ‘blank’ codon (nonsense or frameshift suppression). This method enables the site-specific incorporation of an ncAA within the growing polypeptide chain, with the mRNA containing a nonsense codon at the desired position. This is realized by the use of an orthogonal suppressor aaRS:tRNA pair (o-pair) capable to decode the nonsense codon. To date, various orthogonal aaRS:tRNA pairs have been developed (Liu & Schultz, 2010) with the TyrRS from Methanocaldococcus jannaschii (mjTyrRS) being most frequently used (Wang et al., 2001). More recently, pyrrolysine aaRS (PylRS) and its cognate suppressor tRNA pylT from Methanosarcinaceae species have been shown to be highly potent orthogonal ncAA systems (Mukai et al., 2008; Neumann et al., 2008; Huang et al., 2010). In contrast to mjTyrRS, the ‘natural PylRS:tRNA o-pair’ already displays a broad substrate tolerance towards orthogonal ncAAs, but PylRS does not utilize the common 20 amino acids as substrates (Polycarpo et al., 2006; Li et al., 2009). Within a short time, a large number of different PylRS variants for incorporation of versatile ncAAs were developed (Fekner et al., 2010). Nonsense suppression and frameshift suppression are widely used in academic research, although these approaches are still limited by several means. Both typically result in low protein yields due to cellular toxicity of the ncAA, suffer from context effects, and compete with the highly specialized endogenous termination machinery (Doerig et al., 1988; Lepthien et al., 2010; Young et al., 2010). A detailed description about genetic code engineering and expansion is given elsewhere (Wang & Schultz, 2004; Wang et al., 2006; Liu & Schultz, 2010; Antonczak et al., 2011; Hoesl & Budisa, 2012).

Sense codon-specific incorporation of ncAAs

Ultimately, the incorporation of an ncAA typically involves outcompeting the natural aaRS substrate, recoding a stop codon, or the use of four to five base codons. As most amino acids are decoded by more than one codon (exceptions: Trp and Met), our future aim is to emancipate a codon from its natural meaning and to reassign it to an ncAA. This would allow proteome-wide incorporation of a 21st amino acid in response to a sense codon. It is assumed that between 30 and 40 sense codons are required to encode the genetic information of an organism. Therefore, a large number of sense codons (> 20) should be available for recoding experiments (Krishnakumar et al., 2013). Correspondingly, the genetic code would be substantially expanded by such proteome-wide replacements. To our understanding, a synthetic organism with an expanded genetic code is not able to live without the ncAA, as the newly reassigned sense codon becomes not assigned. One step towards sense codon reassignment was described by Kwon et al. (2003). In E. coli, Phe is naturally encoded by UUC and UUU codons, and a single decodes both via Watson–Crick base pairing and wobble base pairing, respectively. Introduction of a heterologous tRNA with an AAA anticodon () and a corresponding aaRS (yaaRS) from S. cerevisiae resulted in competition between the two tRNAs for the UUU codon. The synthetic site of yPheRS was engineered to activate the ncAA 2-naphthylalanine. Depending on experimental setup, assignment of UUC to Phe was quantitative and UUU translation into 2-naphthylalanine reached 80% (Kwon et al., 2003). Nevertheless, the cells were still able to incorporate Phe at UUU positions. In the absence of the ncAA, the cells could easily grow using their endogenous for the incorporation of Phe at UUU positions. A complete sense codon-specific reassignment would lead to cell death in the absence of the ncAA. A nice overview about reassignment of sense codons in vivo and corresponding methods is given by Link & Tirrell (2005) and extensively discussed by Budisa (2005, pp. 97–100).

Choice of the degenerate codon: AUA as a target for reassignment

Real expansion of the genetic code will take place once when we would be able to add synthetic amino acids in a proteome-wide manner by genome-wide reassignments of target codons. Here, we elaborate the concept and report first experimental steps towards the emancipation/liberation of degenerate sense codons from their canonical function, a step towards estranging the genetic code that should enable addition of novel amino acids to the existing repertoire. For the choice of the sense codon to be reassigned, two aspects were considered highly important. The first important point is the presence of a specific modification pathway, which is necessary for the decoding process of the codon. Through the selective interruption of this modification, the decoding mechanism can be eliminated. Thus, the codon is liberated from its original meaning and is available as an open sense codon. Now, the certain codon can be reassigned to a noncanonical amino acid. To minimize disrupting effects through the appearance of an ncAA in the whole proteome, it is favorably to select a codon rarely used. The second important selection criterion is the choice of a noncanonical amino acid whose proteome-wide replacement will be tolerated by the cell. Therefore, choosing a codon which encodes an amino acid with little metabolic involvement and catalytic functionality might be a good starting point. In the genetic code, all coding triplets with a central U (the so-called XUX group of codons) are cognate to apolar (hydrophobic) amino acids, for example Leu, Ile, and Val. These residues are chemically inert and rarely directly involved in the catalytic function of proteins, but frequently play an essential role in protein folding as well as in binding/recognition of hydrophobic ligands. Their synthetic chemical analogs such as trifluorleucine have been known for more than 50 years to be translationally competent, allowing their incorporation into the nascent polypeptide chain (Rennert & Anker, 1963). With Ile rarely involved in catalytic reactions, the rare codon AUA perfectly fits for reassignment experiments. In E. coli, the codon usage of AUA is low. With only 5797 occurrences in the genome, it is usually involved in regulatory events such as ribosomal pausing (Lajoie et al., 2013a). Consequently, we speculated that replacement of Ile by ncAA at AUA positions might be tolerated by the E. coli cell. Next, codon ambiguity has widespread effects on protein function mainly resulting in detrimental effects on cellular fitness. However, these growth defects can be compensated by multiple mutations in various proteins by improving their ability to grow in the presence of the ncAA. For example, Philippe Marliere and coworkers generated E. coli mutant strains capable of installing translational pathways specific for the incorporation of additional amino acids into proteins in vivo (Lemeignan et al., 1993). Their selection procedure resulted in stable bacterial strains with coding triplets that can be read ambiguously. Similarly, Döring & Marlière (1998) constructed stable bacterial strains with ambiguous reading of the rare coding triplet AUA by overexpression of CysRS:tRNACys. Experimentally designed selective pressure yielded almost full reassignment (Ile→Cys) of the AUA codon in E. coli. Such proteome-wide substitutions were possible as Ile side chains evidently do not act as catalytic residues in any enzyme of the E. coli proteome (Döring & Marlière, 1998; Budisa, 2005). Although the repertoire of amino acids was not enlarged, it could be shown that Cys miscoded at the Ile AUA codon was well tolerated by the cell and that codon reading as part of normal translation can be altered in vivo. Döring et al. (2001) were also able to globally substitute Val by the ncAA aminobutyric acid (Abu) in a quarter of the cellular proteome using an E. coli strain with defective editing function of ValRS (Döring et al., 2001). Furthermore, a nonreverting strain of E. coli with impaired editing function has been generated. Under Ile starvation conditions and relative to a wild-type E. coli strain, it achieves higher culture growth in combination with analogs such as norvaline (Pezo et al., 2004). Later, Bacher et al. (2005) abolished the editing function of IleRS and found that bacteria grow slowly even with half of their isoleucine residues being miscoded. Taken together, the rare Ile AUA codon seems to be a feasible target for sense codon reassignment experiments. Finally, proteome-wide addition of noncanonical amino acids associated with breaking the degeneracy of the genetic code should be tolerated by correspondingly engineered cells, in an ideal case by setting up long-term cultivation experiments (Lenski et al., 1991). In the Lenski experiments or using automated devices (de Crécy-Lagard et al., 2001), such genetic estrangement through codon reassignment associated with the use of ncAAs as building blocks would be enforced towards propagation and selection of robust and viable cells with built-in alien nutrients.

Rare isoleucine AUA codon decoding in E. coli

With great accuracy, tRNA isoacceptors are aminoacylated with their related amino acid in vivo. In E. coli, 49 tRNAs function to read the sense codons of the messenger RNA, converting them into 20 amino acids of the protein sequences (Budisa, 2004). Tuning of tRNA-mediated translation has been enabled by more than 70 chemically distinct post-transcriptional base modifications in tRNA species (Agris et al., 2007). In E. coli, isoleucine is decoded by three different codons. The two frequently used codons, AUC and AUU, are translated by the high-abundance tRNA species, tRNAIle1, which harbors a G34AU anticodon and pairs with the two codons. The rare AUA codon is translated by low-abundance tRNAIle2 that carries the same anticodon (CAU) as tRNAMet. Together, the tRNAIle1 and tRNAIle2 pair is able to decipher all Ile codons accurately (Muramatsu et al., 1988a, b1988a; Soma et al., 2003; Ikeuchi et al., 2005; Nakanishi et al., 2009). Bacteria decode AUA via a tRNA in which the wobble position C34 is modified with a lysine, resulting in lysidine (l, 2-lysyl-cytidine). The enzyme catalyzing this modification is called lysidine-tRNA synthase (TilS) (Muramatsu et al., 1988a; Soma et al., 2003; Silva et al., 2006). This lysine-containing cytidine derivative occurs in almost all bacteria (Harada & Nishimura, 1974; Muramatsu et al., 1988b), whereas another chemical modification (2-agmatinylcytidine, agm2C) exists in archaea. The latter modification is introduced by agmatidine synthase TiaS (Ikeuchi et al., 2010; Mandal et al., 2010). The minor tRNAIle2 lacking the lysyl/agmatidine group on C34 is aminoacylated by MetRS and exclusively reads the Met codon AUG (Silva et al., 2006). An overview about Ile decoding in E. coli is given in Fig.2.

Figure 2

Ile decoding in Escherichia coli. Isoleucine is decoded by three different codons. The two frequently used codons AUC and AUU are translated by the high-abundance tRNA species, tRNAIle1, which harbors a GAU anticodon. The rare AUA codon is translated by the low-abundance tRNAIle2 with the same anticodon CAU as tRNAMet. tRNAIle2 is recognized by MetRS when the anticodon at position C34 lacks a lysine modification catalyzed by an enzyme called TilS. It was shown that EctRNAIle2 bearing an LAU anticodon is charged exclusively with Ile by IleRS and recognizes only the AUA codon. According to the chemical structure of lysidine (Fig.3a), conjugation of the lysine moiety at the C2 position of cytidine induces a tautomeric conversion with protonation of the N3 position and imino group formation at the C4 position. By this modification, the proton donor–acceptor pattern of C in hydrogen bonding is completely altered towards the function of U, enabling L to pair with A but not with G (Suzuki & Miyauchi, 2010).

Figure 3

(a) Chemical structures of agmatidine and lysidine. (b) Lysidine converts both the codon and amino acid specificities of tRNAIle. Modified from Suzuki & Miyauchi (2010).

(a) Chemical structures of agmatidine and lysidine. (b) Lysidine converts both the codon and amino acid specificities of tRNAIle. Modified from Suzuki & Miyauchi (2010). In summary, the L34 modification has a dual role: It causes the mature tRNAIle2 to be aminoacylated by IleRS and converts the codon specificity from AUG to AUA (Muramatsu et al., 1988a, b1988b) (Figure 3b).

An essential enzyme: tRNAIle-lysidine synthetase

Soma et al. (2003) identified TilS as the enzyme responsible for the formation of lysidine. It could be shown that partial inactivation of tilS results in an AUA codon-dependent translational defect, which supports the notion that TilS is an RNA modifying enzyme that plays a critical role in the accurate decoding of the genetic code. Complete TilS inactivation is lethal, as the AUA codon becomes an unassigned codon without its cognate decoding tRNA (Nakanishi et al., 2009). Therefore, bacterial tilS was proposed to be one of the 206 essential protein-coding genes required for maintaining bacterial cell life (Gil et al., 2004) and to be essential for reconstructing a minimal synthetic cell (Mushegian & Koonin, 1996; Moya et al., 2009; Gibson et al., 2010). Rescue of the translational defect caused by inactivation of TilS could be performed by a specialized tRNA able to decode the AUA codon. However, remember that codons within the Ile/Met decoding family box (e.g. AUA/AUG) are among the most difficult coding units to be discriminated accurately. Nonetheless, the sequenced genomes of a few-living organisms appear to lack a TilS homolog and (Jaffe et al., 2004; Silva et al., 2006), which suggest that alternative mechanisms of AUA translation exist. Bacterial genomes lacking tilS are found in Bifido bacterium (two species), M. mobile, and Neorickettsia (two species) (Fabret et al., 2011). These species possess a U34-containing tRNAIle () for deciphering AUA codons. Notably, there are bacterial species that contain all three a U34-harboring tRNAIle3, a tilS, and a C34-containing-tRNAIle2 gene pair in their genomes (Lactobacillus, Rhodopirellula, Planctomyces, Ktedonobacter, Cyanothece) (Fabret et al., 2011). In this way, these species can be regarded as evolutionary intermediates between the predominant group of TilS-containing organisms and the rare group of organisms lacking the functional pair of tilS/tiaS and genes. In other words, they contain a temporary ambiguous decoding system – a prediction of the ‘ambiguous intermediate’ theory (Schultz & Yarus, 1994a, b1994b; Jones et al., 2008; Fabret et al., 2011).

Mycoplasma mobile tRNA:IleRS pair is able to decipher AUA codons in E. coli

As mentioned before, M. mobile does not carry the tilS gene but a without any modification. The acquisition of a gene by M. mobile has led to the loss of both the and the tilS gene (Silva et al., 2006). Very recently, Taniguchi et al. (2013) demonstrated that MmIleRS recognizes the UAU anticodon without any modification, whereas E. coli IleRS did not efficiently aminoacylate in vitro. They found Arg865 in MmIleRS to be critical for the specificity to recognize the AUA codon and showed that when the corresponding site in EcIleRS W905 is mutated to Arg, EcIleRS was also able to charge tRNA with a UAU anticodon efficiently. Furthermore, they found that cannot distinguish between AUA and AUG codons in vitro on E. coli ribosomes. This finding suggests that in E. coli, might also read AUG codons as Ile to an extent that could lead to translational defects. As strongly recognizes the AUG codon in the cell, potential misreading might be reduced by competition of with endogenous (Taniguchi et al., 2013).

Escherichia coli ΔtilS strain rescued with MmtRNA:IleRS capable to read rare AUA codons

We have now tested the ability of M. mobile :IleRS pair (MmPair) to rescue the lethal tilS deletion in E. coli through direct AUA codon reading. The ability to decode the AUA codon with a heterologous :IleRS pair – which can later be modified to incorporate an ncAA at AUA positions – is the first step towards AUA sense codon reassignment. To analyze whether the :IleRS pair (MmPair) is able to read the AUA codon in a tilS-depleted E. coli strain, we constructed several rescue vectors. First, and MmIleRS were cloned separately as well as together into plasmid pNB26′2. Furthermore, tilS was cloned into pNB26′1 (Figure S1) as a positive control. Both plasmids carry a p15A origin of replication and pLac promotor. These four different plasmids were used in tilS deletion experiments designed to restore AUA codon readout. The chromosomal tilS in E. coli MG1655 was deleted following the method of Datsenko & Wanner (2000). Obtained colonies were analyzed by PCR with different confirmation primers (Fig.4a; Oligonucleotide sequences are given in Supporting Information, Table S1). For experimental details see Data S1.

Figure 4

(a) Confirmation primers used to verify tilS deletion in genomic DNA of Escherichia coli MG1655. Primer combination 1kb_up_tilS_for and tilSin_conf_rev amplifies a part of tilS and 1 kb upstream of genomic E. coli DNA. Combination tilSin_for and tilSin_rev amplifies tilS in genomic or plasmid DNA. Combination 1kb_up_tilS_for and CmR_conf_rev amplify part of cat and 1 kb upstream of genomic E. coli DNA to verify that cat replaced tilS. (b) PCR verification of ΔtilS::cat genotype in E. coli MG1655. 1) PCR results with primers 1kb_up_tilS_for and CmR_conf_rev. For single no replacement of tilS with cat was observed. As a control, also E. coli MG1655 WT showed no PCR product. For MmIleRS, tilS on a plasmid and the combination of with MmIleRS, tilS replacement with cat is observed. 2) PCR results with primers 1kb_up_tilS_for and tilSin_conf_rev. For all variants except together with MmIleRS and tilS on a plasmid, tilS was found not to be deleted. 3) PCR results with primers tilSin_for and tilSin_rev. These results show that only for the /MmIleRS combination, E. coli is able to live without tilS. For pNB26′1_tilS, genomic tilS was successfully replaced by cat (chloramphenicol acetly-transferase) as shown in Fig.4b (1 and 2). Colonies obtained for pNB26′2_Mm and for pNB26′2_MmIleRS retained the tilS gene and were most probably spontaneous Cm-resistant colonies (Figure4b (2 and 3)). These findings suggest that E. coli IleRS is not able to charge in vivo, which is consistent with the in vitro results of Taniguchi et al. (2013). Regarding the deletion experiment with pNB26′2_MmIleRS, two explanations are plausible. First, MmIleRs might not able to charge and . Alternatively, MmIleRS could charge one or both EctRNAs, whereas the aminoacylated tRNA is not able to decipher the AUA codon on the E. coli ribosome. As Taniguchi et al. (2013) could show in vitro aminoacylation of by MmIleRS, the second option is most probably correct. Nevertheless, using tRNA or IleRS separately, it was not possible to completely remove tilS from the E. coli genome. In contrast, the combination of and MmIleRS was able to rescue the lethal tilS deletion in E. coli MG1655, rendering the cells viable without the essential enzyme TilS. The generated E. coli MG1655 ΔtilS::cat strain (pNB26′2_MmPair) grows slower than the wild-type equivalent (Fig.5 and Fig. S2). The calculated doubling times are 38 and 28 min, respectively (Fig. S3). The lower growth rate might be due to interference with the AUG codon by or due to low intracellular levels of . Currently, we are investigating this phenomenon in more detail.

Figure 5

Comparison of growth curves between E. coli MG1655 WT (black line) and E. coli MG1655 ΔtilS::cat (pNB26'2_MmPair) (red line).

Comparison of growth curves between E. coli MG1655 WT (black line) and E. coli MG1655 ΔtilS::cat (pNB26'2_MmPair) (red line). Taken together, our results show that it is possible to liberate the AUA codon from its natural deciphering mechanism in E. coli. Using MmIleRS and we were able to restore AUA decoding and to rescue the lethal chromosomal tilS deletion (see Fig.4). Figure6 represents the suggested rescue mechanism of the MmPair. In future, we will engineer an orthogonal aaRS to specifically incorporate a 21st ncAA at AUA positions to generate an E. coli strain with a proteome-wide expanded genetic code through sense codon reassignment.

Figure 6

AUA decoding in Escherichia coli ΔtilS::cat (pNB26′2_MmPair) cells. The MmPair (MmIleRS and ) is able to rescue the lethal chromosomal tilS deletion in E. coli (A). EcIleRS is not able to charge (B) and (C), whereas EcIleRS is capable to charge (D), but this tRNA is not able to read the AUA codon at the ribosome (E). Finally, MmIleRS is able to charge (F), but not (G), as shown elsewhere (Taniguchi et al., 2013).

Outlook: towards genetic code expansion via experimental reassignment of AUA codons

Currently available methodologies for genetic code expansion are well suited for small-scale experiments focused on particular academic questions (e.g. those using fluorescence tags) but are of quite limited use for the design and creation of robust artificial diversity in synthetic cells. Thus, we believe that it is necessary to replace the traditional suppression-based approaches with reassignments of sense (preferably rare) codons. However, we are well aware that rare codons such as AUA are often part of translational fine-tuning systems such as ribosomal pausing (Zhang et al., 2009). Nonetheless, we succeeded to develop a rescue strategy that compensated for the translational defects by importing the AUA decoding system from M. mobile, which consists of MmIleRS and . In the next step, we will use an engineered isoleucyl-tRNA synthetase (IleRS) from an evolutionary distant organism to reassign the AUA codon with an ncAA. Just like related enzymes for branched-chain amino acids (Thr, Val, Leu), this IleRS exhibits a hydrolytic editing that is capable to dissociate noncognate amino acids. In conjunction with the substrate recognition and activation by the active site architecture, this leads to the ‘double sieve’ concept (Fersht, 1977; Fersht & Dingwall, 1979). The availability of high-resolution X-ray crystal structures of different IleRSs (in apo-forms or in tRNA/ATP/amino acid complexes) is thus of enormous importance for the identification of residues crucial for amino acid recognition as well as for aaRS:tRNA interactions. The basic enzyme engineering strategy will rely on well-established mutagenesis approaches to redesign the binding pocket and to attenuate the aaRS editing activity. For example, Mursinna & Martinis (2002) reported a rational design that blocked the amino acid editing reaction in EcLeuRS. In particular, a single Thr at position 252 was found to play a critical role, with its mutation to bulky amino acids like Phe or Tyr resulting in the loss of the editing function (Mursinna & Martinis, 2002). Tang & Tirrell (2002) have identified three LeuRS mutants (Thr252Tyr, Thr252Leu, Thr252Phe) with attenuated editing activities that allow successful translation of norvaline, allylglycine, homoallyglycine, homopropargylglycine, and 2-butynylglycine at Leu positions of the model protein (Tang & Tirrell, 2002). This approach should be applicable to all ‘double sieve’ aaRS for branched-chain amino acids such as ThrRS, ValRS, LeuRS, and IleRS. By a variety of mechanisms, long-term cultivation of bacteria can lead to increased mutation rates – the so-called adaptive mutation phenomenon (Marlière et al., 2011). Under such conditions, these organisms need to explore both a larger genetic space and a large protein folding space to accommodate the desired ncAAs in their proteome and intermediary metabolism. Thus, we anticipate long-term cultivation experiments using a special turbidostat system as a promising tool for creating codon-emancipated cells (de Crécy-Lagard et al., 2001). Via a controllable and defined selective pressure, codon removal will be actively enforced in microbial cells, allowing the reassignment of AUA codons to suitable Ile analogs. Subjected to successive selection cycles in continuous culture, E. coli strains with ambiguous reading of this codon will prevail. Eventually, sufficiently stable variants will be perpetuated. We are fully aware that the reassignment of AUA codons to ncAAs potentially causes detrimental effects on cell survival due to their partial localization in essential genes. However, the same argument also applies to the elimination or reassignment of amber stop codons (UGA) in the genome of E. coli (321 ORFs). Nonetheless, the groups of Church, Sakamoto, and Yokoyama have shown very recently that the latter codon can be either fully replaced via oligonucleotide-mediated mage technology or reassigned by simple knock out of release factor 1, (RF1), compensated by the presence of a suppressor tRNA capable to decode the UAG codon (Mukai et al., 2010; Isaacs et al., 2011; Johnson et al., 2011). Together, the replacement of all UAG codons with UAA codons in E. coli and the deletion of RF1 allowed for the reassignment of UAG translation function (Lajoie et al., 2013b). Recently, George Church and coworkers analyzed the possibility of replacing degenerate rare codons in essential genes with a synonymous codon. They selected 42 essential genes and removed all instances of 13 rare codons from these genes indicating that genome-wide removal of rare codons is feasible (Lajoie et al., 2013a). In light of these results, our strategy of proteome-wide AUA codon reassignment appears feasible.

80 in total

1. Reassignment of sense codons in vivo.

Authors: A James Link; David A Tirrell
Journal: Methods Date: 2005-07 Impact factor: 3.608

2. Inhibited cell growth and protein functional changes from an editing-defective tRNA synthetase.

Authors: Jamie M Bacher; Valérie de Crécy-Lagard; Paul R Schimmel
Journal: Proc Natl Acad Sci U S A Date: 2005-01-12 Impact factor: 11.205

Review 3. Natural expansion of the genetic code.

Authors: Alexandre Ambrogelly; Sotiria Palioura; Dieter Söll
Journal: Nat Chem Biol Date: 2007-01 Impact factor: 15.040

Review 4. tRNA's wobble decoding of the genome: 40 years of modification.

Authors: Paul F Agris; Franck A P Vendeix; William D Graham
Journal: J Mol Biol Date: 2006-11-15 Impact factor: 5.469

Review 5. Expanding the genetic code.

Authors: Lei Wang; Jianming Xie; Peter G Schultz
Journal: Annu Rev Biophys Biomol Struct Date: 2006

6. An operational RNA code for faithful assignment of AUG triplets to methionine.

Authors: Thomas E Jones; Cassidy L Brown; Renaud Geslain; Rebecca W Alexander; Lluís Ribas de Pouplana
Journal: Mol Cell Date: 2008-02-15 Impact factor: 17.970

7. Pyrrolysine analogues as substrates for pyrrolysyl-tRNA synthetase.

Authors: Carla R Polycarpo; Stephanie Herring; Amélie Bérubé; John L Wood; Dieter Söll; Alexandre Ambrogelly
Journal: FEBS Lett Date: 2006-11-20 Impact factor: 4.124

8. molecular mechanism of lysidine synthesis that determines tRNA identity and codon recognition.

Authors: Yoshiho Ikeuchi; Akiko Soma; Tomotake Ote; Jun-ichi Kato; Yasuhiko Sekine; Tsutomu Suzuki
Journal: Mol Cell Date: 2005-07-22 Impact factor: 17.970

Review 9. The genetic code - thawing the 'frozen accident'.

Authors: Dieter Söll; Uttam L RajBhandary
Journal: J Biosci Date: 2006-10 Impact factor: 2.795

10. Differential annotation of tRNA genes with anticodon CAT in bacterial genomes.

Authors: Francisco J Silva; Eugeni Belda; Santiago E Talens
Journal: Nucleic Acids Res Date: 2006-10-27 Impact factor: 16.971

16 in total

1. Genetic Encoding of Three Distinct Noncanonical Amino Acids Using Reprogrammed Initiator and Nonsense Codons.

Authors: Jeffery M Tharp; Oscar Vargas-Rodriguez; Alanna Schepartz; Dieter Söll
Journal: ACS Chem Biol Date: 2021-03-16 Impact factor: 5.100

2. Polyspecific pyrrolysyl-tRNA synthetases from directed evolution.

Authors: Li-Tao Guo; Yane-Shih Wang; Akiyoshi Nakamura; Daniel Eiler; Jennifer M Kavran; Margaret Wong; Laura L Kiessling; Thomas A Steitz; Patrick O'Donoghue; Dieter Söll
Journal: Proc Natl Acad Sci U S A Date: 2014-11-10 Impact factor: 11.205

Review 3. The central role of tRNA in genetic code expansion.

Authors: Noah M Reynolds; Oscar Vargas-Rodriguez; Dieter Söll; Ana Crnković
Journal: Biochim Biophys Acta Gen Subj Date: 2017-03-18 Impact factor: 3.770

4. Pyrrolysyl-tRNA synthetase, an aminoacyl-tRNA synthetase for genetic code expansion.

Authors: Ana Crnković; Tateki Suzuki; Dieter Söll; Noah M Reynolds
Journal: Croat Chem Acta Date: 2016-06-14 Impact factor: 0.887

5. Essentiality of threonylcarbamoyladenosine (t(6)A), a universal tRNA modification, in bacteria.

Authors: Patrick C Thiaville; Basma El Yacoubi; Caroline Köhrer; Jennifer J Thiaville; Chris Deutsch; Dirk Iwata-Reuyl; Jo Marie Bacusmo; Jean Armengaud; Yoshitaka Bessho; Collin Wetzel; Xiaoyu Cao; Patrick A Limbach; Uttam L RajBhandary; Valérie de Crécy-Lagard
Journal: Mol Microbiol Date: 2015-10-07 Impact factor: 3.501

Review 6. From Prebiotics to Probiotics: The Evolution and Functions of tRNA Modifications.

Authors: Katherine M McKenney; Juan D Alfonzo
Journal: Life (Basel) Date: 2016-03-14

Review 7. Xenomicrobiology: a roadmap for genetic code engineering.

Authors: Carlos G Acevedo-Rocha; Nediljko Budisa
Journal: Microb Biotechnol Date: 2016-08-04 Impact factor: 5.813

8. Towards Biocontained Cell Factories: An Evolutionarily Adapted Escherichia coli Strain Produces a New-to-nature Bioactive Lantibiotic Containing Thienopyrrole-Alanine.

Authors: Anja Kuthning; Patrick Durkin; Stefan Oehm; Michael G Hoesl; Nediljko Budisa; Roderich D Süssmuth
Journal: Sci Rep Date: 2016-09-16 Impact factor: 4.379

9. Modification of orthogonal tRNAs: unexpected consequences for sense codon reassignment.

Authors: Wil Biddle; Margaret A Schmitt; John D Fisk
Journal: Nucleic Acids Res Date: 2016-10-23 Impact factor: 16.971

10. Reassignment of a rare sense codon to a non-canonical amino acid in Escherichia coli.

Authors: Takahito Mukai; Atsushi Yamaguchi; Kazumasa Ohtake; Mihoko Takahashi; Akiko Hayashi; Fumie Iraha; Satoshi Kira; Tatsuo Yanagisawa; Shigeyuki Yokoyama; Hiroko Hoshi; Takatsugu Kobayashi; Kensaku Sakamoto
Journal: Nucleic Acids Res Date: 2015-08-03 Impact factor: 16.971