Yujia Huang1, Tao Liu1. 1. State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing, 100191, China.
Abstract
In nature, a limited, conservative set of amino acids are utilized to synthesize proteins. Genetic code expansion technique reassigns codons and incorporates noncanonical amino acids (ncAAs) through orthogonal aminoacyl-tRNA synthetase (aaRS)/tRNA pairs. The past decade has witnessed the rapid growth in diversity and scope for therapeutic applications of this technology. Here, we provided an update on the recent progress using genetic code expansion in the following areas: antibody-drug conjugates (ADCs), bispecific antibodies (BsAb), immunotherapies, long-lasting protein therapeutics, biosynthesized peptides, engineered viruses and cells, as well as other therapeutic related applications, where the technique was used to elucidate the mechanisms of biotherapeutics and drug targets.
In nature, a limited, conservative set of amino acids are utilized to synthesize proteins. Genetic code expansion technique reassigns codons and incorporates noncanonical amino acids (ncAAs) through orthogonal aminoacyl-tRNA synthetase (aaRS)/tRNA pairs. The past decade has witnessed the rapid growth in diversity and scope for therapeutic applications of this technology. Here, we provided an update on the recent progress using genetic code expansion in the following areas: antibody-drug conjugates (ADCs), bispecific antibodies (BsAb), immunotherapies, long-lasting protein therapeutics, biosynthesized peptides, engineered viruses and cells, as well as other therapeutic related applications, where the technique was used to elucidate the mechanisms of biotherapeutics and drug targets.
While DNA and RNA are two ways how creatures on Earth store hereditary information, it is protein that constitutes most of cells’ dry mass and carry out most information-bearing functions. Natural proteins are synthesized by 20 canonical amino acids. The different chemical properties of these building blocks equip proteins with multifarious structures and functions, however, at the same time, restrict the development of protein properties. Post-translational modifications (PTM) such as methylation, phosphorylation and glycosylation, as well as the binding of co-enzyme factors are different strategies natural proteins take to overcome these limitations at a certain extent. To unleash the full power of amino acid diversities and meet the challenges for protein related applications, scientists have devised a variety of ways to incorporate noncanonical amino acids (ncAAs) into proteins, over the past few decades, greatly expanding protein properties.Genetic code expansion is one of these techniques, which allows site-specific incorporation of ncAAs into any protein of interest in cells. This is achieved with engineered aminoacyl-tRNA synthetase/tRNA (aaRS/tRNA) pairs which can specifically aminoacylates its cognate tRNA with a given ncAA, devoted to decoding reassigned codons, which are typically nonsense codons (UAG/UGA/UAA). The aminoacylated suppressor tRNA charged with the provided ncAA is then recognized by ribosome allowing the ncAA to be site-specifically incorporated into the synthesizing peptides during translation in response to assigned codons. This system must be orthogonal with the native translation system, including endogenous tRNA, canonical amino acids as well as endogenous aaRS (Fig. 1(a)).
Fig. 1
(a) Principles of genetic code expansion technique. aaRS: aminoacyl-tRNA synthetase. (b) Applications of ncAAs in studying protein structures and functions.
(a) Principles of genetic code expansion technique. aaRS: aminoacyl-tRNA synthetase. (b) Applications of ncAAs in studying protein structures and functions.Several aaRS/tRNA pairs have been developed to incorporate more than 200 different ncAAs in different species. For recombinant expression therapeutic proteins in E. coli, the most widely used aaRS/tRNA pairs are the Archean TyrRS/tRNATyr pair from Methanocaldococcus jannaschii (MjTyrRS/tRNATyr) [[1], [2], [3], [4]] and the Archean pyrrolysineaaRS/tRNAPyl pair (Mb/MmPylRS/tRNAPyl) [[5], [6], [7], [8], [9]]. The MjTyrRS/tRNATyr pair was the first genetic code expansion system developed for site-specific ncAAs incorporation in living cells in 2000 [1] and has now been widely used in prokaryotic genetic code expansion. This system is highly efficient in incorporation of aromatic ncAAs with a wide range of novel chemical, physical and biological properties. While the MjTyrRS/tRNATyr pair is commonly used in E. coli, it is not orthogonal in higher organisms. Unlike other systems, Mb/MmPylRS/tRNAPyl pair was a gift from nature. The discovery of this naturally existing orthogonal system was first described in 2002 and it allows the genetic incorporation of a natural ncAA, pyrrolysine, which was then referred as the 22nd amino acid [10,11], site-specifically into the proteome of living cells in response to amber stop codon. The pyrrolysine system has shown a great orthogonality in both prokaryotes and eukaryotic cells. Such orthogonality enables the evolution of PylRS/tRNAPyl pairs in E. coli and the evolved PylRSs can be directly applied in mammalian cells. In addition, the evolved PylRS mutants can incorporate not only pyrrolysine analogs but also a diverse side chain structures [12].Systems most widely used in eukaryotes nowadays are E. coli TyrRS/tRNATyr (EcTyrRS/tRNATyr) [[13], [14], [15], [16], [17]], E. coli LeuRS/tRNALeu (EcLeuRS/tRNALeu) [16,18,19] and Mb/MmPylRS/tRNAPyl pair [17,[20], [21], [22], [23], [24], [25], [26], [27], [28]]. The incorporation of ncAAs in Saccharomyces cerevisiae was first achieved with E. coli TyrRS/tRNATyr pair by Chin, J. W. et al. [14]. The E. coli TyrRS has also been cooperated with the Bacillus stearothermophilus tRNATyr [13,29] to achieve more efficient suppression. These pairs successfully introduced phenylalanine derivatives containing chemical structures like benzophenone [15], ketone [30], iodide [30] and azide [30] groups. E. coli LeuRS/tRNALeu pair was first developed in 2004 [18]. A variety of structures can be incorporated by this pair through mult-site mutations for its tolerance for structural diversity. Mb/MmPylRS/tRNAPyl pair has also been extensively applied in mammalian cells as previously described.Other platforms like TrpRS/tRNATrp [7,[31], [32], [33], [34]], PheRS/tRNAPhe [35,36], LysRS/tRNALys [37], SepRS/tRNACys [38,39] have also been developed and were further explained elsewhere [40]. The orthogonality feature of these pairs determines the disadvantage that two distinct platforms need to be developed for applications of one single ncAA in both bacterial and eukaryotic cells. Italia, J. S. et al. recently reported a novel strategy to expand the genetic code of E. coli by replacing its endogenous pair with an eukaryotic-archaeal counterpart, achieving a general way to develop new aaRS/tRNA pairs for both E. coli and eukaryotes [34,41].Genetic code expansion has been commonly used to study protein structures and functions both in vitro and in vivo (Fig. 1(b)). Chemical cross-linkable ncAAs were used to catch proximity-based protein-protein interactions in live cells [42,43]. Fluorescent ncAAs [44] or fluorophores that react with ncAAs through bioorthogonal ligation reactions [45,46] were used in live cell imaging studies. Through incorporation of photo-switchable [47,48] or chemical-switchable ncAAs [49], a new field in controllable protein function has also been opened. In addition to the above applications, expanding the genetic code has shown great advantages in the study of PTM, like serine phosphorylation [39,50], threonine phosphorylation [51], tyrosine phosphorylation [52], lysine acetylation [6], lysine crotonylation [53] and lysine hydroxyisobutyrylation [54].The site-specific modification characteristic of genetic code expansion technique makes it a powerful tool in the improvement of pharmacological properties of protein therapeutics as well as in developing novel therapeutic methods. In this review, we will be focused on the therapeutic applications of genetic code expansion in recent years.
Antibody-drug conjugate (ADC)
Antibody-drug conjugate (ADC) is a class of novel biopharmaceutical drugs with a significantly enhanced therapeutic window against cancer. Four ADCs have been approved by FDA and EMA in clinical use [55]. Conventional method of ADC production is through nonselectively coupling to lysine or cysteine residues, yielding heterogeneous products, which decreases ADC potency, circulation half-life, stability, tolerability as well as antigen binding [56]. In 2010, the first approved ADC, gemtuzumab ozogamicin (Mylotarg™) was voluntarily withdrawn from the market due to toxicity and a lack of efficacy [57]. To avoid the problem of heterogeneity, different ways have been exploited to engineer ADCs. Reactive cysteine residues were introduced into antibody-Fab, which is called THIOMAB (the antibodies with engineered reactive cysteine residues), exhibiting an improved therapeutic window [58]. However, in the process of producing THIOMABs, other side products, like triple light antibodies (3LC) may form [59]. The inherent cysteine-maleimide instability and the multistep process precluded it from clinical applications as well. Other site-specific methods utilizing enzymatic modifications like β1,4-galactosyltransferase (Gal T), α2,6-sialyltransferase (Sial T) [60], transglutaminase (TG) [61], formylglycine-generating enzyme (FGE) or anaerobic sulfatase-maturating enzyme (anSME) [62] have also been demonstrated to successfully introduce specific residues into antibodies. Nevertheless, the positions that can be targeted on antibodies are limited. The incorporation of ncAAs using genetic expression systems like auxotrophic strain or selenocysteine have provided new insights [63,64].Genetic code expansion owns the advantage of synthesizing homogenous ADCs through incorporating ncAAs with different moieties at one or more specific sites, such as ketone, azide, alkyne and alkene groups [57,65]. To produce antibodies with excellent potency, pharmacokinetics and safety, p-acetylphenylalanine (pAcF) was genetically incorporated into two sites on an anti-Her2 antibody Fab fragment and coupled with the microtubule toxin auristatin F (AF) (Fig. 2) [66]. The synthesis of chemical defined CXCR4-auristatin ADC, anti-CD11a IgG conjugated with liver X receptor (LXR) agonists or phosphodiesterase 4 (PDE4) embraced the same strategy [[67], [68], [69]]. These ADCs not only showed significant effects in in vivo disease models but also reduced unwanted side-effects due to the improvement of drug specificity. To produce more complex proteins like full-length antibodies, Tian, F et al. utilized CHO cells to develop a stable expression system and promoted the titer of antibodies containing ncAAs to over 1 g/L [70]. The genetic code expansion has also been used in radioimmunotherapy. Through incorporation of an azido group-bearing amino acid, Nε-2-azideoethyloxycarbonyl--lysine (NAEK), a newly synthesized bifunctional linker DIBO-DOTA was site-specifically conjugated to the anti-CD20 antibody rituximab, followed by radiolabeling (Fig. 2) [71]. To improve conjugation chemistry, Koehler, C et al. utilized MutiBac system in insect cells to introduce large, hydrophobic, carbocyclic reactive ncAAs into antibodies [27]. Later, Oller-Salvia, B et al. engineered mammalianexpression systems that efficiently incorporated a cyclopropene derivative of lysine (CypK) into trastuzumab through genetic code expansion [72]. Above all else, this technique has also been applied to immunosuppressive drugs. This was exemplified by the site-specific conjugation of dasatinib with humanized antibody HLCX, selectively delivering this Lck inhibitor to human T lymphocytes [73].
Bispecific antibody is a type of bioengineered antibodies with the ability to simultaneously recognize two different types of epitopes. In 2009, catumaxomab (Removab) was approved in the European Union for the treatment of malignant ascites [74]. The FDA granted accelerated approval of blinatumomab (Blincyto) in December, 2014 for the treatment of Philadelphia chromosome-negative relapsed or refractory precursor B-cell acute lymphoblastic leukemia (R/R ALL) [75]. However, the development of bispecific antibodies is hampered by the production of correctly assembled structures (i.e., correct assembly of heavy and light chains). One way to produce bispecific antibodies is through chemical conjugations using lysine or cysteine residues within the antibody, which often yields heterogenous products unsuitable for therapeutic applications.Through site-specific encoding pAcF into anti-Her2 and anti-CD3Fabs and subsequently coupling the Fabs to bifunctional ethylene glycol linkers with an alkoxy-amine on one end and an azide or cyclooctyne group on the other, an anti-Her2/anti-CD3 bispecific antibody was synthesized by a copper-free click chemistry, effectively crosslinking HER2+ cells and CD3+ cells (Fig. 3) [76,77]. Similar strategy has also been applied to the synthesis of a novel bispecific antibody, αCLL1-αCD3, which completely eliminated established tumors in a cell line xenograft model, showing potential as a treatment for acute myeloid leukemia (AML) [78]. A general and straightforward approach using Watson-Crick base pairing interactions has been demonstrated for producing homodimeric, heterodimeric and multimeric Fabs with defined composition, valency and geometry. The chemical reactive ncAA in this platform was used to site-specifically couple oligonucleotides or peptide nucleic acids (PNAs) to antibodies, with the assemblies showing enhanced anti-tumor activity [79]. Different types of chemical reactions have been constantly exploited to develop novel bispecific antibodies. In 2015, Ramadoss, N. S et al. utilized the reaction between tetrazine (TET) and bicyclononyne (BCN) to develop anti-BCMA and anti-CS1 bispecific antibodies (Fig. 2). The resulting BiFab-BCMA robustly activated T cells and mediated rapid tumor regression [80].
Fig. 3
Antibody-related applications of an expanded genetic code. scFv: single-chain variable fragment; ncAA: noncanonical amino acid; FITC: fluorescein isothiocyanate; Fab: antigen-binding fragment; TCR: T cell receptor; CAR: chimeric antigen receptor.
Antibody-related applications of an expanded genetic code. scFv: single-chain variable fragment; ncAA: noncanonical amino acid; FITC: fluorescein isothiocyanate; Fab: antigen-binding fragment; TCR: T cell receptor; CAR: chimeric antigen receptor.Apart from bispecific antibodies, bispecific targeting therapeutics consisting of a small molecule 2-[3-(1, 3-dicarboxy propyl)-ureido] pentanedioic acid (DUPA), that selectively targets a tumor-associated antigen, prostate-specific membrane antigen (PSMA) were also developed (Fig. 2). With DUPA conjugating to pAcF site-specifically expressed on anti-CD3Fab, the EC50 of the conjugate has been reduced to 100pM, and serum half-life has been improved to 5–6 h, resulting effective in a treatment in prostate cancer xenograft mouse models [81]. Same strategy was also employed to synthesize an anti-CD3Fab-folate conjugate [82] and a switchable αGCN4-Fab conjugate [83].
Chimeric antigen receptor T (CAR-T) cell immunotherapy
Chimeric antigen receptor T (CAR-T) cell therapy is a promising immunotherapy for cancer treatment. CARs endow the engineered T cells derived from patients to recognize cancer cells through equipment of an anti-tumor single-chain variable fragment (scFv), overcoming the limitations of the need for MHC expression. In August 2017, a CAR-T therapy targeting CD19 was approved by FDA for the treatment of refractory pre-B cell acute lymphoblastic leukemia and diffuse large B cell lymphoma [84]. The conventional CAR-T cell therapy is facing a number of challenges. One of the most important obstacles is the severe cytokine release syndrome (CRS). Hence, it is important to control T cell activities when necessary. In 2012, Tamada, K et al. developed an anti-tag CAR technology which was adaptable and capable of regulating CAR T-cell functions. T cells were genetically engineered to express a CAR that recognizes fluorescein isothiocyanate (FITC) molecules and various antibodies conjugated with FITC served as an intermediary [85]. However, antibodies in this study were conjugated with FITC using nonspecific labeling methods, with an average of 3 FITC molecules conjugating per one molecule of antibody. FITC-labelled Abs-induced in vivo anti-FITC immune responses might shorten the half-life of these intermediary molecules and thus weakened curative effects. In this case, a site specifically modified “switch” molecule is needed to minimize the potential for immunogenicity. Ma, J. S. Y et al. and Cao, Y et al. later developed switchable CAR-T (sCAR-T) systems based on genetic code expansion, in which anti-CD19, anti-CD22 and anti-Her2 antibody fragments were site-specifically modified with FITC through para-azidophenylalanine (pAzF) and mediated distinct spatial interactions between sCAR-T cells and cancer cells (Fig. 3) [86,87]. The strictly switch dose-dependency of this sCAR-T system enabled CAR-T cell activity to become potent and controllable, thus improving efficacy and safety. Furthermore, this system is able to target multiple antigens without further genetic engineering of a patient's T cells, making it a versatile immunotherapy.
Increasing therapeutic agent serum half-lives
PEGylation is a well-established technique for improving the pharmacokinetics and biocompatibility of therapeutic agent with short in vivo half-lives. The first unspecific multi-PEGylated drug, Adagen, was approved by FDA in 1990 [88]. PEGylation is usually achieved through chemical conjugation to reactive side chains [89]. The acquired heterogeneous mixture gave rise to difficulties in batch-to-batch reproducibility and quality control, making site-specific PEGylation attractive.In spite of the efficacy of humangrowth hormone (hGH) therapy for treatment for pathological short stature, it requires daily subcutaneous injection due to its short circulating half-life [90]. Through site-specific conjugation of hGH containing pAcF with PEG, the homogeneous hGH variant was demonstrated to be as effective and safe as the native hGH. Meanwhile, the injection frequency was reduced to once a week [90]. In addition, site-specific multi-PEGylation of hGH was achieved by Wu, L et al., which showed further reduced immunogenicity and improved pharmacokinetic profiles [91]. Apart from hGH, site-specific PEGylation using genetic code expansion has also been expanded to the optimization of IFN-α2b [92], fibroblast growth factor 21 (FGF21) [93] as well as AAV vectors [94]. Similar strategy was developed to achieve site-specific POxylation of Interleukin-4 (IL-4), either [95].In addition to site-selective conjugation, ncAAs themselves are able to enhance protein stability through formation of more stable bonds. ncAAs containing long side-chain thiols were incorporated into β-lactamase, forming unnatural disulfide bond and exhibiting significantly enhanced thermostability both in vitro and in vivo [96]. Genetically incorporation of p‐fluorophenylalanine (pFF) [97] or para‐isothiocyanatephenylalanine (pNCSF) [98] was also capable of enhancing protein stability at elevated temperature.
Biosynthesized therapeutic peptides
Genetic code expansion was used in combination with split intein catalyzed ligation of proteins and peptides (SICLOPPS) to construct a large cyclic peptide library, which was subsequently harnessed to evolve an HIV protease inhibitor [99]. More recently, co-translational and site-specific incorporation of ncAAs has been applied to the biosynthesis of ribosomally synthesized and post-translationally modified peptide (RiPP) natural products to expand their chemistry and structure repertoire. For example, incorporation of ncAAs at discrete positions on Nisin expanded building blocks that can be biosynthetically incorporated into lanthipeptide and diversified macrocyclic topologies of this antibacterial peptide [100]. Likewise ncAAs have been successfully introduced into other lanthipeptides [101,102], lasso peptides [103,104], macrocyclic peptides [105], thiopeptides [106] as well as in the natural product cinnamycin, recently [107].
Site specifically modified viruses
Adeno-associated virus (AAV) is a small non-enveloped human virus that has been proven as a viral vector for gene therapy with its attractive feature of lacking apparent pathogenicity, replication deficiency, persistent transgene expression as well as the capability to infect nondividing and dividing cells [108]. Its icosahedral capsid is made up of three capsid proteins, VP1, VP2, and VP3 [109]. However, the serum instability, immunogenicity and nonspecific cell infectivity remain major obstacles for effective gene transfer. Although reengineering of altered capsid proteins [110,111], fusion of high-affinity ligands to capsid proteins [112] and chemical conjugation of small molecule effector functionality [113] were applied to overcome these challenges, few method can produce heterogeneous vectors for specific targets.To alleviate the immunogenicity and instability, polyethylene glycols (PEG) was covalently conjugated to AAV-2 site-selectively through genetic code expansion (Fig. 4) [94]. In this study, NAEK was introduced to a series of chosen sites on the capsid protein VP1. PEGylation at these sites showed protective effect and improvement in stability. Selective tropism towards αvβ3 integrin receptors overexpressing tumor cells was enhanced through bioorthogonally and stoichiometrically attachment of AAV2 with a cyclic-RGD peptide precisely, constructing chimeric viruses as simple and viable targeting delivery vehicles (Fig. 4) [114,115]. This method greatly expanded the versatility of AAV vectors for it could be easily applied to different ligands and to other types of viral vectors for gene therapy. Recently, Erickson, S. B et al. combined AAV with photocaged derivatives of lysine to precisely perturb virus-host interaction with high in vivo spatial and temporal control [116]. A photocaged amino acid NBK was site-specifically incorporated to a functionally important site on the virus capsid protein VP1, abrogating the positive charge on the lysine residue and subsequently undergoing efficient decaging to non-toxic products upon irradiation with 365 nm light (Fig. 4). With this methodology, it is able to synchronize the invading viruses at a specific stage. Fluorescent molecules were also lately labelled to AAV2 for visualization and tracking of virus vectors [72].
Fig. 4
Applications of genetic code expansion related to viral vectors. aaRS: aminoacyl-tRNA synthetase; ncAA: noncanonical amino acid; AAV: adeno-associated virus; cRGD: cyclic Arg-Gly-Asp; PEG: polyethylene glycols.
Applications of genetic code expansion related to viral vectors. aaRS: aminoacyl-tRNA synthetase; ncAA: noncanonical amino acid; AAV: adeno-associated virus; cRGD: cyclic Arg-Gly-Asp; PEG: polyethylene glycols.
Therapeutic vaccines
Protein-based vaccines
Immunization of patients with self-proteins or specific epitopes like tumor antigens is difficult due to weakly immunogenicity and immunotolerance. A number of strategies including adjuvants, the introduction of foreign immunodominant T-helper (TH) epitopes into chimeric antigens [117,118], chemical derivatization of self-antigens [119], and DNA vaccines [120] have been exploited to break immunological self-tolerance. Autologous tumor cells modified with a hapten, dinitrophenyl (DNP), was demonstrated to elicit patients’ immune responses cross-reacting with unmodified, native tumor-cell antigens, likely due to the formation of strong stacking and van der Waals interaction [121]. Therefore, genetic code expansion provided a way to introduce immunogenic ncAAs into autologous proteins and induce immune responses to these self-tolerated epitopes. Site-specific incorporation of p-nitrophenylalanine (pNO2Phe) to murinetumor necrosis factor-α (mTNF-α) successfully generated a sustained high-titer antibody response to both parental and antigenically distinct mTNF-α variants without the need for strong adjuvants [122]. Another DNP-containing ncAA, N‐(2‐(2,4‐dinitrophenyl)acetyl)lysine (DnpK) was later incorporated into HEK 293T cells, although unstable in Escherichia coli [123]. Sulfotyrosine (SO3Tyr) and 3-nitrotyrosine (3NO2Tyr), two PTM-related ncAAs were introduced to specific sites in mTNF-α and EGF, indicating that PTM might play a role in autoimmune disorders [124]. This methodology could also be applied to a self-protein unrelated to immune function, murineretinol-binding protein 4 (RBP4) [125] and might as well be applied to other weakly immunogenic antigens.
Organism-based vaccines
Although traditional live-attenuated vaccines (LAVs) are potent, the potential pathogenic consequences have aroused safety concerns [126]. Genetic code expansion strategy was first coupled with intact and infectious virus engineering using hepatitis D virus (HDV) as a model system [127]. The resultant HDV virions maintained near wild-type viability and infectivity. Then this strategy was expanded to influenza A viruses and humanimmunodeficiency virus type 1 (HIV-1) virus for the development of LAVs and could be potentially adapted to almost any viral vaccines. Live but replication-incompetent influenza A virus vaccine was generated in a special transgenic cell line harboring MbPylRS/tRNAPylCUA, eliciting robust immunity against both wild type and ncAA-containing strains (Fig. 5(a)) [128]. By means of introducing blank codons on HIV-1 essential proteins within the genome, all-in-one live-attenuated HIV-1 mutants were constructed, and HIV-1 viability was precisely controlled by ncAAs [129,130]. This live-attenuated HIV-1 vaccine constructed through genetic code expansion marked an important step towards combating AIDS pandemic worldwide.
Fig. 5
(a) Principles of live-attenuated vaccines (LAVs) using genetic code expansion. ncAA: noncanonical amino acid. (b) Principles of genetically modified organisms (GMOs).
(a) Principles of live-attenuated vaccines (LAVs) using genetic code expansion. ncAA: noncanonical amino acid. (b) Principles of genetically modified organisms (GMOs).Genetic code expansion showed its enormous potential in developing organisms strictly depending on ncAA for growth. After ncAA incorporation had been achieved in the intracellular pathogen Mycobacterium tuberculosis (Mtb), which facilitated studies of tuberculosis (TB) vaccine development [131], increasing number of genetically modified organisms (GMOs) were developed. Unlike viruses, evolutionary pressure compels cellular organisms to impose mutagenic drift, environmental supplementation and horizontal gene transfer (HGT), resulting in high escape frequencies [132]. Computationally redesign of essential enzymes and the combination of multiple redesigned enzymes resulted in remarkable reduced escape rate, alleviating this problem to some degree [132]. Using multiplex automated genome engineering, TAG codons were introduced into permissive locations of 22 essential genes in an Escherichia coli strain that lacked all TAG codons and release factor 1, creating a notable strain with robust growth and undetectable escape frequencies (Fig. 5(b)) [133]. To addict the modified organisms to ncAAs, a selection method was developed using TEM-1 β-lactamase as a model system [134].Inspired by PTM, scientists developed a general strategy to create genetically modified organisms with stringent dependency on the presence of N‐ε‐acetyl‐L‐Lys (AcK), which blocked a functionally essential lysine residue required for cell growth and can be rescued by endogenous deacetylases. The combination of this approach with a barnase-based suicide switch finally gave a low escape frequency [135]. Other ncAAs with specific chemical properties have also been utilized for biological containment. The hydrophobic amino acid p-benzoyl-l-phenylalanine (pBzF) was introduced to a homodimeric chorismate mutase (CM) to disrupt protein-protein interfaces for its large nonpolar side chain which cannot be replaced by 20 canonical amino acids in theory. Through screening of ncAA surrounding residues, CM was evolved to an enzyme that could only retain active in presence of pBzF, thus developing Escherichia coli strain dependent on ncAAs [136]. However, this deficiency might be complemented by aromatic amino acids in the host. Afterwards, replacing key histidine (His) residues in essential metalloproteins with 3-methyl--histidine (MeH) provided a straightforward strategy to solve this problem [137]. Due to the requirement of two single nucleotide mutations to substitute UAG codon with native codons for His (CAU and CAC), this strategy showed extremely low reversion rate and was feasible to be applied to ncAA conditional live vaccines.
Therapeutic related applications
Genetic code expansion has also been applied to elucidate the mechanisms of drug targets and biotherapeutics. For example, the genetic code expansion system has been used to explore interactions between drug targets like G protein-coupled receptors (GPCRs) and their binding partners (antibodies or ligands). Back in 2008, the noncanonical amino acids with reactive keto groups, pAcF and pBzF, were incorporated into two different GPCRs, CCR5 and rhodopsin at different sites [29]. Later, this strategy was further applied to map the antibody epitopes on another GPCR, CXC chemokine receptor 4 (CXCR4) [138]. Similar strategy was also applied to map the binding path of neuropeptide Urocortin-I (Ucn1) to the corticotropin releasing factor receptor type 1 (CRF1R) [139], the binding sites between antidepressant drugs and humanserotonin transporter (hSERT) [140] and the interactions between exendin-4 peptide and the glucagon-like peptide-1 receptor (GLP-1R) [141], which may contribute to the pursuit of structure-based drug design.Apart from these applications, it is worth mentioning that clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 system have also been combined with a genetically installed photocaged lysine (PCK), achieving opto-chemical controllable gene editing [142]. Although the CRISPR system has not yet been applied in clinic, it has shown a great potential in future gene therapy and cell therapy. We think that the genetic code expansion technology has not yet shown its full potential in the gene editing field.
Challenges and prospects
Since the first ncAA, O-methyl--tyrosine was incorporated into proteins in Escherichia coli, expansion of the genetic code has allowed over 200 ncAAs to be site-specifically incorporated into proteins in bacteria, yeast [14,16,18,20,[142], [143], [144], [145], [146], [147]], plant [23], worm [21,148,149], insect [22,25,27,150], fish [151] and mammal [13,15,17,19,[24], [25], [26],28,31,34,144,[151], [152], [153]], including stem cells [154,155], thus affording novel avenues for biotherapeutic applications both in vitro and in vivo. Drug candidates produced with genetic code expansion like αHER2 ADC (Ambrx, ARX788) and PEG-FGF21 (Ambrx, ARX618) are in phase 1 and phase 2 clinical trial respectively.Although the competence of precise and highly tailored structural perturbations to proteins provides abundant advantages, an expanded genetic code is far from reaching its full potential. While ncAAs were incorporated at high levels in E. coli and several cell lines (e.g. HEK293T cells), one of the major challenges of this technique is the low suppression efficiency in more challenging cell lines and species due to its competition with endogenous termination system. This was partly solved by engineering components in the expression system like eRF1 [[156], [157], [158]], the elongation factor Tu (EF-Tu) [159], ribosome [160] and the nonsense-mediated mRNA decay (NMD) pathway [16,28], codon reassignment [161,162] or by improving expression vectors [163]. Up to 30 ncAA residues incorporated with high yields and accuracy into one recombinant protein can now be achieved in E. coli [4]. Optimization of tRNA structure have also been applied in bacteria. However, the evolution of the tRNA sequence in mammalian cells is still challenging [164]. This might also arouse the suspicion that suppressor tRNAs would have toxic effects to cell fitness due to uncontrolled proteome-wide ncAA incorporation. Quadruplet codons offer an optional route to avoiding this problem [3,37]. Another issue is the limited ncAA core structure motif owing to the limited orthogonal pairs. Many natural PTMs still cannot be encoded either in natural or mimic forms. This can only be resolved by exploiting novel aaRS/tRNA system and higher throughput screening methods. Recently, a phage-assisted continuous evolution (PACE) [165] and a phage-assisted noncontinuous evolution (PANCE) [166] have been developed, achieving rapid production of orthogonal aaRS variants that are of highly activity and amino acid specificity. Although PylRS does not recognize the tRNAPyl anticodon, the fact that many aaRS use the anticodon as an identity element [167] and tRNAPylUCA along with tRNAPylUUA display much lower suppression efficiency than tRNAPylCUA [168] largely limited the incorporation of different ncAAs at the same time. Willis, J. C. et al. recently created new PylRS/tRNAPyl pairs that are orthogonal with MmPylRS/tRNAPyl pairs [169].Future directions, in terms of biotherapeutic applications, include genetic encoding of more building blocks with multifarious chemical, physical and biological characteristics, enhancing suppression efficiencies in therapeutic related species, and to apply this system to relevant therapeutic agents including proteins, cells and even organisms. Merging between genetic code expansion with other disciplines like synthetic biology may force it to a low-cost and available technique to a great extent. Currently, the genetic code expansion technology can allow a maximum of two additional ncAAs to be site-specifically incorporated into one protein [9,24,170,171]. Recent advances in cell engineering [132,133], codon reassignment [172] and expansion of the genetic alphabet [173,174] would theoretically allow the creation of organisms with multiple ncAAs in the proteome. On the other hand, more mutually orthogonal systems and structurally diverse ncAAs need to be developed to accommodate these engineered organisms with free codons or expanded codons. And more importantly, more researchers need to apply the genetic code expansion technique in their research to exploit new applications and functions of ncAAs, which will eventually benefit human therapy. As this technique develops, we are sure that new generations of biotherapeutics will emerge.
Authors: Yorke Zhang; Brian M Lamb; Aaron W Feldman; Anne Xiaozhou Zhou; Thomas Lavergne; Lingjun Li; Floyd E Romesberg Journal: Proc Natl Acad Sci U S A Date: 2017-01-23 Impact factor: 11.205
Authors: Michael Shaofei Zhang; Simon F Brunner; Nicolas Huguenin-Dezot; Alexandria D Liang; Wolfgang H Schmied; Daniel T Rogerson; Jason W Chin Journal: Nat Methods Date: 2017-05-29 Impact factor: 28.547
Authors: David B F Johnson; Jianfeng Xu; Zhouxin Shen; Jeffrey K Takimoto; Matthew D Schultz; Robert J Schmitz; Zheng Xiang; Joseph R Ecker; Steven P Briggs; Lei Wang Journal: Nat Chem Biol Date: 2011-09-18 Impact factor: 15.040
Authors: Chayasith Uttamapinant; Jonathan D Howe; Kathrin Lang; Václav Beránek; Lloyd Davis; Mohan Mahesh; Nicholas P Barry; Jason W Chin Journal: J Am Chem Soc Date: 2015-04-01 Impact factor: 15.419