Literature DB >> 29198111

Expansion of the Genetic Alphabet: A Chemist's Approach to Synthetic Biology.

Abstract

The information available to any organism is encoded in a four nucleotide, two base pair genetic code. Since its earliest days, the field of synthetic biology has endeavored to impart organisms with novel attributes and functions, and perhaps the most fundamental approach to this goal is the creation of a fifth and sixth nucleotide that pair to form a third, unnatural base pair (UBP) and thus allow for the storage and retrieval of increased information. Achieving this goal, by definition, requires synthetic chemistry to create unnatural nucleotides and a medicinal chemistry-like approach to guide their optimization. With this perspective, almost 20 years ago we began designing unnatural nucleotides with the ultimate goal of developing UBPs that function in vivo, and thus serve as the foundation of semi-synthetic organisms (SSOs) capable of storing and retrieving increased information. From the beginning, our efforts focused on the development of nucleotides that bear predominantly hydrophobic nucleobases and thus that pair not based on the complementary hydrogen bonds that are so prominent among the natural base pairs but rather via hydrophobic and packing interactions. It was envisioned that such a pairing mechanism would provide a basal level of selectivity against pairing with natural nucleotides, which we expected would be the greatest challenge; however, this choice mandated starting with analogs that have little or no homology to their natural counterparts and that, perhaps not surprisingly, performed poorly. Progress toward their optimization was driven by the construction of structure-activity relationships, initially from in vitro steady-state kinetic analysis, then later from pre-steady-state and PCR-based assays, and ultimately from performance in vivo, with the results augmented three times with screens that explored combinations of the unnatural nucleotides that were too numerous to fully characterize individually. The structure-activity relationship data identified multiple features required by the UBP, and perhaps most prominent among them was a substituent ortho to the glycosidic linkage that is capable of both hydrophobic packing and hydrogen bonding, and nucleobases that stably stack with flanking natural nucleobases in lieu of the potentially more stabilizing stacking interactions afforded by cross strand intercalation. Most importantly, after the examination of hundreds of unnatural nucleotides and thousands of candidate UBPs, the efforts ultimately resulted in the identification of a family of UBPs that are well recognized by DNA polymerases when incorporated into DNA and that have been used to create SSOs that store and retrieve increased information. In addition to achieving a longstanding goal of synthetic biology, the results have important implications for our understanding of both the molecules and forces that can underlie biological processes, so long considered the purview of molecules benefiting from eons of evolution, and highlight the promise of applying the approaches and methodologies of synthetic and medical chemistry in the pursuit of synthetic biology.

Entities: CellLine Chemical Disease Species

Mesh：

Substances：
Nucleotides
DNA

Year: 2017 PMID： 29198111 PMCID： PMC5820176 DOI： 10.1021/acs.accounts.7b00403

Source DB: PubMed Journal: Acc Chem Res ISSN： 0001-4842 Impact factor: 22.384

Introduction

The field of synthetic biology was first defined in 1911 by Stéphane Leduc[1] with the goal of creating new biological forms and functions. The modern field is largely focused on using the engineering-like approach of design, test, and standardize to optimize naturally derived “parts”, most commonly novel DNA elements. However, the most fundamental approach to create new forms and functions is to develop unnatural base pairs (UBPs) that expand the genetic alphabet and, thus, increase the amount of information that may be stored in an organism’s DNA. The effort to expand the genetic alphabet was first championed by Steven Benner and focused on unnatural nucleotides bearing hydrogen-bonding (H-bonding) patterns that are orthogonal to those employed by the natural base pairs (Figure ).[2] However, it was unclear if H-bonding was the only force suitable for controlling base pairing. Indeed, the Kool laboratory reported the remarkable observation that a DNA polymerase could selectively insert the difluorotoluene analog of dTTP opposite dA in the template.[3] We and the Hirao laboratory thus initiated efforts to use hydrophobic and packing forces to control UBP formation. Hirao has focused on derivatizing natural purine and pyrimidine scaffolds to create “shape complementary” UBPs (Figure ),[4] while we focused on nucleotides bearing nucleobase analogs with little to no homology to their natural counterparts.

Figure 1

dNaM-dTPT3, dZ-dP, and dDs-dPx (R = H or −CH(OH)–CH2OH) UBPs.

dNaM-dTPT3, dZ-dP, and dDs-dPx (R = H or −CH(OH)–CH2OH) UBPs. Although our efforts were consistent with the tenets of modern synthetic biology, we used synthetic chemistry to generate the required parts, which is in many ways more consistent with Leduc’s original vision.[1] Furthermore, we optimized the parts using a medicinal chemistry-like approach, inspired by the perspective that any effort endeavoring to develop foreign, man-made parts that function within a cell will need to optimize solubility, cellular uptake, stability, off-target activity, toxicity, dose, and dosing regimen. The field of medicinal chemistry approaches the same problems through the construction of structure–activity relationships (SARs) that allow for empirical optimization in the absence of a complete understanding of the process being optimized, a strategy that we adapted for UBP development. Here, we recount our efforts to develop UBPs that ultimately culminated in the discovery of the dNaM–dTPT3 UBP (Figure ), as well as a family of related UBPs, all of which have now been used to create semi-synthetic organisms (SSOs) that store[5,6] and retrieve[7] increased information. This places us at the doorstep of realizing Leduc’s vision of creating organisms with novel forms and functions.

Parts Optimization: First Generation UBP Candidates

The synthetic biology parts required for the expansion of the genetic alphabet are the unnatural nucleotides that selectively pair to form a UBP, and their optimization, at least initially, was measured by both duplex stability and DNA polymerase recognition. In most cases, analogs were synthesized as triphosphates (referred to as dXTP, where X is a nucleoside analog), as well as phosphoramidites for incorporation into DNA via solid phase synthesis. While characterization of the stability of duplex DNA containing the UBPs revealed many interesting trends, such as the effects of solvation,[8] stability proved uncorrelated with polymerase recognition, as is also the case with natural base pairs,[9] and polymerase recognition quickly emerged as our primary focus. The majority of our early efforts employed the exonuclease-deficient Klenow fragment of E. coli DNA polymerase I (Kf). We focused specifically on two steps, the insertion of an unnatural triphosphate opposite its cognate analog in a template (a step that we also refer to as UBP synthesis or unnatural nucleotide incorporation), and the continued primer extension by insertion of the next correct triphosphate, in each case characterizing efficiency (second-order rate constant, kcat/KM) and fidelity ((kcat/KM)correct/(kcat/KM)incorrect). Steady-state conditions were employed, which allowed rapid SAR construction but only provided information about the rate-limiting step of insertion or extension, both of which are actually a complex series of reactions (e.g., triphosphate binding, conformational changes, phosphoryl transfer, product release). During early development, this was not a problem as the rate-limiting step was invariably phosphoryl transfer. Our search began with the simple benzene and naphthalene nucleobase analogs (Figure A).[10,11] We found that dDMTP is a poor polymerase substrate, as incorporation opposite dDM or dTM in the template was virtually undetectable. In contrast, dTMTP was more efficiently incorporated opposite dDM (kcat/KM = 1.4 × 106 M–1 min–1) and dTM (kcat/KM = 2.2 × 106 M–1 min–1), only 20–30-fold less than the efficiency with which a natural base pair is synthesized in the same sequence context. However, both the dTM–dDM heteropair and dTM–dTM self-pair are limited by poor extension and mispairing with dA, likely because dA is the most hydrophobic of the natural nucleotides. We also found that d2Np efficiently directs the insertion of d2NpTP (kcat/KM = 2.8 × 106 M–1 min–1), but misincorporation of dATP was again problematic (kcat/KM = 1.1 × 105 M–1 min–1). Addition of a methyl substituent to the position ortho to the glycosidic bond (hereafter referred to simply as the ortho position) yielded d2MN, which generally increased the rate of incorporation of the triphosphate against hydrophobic analogs in the template. However, d2MNTP is again efficiently inserted opposite dA. Moving the single methyl substituent from the ortho to the meta position (d3MN) reduces insertion opposite dA but also reduces the efficiency of UBP synthesis. The additional methyl substituent of dDMN resulted in efficient pairing with both dTM and d2MN, but it also increased the rate of dATP insertion when in the template. Thus, while mispairing generally remained problematic, at this point it was clear that several of these UBPs showed promising synthesis rates. However, none of them supported continued primer extension at a detectable level (kcat/KM < 103 M–1 min–1).

Figure 2

(A) First generation methylated benzene and naphthalene analogs. (B) Second generation methylated benzene analogs. Sugar and phosphate groups omitted for clarity.

(A) First generation methylated benzene and naphthalene analogs. (B) Second generation methylated benzene analogs. Sugar and phosphate groups omitted for clarity. The isocarbostyril scaffold also received early development attention (Figure A,B).[10,12−14] When in the template, the simplest of the series, dICS, directs the insertion of triphosphates bearing simple substituted benzene nucleobases, such as dDMTP, dTMTP, or dDMNTP, with only modest efficiency, but d2MNTP is inserted significantly more efficiently. Addition of a 1-propynyl group to the 7-position of dICS, affording dPICS, results in a triphosphate that is generally inserted more efficiently opposite other unnatural analogs. The methyl group of dPIM resulted in efficient but indiscriminate insertion of the triphosphate, while addition of the methyl group of dMICS resulted in the indiscriminate templating of natural triphosphates. Although the methyl group of d5MICS had little systematic effect on UBP synthesis, it did dramatically reduce self-pairing. Despite variable rates of UBP synthesis, the rates of extension remained problematic. While 6-aza substitution (dNICS) results in 2-fold reduced self-pair synthesis, it also results in a 2-fold increased rate of extension. Remarkably, replacing the oxygen of dICS with sulfur (dSICS) results in an 80-fold increase in the rate of self-pair synthesis and a 4-fold increase in the rate of extension. The combination of both modifications in dSNICS retained the increased rate of synthesis but further increased the rate of extension 12-fold (kcat/KM = 2.2 × 104 M–1 min–1). These results provided early hints as to the important but complicated role of the ortho substituent.

Figure 3

(A) Isocarbostyril analogs. (B) Heteroatom derivatized isocarbostyril analogs. (C) Furan and thiophene fused pyridone and thiopyridone analogs. Sugar and phosphate groups omitted for clarity.

(A) Isocarbostyril analogs. (B) Heteroatom derivatized isocarbostyril analogs. (C) Furan and thiophene fused pyridone and thiopyridone analogs. Sugar and phosphate groups omitted for clarity. A series of pyridones and thiopyridones fused to furan and thiophene rings in both meta- and para-linked orientations were investigated (Figure C).[15] UBP synthesis with these analogs was generally inefficient, with a para-linked furan appearing to be particularly detrimental. The thiopyridone triphosphate analogs are generally inserted more efficiently, but the effect of sulfur was less pronounced than with the isocarbostyril scaffold, and none of these analogs emerged as particularly promising. Nucleoside analogs bearing azaindole scaffolds (Figure ) were found to efficiently pair with various unnatural nucleotides in the template, and with reasonable selectively against the natural triphosphates when in the template, with the exception of dP7AI, which efficiently templates the insertion of dCTP.[16] However, while UBP synthesis again proved amenable to optimization, the resulting UBPs were generally refractory to extension, which by now had emerged as the greatest challenge to optimization.

Figure 4

Azaindole analogs. Sugar and phosphate groups omitted for clarity.

A Structural Interlude

The dominant SAR that emerged from these first generation UBP candidates was that while the hydrophobicity and aromatic surface area of aromatic nucleobase derivatives appeared promising for the optimization of UBP synthesis, the resulting UBPs generally proved refractory to continued primer extension. The NMR structure of a duplex containing the dPICS self-pair (Figure A) was solved to better understand this SAR.[17] These studies revealed a generally unperturbed duplex structure, with the propargyl moieties of dPICS disposed as expected in the major groove; however, considerable distortion was observed at the site of the UBP itself. Rather than adopting the canonical edge-on Watson–Crick geometry, the dPICS nucleobases interact via cross-strand intercalation. This intercalation appears to be driven by favorable stacking of the large aromatic surfaces of each dPICS nucleobase and has been observed with other nucleotides bearing nucleobase analogs with extended aromatic surface area.[18] We hypothesized that the same mode of pairing occurs at the primer terminus, which we envisioned would account for the efficient rates of UBP synthesis, but also the inefficient rates of continued extension, as deintercalation would be required to appropriately position the primer terminus.

Figure 5

(A) Duplex structure of DNA containing the dPICS–dPICS UBP.[16] (B) Duplex structure of DNA containing the dNaM–d5SICS UBP.[36] (C) Structure of d5SICSTP paired opposite dNaM in the polymerase active site.[37] In chemical structures, sugar and phosphate groups omitted for clarity.

Continued Parts Optimization: Second Generation UBP Candidates

Based on the SAR generated with the first generation analogs, we began testing whether smaller nucleobase analogs could be optimized for UBP synthesis, with the expectation that they would be less prone to cross-strand intercalate. These second generation efforts started with a more complete analysis of the simple benzene scaffold explored previously (Figure B). The parent analog, dBENTP, is poorly recognized by Kf.[19] We then explored dMM1, dMM2, and dMM3, but found little improvement.[19] The efficiency of triphosphate insertion was progressively increased with dDMTP, dDM2TP, dDM3TP, dDM4TP, and dDM5TP, and insertion opposite dA was eliminated with dTMTP. Despite this progress with insertion, extension of primers terminating with these analogs remained inefficient. We next explored heteroatom derivatization of these small scaffolds with bromo-, fluoro-, and cyano-substituents (Figure A). Bromo- and cyano-substituents tended to decrease the rate of mispairing with natural nucleotides,[20] but extension rates remained poor. A systematic analysis of fluoro substitution identified a single meta substituent (d3FB) as particularly interesting, with the resulting self-pair both synthesized and extended with at least moderate efficiency,[21] but further optimization efforts proved unproductive.

Figure 6

(A) Bromo-, cyano-, and fluoro-substituted benzene analogs. (B) Methoxy-substituted benzene analogs. Sugar and phosphate groups omitted for clarity.

(A) Bromo-, cyano-, and fluoro-substituted benzene analogs. (B) Methoxy-substituted benzene analogs. Sugar and phosphate groups omitted for clarity. The situation improved with a family of methoxy-derivatized analogs (Figure B), and those possessing an ortho-methoxy substituent were the first to provide UBPs that were consistently extended with at least reasonable efficiency.[22] In fact, with dMMO2 paired against dTM in the template, the resulting primer was extended with an efficiency that is only 36-fold lower than that of a natural base pair in the same sequence context. Mutation of the polymerase indicated that the increased extension resulted from the ability of the ortho-methoxy group to accept an H-bond, and similar substituents within the natural nucleobases are known to play a similar role.[23−25] Consistent with the need for an ortho H-bond acceptor at the primer terminus, dTM paired opposite dMMO2 in the template is only extended poorly (kcat/KM = 6.3 × 103 M–1 min–1). Although the d3FB self-pair was an exception, the SAR strongly suggested that an ortho H-bond acceptor was essential, and its inclusion emerged as a central design theme. A variety of heterocyclic N- and C-nucleotides, which can also position an H-bond acceptor at the position ortho to the glycosidic linkage, were next examined[26−28] (Figures and 8). The pyridone analog triphosphates were inserted with only marginal efficiency, but the resulting UBPs were extended with greater efficiency, consistent with the proposed role of the ortho H-bond acceptor. Conversion of dPYR to the corresponding thiopyridone (dSPYR) resulted in UBPs that were still reasonably well extended but synthesized less efficiently.[29] No pyridine analogs were efficiently inserted as triphosphates, but d2Py was better extended when at the primer terminus than was d3Py or d4Py, again consistent with the importance of an H-bond acceptor in the developing minor groove.[28] Problematically, however, the pyridine analogs paired more efficiently with dATP than any unnatural triphosphates when in the template, and no improvements were found with various alkyl or heteroatom substituents or with an increased aromatic surface (dQL).[30]

Figure 7

Derivatized monocyclic pyridone analogs. Sugar and phosphate groups omitted for clarity.

Figure 8

Pyridine and substituted pyridine analogs. Sugar and phosphate groups omitted for clarity.

Derivatized monocyclic pyridone analogs. Sugar and phosphate groups omitted for clarity. Pyridine and substituted pyridine analogs. Sugar and phosphate groups omitted for clarity. The most pronounced SAR generated from these second generation candidates was that efficient UBP synthesis requires a hydrophobic ortho substituent, while efficient extension requires the same substituent to be more hydrophilic, at least hydrophilic enough to act as an H-bond acceptor. This apparent physicochemical dichotomy challenged our confidence that both UBP synthesis and extension could be simultaneously optimized.

Parts Optimization: Third Generation UBP Candidates

With no clear strategy to satisfy the conflicting demands of UBP synthesis and extension, we pivoted to a screen-based strategy. Two complementary screens were performed, one gel-based screen that focused on UBP incorporation and extension under steady state conditions, and one fluorescence-based plate screen that focused on the efficiency and fidelity of full length synthesis.[29] Remarkably, from 3600 candidate UBPs, both screens identified dMMO2–dSICS as the most promising. This UBP appeared to satisfy the physicochemical contradiction, because the sulfur of the dSICS thioamide is relatively hydrophobic but still able to accept an H-bond, while simple bond rotation allows the methoxy group of dMMO2 to direct either a hydrophobic methyl group or the oxygen lone pairs into the developing minor groove. The identification of dMMO2–dSICS reinvigorated our design efforts. Steady-state kinetics revealed that while dMMO2 and dSICS were both incorporated and extended relatively efficiently (kcat/KM = 3.4 × 105 M–1 min–1 and 1.7 × 106 M–1 min–1, respectively, with dSICS in the template, and 1.4 × 106 M–1 min–1 and 1.1 × 106 M–1 min–1, respectively, with dMMO2 in the template), replication fidelity was limited by dSICS self-pairing. Based on previous SAR, we explored the addition of a methyl group to the distal aromatic ring of dSICS to introduce steric interactions to disfavor self-pairing. These efforts identified d5SICS, and thus dMMO2–d5SICS emerged as our lead UBP. We next turned to the optimization of dMMO2TP insertion opposite d5SICS, which was now the rate-limiting step of replication (Figure ). Based on previous results that a meta-fluoro substituent or an expansion of aromatic surface area increased the efficiency of triphosphate incorporation, we explored the analogs d5FM and dNaM (Figure B).[31] Gratifyingly, opposite d5SICS, both d5FMTP and dNaMTP are more efficiently inserted than dMMO2TP. Although the former is limited by the synthesis and extension of the dA–d5FM mispair, the efficiency and fidelity of replicating DNA containing the dNaM–d5SICS UBP is excellent (kcat/KM = 5.0 × 106 M–1 min–1 and 3.7 × 107 M–1 min–1, for insertion of dNaMTP opposite d5SICS and d5SICSTP opposite dNaM, respectively, and 1.2 × 106 M–1 min–1 and 2.7 × 106 M–1 min–1, respectively, for each subsequent extension). In fact, dNaM–d5SICS was the first of our UBP candidates to be amplified in a standard PCR reaction with high efficiency and fidelity.[32] To determine whether dNaM is optimal for pairing with d5SICS, we further explored derivatization of the dMMO2 scaffold with Cl, Br, I, Me, and Pr meta-substituents, but none improved replication.[33] We next examined 28 analogs with different para-substituents, and the SAR was augmented with PCR amplification assays using Taq, OneTaq, and KOD polymerases (Figure ).[34,35] Several of the most promising analogs were further derivatized with a meta-fluoro or methoxy substituent. These efforts identified dEMO, dFIMO, and dFEMO as better partners for d5SICS than dMMO2 (Figure B); however, none was more optimal in these in vitro assays than dNaM.

Figure 9

(A) Para- and meta-substituted dMMO2 analogs. (B) Optimized dMMO2 analogs. Sugar and phosphate groups omitted for clarity.

Parts Standardization

The successful development of a synthetic biology “part” includes not only its optimization for function, but also its standardization for function in different contexts, which here corresponds to recognition of the unnatural triphosphates by different DNA polymerases and within different local sequence contexts. To explore the extent of standardization of dNaMTP and d5SICSTP, we first examined the fidelity with which DNA containing the corresponding UBP was PCR-amplified in a variety of different sequence contexts using DeepVent, Taq, or Phusion polymerase, which demonstrated that retention of the UBP was near or in excess of 99% per doubling.[36] DNA containing a UBP within a 40-nt stretch of randomized natural nucleotides was PCR amplified 1024-fold with OneTaq, and the products were analyzed by deep sequencing. A slight enrichment of the sequence 5′-GCNaM was observed, but this bias is no greater than that observed with natural sequences, demonstrating a sufficient level of standardization of the dNaM–d5SICS UBP for in vivo use.

A Second Structural Interlude

Having identified a family of promising UBPs, we again returned to a structural characterization. Collaborating with the Dwyer group, we first solved the structure of a DNA duplex containing dMMO2–d5SICS. Surprisingly, the UBP still formed a cross-strand intercalated structure,[37] and solution-state NOEs revealed that the dNaM–d5SICS UBP did as well (Figure B),[38] although significantly less so than the dPICS self-pair. While our interests centered around function and not structure, this mode of pairing with dNaM–d5SICS raised a perplexing question: because the UBP resembles a natural mispair more than a correct pair, how is it efficiently recognized by DNA polymerases, which are known to have evolved to reject triphosphates that form mispairs? To address this question, we collaborated with the Marx group to solve the structures of the binary complex of KlenTaq DNA polymerase bound to a primer–template with dNaM in the templating position and with a primer terminating with a ddC, as well as the corresponding ternary complex with d5SICSTP bound. The data revealed that formation of the UBP drives the same large conformational change of the polymerase caused by the formation of a natural base pair (Figure )[39,40] and, remarkably, that the conformational change of the polymerase reciprocally drives a conformational change within the UBP, causing it to adopt an edge-to-edge paired natural Watson–Crick-like structure (Figure C). Thus, while a natural base pair is replicated with an induced-fit mechanism, the UBP is replicated with a similar, but subtly different, mutually induced-fit mechanism, providing a mechanistic explanation for its efficient replication. Nonetheless, after synthesis, the nascent UBP again adopts a cross-strand intercalated structure,[41] explaining the SAR data that identified a requirement for deintercalation prior to extension. At this point, we surmised that any further optimization of the UBP would require the careful optimization of intrastrand packing with neighboring natural nucleotides over cross-strand intercalation.

Figure 10

(A) Superimposition of the binary complex of KlenTaq polymerase bound to DNA with dNaM in the templating position in the open conformation (yellow) and the corresponding ternary complex bound to d5SICSTP in the closed conformation (purple). (B) Superimposition of ternary complex between KlenTaq polymerase, dNaM template DNA, and d5SICSTP (purple), or a natural dG template and dCTP (gray). Reproduced from ref (38). Copyright 2012 Nature Publishing Group.

Growing the Family of in Vitro Replicated UBPs

Based on the proposed mechanism of replication, we speculated that dNaM–d5SICS might be optimized by distal ring contraction and heteroatom derivatization of d5SICS, potentially favoring deintercalation, and thus extension, while preserving synthesis efficiency. Thus, we explored four derivatives with the distal ring replaced with para- or meta-linked thienyl, methyl furanyl, or methyl thienyl rings, as well as an additional derivative to explore fluorination at the meta position (Figure ).[42] Gratifyingly, we found that dTPT2, dTPT3, and dFTPT3 form UBPs with dNaM that are more efficiently replicated within DNA than the dNaM-d5SICS UBP (as demonstrated by a pre-steady-state kinetic assay, as the steady-state rates were now limited by product dissociation[43]). The most efficiently replicated was dNaM–dTPT3, which thus emerged as our lead UBP.

Figure 11

Distal ring-contracted d5SICS analogs. Sugar and phosphate groups omitted for clarity.

Distal ring-contracted d5SICS analogs. Sugar and phosphate groups omitted for clarity. At this point, we had explored several analogs of both dMMO2 and d5SICS since the identification of dMMO2–dSICS, but we had never explored these analogs as partners with previous generation analogs. Thus, using a PCR-based screen, we examined the amplification of DNA containing 111 different unnatural nucleotides (resulting in approximately 6000 candidate UBPs) drawn from our now expanded set of analogs. While we found that dNaM–dTPT3 is generally the most efficiently replicated of the UBPs examined, we identified seven additional and physicochemically distinct UBPs that are replicated significantly better than dNaM–d5SICS (Figure ). Again drawing on established tenets of medicinal chemistry, these results are important, because our long-term goal of using the UBPs in a living SSO brings with it additional constraints, which may be differently satisfied by UBPs with differing physicochemical properties.

Figure 12

A family of well replicated dNaM–dTPT3-like UBPs. Sugar and phosphate groups omitted for clarity.

In Vivo Performance and Optimization

In 2014, we demonstrated that when dNaMTP and d5SICSTP are imported into Escherichia coli via transgenic expression of an algal nucleoside triphosphate transporter, they are used by cellular polymerases to replicate DNA containing the UBP.[5] However, unlike in vitro replication, replication in this SSO showed significant sequence biases, some of which were still observed with the dNaM–dTPT3 UBP.[6] This is perhaps not surprising considering that the in vivo replication environment is distinct from the in vitro environment, which had been used to generate the SAR that drove UBP optimization and standardization. Thus, we screened 135 candidate UBPs, drawn from 91 unnatural triphosphates selected to cover the range of analogs that had been explored in vitro, for those that when added to the media were able to support high level retention of the corresponding UBP on a plasmid within the SSO.[44] Much of the SAR generated was consistent with that generated in vitro, but there were several interesting differences. In particular, the in vivo environment was somewhat more permissive to the nature of the ortho substituent that had proven so critical for in vitro replication (although the best UBPs still retained these H-bond acceptors). Perhaps more importantly, the in vivo screen identified four additional UBPs, each formed by pairing a dNaM analog opposite dTPT3, that are more efficiently replicated and with less sequence bias than dNaM–dTPT3 (Figure ), the most promising of which was dCNMO–dTPT3. While this UBP is at present the most promising lead for further SSO development, it is again the physicochemical diversity offered by a family of well replicated UBPs that is likely to prove most valuable as our efforts shift toward achieving in vivo transcription and translation, which we have only just begun to explore with dNaM–dTPT3.

Figure 13

A family of UBPs optimized for in vivo expansion of the genetic alphabet. Sugar and phosphate groups omitted for clarity.

Conclusions and the Chemist’s Approach to Synthetic Biology

We have used synthetic chemistry, coupled with the methods of medicinal chemistry, to develop a family of UBPs that function not only in vitro but also in vivo and have used them to create SSOs that can store more information in their DNA. The SARs elucidated from the examination of over 150 unnatural nucleotides have guided development and identified key elements that the unnatural nucleotides must possess, most clearly, an ortho substituent that is capable of providing a hydrophobic surface as well as an H-bond acceptor and a nucleobase surface that favors intrastrand packing over cross-strand intercalation. While this Account has recounted our efforts to optimize replication, we have also recently demonstrated that DNA containing the UBPs may be transcribed into RNA in an SSO and used during translation at the ribosome to produce proteins with noncanonical amino acids.[7] This lays the foundation for the creation of SSOs with forms and functions not available to their natural counterparts and thereby achieves a long-standing goal of synthetic biology. At its core, synthetic biology aims to create parts that function within living cells, imparting them with novel attributes. While the tenets of the field were originally implicitly founded on the use of chemistry to create those parts, its modern incarnation has focused on the use of parts assembled from natural components or components intended to mimic their natural counterparts. This would seem justified by the eons of evolution that optimized the natural components for functioning in a cell, at least for a similar function. However, most natural components are recognized by or interact with multiple other components in every cell, possibly in unknown ways, and thus their introduction may have unintended consequences. While truly synthetic parts made by chemists do not benefit from eons of evolution, they are foreign to cells, possibly even drawing upon forces not used by their natural counterparts, and thus they may be more orthogonal and possibly easier to introduce and optimize without perturbation. In this case, optimization must proceed with less information, because less is known about how the parts might interact with their biological targets, and the methodology and lessons of medicinal chemistry, which conceptually face essentially the same challenges, provide the blueprint for success. If the combination of synthetic and medicinal-like chemistry can produce molecules that function alongside those evolved by nature for the most central of its processes, to store and retrieve information, then this approach is likely to be capable of discovering molecules that effectively participate in any biological process, potentially opening a new vista for chemists in synthetic biology.

39 in total

1. Determinants of unnatural nucleobase stability and polymerase recognition.

Authors: Allison A Henry; Chengzhi Yu; Floyd E Romesberg
Journal: J Am Chem Soc Date: 2003-08-13 Impact factor: 15.419

2. Minor Groove Interactions between Polymerase and DNA: More Essential to Replication than Watson-Crick Hydrogen Bonds?

Authors: Juan C Morales; Eric T Kool
Journal: J Am Chem Soc Date: 1999-02-14 Impact factor: 15.419

3. Minor groove hydrogen bonds and the replication of unnatural base pairs.

Authors: Shigeo Matsuda; Aaron M Leconte; Floyd E Romesberg
Journal: J Am Chem Soc Date: 2007-04-06 Impact factor: 15.419

Review 4. Alternative Watson-Crick Synthetic Genetic Systems.

Authors: Steven A Benner; Nilesh B Karalkar; Shuichi Hoshika; Roberto Laos; Ryan W Shaw; Mariko Matsuura; Diego Fajardo; Patricia Moussatche
Journal: Cold Spring Harb Perspect Biol Date: 2016-11-01 Impact factor: 10.005

5. Natural-like replication of an unnatural base pair for the expansion of the genetic alphabet and biotechnology applications.

Authors: Lingjun Li; Mélissa Degardin; Thomas Lavergne; Denis A Malyshev; Kirandeep Dhami; Phillip Ordoukhanian; Floyd E Romesberg
Journal: J Am Chem Soc Date: 2013-10-23 Impact factor: 15.419

6. Synthetic Biology Parts for the Storage of Increased Genetic Information in Cells.

Authors: Sydney E Morris; Aaron W Feldman; Floyd E Romesberg
Journal: ACS Synth Biol Date: 2017-06-27 Impact factor: 5.110

7. Chemical Stabilization of Unnatural Nucleotide Triphosphates for the in Vivo Expansion of the Genetic Alphabet.

Authors: Aaron W Feldman; Vivian T Dien; Floyd E Romesberg
Journal: J Am Chem Soc Date: 2017-02-07 Impact factor: 15.419

8. Optimization of the pyridyl nucleobase scaffold for polymerase recognition and unnatural base pair replication.

Authors: Yoshiyuki Hari; Gil Tae Hwang; Aaron M Leconte; Nicolas Joubert; Michal Hocek; Floyd E Romesberg
Journal: Chembiochem Date: 2008-11-24 Impact factor: 3.164

9. Substituent effects on the pairing and polymerase recognition of simple unnatural base pairs.

Authors: Gil Tae Hwang; Floyd E Romesberg
Journal: Nucleic Acids Res Date: 2006-04-14 Impact factor: 16.971

10. The effects of unnatural base pairs and mispairs on DNA duplex stability and solvation.

Authors: Gil Tae Hwang; Yoshiyuki Hari; Floyd E Romesberg
Journal: Nucleic Acids Res Date: 2009-06-10 Impact factor: 16.971

18 in total

Review 1. Beyond DNA and RNA: The Expanding Toolbox of Synthetic Genetics.

Authors: Alexander I Taylor; Gillian Houlihan; Philipp Holliger
Journal: Cold Spring Harb Perspect Biol Date: 2019-06-03 Impact factor: 10.005

Review 2. Reprogramming the genetic code.

Authors: Daniel de la Torre; Jason W Chin
Journal: Nat Rev Genet Date: 2020-12-14 Impact factor: 53.242

3. Progress Toward a Semi-Synthetic Organism with an Unrestricted Expanded Genetic Alphabet.

Authors: Vivian T Dien; Matthew Holcomb; Aaron W Feldman; Emil C Fischer; Tammy J Dwyer; Floyd E Romesberg
Journal: J Am Chem Soc Date: 2018-11-12 Impact factor: 15.419

4. Electrochemically Driven, Ni-Catalyzed Aryl Amination: Scope, Mechanism, and Applications.

Authors: Yu Kawamata; Julien C Vantourout; David P Hickey; Peng Bai; Longrui Chen; Qinglong Hou; Wenhua Qiao; Koushik Barman; Martin A Edwards; Alberto F Garrido-Castro; Justine N deGruyter; Hugh Nakamura; Kyle Knouse; Chuanguang Qin; Khalyd J Clay; Denghui Bao; Chao Li; Jeremy T Starr; Carmen Garcia-Irizarry; Neal Sach; Henry S White; Matthew Neurock; Shelley D Minteer; Phil S Baran
Journal: J Am Chem Soc Date: 2019-04-02 Impact factor: 15.419

5. Optimization of Replication, Transcription, and Translation in a Semi-Synthetic Organism.

Authors: Aaron W Feldman; Vivian T Dien; Rebekah J Karadeema; Emil C Fischer; Yanbo You; Brooke A Anderson; Ramanarayanan Krishnamurthy; Jason S Chen; Lingjun Li; Floyd E Romesberg
Journal: J Am Chem Soc Date: 2019-06-26 Impact factor: 15.419

6. Fluorobenzene Nucleobase Analogues for Triplex-Forming Peptide Nucleic Acids.

Authors: Vipin Kumar; Eriks Rozners
Journal: Chembiochem Date: 2021-12-20 Impact factor: 3.164

7. "Click handle"-modified 2'-deoxy-2'-fluoroarabino nucleic acid as a synthetic genetic polymer capable of post-polymerization functionalization.

Authors: Kevin B Wu; Christopher J A Skrodzki; Qiwen Su; Jennifer Lin; Jia Niu
Journal: Chem Sci Date: 2022-05-17 Impact factor: 9.969

Review 8. Unnatural bases for recognition of noncoding nucleic acid interfaces.

Authors: Shiqin Miao; Yufeng Liang; Sarah Rundell; Debmalya Bhunia; Shekar Devari; Oliver Munyaradzi; Dennis Bong
Journal: Biopolymers Date: 2020-09-24 Impact factor: 2.505

9. Incorporation of β-Alanine in Cu(II) ATCUN Peptide Complexes Increases ROS Levels, DNA Cleavage and Antiproliferative Activity.

Authors: Julian Heinrich; Karolina Bossak-Ahmad; Mie Riisom; Haleh H Haeri; Tasha R Steel; Vinja Hergl; Alexander Langhans; Corinna Schattschneider; Jannis Barrera; Stephen M F Jamieson; Matthias Stein; Dariush Hinderberger; Christian G Hartinger; Wojciech Bal; Nora Kulak
Journal: Chemistry Date: 2021-12-04 Impact factor: 5.020

Review 10. Applications of Ruthenium Complexes Covalently Linked to Nucleic Acid Derivatives.

Authors: Marie Flamme; Emma Clarke; Gilles Gasser; Marcel Hollenstein
Journal: Molecules Date: 2018-06-22 Impact factor: 4.411