Literature DB >> 25122747

Systematic exploration of a class of hydrophobic unnatural base pairs yields multiple new candidates for the expansion of the genetic alphabet.

Kirandeep Dhami1, Denis A Malyshev1, Phillip Ordoukhanian2, Tomáš Kubelka3, Michal Hocek4, Floyd E Romesberg5.   

Abstract

We have developed a family of unnatural base pairs (UBPs), which rely on hydrophobic and packing interactions for pairing and which are well replicated and transcribed. While the pair formed between d5SICS and dNaM (d5SICS-dNaM) has received the most attention, and has been used to expand the genetic alphabet of a living organism, recent efforts have identified dTPT3-dNaM, which is replicated with even higher fidelity. These efforts also resulted in more UBPs than could be independently analyzed, and thus we now report a PCR-based screen to identify the most promising. While we found that dTPT3-dNaM is generally the most promising UBP, we identified several others that are replicated nearly as well and significantly better than d5SICS-dNaM, and are thus viable candidates for the expansion of the genetic alphabet of a living organism. Moreover, the results suggest that continued optimization should be possible, and that the putatively essential hydrogen-bond acceptor at the position ortho to the glycosidic linkage may not be required. These results clearly demonstrate the generality of hydrophobic forces for the control of base pairing within DNA, provide a wealth of new structure-activity relationship data and importantly identify multiple new candidates for in vivo evaluation and further optimization.
© The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 25122747      PMCID: PMC4176363          DOI: 10.1093/nar/gku715

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Expansion of the genetic alphabet by development of a replicable unnatural base pair (UBP) has attracted significant attention (1–7) since report of the first efforts in 1989 (8). For over a decade, we have explored the use of hydrophobic and packing forces to drive the stable and selective pairing of unnatural nucleotides in DNA and during replication and transcription. Our initial work focused on the replacement of the natural purine or pyrimidine nucleobases with predominantly hydrophobic analogs based on benzene-, naphthalene-, isocarbostiryl-, pyridine- and pyridone-scaffolds (9). However, the number of candidate UBPs formed by these nucleotides soon exceeded the number that could be analyzed individually, and as a result, we conducted a screen wherein 3600 candidate UBPs were analyzed (10). This screen identified the pair formed between dSICS and dMMO2 (dSICS-dMMO2), which upon optimization yielded d5SICS-dMMO2 (Figure 1), whose relatively efficient replication by a variety of DNA polymerases (11) validated the use of hydrophobic and packing forces instead of the canonical Watson–Crick hydrogen-bonds (H-bonds) that underlie the replication of natural base pairs. In addition, one of clearest structure–activity relationships (SARs) to emerge from these studies was the apparent importance of an H-bond acceptor positioned ortho to the glycosidic linkage (10,12,13), which as with natural DNA (14–16), was thought to mediate the formation of a critical H-bond with a polymerase donor.
Figure 1.

The most promising UBPs previously identified. Sugar and phosphate backbone are omitted for clarity.

The most promising UBPs previously identified. Sugar and phosphate backbone are omitted for clarity. Our efforts to optimize the UBP then turned to improving dMMO2 as a partner for d5SICS, eventually yielding d5SICS-dNaM (Figure 1). d5SICS-dMMO2 and especially d5SICS-dNaM are replicated (2,10,17) and transcribed (18) sufficiently well for many applications, and we have used linker-modified versions to enzymatically synthesize site-specifically labeled DNA and RNA (4,19). Most importantly, we have incorporated d5SICS-dNaM into a plasmid that is stably propagated in Escherichia coli, creating the first semi-synthetic organism with an expanded genetic alphabet (20). Nonetheless, the demands of in vivo replication in different organisms and in all possible sequence contexts, including those containing multiple UBPs, is likely to require further optimization, which has proceeded, at least in part, based on the structure of the UBP formed in the active site of a DNA polymerase (3,21). Additional analogs of both d5SICS and dNaM were synthesized that in all cases maintained the putatively essential H-bond acceptor positioned ortho to the glycosidic bond. These efforts yielded d5SICS-dFEMO (22), and ultimately dTPT3-dNaM (23) (Figure 1). Although d5SICS-dFEMO is replicated with an efficiency and fidelity similar to that of d5SICS-dNaM, its propinyl group provides a natural site for post-amplification derivatization and thus for site-specific DNA labeling. In contrast, when incorporated into DNA dTPT3-dNaM is replicated better than either d5SICS-dMMO2 or d5SICS-dNaM, with rates approaching those of a natural base pair. Despite the efficient replication of DNA containing dTPT3-dNaM, it was uncertain whether it was the most promising UBP formed between the nucleotides that had been synthesized. This is because the optimization efforts again resulted in the synthesis of too many nucleotide analogs to analyze all possible combinations individually. Here, we report a screen of a library of 111 unnatural nucleotides (resulting in ∼6000 candidate UBPs) drawn from our complete set of analogs, including those synthesized after the identification d5SICS-dMMO2, which are structurally more homologous to either d5SICS or dNaM, as well as those synthesized before the identification of d5SICS-dMMO2, which are more structurally diverse. However, unlike our previously reported screen, which was based on the steady-state synthesis of a single strand of DNA, the current screen relies on polymerase chain reaction (PCR) amplification. This approach was selected because PCR is an important practical tool with which any identified UBPs should be immediately compatible, and further it has allowed us to screen for fidelity in a straightforward manner, by sequencing the amplification products. We found that dTPT3-dNaM is, in general, the most efficiently replicated of the UBPs examined; however, we also identified seven additional and structurally distinct UBPs that are better replicated than d5SICS-dNaM and should thus be sufficiently well replicated to underlie the expansion of an organism's genetic alphabet. In addition, we identified two UBPs that are reasonably well replicated despite one constituent nucleobase lacking the putatively essential ortho H-bond acceptor, which challenges the assumption, at least for UBPs, that the H-bond acceptor is required for efficient replication. Finally, additional SAR data were generated that should facilitate the further optimization of this class of UBP.

MATERIALS AND METHODS

General

The triphosphates of the α6 group were prepared from the previously reported nucleosides (24) according to literature procedure (25) (See Synthetic Methods and Spectra in Supplementary Data and Supplementary Table S1). The purity of all other triphosphates was confirmed by MALDI-TOF and UV-VIS. Taq and OneTaq DNA polymerases were purchased from New England Biolabs (Ipswich, MA, USA). A mixture of dNTPs was purchased from Fermentas (Glen Burnie, MD, USA). SYBR Green I Nucleic Acid Gel Stain (10 000×) was purchased from Life Technologies (Carlsbad, CA, USA). The synthesis of the DNA templates, D8 (2), used for screening rounds 1–5, and D6 (26), used for all other amplifications, was described previously; sequences of templates are provided in Supplementary Table S2. Sanger sequencing was carried out as described previously (2). Raw Sanger sequencing traces were used to determine the percent retention of the UBPs, which was converted to fidelity per doubling, as described previously (2,26) and in the Supplementary Data.

Screen PCR assay conditions

All PCR amplifications were performed in a CFX Connect Real-Time PCR Detection System (Bio-Rad), in a total volume of 25 μl using the following conditions: 1× OneTaq reaction buffer, 0.5× Sybr Green I, MgSO4 adjusted to 4.0 mM, 0.2 mM of each dNTP, 50 μM of each unnatural triphosphate, 1 μM of Primer1 and Primer2 (See Supplementary Table S2) and 0.02 U/μl of the DNA polymerase. Other conditions specific for each round of screening are described in Supplementary Table S3. Amplified products were purified using DNA Clean and Concentrator-5 spin columns from Zymo Research (Irvine, CA, USA). After purification, the PCR products were sequenced on a 3730 DNA Analyzer (Applied Biosystems) to determine the retention of the UBP as described in the Supplementary Material. Fidelity was characterized from UBP retention as determined by sequencing with Primer1 on a 3730 DNA Analyzer (Applied Biosystems).

Specific PCR assay conditions

PCR with the most promising UBPs was carried out with the conditions as described in Supplementary Table S3. PCR products were further purified on 2% agarose gels, followed by single band excision and subsequent clean up using the Zymo Research Zymoclean Gel DNA Recovery Kit. After elution with 20 μl of water, the DNA concentration was measured using fluorescent dye binding (Quant-iT dsDNA HS Assay kit, Life Technologies), and purified amplicons were sequenced in triplicate with both Primer1 and Primer2 to determine UBP retention and thus amplification fidelity (see Supplementary Figures S1–S3). Amplification of DNA containing the pairs involving analogs of group α6 was performed with OneTaq polymerase under the following thermal cycling conditions: initial denaturation at 96°C for 1 min; 16 cycles of 96°C for 10 s, 60°C for 15 s, 68°C for 1 min. Fidelity was determined by sequencing amplicons in the Primer1 direction in triplicate (Supplementary Figure S4). Amplification of DNA containing the UBPs formed between dTPT3 and d2MN or dDM2 was performed using OneTaq or Taq polymerases for 16 cycles under the following thermal cycling conditions: (i) OneTaq: initial denaturation at 96°C for 1 min, 96°C for 10 s, 60°C for 15 s, 68°C for 1 min; or (ii) Taq: initial denaturation at 96°C for 1 min, 96°C for 5 s, 60°C for 5 s, 68°C for 10 s. Fidelity was determined by sequencing amplicons in the Primer1 direction in triplicate (Supplementary Figure S5).

RESULTS

To screen for well replicated UBPs, unnatural deoxynucleoside triphosphates were grouped for analysis into either dMMO2/dNaM- or d5SICS/dTPT3-like analogs, although the distinction is not completely clear in all cases. In total, 80 dMMO2/dNaM analogs were grouped into 12 ‘α groups’ (α1–α12; Figure 2), and 31 d5SICS/dTPT3 analogs were grouped into six ‘β groups’ (β1–β6; Figure 3). Note that the group designations used here should not be confused with anomer designation (all analogs examined are β glycosides). In addition, to increase the SAR content of the screen, seven previously reported nucleoside analogs (dTOK576dTOK588) with substituted pyridyl nucleobases (24) were phosphorylated as described in Supplementary Data, and included as group α6. For screening, a 134-mer single-stranded DNA template containing a centrally located dNaM (which has been used previously and is referred to as D8 (2)) was PCR amplified in the presence of the natural triphosphates (200 μM each), all pairwise combinations of an α and a β triphosphate group (50 μM each), and 0.02 U/μl DNA polymerase. During the first round of PCR, dNaM templates the incorporation of a β analog and is then replaced by an α analog when the original strand is copied in the second round, with the resulting UBP amplified in subsequent rounds. The amplification product of each reaction was analyzed by Sanger sequencing (Supplementary Data). As reported previously, the presence of an unnatural nucleotide results in the abrupt termination of the sequencing chromatogram, allowing the level of UBP retention to be quantified by the amount of read through (2,26). The percentage of UBP retained in the DNA after amplification during each round of screening is shown in Figure 4.
Figure 2.

α group unnatural deoxynucleoside triphosphates. d2OMe and dMMO1 were moved from Group α1 to the groups indicated after the first round of screening. Sugar and phosphate backbone are omitted for clarity. References for each compound are provided in Supplementary Table S6.

Figure 3.

β group unnatural deoxynucleoside triphosphates. Sugar and phosphate backbone are omitted for clarity. References for each compound are provided in Supplementary Table S6.

Figure 4.

UBP retention (%) after PCR amplification during each round of screening. Open squares indicate UBPs that were not evaluated; light gray squares indicate UPBs that were replicated with less than 50% retention, while those that resulted in higher retention are indicated with darker shading and with the retention value included.

α group unnatural deoxynucleoside triphosphates. d2OMe and dMMO1 were moved from Group α1 to the groups indicated after the first round of screening. Sugar and phosphate backbone are omitted for clarity. References for each compound are provided in Supplementary Table S6. β group unnatural deoxynucleoside triphosphates. Sugar and phosphate backbone are omitted for clarity. References for each compound are provided in Supplementary Table S6. UBP retention (%) after PCR amplification during each round of screening. Open squares indicate UBPs that were not evaluated; light gray squares indicate UPBs that were replicated with less than 50% retention, while those that resulted in higher retention are indicated with darker shading and with the retention value included. UBPs identified by the present study. The first round of screening employed 0.1 ng of template and 16 cycles of amplification under relatively permissive conditions that included OneTaq polymerase and a 1 min extension time. For our purposes, OneTaq is considered permissive because it is a mixture of Taq (a family A polymerase (27,28)) and Deep Vent (a family B polymerase (27,28)), with the latter possessing exonucleotidic proofreading that allows for the excision of an incorrectly incorporated triphosphate. Under these conditions, only the pairs involving group β5 or β6 showed high retention. The combinations of β5 or β6 and the α groups that showed the highest retention were progressed to a second round of screening, wherein they were divided into smaller groups (denoted by a, b or c; Figures 2 and 3). High retention (≥97%) was observed with β5a and α2c, α9a, α9c, α10a, α10c, α12b or α12c; with β5b and α9a, α9b, α10c or α12b and with β6b and α10c. Moderate retention (84–96%) was observed with β5a and α1a, α1b, α6a, α9b, α10b or α12a; β5b and α1a, α1b, α2c, α6a, α9c, α10a, α12a or α12c; β6a and α1b or α10c and β6b and α1a, α6a, α9α–c, α10a, α10b or α12a–c. For a third round of screening, α analogs were analyzed in groups of only one to three compounds, and group β6a was subdivided into its two constituent triphosphates, dTPT1TP and dFPT1TP. The highest retention (≥90%) was observed with β5a and α1a, α2cII, α9a–c, α10aI, α10aII, α10c, α12b or dTfMOTP; β5b and α9a, α9c or α10c; dFPT1TP and α10aI and β6b and α1a, α9a–c, α10aI, α10aII, α10c, α12b, dNMOTP, dTfMOTP or dCNMOTP. Only slightly less retention (80–89%) was seen with β5a and α2cI, α12a, dNMOTP, dQMOTP or dTOK587TP; β5b and α1a, α2cII, α10aI, α10aII, α12b or dTOK587TP; dFPT1TP and α10c and β6b and α12a, dQMOTP, dFuMO1TP or dTOK587TP. For a fourth round of screening, all of the α derivatives were analyzed as individual triphosphates, with the exception of α9b and α9c, which remained grouped. The highest retention (≥91%) was observed with β5a and α9b, α9c, dFIMOTP, dIMOTP, dFEMOTP, dMMO2TP, d2OMeTP, dDMOTP, d5FMTP, dNaMTP, dVMOTP, dZMOTP, dClMOTP, dTfMOTP, dQMOTP, d2MNTP, dDM2TP or dTOK587TP; β5b and α9b, α9c, dFIMOTP, dIMOTP, dFEMOTP, dNaMTP, dZMOTP, dClMOTP, dQMOTP, dMM1TP, dDM2TP or dTOK587TP; β6 analog dFPT1TP and α analogs d2OMeTP or dNaMTP and β6b and α9b, α9c, dFIMOTP, dIMOTP, dFEMOTP, dMMO2TP, dDMOTP, dTMOTP, dNMOTP, d5FMTP, dNaMTP, dVMOTP, dZMOTP, dClMOTP, dTfMOTP, dQMOTP, dCNMOTP, d2MNTP, dTOK587TP or dFuMO2TP. To increase the stringency of the screen, a fifth round was performed with Taq polymerase instead of OneTaq, as it lacks exonuclease proofreading activity and thus increases the sensitivity to mispair synthesis. This round also separated all remaining α and β groups into individual triphosphates. The highest retention (≥90%) was seen with dSICSTP and dNaMTP; dSNICSTP and dNaMTP; dTPT2TP and dFDMOTP; dTPT3TP and dFIMOTP, dIMOTP or dNaMTP and dFTPT3TP and dFIMOTP, dIMOTP, dFEMOTP, dNMOTP, dNaMTP, dClMOTP, dTfMOTP or dCNMOTP. To better differentiate between the UBPs, we progressed the 62 most promising candidate UBPs to a sixth round of screening in which the template concentration was decreased 10-fold (to 10 pg) to allow for greater amplification, and thereby afford greater discrimination, and the template was changed to D6 (26), where the three flanking nucleotides on either side of the unnatural nucleotide are randomized among the natural nucleotides. Moreover, the denaturation and annealing steps were decreased to 5 s each, and the extension time was decreased to 10 s. Under these conditions, we explored amplification either with OneTaq or with Taq alone. The results with OneTaq showed the highest retention (>95%) with dSICSTP and dNaMTP; dSNICSTP and dFEMOTP; dTPT3TP and dFIMOTP, dIMOTP, dFEMOTP, dZMOTP or dNaMTP and dFTPT3TP and dIMOTP or dFEMOTP. Moderate retention (86–94%) was observed with dSICSTP and dFEMOTP or dDM2TP; d5SICSTP and dNaMTP; dSNICSTP or dIMOTP; dTPT2TP and dNaMTP; dTPT3TP and dNMOTP, dClMOTP, dQMOTP, dCNMOTP or d2MNTP and dFTPT3TP and dFIMOTP, dNaMTP, dZMOTP, dClMOTP, dTfMOTP or dCNMOTP. While retention during Taq-mediated amplification was in general reduced relative to that with OneTaq, the general trends were similar. The highest retention (>96%) was observed with dTPT3TP and dFIMOTP or dIMOTP, and with dFTPT3TP and dFIMOTP. Only slightly lower retention (89–94%) was observed with dTPT3TP and dFEMOTP, dNaMTP or dCNMOTP and dFTPT3TP and dIMOTP, dFEMOTP, dNaMTP, dClMOTP, dCNMOTP or d2MNTP. Amplification with the most promising combinations of triphosphates, dTPT3TP or dFTPT3TP and dFIMOTP, dIMOTP, dFEMOTP or dNaMTP, was then performed over 52 cycles with Taq and a 10 s extension time, to explore particularly stringent conditions, or with OneTaq and a 30 s extension time, to explore more practical conditions (Table 1, Figure 5). Both amplified strands were sequenced in triplicate to determine UBP retention with high accuracy. With Taq, dTPT3-dNaM, dTPT3-dFIMO, dFTPT3-dNaM and dFTPT3-dFIMO showed the highest retention, while the pairs involving dIMO and dFEMO showed somewhat less retention. With OneTaq, dTPT3-dNaM and dFTPT3-dNaM showed the highest retention, followed closely by dFTPT3-dFIMO and dTPT3-dFIMO.
Table 1.

Characterization of the most promising UBPs

dβTPdαTPAmplification, ×1012Retention, %Fidelity per doubling, %
Taq, 10 s Extension
TPT3FIMO8.584 ± 399.60 ± 0.09
IMO6.381 ± 599.50 ± 0.15
FEMO5.079 ± 399.44 ± 0.09
NaM5.886.5 ± 0.599.66 ± 0.01
FTPT3FIMO4.884 ± 399.60 ± 0.09
IMO5.682 ± 599.54 ± 0.13
FEMO5.781 ± 499.51 ± 0.11
NaM3.791 ± 699.76 ± 0.15
5SICSNaM9.3<50b<85b
OneTaq, 1 min Extension
TPT3FIMO8.784.7 ± 1.199.61 ± 0.03
IMO9.482.9 ± 1.799.56 ± 0.05
FEMO10.482.2 ± 1.099.55 ± 0.03
NaM8.391.2 ± 1.399.79 ± 0.03
FTPT3FIMO8.286 ± 399.65 ± 0.08
IMO7.176.8 ± 1.699.38 ± 0.05
FEMO6.372.4 ± 1.499.24 ± 0.04
NaM7.090 ± 299.76 ± 0.06
5SICSNaM8.177.1 ± 0.799.00 ± 0.02

aRetention and fidelity determined as described in Materials and Methods.

bUBP retention below 50%, and fidelity is thus estimated to be <85%.

Figure 5.

UBPs identified by the present study.

Characterization of the most promising UBPs aRetention and fidelity determined as described in Materials and Methods. bUBP retention below 50%, and fidelity is thus estimated to be <85%. The screening data suggest that several pairs formed between dTPT3 and the previously unexamined pyridine-based derivatives of α6 were reasonably well replicated. Thus, we examined in triplicate the amplification of DNA containing these UBPs using OneTaq and 16 amplification cycles with 1 min extension times (Supplementary Table S4). The pairs formed between dTPT3 and dTOK580, dTOK582 or dTOK586 were poorly replicated. However, the pairs formed between dTPT3 and dTOK588, dTOK581, dTOK576 and dTOK587 were amplified with a retention of 62%, 65%, 85% and 94%, respectively. Finally, the screening data suggested that the pairs formed between dTPT3 and d2MN or dDM2 are reasonably well replicated, despite neither d2MN nor dDM2 possessing a putatively essential ortho H-bond acceptor. Thus, these pairs were further examined via 16 cycles of amplification with OneTaq or Taq alone, and with extension times of either 1 min or 10 s (Supplementary Table S5). With Taq alone, only poor retention was observed. However, with OneTaq, retention was better for both pairs. Retention of the dTPT3-dDM2 pair is 58% and 69% with 1 min and 10 s extension times, respectively. Remarkably, dTPT3-d2MN is amplified with retentions of 96% and 94% with 1 min and 10 s extension times, respectively.

DISCUSSION

By relying on the propagation of DNA containing d5SICS-dNaM, we have recently succeeded in creating the first semi-synthetic organism with an expanded genetic alphabet (20). Nonetheless, the creation of semi-synthetic organisms that indefinitely retain the UBP in all possible sequence contexts, including those that are difficult to replicate or that contain multiple UBPs, will likely be facilitated by the availability of multiple, structurally distinct UBPs. Significant progress toward this goal was recently reported with discovery of the dTPT3-dNaM UBP (23). However, as we have synthesized more analogs of d5SICS/dTPT3 (referred to herein as β derivatives) and dNaM (α derivatives) than can be evaluated individually, it was not clear whether dTPT3-dNaM was even the best UBP among those already available. Thus, we initiated a PCR-based screen to identify the most promising UBPs. In addition, to increase the SAR content of the screen, we included seven novel α derivatives that are based on a pyridyl scaffold with different substituents at the positions ortho and para to the glycosidic linkage.

SAR data

Even under permissive conditions, where exonucleotidic proofreading activity was present and extension times were 1 min, only mixed groupings of α analogs with β analogs showed significant levels of retention, demonstrating that efficient replication requires the pairing of an α scaffold with a β scaffold. However, the only dβ groups that showed high retention were β5 and β6. This reveals the privileged status of the d5SICS/dTPT3-like scaffold relative to all of the others examined. Surprisingly, the dominant contribution to the high retention with group β5 proved to result not from pairs involving d5SICS, but rather from pairs involving dSICS, and to a lesser extent dSNICS. For example, under all conditions, dSICS-dNaM was better replicated than d5SICS-dNaM. Interestingly, d5SICS resulted from the optimization of dSICS for pairing with dMMO2 (10); apparently, the increased bulk of dNaM makes the added methyl group deleterious. Furthermore, dSNICS-dNaM is replicated nearly as well (with OneTaq) or better (with Taq) than d5SICS-dNaM, suggesting that a 6-aza substituent optimizes UBP synthesis by facilitating insertion of the unnatural triphosphate opposite dNaM or by increasing the efficiency with which the unnatural nucleotide templates the insertion of dNaMTP. Finally, dSNICS-dFEMO is also better replicated than d5SICS-dNaM, but only in the presence of proofreading, suggesting that while triphosphate insertion may be less efficient, increased efficiency of extension results in an overall increase in fidelity. The dominant contribution to high-fidelity retention with group β6 was provided by dTPT3 and dFTPT3. In general, both paired well with dNaM, dFEMO, dFIMO or dIMO. dTPT3 paired especially well with dFIMO and dIMO, suggesting that the para iodo substituent mediates favorable interactions, and it also paired well with dFEMO and especially dNaM when exonuclease activity was present. dFTPT3 paired well with either dIMO or dFEMO in the presence of exonuclease activity, as well as with dFIMO and dNaM in its absence. While the nitrogen substituent of the pyridine-based α analogs (group α6) was generally detrimental for replication, a more detailed analysis of the UBPs formed with dTPT3 revealed several interesting trends. As has been observed with other scaffolds, a methyl, chloro or amino substituent at the position ortho to the C-glycosidic linkage resulted in poorly replicated pairs, presumably due to poor extension after incorporation of the unnatural triphosphate. Also as has been observed with other scaffolds, the ortho methoxy substituent of dTOK581 resulted in better replication, presumably due to its ability to both hydrophobically pack with the template during UBP synthesis and accept an H-bond with a polymerase-based H-bond donor during extension (10). Surprisingly, the data also revealed that the methylsulfanyl ortho substituent of dTOK588, dTOK576 and especially dTOK587 results in better replication. This improvement is likely due to a more optimized compromise between the ability to hydrophobically pack and the ability to accept an H-bond from the polymerase at the primer terminus. We also found that the para substituent in this series of derivatives can contribute to efficient replication, with a bromo substituent being the best, followed by a second methylsulfanyl group, and then finally a simple methyl group. When dTOK587, with its combination of the ortho methylsulfanyl and para bromo substituents, was paired with dTPT3, the resulting UBP was replicated by OneTaq and 1 min extension times with a fidelity (calculated from retention level as reported previously (2,26)) of 99.3%, which is slightly better than d5SICS-dMMO2 under similar conditions. Clearly, similar ortho methylsulfanyl and para bromo substituents should be examined with the more efficiently replicated α-like scaffolds, such as dFIMO and dNaM. The replication of the pairs formed between dTPT3 and d2MN or dDM2 also merits discussion. DNA containing these pairs is not amplified by Taq alone, but is surprisingly well amplified by OneTaq. This result was unexpected because neither d2MN nor dDM2 possesses the ortho H-bond acceptor that has been postulated to be essential for extension of the nascent (natural or unnatural) primer terminus. Specifically, when a nucleotide is positioned at the growing primer terminus, the H-bond acceptor is disposed into the developing minor groove where it accepts an H-bond from the polymerase, and this H-bond is thought to be required for proper terminus alignment (14–16). When amplified with OneTaq and a 1 min extension time, dTPT3-d2MN is replicated with a fidelity of 99.5%, which only drops to 99.1% when the extension time is reduced to 10 s. The absence of amplification in the absence of proofreading, coupled with the only small decrease observed in the presence of proofreading when extension times were reduced, implies that the surprisingly high-fidelity amplification of DNA containing dTPT3-d2MN results from selective extension of the UBP relative to mispairs. This suggests that the absence of an ortho H-bond acceptor is more deleterious for the extension of a mispair than for the extension of the UBPs.

Efforts toward the expansion of the genetic alphabet

Overall, the data confirms that dTPT3-dNaM is the most promising UBP of those currently available, and current efforts toward the expansion of the genetic code will focus on this UBP. However, the pairs formed between dTPT3 and dFEMO, dFIMO or dIMO, or between dFTPT3 and dNaM, dFEMO, dFIMO or dIMO, are also particularly promising. Given that each of these eight pairs is replicated more efficiently than d5SICS-dNaM, and that d5SICS-dNaM is sufficiently well replicated to be stably propagated within a cell (20), each of these UBPs is a viable candidate for use in the expansion of an organism's genetic code. Clearly, the core scaffolds represented by dTPT3 and dNaM are a general solution to the challenge of storing genetic information, a property previously only associated with the purines and pyrimidines of the natural nucleotides. In addition to the most promising UBPs noted above, it is noteworthy that a remarkable number of additional novel pairs are replicated with only a moderately reduced fidelity, or are replicated with a high fidelity when the amplification is performed under less stringent conditions (Table 2). Along with the most efficiently replicated UBPs, these pairs provide a wide range of scaffolds with diverse physicochemical properties for further optimization efforts. This is especially critical in the effort to optimize in vivo replication, where different physicochemical properties are expected to bestow the constituent nucleotides with different pharmacokinetic-like properties, the optimization of which is also likely to be important during the effort to create stable and healthy semi-synthetic organisms that are able to store and retrieve increased genetic information.
Table 2.

Characterization of additional promising UBPs

dβTPdαTPRetention (%)
SICSNaM99a
SICSFEMO92b
SNICSNaM90a
SNICSFEMO95b
SNICSIMO88b
TPT3NMO89b
TPT3ZMO86b
TPT3ClMO90b
TPT3QMO90b
TPT3CNMO91b
FTPT3NMO94a
FTPT3ZMO88a
FTPT3ClMO97a
FTPT3QMO87a
FTPT3CNMO94a

aPCR Conditions: 100 pg D8 template (2) amplified for 16 cycles with Taq polymerase under thermocycling conditions: initial denaturation at 96°C for 1 min, 96°C for 30 s, 60°C for 15 s, 68°C for 60 s.

bPCR Conditions: 10 pg D6 template (26) amplified for 24 cycles with OneTaq polymerase under thermocycling conditions: initial denaturation at 96°C for 1 min, 96°C for 5 s, 60°C for 5 s, 68°C for 10 s.

Characterization of additional promising UBPs aPCR Conditions: 100 pg D8 template (2) amplified for 16 cycles with Taq polymerase under thermocycling conditions: initial denaturation at 96°C for 1 min, 96°C for 30 s, 60°C for 15 s, 68°C for 60 s. bPCR Conditions: 10 pg D6 template (26) amplified for 24 cycles with OneTaq polymerase under thermocycling conditions: initial denaturation at 96°C for 1 min, 96°C for 5 s, 60°C for 5 s, 68°C for 10 s.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.
  24 in total

1.  Polymerase recognition of unnatural base pairs.

Authors:  Chengzhi Yu; Allison A Henry; Floyd E Romesberg; Peter G Schultz
Journal:  Angew Chem Int Ed Engl       Date:  2002-10-18       Impact factor: 15.336

Review 2.  Beyond A, C, G and T: augmenting nature's alphabet.

Authors:  Allison A Henry; Floyd E Romesberg
Journal:  Curr Opin Chem Biol       Date:  2003-12       Impact factor: 8.822

3.  Minor Groove Interactions between Polymerase and DNA: More Essential to Replication than Watson-Crick Hydrogen Bonds?

Authors:  Juan C Morales; Eric T Kool
Journal:  J Am Chem Soc       Date:  1999-02-14       Impact factor: 15.419

4.  Site-specifically arraying small molecules or proteins on DNA using an expanded genetic alphabet.

Authors:  Zhengtao Li; Thomas Lavergne; Denis A Malyshev; Jörg Zimmermann; Ramkrishna Adhikary; Kirandeep Dhami; Phillip Ordoukhanian; Zhelin Sun; Jie Xiang; Floyd E Romesberg
Journal:  Chemistry       Date:  2013-09-11       Impact factor: 5.236

5.  Site-specific labeling of DNA and RNA using an efficiently replicated and transcribed class of unnatural base pairs.

Authors:  Young Jun Seo; Denis A Malyshev; Thomas Lavergne; Phillip Ordoukhanian; Floyd E Romesberg
Journal:  J Am Chem Soc       Date:  2011-11-18       Impact factor: 15.419

6.  Major groove substituents and polymerase recognition of a class of predominantly hydrophobic unnatural base pairs.

Authors:  Thomas Lavergne; Denis A Malyshev; Floyd E Romesberg
Journal:  Chemistry       Date:  2011-12-21       Impact factor: 5.236

7.  Structural insights into DNA replication without hydrogen bonds.

Authors:  Karin Betz; Denis A Malyshev; Thomas Lavergne; Wolfram Welte; Kay Diederichs; Floyd E Romesberg; Andreas Marx
Journal:  J Am Chem Soc       Date:  2013-11-27       Impact factor: 15.419

8.  PCR with an expanded genetic alphabet.

Authors:  Denis A Malyshev; Young Jun Seo; Phillip Ordoukhanian; Floyd E Romesberg
Journal:  J Am Chem Soc       Date:  2009-10-21       Impact factor: 15.419

9.  Unnatural substrate repertoire of A, B, and X family DNA polymerases.

Authors:  Gil Tae Hwang; Floyd E Romesberg
Journal:  J Am Chem Soc       Date:  2008-10-11       Impact factor: 15.419

10.  Escherichia coli DNA polymerase I (Klenow fragment) uses a hydrogen-bonding fork from Arg668 to the primer terminus and incoming deoxynucleotide triphosphate to catalyze DNA replication.

Authors:  Aviva S Meyer; Maureen Blandino; Thomas E Spratt
Journal:  J Biol Chem       Date:  2004-06-20       Impact factor: 5.157

View more
  10 in total

1.  Evolvability Is an Evolved Ability: The Coding Concept as the Arch-Unit of Natural Selection.

Authors:  Srdja Janković; Milan M Ćirković
Journal:  Orig Life Evol Biosph       Date:  2015-09-29       Impact factor: 1.950

Review 2.  Strategic labelling approaches for RNA single-molecule spectroscopy.

Authors:  Gerd Hanspach; Sven Trucks; Martin Hengesbach
Journal:  RNA Biol       Date:  2019-04-21       Impact factor: 4.652

3.  Helix instability and self-pairing prevent unnatural base pairs from expanding the genetic alphabet.

Authors:  Thomas P Hettinger
Journal:  Proc Natl Acad Sci U S A       Date:  2017-08-02       Impact factor: 11.205

Review 4.  Synthetic Biological Circuits within an Orthogonal Central Dogma.

Authors:  Alan Costello; Ahmed H Badran
Journal:  Trends Biotechnol       Date:  2020-06-22       Impact factor: 19.536

5.  In Vivo Structure-Activity Relationships and Optimization of an Unnatural Base Pair for Replication in a Semi-Synthetic Organism.

Authors:  Aaron W Feldman; Floyd E Romesberg
Journal:  J Am Chem Soc       Date:  2017-08-10       Impact factor: 15.419

6.  A semisynthetic organism engineered for the stable expansion of the genetic alphabet.

Authors:  Yorke Zhang; Brian M Lamb; Aaron W Feldman; Anne Xiaozhou Zhou; Thomas Lavergne; Lingjun Li; Floyd E Romesberg
Journal:  Proc Natl Acad Sci U S A       Date:  2017-01-23       Impact factor: 11.205

7.  Chemical Stabilization of Unnatural Nucleotide Triphosphates for the in Vivo Expansion of the Genetic Alphabet.

Authors:  Aaron W Feldman; Vivian T Dien; Floyd E Romesberg
Journal:  J Am Chem Soc       Date:  2017-02-07       Impact factor: 15.419

8.  Optimization of Replication, Transcription, and Translation in a Semi-Synthetic Organism.

Authors:  Aaron W Feldman; Vivian T Dien; Rebekah J Karadeema; Emil C Fischer; Yanbo You; Brooke A Anderson; Ramanarayanan Krishnamurthy; Jason S Chen; Lingjun Li; Floyd E Romesberg
Journal:  J Am Chem Soc       Date:  2019-06-26       Impact factor: 15.419

Review 9.  The expanded genetic alphabet.

Authors:  Denis A Malyshev; Floyd E Romesberg
Journal:  Angew Chem Int Ed Engl       Date:  2015-08-25       Impact factor: 15.336

Review 10.  Overcoming Challenges in Engineering the Genetic Code.

Authors:  M J Lajoie; D Söll; G M Church
Journal:  J Mol Biol       Date:  2015-09-05       Impact factor: 5.469

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.