Literature DB >> 28499033

Large-scale recoding of a bacterial genome by iterative recombineering of synthetic DNA.

Yu Heng Lau^1,2, Finn Stirling^1,2, James Kuo^1,2, Michiel A P Karrenbelt^1,3, Yujia A Chan^1,2, Adam Riesselman⁴, Connor A Horton^1,2, Elena Schäfer^1,2, David Lips^1,2, Matthew T Weinstock⁵, Daniel G Gibson^5,6, Jeffrey C Way^1,2, Pamela A Silver^1,2.

Abstract

The ability to rewrite large stretches of genomic DNA enables the creation of new organisms with customized functions. However, few methods currently exist for accumulating such widespread genomic changes in a single organism. In this study, we demonstrate a rapid approach for rewriting bacterial genomes with modified synthetic DNA. We recode 200 kb of the Salmonella typhimurium LT2 genome through a process we term SIRCAS (stepwise integration of rolling circle amplified segments), towards constructing an attenuated and genetically isolated bacterial chassis. The SIRCAS process involves direct iterative recombineering of 10-25 kb synthetic DNA constructs which are assembled in yeast and amplified by rolling circle amplification. Using SIRCAS, we create a Salmonella with 1557 synonymous leucine codon replacements across 176 genes, the largest number of cumulative recoding changes in a single bacterial strain to date. We demonstrate reproducibility over sixteen two-day cycles of integration and parallelization for hierarchical construction of a synthetic genome by conjugation. The resulting recoded strain grows at a similar rate to the wild-type strain and does not exhibit any major growth defects. This work is the first instance of synthetic bacterial recoding beyond the Escherichia coli genome, and reveals that Salmonella is remarkably amenable to genome-scale modification.

Entities: Chemical Disease Species

Mesh：

Substances：

Year: 2017 PMID： 28499033 PMCID： PMC5499800 DOI： 10.1093/nar/gkx415

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

The next widely anticipated breakthrough in genetic engineering is the ability to rapidly rewrite the genomes of industrially relevant microbes, plants, and animals. Rewriting entire genomes will deepen our understanding of the genetic code and dramatically transform human health, food and energy production, and our environment (1–6). A major challenge identified by the Genome Project-Write consortium is the efficiency of building and testing large modified genomes (1). In particular, genome recoding involves synonymous replacement of all instances of specific codons throughout an entire genome (2), requiring efficient assembly of large constructs containing thousands of designed base changes (3). New foundational technologies are therefore crucial for accelerating the pace of genome synthesis and modification. New organisms based on recoded genomes can serve a variety of useful functions (1–3,7). While recoded genomes still encode for the same translation products, the replaced target codons can be repurposed for incorporating non-standard amino acids into proteins and peptides (8–10) Loss of the target codon tRNAs in a recoded organism can also impair its ability to express foreign DNA acquired by horizontal gene transfer or viral infection (2,11). This may enhance the genetic containment of an organism (12,13), a desirable feature when considering safety and stability of engineered organisms for applications in open environments (14). Several recent examples of modifying bacterial genomes illustrate the challenges associated with the construction process. Church and co-workers reported a piecewise strategy for debugging their design of a 57-codon Escherichia coli genome (3). Testing 55 different strains containing independent ∼50 kb segments of their recoded genome design on plasmids, 44 segments conferred viability. Although the remaining 11 segments contained lethal elements, the piecewise strategy helped to expedite the troubleshooting process by narrowing down the number of possible offending mutations, enabling a design flaw to be fixed in one of the segments. While cryptic ribosome binding sites and changes in mRNA folding have been identified as potentially important factors when recoding (15,16), underlying principles of genome redesign are generally still not well-defined and fixing design flaws remains challenging (17). The empirical nature of recoding is exemplified in recent work by Chin and co-workers, in which recoding of an essential 20 kb region of E. coli was attempted (7). Three of the eight different codon replacement schemes gave viable organisms and one lethal scheme could be rescued by fixing a single offending codon, while the remaining four schemes led to inviability. Furthermore, lessons from the de novo synthesis of a ‘minimal bacterial genome’ by Venter and co-workers have implications for genome recoding (5). While the authors halved the genome size of Mycoplasma mycoides by removing non-essential elements, this required iterative testing to address complexities such as quasi-essential genes and synthetic lethal gene pairs. Similar issues may arise when implementing genome-wide recoding. Herein, we report a streamlined method for accumulating large-scale recoding in a single strain of Salmonella enterica serovar Typhimurium LT2 (S. typhimurium), bypassing the reliance of previous methods on site-specific enzymes. S. typhimurium is of interest in engineering the gut microbiome for diagnosis, vaccination and treatment of human disease (18–20). An attenuated and recoded S. typhimurium could serve as a stable chassis for housing various diagnostic and therapeutic circuits that can autonomously detect and act upon changes in the gut environment (21). In this work, we accumulate 1557 codon changes in 176 genes across 200 kb of the S. typhimurium genome (Figure 1), using an iterative process we term ‘stepwise integration of rolling circle amplified segments’ (SIRCAS).

Figure 1.

Accumulated recoding covering 200 kb of the S. typhimurium genome. An in silico design of the recoded genome replaced all 33 229 instances of TTA/TTG leucine codons with synonymous CTA/CTG codons respectively. From this design, regions A and B (yellow and purple) were independently recoded using SIRCAS, then combined into one strain by conjugative assembly. The ‘integrated recombineering element’ (inserted at the native hsd restriction system of S. typhimurium, blue) is an arabinose-inducible lambda red cassette that facilitates recoding by enhancing the homologous replacement of wild-type genomic DNA with synthetic fragments.

MATERIALS AND METHODS

Assembly of synthetic DNA in yeast

Synthetic DNA fragments were either purchased from commercial vendors (Gen9, SGI-DNA and Integrated DNA Technologies), or obtained by automated enzymatic assembly of oligonucleotides (BioXp™ 3200 instrument, SGI-DNA). Full details of construct design are provided in the Supporting Information (Supplementary File SI 1 Section S3). The DNA assembly method by transformation-associated recombination was adapted from previously reported protocols (22,23). To prepare yeast spheroplasts for transformation, Saccharomyces cerevisiae VL6-48 (ATCC® MYA3666™) was grown in 50 ml YPAD medium to an optical density OD600 of 1. Cells were then suspended in 20 ml of 1 M sorbitol and kept at 4°C for 4 h. Spheroplasts were generated in 20 ml SPE buffer containing 20 μl beta-mercaptoethanol (Sigma-Aldrich) and 250 μg Zymolyase-20T (US Biological) at 30°C. When the OD600 of cells diluted 1:10 in SPE buffer was 3–4 times the OD600 value of cells diluted in 2% SDS in SPE buffer, the cells were washed twice with 50 ml 1 M sorbitol. Cells were then resuspended in 2.5 ml STC buffer. After 20 min, 200 μl of spheroplasts were added to a 50 μl solution of DNA fragments (total ∼1 μg DNA, equimolar fragments). After 10 min, 1 ml of freshly prepared 20% PEG 8000 solution was added. After a further 20 min, the spheroplasts were resuspended in 800 μl SOS and incubated at 30°C for 30 min. The recovered cells were then suspended in top agar kept at 55°C and poured onto synthetically defined minus Trp plates. Buffer and media formulations are detailed in the Supporting Information (Supplementary File SI 1 Section S1). Successful assemblies were identified either by diagnostic digests of RCA products or multiplex PCR across assembly junctions (Supplementary File SI 1 Section S3.6).

Rolling circle amplification of DNA from yeast artificial chromosomes

Adapting the protocol by Hutchison et al. (5), yeast centromeric plasmid DNA was isolated from Saccharomyces cerevisiae using a modified miniprep workflow. Overnight 5 mL cultures of yeast were suspended in 250 μl P1 buffer (Qiagen) containing 250 μg Zymolyase-20T (US Biological). After incubation at 37°C for 1 h, 250 μl P2 buffer and 250 μl P3 buffer (Qiagen) were added sequentially. The plasmid DNA in the supernatant was precipitated with 1 ml isopropanol, washed with 1 ml of 70% ethanol, and the resulting DNA pellet was air-dried. The plasmid DNA was then dissolved in 40 μl Tris–EDTA buffer pH 8. Rolling circle amplification was conducted using reagents from the TempliPhi Large Construct Kit (GE Healthcare). 2 μl of plasmid DNA solution was added to 15 μl of sample buffer and heated at 95°C for 3 min. Upon cooling, 15 μl of reaction buffer, 1 μl enzyme and 2 μl of 10 mM dNTPs (New England Biolabs) were added, and the reaction was incubated for 24 h at 30°C. To release the recoded DNA as a linear construct, the amplified DNA was diluted with 40 μl water, then 8 μl of 10× FastDigest buffer and 2 μl LguI restriction enzyme (Thermo Fisher) was added. After 1 h at 37°C, the linearized DNA was precipitated with ethanol and redissolved in 40 μl Tris–EDTA buffer pH 8.

Integration into the Salmonella genome

Overnight cultures of S. typhimurium were diluted 1:100 into a 50 ml volume in LB containing 10 mM l-arabinose (Sigma-Aldrich). When an OD600 of 0.5 was reached, cells were washed twice with 20 ml and once with 1 ml of ice-cold 10% glycerol, then resuspended in 250 μl of 10% glycerol. 50 μl of the competent cell suspension was electroporated with the appropriate DNA construct (100 ng per kb of construct, Bio-Rad MicroPulser using the Ec2 setting) in a 0.2 cm cuvette (USA Scientific). After 5 h outgrowth at 37°C in 1 ml SOC medium, 250 μl of cells was plated onto antibiotic selection plates. Typically, at least 50 colonies were patched onto double antibiotic selection plates (chloramphenicol and kanamycin) to determine which colonies contained the correct markers. See the Supporting Information for further details about the SIRCAS protocol (Supplementary File SI 1, Section S4).

Conjugative assembly of recoded genomic regions

Donor strains were prepared by integrating a cassette containing spectinomycin resistance and the origin of transfer from RK2 plasmid (24) upstream of recoded region A by recombineering, and transforming plasmid pTA-Mob (25) containing necessary parABCDE (‘partitioning’) genes. Recipient strains were prepared by integrating an ampicillin resistance cassette at the start of the recoding region to be transferred from the donor. Conjugation experiments were based on a protocol previously described by Ma, Moonan and Isaacs (26). Donor and recipient strains were grown in separate 30 mL cultures, inoculated from overnight cultures by 1:100 dilution. The recipient strain lambda red system was induced with arabinose to enhance homologous recombination. Cultures were grown to OD600 ∼0.5, centrifuged and resuspended in LB to OD600 ∼15. Donor and recipient cells were mixed in 80–120 μl aliquots and pipetted as 10–20 μl spots on LB plates, incubated at 30°C for 1–2 h, then washed and plated as 250 ul of a 1:100 dilution on selection plates. Candidates were patched out on multiple selection plates to check for correct phenotype and incubated at 37°C.

Next-generation sequencing of recoded strains

Salmonella genomic DNA was prepared using a DNeasy Blood and Tissue kit (Qiagen) and quantified using a Qubit 2.0 fluorometer (Thermo Fisher). DNA libraries for next-generation sequencing were prepared using a Nextera XT kit (Illumina) with size selection using SPRIselect beads (Beckman-Coulter) according to manufacturer's protocols. Sequencing was conducted on a Miseq platform using either a MiSeq or MiSeq Nano reagent kit v2 (Illumina), running 300 cycles with paired ends. Reads were trimmed and aligned to a reference genome using the Geneious software package (version 9.1.5) (27). Further details can be found in Supplementary File SI 1 Section S6.

Measurement of recoded Salmonella growth rates

Cells were inoculated into 200 μl LB in a 96-well clear flat-bottom plate (Corning) and incubated overnight at 37°C with continuous shaking. From these overnight stationary cultures, 1 μl was added to 200 μl fresh LB in a new 96-well plate, and incubated at 37°C with continuous shaking in a plate reader (BioTek Synergy HT). OD630 measurements were taken every 10 min for at least 16 h. Measurements were performed in technical triplicates, and at least three biological replicates were obtained for each sample. Errors are calculated as the standard error of the mean between the biological replicates. Doubling times were determined in a similar manner to Lajoie et al. (2), based on linear regression of ln (OD630) using five adjacent timepoints (40 min). The doubling time was calculated by td = ln (2)/m, where m is the maximum gradient of ln(OD630) as determined by the linear regression analysis. Additional growth data can be found in Supplementary File SI 1, Section S7.1. For strains B2, B3 and A13-B3, cells were inoculated into 5 ml LB cultures overnight. From these overnight stationary cultures, 250 μl was added to 25 ml fresh LB in a 250 ml unbaffled Erlenmeyer flask, and grown in a Multitron Standard (INFORS HT) shaking incubator at 37°C and 200 rpm. OD600 measurements were taken every 15–25 min during early exponential phase on an Ultrospec 10 (Amersham Biosciences), removing an aliquot of 700 μl each time. Doubling times were calculated using the same procedure except using three adjacent timepoints which had OD values lying between 0.05 and 0.75. Colony forming unit (CFU) measurements were performed by counting colonies from appropriate dilutions on LB plates. As a control, growth curves for wild-type LT2 were obtained in both plate reader and Erlenmeyer flask format, and the doubling times were found to be consistent with each other.

RESULTS

Computational design of a leucine-recoded S. typhimurium genome

A fully-recoded S. typhimurium genome was generated computationally (Figure 1 and Supplementary Figure SI 1 Section S2), replacing all 33 229 instances of leucine codons TTA and TTG in open reading frames (28) with synonymous CTA and CTG codons respectively. The decision to target leucine codons was due to their high frequency of occurrence throughout the genome (Supplementary File SI 1, Section S2.2), and similar to previous recoding efforts (3,7), the two specific codons were chosen because their corresponding tRNA anticodons are not involved in decoding the remaining four leucine codons (29). This recoding scheme also minimizes the number of base pair changes, requiring only a single T to C base change for each codon. In cases where target codons were present in overlapping reading frames, synonymous changes in both proteins were made where possible. In 49 instances where synonymous mutations were not possible, point mutations were chosen to minimize the impact on biological function as predicted by PROVEAN (30), an alignment-based tool for predicting the effect of amino acid substitutions and indels (Supplementary File SI 1 Table S2.3.1). There was one specific instance in which overlapping genes were split due to the presence of four target codons within the overlap (STM0521 and STM0522, see Supplementary File SI 1 Section S2.3). In addition to codon changes, ∼400 kb of pseudogenes, mobile elements and pathogenicity islands were removed to reduce genetic instability, genome size and pathogenicity (28) (Supplementary File SI 1 Table S2.4.1). Finally, 754 instances of the recognition site for restriction enzyme LguI (5΄-GCTCTTC/GAAGAGC-3΄) were removed to facilitate downstream cloning.

Construction of an S. typhimurium recombineering strain

We used a recombineering approach based on the lambda red system for increasing integration efficiency (31–33). Recombineering is routinely used in S. typhimurium and other bacteria for small insertions and knockouts (34), but the efficiency decreases with increasing DNA insert size (35). Although the use of endonucleases in tandem with recombineering has been shown to improve integration efficiency (7 kb with I-SceI (35) and 100 kb insertions with Cas9 (7)), we postulated that these extra accessory proteins are not necessary to achieve practical rates of integration sufficient for genome recoding, even when working with large DNA segments. To generate a stable recombineering strain of S. typhimurium, we generated a construct containing the lambda red genes under arabinose-inducible control, a gentamicin resistance cassette, and homology arms for replacing the hsd region of the genome (Figure 1 and Supplementary File SI 1, Section S3.5). Using an S. typhimurium strain containing the recombineering plasmid pKD46 (32,36), the Para-lambda red cassette was successfully integrated into the hsd locus, simultaneously removing the native hsd restriction system (37) which could otherwise impede transformation. Curing the pKD46 plasmid at elevated temperature resulted in the final restriction-deficient strain containing an integrated recombineering element (IRE).

Preparation of large recoded DNA segments via rolling circle amplification

From the in silico genome design, sixteen recoded segments were constructed (A1–A13 and B1–B3, Figure 1 and Supplementary File SI 3) constituting two separate arbitrary regions of the recoded genome (Regions A and B). Each segment contained 10–25 kb of recoded DNA, a selection marker, and 1 kb flanking homology regions for integration. The 10–25 kb size range was chosen to simplify construction, decrease the likelihood of unwanted internal recombination events, and minimize the cost of fixing an error in any one segment. Each segment was assembled in yeast (23) from commercially synthesized 2–4 kb DNA fragments (Figure 2A). The cloned assemblies in yeast were checked by diagnostic digests, but were otherwise used immediately without further sequencing. To assess the utility of different commercial DNA sources, a mixture of 156 kb of clonal sequence-verified DNA and 44 kb of non-clonal DNA was used for constructing the segments (Supplementary File SI 2, Table S1). Clonal DNA refers to sequence-verified constructs propagated through a bacterial host, while non-clonal DNA constructs are synthesized completely in vitro in a faster and cheaper process to generate a heterogeneous pool containing errors derived from chemical synthesis. SIRCAS uses a marker swapping approach alternating between chloramphenicol and kanamycin selection (Figure 2B) for a simple phenotypic readout, similar to the strategy for building chromosomes in S. cerevisiae (4,6) and combining genomes in B. subtilis (38).

Figure 2.

(A) Large 10–25 kb recoded DNA segments were created by assembling short synthetic DNA fragments into a YAC, followed by rolling circle amplification and linearization. Each recoded segment contains a selection marker (M1 or M2, typically kanamycin or chloramphenicol resistance cassettes) and flanking homology regions for integration. (B) Accumulated genome recoding by SIRCAS involves iterative recombination of recoded DNA segments. After each step, one selection marker is gained while the other is lost, providing a readout for successful recoding. We bypassed the use of bacterial plasmids, site-specific enzymes (eg. site-specific integrases, Cas9), and associated cloning steps by amplifying each recoded segment directly from yeast using rolling circle amplification (39). RCA utilizes the bacteriophage ø29 DNA polymerase and random hexamers to selectively amplify circular DNA. Linearizing the resulting concatemer by LguI digestion gave microgram quantities of DNA ready for direct integration (Figure 2B). Carrying each segment on a bacterial plasmid would otherwise have required additional negative selection against the plasmid backbone to distinguish between the desired integration event versus the plasmid existing as an extrachromosomal replicative element (3,7). Propagation through a bacterial cloning host may also lead to toxicity issues arising from unintended expression of Salmonella genes.

Accumulated genome recoding by SIRCAS

We successfully performed SIRCAS on all sixteen recoded DNA segments to accumulate recoding in two independent S. typhimurium strains, one recoded from segment A1 to A13 and the other from B1 to B3. Each round of SIRCAS required only two days to complete. In a typical round of SIRCAS, hundreds of marker-positive colonies were obtained after recombineering (Figure 3). No colonies were obtained in negative controls in which arabinose or DNA was omitted. Upon screening for loss of the previous marker, a median of 14% of colonies was found to have correctly swapped markers, while the remaining colonies incorrectly contained both selection markers (Figure 3B and Supplementary File SI 1, Table S4.2.1). This frequency is higher than naively predicted based on lengths of DNA in the distal and central segments, but is expected based on the results of Matic et al. (40), who found that mismatches in recombining DNA can significantly suppress the isolation of recombinants. The rate of obtaining the correct marker phenotype ranged from 3–41% (Figure 3C), presumably due to differences in marker integration locus (41), as well as the size and content of the incoming recoded DNA. To check whether recombinants with the correct markers were missing internal recoding due to multiple crossovers (Figure 3A), selected colonies with the correct phenotype were Sanger sequenced at one location in the middle of each newly recoded segment before progressing to the next round of SIRCAS. Correct recoding was observed in 83% of all sequenced colonies with the correct marker phenotype over sixteen rounds of SIRCAS, indicating that multiple internal crossovers do occur but are relatively uncommon (Supplementary File SI 1 Table S4.3.1). This inference was supported by whole-genome sequencing of the final organism containing all 200 kb of recoded segments.

Figure 3.

Screening for stepwise replacement of genomic segments. (A) Three possibilities after recombineering and selection for marker M1 are full integration of the entire segment leading to marker swapping, partial integration leading to the presence of both markers, or internal crossovers leading to missing recoding. (B) Hundreds of transformants were typically obtained on M1 selection plates (recoding of segment A13 shown here). On average, one in six colonies (16%) had the correct phenotype +M1/–M2, as identified by screening for no growth on M1+M2 plates. In this instance, 9 out of 60 colonies were correct (each marked with a red dot). (C) The rate of correct marker phenotype varied between different rounds of SIRCAS.

Hierarchical assembly of SIRCAS-recoded regions by conjugation

After complete recoding of regions A and B by SIRCAS in two separate strains, the two regions of recoding were consolidated into one strain by conjugative assembly (42). An origin of transfer from the Gram-negative bacterial RK2 plasmid, which directs RK2-mediated conjugal transfer of DNA to a broad range of hosts (24), was integrated upstream of the recoded region of strain A13 by recombineering using an associated spectinomycin resistance cassette (Supplementary File SI 1 Figure S5.2.1). This donor strain was transformed with the 52.7 kb conjugation plasmid pTA-Mob (25), which contains the RK2 parABCDE (‘partitioning’) genes necessary for conjugation. The B3-based recipient strain was prepared by integrating an ampicillin resistance cassette at the start of region A such that upon conjugation, the resistance marker should be lost. Conjugation between these strains resulted in the transfer of the entire 154 kb recoded segment from the donor strain to the recipient, resulting in an exconjugant strain A13-B3 containing a total of 200 kb of recoded genomic DNA. Conjugation success rate as determined by ampicillin marker loss was 40% (17 out of 42 colonies screened).

Characterization of accumulated recoding in S. typhimurium

In the final strain (designated A13-B3), recoding throughout both targeted regions spanning 200 kb of the genome was confirmed by next-generation sequencing, with a total of 1557 leucine codon changes successfully installed. An additional five point substitutions, one single point deletion and one single point insertion were found across 156 kb of the recoded regions constructed using clonal sequence-verified DNA, constituting an error rate of approximately 1 in 20 000. This number is consistent with the error rate of RCA (5) (approximately 1 in 50 000). In contrast, a total of 51 errors were found across the 44 kb of recoded regions constructed using non-clonal DNA, for an approximate error rate of 1 in 860 (Supplementary File SI 1 Section S6.3 and Supplementary File SI 2 Tab S7). The majority of these are single base deletions, consistent with expected errors in the synthesis of the chemical oligonucleotide precursors used to generate non-clonal DNA, which occur at a rate of better than 1 in 200 (43,44). Despite these errors (Supplementary File SI 2 Tab S8), the final strain A13-B3 was still viable and no other instances of incorrect leucine codon recoding were found. Growth rates of all recoded strains were measured to analyze the impact of accumulated recoding on the viability of S. typhimurium. Despite the vast number of changes introduced as part of the recoding process, no major cumulative growth defect was observed over the sixteen rounds of SIRCAS, demonstrating that bacterial genomes appear to be amenable towards large-scale rational re-engineering. No downward trend in fitness was observed across the entire 200 kb of recoding, with comparable doubling times measured across all strains at 37°C in LB (Figure 4A). Similar trends in growth behavior were also seen with growth at both 30°C and 42°C (Supplementary File SI 1 Figure S7.1.3). Sequencing also did not reveal any clear compensatory mutations in non-recoded regions of the genome which might suppress any defects (Supplementary File SI 2 Table S9).

Figure 4.

(A) Doubling times of various recoded S. typhimurium strains growing at 37°C in LB. Each data point is the average of three technical replicates, and the error bars represent the standard deviation of three biological replicates. No major fitness defect was observed throughout the recoding process. The strain nomenclature ‘A_’ refers to the strain containing all recoded segments from A1 up to A_. We note that doubling times for B2, B3 and A13-B3 were calculated from growth curves conducted in batch cultures grown in flasks due to sedimentation, while the remaining data was obtained on a plate reader. (B) Uneven sedimentation of cells is observed when strains B2, B3 and A13-B3 are grown in a plate reader (wells in right column). (C) Growth data for B2, B3 and A13-B3 was obtained by growth in culture flasks rather than in 96-well plates to avoid sedimentation artifacts. Wild-type LT2 was used as a control and had comparable doubling times in both plate and flask growth (see Supplementary File SI 1 Figure S7.1.4). Minor phenotypic changes were observed at one step of the recoding process. When grown in a plate reader, cells containing segment B2 (ie. strains B2, B3 and A13-B3) were observed to sediment unevenly (Figure 4B), resulting in anomalous growth curves. Growth experiments repeated in Erlenmeyer flasks showed no major differences in doubling time (Figure 4A). Although final optical density measurements differed between strains, growth as measured by colony forming units was consistent between wild-type and recoded strains B2, B3 and A13-B3 (Figure 4C). The sedimentation effect may be due to changes in the cell surface arising from non-clonal DNA errors in regions responsible for O-antigen biosynthesis (Supplementary File SI 1 Figure S6.3.1). No stepwise accumulation of growth defects was observed upon recoding all other segments.

DISCUSSION

We have developed SIRCAS (stepwise integration of rolling circle amplified segments) as a rapid method for making genomic changes that bypasses the requirement for site-specific enzymes. Applying SIRCAS to recoding of the S. typhimurium genome, we installed 1557 TTA/TTG to CTA/CTG leucine codon changes across 200 kb of genomic DNA in a single strain. A key aspect of SIRCAS is the use of rolling circle amplification to produce DNA of sufficient quality and quantity appropriate for transformation. We found that recombineering with extended homology regions was effective on 10–25 kb linear DNA segments, without needing assistance from extra accessory proteins such as endonucleases. Sixteen cycles of integration into the genome were achieved efficiently, with each integration step requiring two days to complete. The recoded strains did not show any major fitness defects when compared to the wild-type strain, demonstrating the remarkable plasticity of bacterial genomes. Overall, this work paves the way for creating a completely recoded Salmonella as a genetically isolated chassis for downstream therapeutic applications. The SIRCAS approach can be carried out in parallel in many strains for more rapid construction. In this study, we maintained two different recoded strains and unified the recoded regions by conjugation. For completing the construction of a fully recoded S. typhimurium genome, we anticipate recoding 8–16 strains in parallel, followed by hierarchical assembly via conjugation. Recoding 20 kb every two days, it is theoretically possible to rewrite 300 kb of the genome per strain in a month. The Salmonella genome tolerated a vast number of designed codon changes, suggesting that recoding the entire genome of Salmonella will be feasible. This result is consistent with similar recoding studies in E. coli (3,7). Design flaws may affect viability, and troubleshooting may be a significant challenge when constructing a complete recoded genome. As observed in the ‘minimal genome’ work by Venter et al. (5), synthetic lethal gene pairs may become apparent as more recoding is accumulated in a single strain. In these cases, the piecewise nature of the SIRCAS strategy may prevent the process of redesigning problematic pieces from becoming a bottleneck. Further understanding of principles governing codon choice will also help improve our existing designs (15,16). There are currently few alternative methods for large-scale genomic engineering. Multiplex automated genome engineering (45) and multiplexed CRISPR/Cas9 systems (46) are not sufficiently high throughput to install the large number of designed changes required for leucine recoding. Church and co-workers used site-specific integrases for testing a 57-codon E. coli genome design (3), integrating 10 out of 87 possible 50 kb recoded segments into independent strains. Site-specific integrases are an effective means for testing individual recoded segments, as there is less opportunity for internal crossover events when compared to homologous recombination. However, unification of recoded segments into a single strain is more challenging. Integrase-based methods require the installation and subsequent removal of landing pads for each integration step. The integration event itself also requires several steps, starting with introduction of the recoded segment on a plasmid, removal of the wild-type genomic segment, integration of the recoded segment, and finally removal of the plasmid. This multi-step approach compromises the efficiency of the construction process when compared to the single-step integration of SIRCAS. The 20 kb size range may be an optimal compromise between the number of integration steps and ease of construction and troubleshooting. Chin and co-workers have combined Cas9 and recombineering to insert over 100 kb of DNA into the E. coli genome (7). The major advantage of using large 100 kb constructs is that fewer rounds of integration are required to achieve complete genome recoding when compared to the 10–25 kb segments we used for SIRCAS. Conversely, there is a greater likelihood of incomplete recoding due to multiple internal crossover events with increasing size. Larger constructs are also more difficult to assemble from small fragments (efficiency drops as more fragments are included in the assembly (23)) and more prone to shearing when handled. Furthermore, finding the causative mutations when recoding changes impact fitness becomes more challenging with increasing size. A noteworthy troubleshooting method was described by Chin and co-workers, using a sequencing approach to pinpoint where single lethal mutations arise. However, this method was only demonstrated for recoding a 20 kb region of the E. coli genome, and may be more problematic in segments with more than one lethal mutation. The overall error rates of SIRCAS are competitive with that of other genome recoding methods. Within the 156 kb of recoded regions that was written with clonal sequence-verified DNA, an overall error rate of 1 in 20 000 was observed (7 point mutations and one leucine codon reversion in 200 kb). In comparison, Church and co-workers (3) reported an error rate of 1 in 5000 (an average of 9.7 mutations and 0.6 codon reversions in 50 kb). For the 44 kb of recoded regions written using non-clonal DNA, an overall error rate of 1 in 860 was found (51 errors, primarily single point deletions). Use of clonal DNA for SIRCAS was therefore preferable, although issues associated with construct errors in non-clonal DNA may become less significant in the future as DNA synthesis and error-correction methods continue to improve. Beyond genome recoding, SIRCAS is a powerful enabling technology for building and testing large de novo designs that span hundreds of genes. SIRCAS is not limited to Salmonella, and could be used in any recombineering competent host to integrate large naturally occurring or de novo designed gene clusters, facilitating the development of new industrial production strains. SIRCAS also enables interrogation of bacterial genetics on a scale that is not possible with traditional techniques. By making genome-scale construction more accessible, SIRCAS will help to empirically elucidate the underlying principles for completely de novo genome design. Click here for additional data file.

46 in total

1. Combining two genomes in one cell: stable cloning of the Synechocystis PCC6803 genome in the Bacillus subtilis 168 genome.

Authors: Mitsuhiro Itaya; Kenji Tsuge; Maki Koizumi; Kyoko Fujita
Journal: Proc Natl Acad Sci U S A Date: 2005-10-18 Impact factor: 11.205

Review 2. Universal rules and idiosyncratic features in tRNA identity.

Authors: R Giegé; M Sissler; C Florentz
Journal: Nucleic Acids Res Date: 1998-11-15 Impact factor: 16.971

3. Programmable bacteria detect and record an environmental signal in the mammalian gut.

Authors: Jonathan W Kotula; S Jordan Kerns; Lev A Shaket; Layla Siraj; James J Collins; Jeffrey C Way; Pamela A Silver
Journal: Proc Natl Acad Sci U S A Date: 2014-03-17 Impact factor: 11.205

4. GENOME ENGINEERING. The Genome Project-Write.

Authors: Jef D Boeke; George Church; Andrew Hessel; Nancy J Kelley; Adam Arkin; Yizhi Cai; Rob Carlson; Aravinda Chakravarti; Virginia W Cornish; Liam Holt; Farren J Isaacs; Todd Kuiken; Marc Lajoie; Tracy Lessor; Jeantine Lunshof; Matthew T Maurano; Leslie A Mitchell; Jasper Rine; Susan Rosser; Neville E Sanjana; Pamela A Silver; David Valle; Harris Wang; Jeffrey C Way; Luhan Yang
Journal: Science Date: 2016-06-02 Impact factor: 47.728

5. Interspecies gene exchange in bacteria: the role of SOS and mismatch repair systems in evolution of species.

Authors: I Matic; C Rayssiguier; M Radman
Journal: Cell Date: 1995-02-10 Impact factor: 41.582

6. Precise manipulation of bacterial chromosomes by conjugative assembly genome engineering.

Authors: Natalie J Ma; Daniel W Moonan; Farren J Isaacs
Journal: Nat Protoc Date: 2014-09-04 Impact factor: 13.491

7. Programming cells by multiplex genome engineering and accelerated evolution.

Authors: Harris H Wang; Farren J Isaacs; Peter A Carr; Zachary Z Sun; George Xu; Craig R Forest; George M Church
Journal: Nature Date: 2009-07-26 Impact factor: 49.962

8. Genomic Recoding Broadly Obstructs the Propagation of Horizontally Transferred Genetic Elements.

Authors: Natalie Jing Ma; Farren J Isaacs
Journal: Cell Syst Date: 2016-07-14 Impact factor: 10.304

9. DNA restriction and modification systems in Salmonella. I. SA and SB, two Salmonella typhimurium systems determined by genes with a chromosomal location comparable to that of the Escherichia coli hsd genes.

Authors: C Colson; A Van Pel
Journal: Mol Gen Genet Date: 1974-04-03

10. Recoded organisms engineered to depend on synthetic amino acids.

Authors: Alexis J Rovner; Adrian D Haimovich; Spencer R Katz; Zhe Li; Michael W Grome; Brandon M Gassaway; Miriam Amiram; Jaymin R Patel; Ryan R Gallagher; Jesse Rinehart; Farren J Isaacs
Journal: Nature Date: 2015-01-21 Impact factor: 49.962

26 in total

1. Construction of an Escherichia coli genome with fewer codons sets records.

Authors: Benjamin A Blount; Tom Ellis
Journal: Nature Date: 2019-05 Impact factor: 49.962

Review 2. Reprogramming the genetic code.

Authors: Daniel de la Torre; Jason W Chin
Journal: Nat Rev Genet Date: 2020-12-14 Impact factor: 53.242

3. Cell engineering: How to hack the genome.

Authors: Jeffrey M Perkel
Journal: Nature Date: 2017-07-26 Impact factor: 49.962

Review 4. Controlling the Implementation of Transgenic Microbes: Are We Ready for What Synthetic Biology Has to Offer?

Authors: Finn Stirling; Pamela A Silver
Journal: Mol Cell Date: 2020-05-21 Impact factor: 17.970

Review 5. Engineering bacteria for diagnostic and therapeutic applications.

Authors: David T Riglar; Pamela A Silver
Journal: Nat Rev Microbiol Date: 2018-02-05 Impact factor: 60.633

Review 6. Rewriting the Genetic Code.

Authors: Takahito Mukai; Marc J Lajoie; Markus Englert; Dieter Söll
Journal: Annu Rev Microbiol Date: 2017-07-11 Impact factor: 15.500