| Literature DB >> 24829454 |
Wen Wan1, Lulu Li1, Qianqian Xu1, Zhefan Wang1, Yuan Yao1, Rongliang Wang1, Jia Zhang1, Haiyan Liu1, Xiaolian Gao2, Jiong Hong3.
Abstract
The development of economical de novo gene synthesis methods using microchip-synthesized oligonucleotides has been limited by their high error rates. In this study, a low-cost, effective and improved-throughput (up to 32 oligos per run) error-removal method using an immobilized cellulose column containing the mismatch binding protein MutS was produced to generate high-quality DNA from oligos, particularly microchip-synthesized oligonucleotides. Error-containing DNA in the initial material was specifically retained on the MutS-immobilized cellulose column (MICC), and error-depleted DNA in the eluate was collected for downstream gene assembly. Significantly, this method improved a population of synthetic enhanced green fluorescent protein (720 bp) clones from 0.93% to 83.22%, corresponding to a decrease in the error frequency of synthetic gene from 11.44/kb to 0.46/kb. In addition, a parallel multiplex MICC error-removal strategy was also evaluated in assembling 11 genes encoding ∼21 kb of DNA from 893 oligos. The error frequency was reduced by 21.59-fold (from 14.25/kb to 0.66/kb), resulting in a 24.48-fold increase in the percentage of error-free assembled fragments (from 3.23% to 79.07%). Furthermore, the standard MICC error-removal process could be completed within 1.5 h at a cost as low as $0.374 per MICC.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24829454 PMCID: PMC4081059 DOI: 10.1093/nar/gku405
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
The effectiveness of different error-removal methods on CPG-oligos
| Methoda | Toolb | Error correction stage (bp) | Analyzed DNA length (nts) | Before correction | After correction | Ref. | ||
|---|---|---|---|---|---|---|---|---|
| Error frequency (errors per kb) | Error-free DNA ratio (%) | Error frequency (errors per kb) (fold) | Error-free DNA ratio (%) (fold) | |||||
| Function selection | Synthetic ORF selection vector | Assembled DNA | 717 | NAc | 28%d | NAc | 82%d (2.93-fold) | ( |
| PAGE | PAGE | Oligos | 717 | NAc | 28%d | NAc | 64%d (2.29-fold) | ( |
| EMC | T4 endonuclease VII | Assembled DNA (616) | 616 | 6.52 | 4.10%d | 1.62 (4.02-fold) | 46.9%d (11.44-fold) | ( |
| EMC | Assembled DNA (616) | 616 | 6.52 | 4.10%d | 1.98 (3.29-fold) | 31%d (7.56-fold) | ( | |
| MMC | Assembled DNA (993) | 993 | 1.8 | NAc | 0.1 (18-fold) | NAc | ( | |
| MMC | Fragmented DNAq (∼150) | 760 | 1.30 | 54%d | 0.3 (4.33-fold) | 93%d (1.72-fold) | ( | |
| MMC | Fragmented DNAq (∼150) | 760 | 0.98 | 67%d | 0.28 (3.50-fold) | 93%d (1.39-fold) | ( | |
aEMC: enzyme-mediated correction; MMC: MutS-mediated correction.
bThe protein and technology used in the corresponding error correction method.
cNot available in the literature.
dPercentage of active clones (contains perfect clones).
The effectiveness of different error-removal methods on MCp-oligos
| Methoda | Toolb | Error correction stage (bp) | Analyzed DNA length (nts) | Before correction | After correction | Ref. | ||
|---|---|---|---|---|---|---|---|---|
| Error frequency (errors/kb) | Error-free DNA ratio (%) | Error frequency (errors/kb) (fold) | Error-free DNA ratio (%) (fold) | |||||
| PAGE | PAGE | Oligos (70) | 1,755 | 6.29 | NAc | 2.20 (2.86-fold) | NAc | ( |
| Hybridization | Microchip | Oligos (70) | 297∼1,755 | 6.29 | NAc | 0.72 (8.74-fold) | NAc | ( |
| NGS | Pyrosequencing platform | Oligos (40) | 137∼255 | 25 | 3.07% | 0.05 (500-fold) | 84.28% (27.45-fold) | ( |
| EMC | ErrASE | Assembled DNA (779) | 779 | 0.67 | 69.8% | 0.14 (4.79-fold) | 90% (1.29-fold) | ( |
| EMC | ErrASE | Assembled DNA (779) | 779 | 0.88 | 60% | 0.20 (4.40-fold) | 85.7% (1.43-fold) | ( |
| EMC | ErrASE | Assembled DNA (708∼777) | 708∼777 | ∼4.00 | NAc | ∼3.17 (1.26-fold) | 12.50% | ( |
| EMC | ErrASE | Assembled DNA (720∼732) | 720∼732 | NAc | 6.8%∼7.5%d | NAc | 26%∼49%d (3.82-fold∼6.53-fold) | ( |
| EMC | Surveyor nuclease | Assembled DNA (678) | 678 | ∼1.9 | 50.20%d | ∼0.19 (10-fold) | 84%d (1.67-fold) | ( |
| EMC | Surveyor nuclease | Assembled DNA (1134) | 723 | 1.9 | 50.20% d | 0.11 (17.27-fold) | 94%d (1.87-fold) | ( |
| MMC | MICC | MCp-oligos (69∼118) and assembled DNA (258∼260) | 720 | 11.44 | 0.93%d | 0.46 (24.87-fold) | 83.22%d (89.48-fold) | This study |
| MMC | MICC | MCp-oligos (63∼129) and assembled DNA (286∼456) | 286∼456 | 14.25 | 3.23% | 0.66 (21.59-fold) | 79.07% (24.48-fold) | This study |
aEMC: enzyme-mediated correction; MMC: MutS-mediated correction.
bThe primary tools applied in the corresponding error correction technology.
cNot available in the literature.
dPercentage of active clones (contains perfect clones).
Figure 1.Schematic representation of removal of error-containing oligonucleotides or assembled DNA using a MICC. (a) Microchip-synthesized oligos or assembled DNA fragments are amplified via PCR. (b) Amplified oligos or DNA fragments are re-annealed to expose errors. (c) Re-annealed oligos or DNA fragments are loaded onto a MICC. (d) After elution, the error-containing oligos or DNA fragments are retained on the column, and the error-free oligos or DNA fragments elute through the column and are collected. (e) The collected error-free oligos or DNA fragments are amplified via PCR to generate additional material for subsequent applications.
Figure 2.Schematic representation of error removal of microchip-synthesized oligonucleotides during de novo gene synthesis. (a∼c) Oligos are synthesized, cleaved and amplified. Specific primers (black, purple or yellow) are added to separate the oligo pool into subpools via PCR. (d) The oligos are re-annealed to expose synthetic errors, such as mismatches (black dot). (e) Errors are removed using a MICC. Each subpool is eluted through one MICC. (f) The error-depleted subpools are amplified separately. (g) Primers are removed. (h) The DNA is assembled.
Figure 3.Evaluation of the error-removal ability of various MICCs. (a) Functional analysis of the synthesized egfp gene. The ratio of ‘fluorescent clones’ to ‘analyzed clones’ is calculated as described in the manuscript for a series of assays with or without error removal using a MICC. (b) Sequencing analysis of the synthesized egfp gene. The error frequencies of synthesized genes were analyzed as described in the text for the synthesized fragments and genes with or without error removal using a MICC. Then, the occurrence of different types of errors was counted, and the error frequency (errors per kb) of various error-removal protocols was calculated as the ratio of each error to the total bases analyzed. t, tMICC; e, eMICC; et, etMICC; O, one round of error removal at the oligo stage; O+F, two rounds of error removal at both the oligo and fragment stages.
Error analysis of synthesized egfp gene sequences with or without MICC-mediated error removal
| Error type | Untreated | tMICC | eMICC | etMICC | |||
|---|---|---|---|---|---|---|---|
| One-rounda | Two-roundb | One-rounda | Two-roundb | One-rounda | Two-roundb | ||
| Multi-errorc | 4 | 0 | 0 | 2 | 0 | 0 | 0 |
| Deletion | 24 | 11 | 5 | 7 | 4 | 8 | 2 |
| A | 5 | 1 | 0 | 0 | 0 | 0 | 0 |
| C | 8 | 2 | 0 | 1 | 0 | 1 | 0 |
| T | 6 | 7 | 4 | 5 | 4 | 7 | 2 |
| G | 5 | 1 | 1 | 1 | 0 | 0 | 0 |
| Insertion | 6 | 0 | 0 | 0 | 0 | 0 | 0 |
| A | 2 | 0 | 0 | 0 | 0 | 0 | 0 |
| C | 2 | 0 | 0 | 0 | 0 | 0 | 0 |
| T | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
| G | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
| Substitution | 39 | 29 | 33 | 3 | 2 | 3 | 1 |
| Transition | 27 | 13 | 9 | 1 | 2 | 0 | 0 |
| G/C to A/T | 26 | 9 | 8 | 0 | 2 | 0 | 0 |
| A/T to G/C | 1 | 4 | 1 | 1 | 0 | 0 | 0 |
| Transversion | 12 | 16 | 24 | 2 | 0 | 3 | 1 |
| G/C to C/G | 2 | 1 | 0 | 0 | 0 | 1 | 0 |
| G/C to T/A | 9 | 14 | 21 | 1 | 0 | 1 | 0 |
| A/T to C/G | 0 | 1 | 1 | 1 | 0 | 0 | 1 |
| A/T to T/A | 1 | 0 | 2 | 0 | 0 | 1 | 0 |
| Total errors | 73 | 40 | 38 | 12 | 6 | 11 | 3 |
| Bases sequenced | 6379 | 7909 | 7915 | 6943 | 9356 | 8054 | 6478 |
| Error frequency (errors per kb) | 11.44 | 5.06 | 4.80 | 1.73 | 0.64 | 1.37 | 0.46 |
aOne round of error removal at the oligo stage.
bTwo rounds error removal at both the oligo and fragment stages.
cError site located in a sequence that contains more than three adjacent consecutive nucleotide errors.
Figure 4.Influence of the error rates on de novo gene synthesis. The number of clones that must be sequenced to identify at least one error-free sequence with a high probability (90%) after two rounds error removal at both oligo and fragment stages using various MICCs.
Error analysis of assembled fragment sequences of the sMMO gene cluster and the Epo A, B and C genes with or without error removal using the etMICC
| Error type | Untreated (%a) | One-round (%) | Two-round (%) |
|---|---|---|---|
| Multi-errorb | 6 (1.34%) | 6 (2.63%) | 0 (0.00%) |
| Deletion | 175 (39.06%) | 61 (26.75%) | 3 (16.67%) |
| Insertion | 38 (8.48%) | 11 (4.82%) | 0 (0.00%) |
| Substitution | 229 (51.12%) | 150 (65.79%) | 15 (83.33%) |
| Total errors | 448 | 228 | 18 |
| Bases sequenced | 31 445 | 71 984 | 27 357 |
| Error frequency (error per kb) | 14.25 | 3.17 | 0.66 |
| Percentage of error-free synthetic fragments or genesc (%) | 3.23 | 38.53 | 79.07 |
aThe ratio of each type of error to the total number of errors.
bError site located in a sequence that contains more than three adjacent consecutive nucleotide errors.
cThe length of the synthetic fragments or genes was ∼335 bp for the sMMO gene cluster and the Epo A, B and C genes.