Literature DB >> 30023688

Spiked Genes: A Method to Introduce Random Point Nucleotide Mutations Evenly throughout an Entire Gene Using a Complete Set of Spiked Oligonucleotides for the Assembly.

Edson Cárcamo¹, Abigail Roldán-Salgado¹, Joel Osuna¹, Iván Bello-Sanmartin¹, Jorge A Yáñez¹, Gloria Saab-Rincón¹, Héctor Viadiu², Paul Gaytán¹.

Abstract

In vitro mutagenesis methods have revolutionized biological research and the biotechnology industry. In this study, we describe a mutagenesis method based on synthesizing a gene using a complete set of forward and reverse spiked oligonucleotides that have been modified to introduce a low ratio of mutant nucleotides at each position. This novel mutagenesis scheme named "Spiked Genes" yields a library of clones with an enhanced mutation distribution due to its unbiased nucleotide incorporation. Using the far-red fluorescent protein emKate as a model, we demonstrated that Spiked Genes yields richer libraries than those obtained via enzymatic methods. We obtained a library without bias toward any nucleotide or base pair and with even mutations, transitions, and transversion frequencies. Compared with enzymatic methods, the proposed synthetic approach for the creation of gene libraries represents an improved strategy for screening protein variants and does not require a starting template.

Entities: Chemical Disease Gene Mutation Species

Year: 2017 PMID： 30023688 PMCID： PMC6044943 DOI： 10.1021/acsomega.7b00508

Source DB: PubMed Journal: ACS Omega ISSN： 2470-1343

Introduction

The ability to produce a large number of protein variants that are screened to optimize the desired protein function continues to expand our technological progress. Directed molecular evolution relying on random and semirandom mutagenesis of genes has become a powerful strategy for modifying protein properties.[1,2] There are already many proteins whose function has been modified by selecting variants from a created protein library. For example, variant proteins with stability to high temperatures,[3] extreme pH,[4,5] and organic solvents[6,7] have been obtained. Moreover, variant enzymes with improved activities[8,9] or specificity changes[10,11] are available. Other physical properties, such as oligomerization of a target protein[12,13] or color of fluorescent proteins have also been modified.[14,15] Although screening and selection are intrinsically related to the particular properties of the target protein, the generation of genetic diversity at the gene level is a more general issue and depends on the mutagenesis method employed.[1,2,16] Soon after the advent of the polymerase chain reaction (PCR) technique, it was determined that random point mutations were generated during the in vitro amplification process,[17,18] and protocols were developed to take advantage of such mutations, creating the error-prone PCR (epPCR) mutagenesis approach.[19,20] Now, it is well known that manganese ions and unbalanced dNTP ratios,[21,22] nucleotide analogs,[23] and low-fidelity DNA polymerases[24,25] promote the random incorporation of erroneous nucleotides. On the other hand, early studies on oligonucleotide synthesis performed on a solid phase demonstrated that when added in equimolar concentrations, the four DNA monomer phosphoramidites dA, dC, dG, and dT have similar chemical reactivity.[26] In 1986, taking advantage of the equal nucleotide reactivity, the 30 bp glucocorticoid response element of the mouse mammary tumor virus was randomized by spiked oligonucleotides, wherein each position was doped during automated DNA synthesis by the low levels of the other three phosphoramidites.[27] This work confirmed the robustness of the method to produce all of the expected mutations by sequencing 546 clones and finding 88 different base substitutions out of the 90 possible single mutants. A similar mutagenesis approach was reported the following year by Hill et al.[28] and 4 years later by Dale et al.[29] Using genes as templates, researchers incorporated spiked oligonucleotides by PCR-based methods to modify specific regions of the encoded proteins.[30] More recently, Hidalgo et al.[31] and Jin et al.[32] reused the spiked oligonucleotides to achieve a focused and directed evolution in certain regions of the regulatory elements and enzymes via in vitro recombination with oligonucleotides bearing the wild-type sequence. In the recent era of synthetic biology, using only the sequence information without any gene template, libraries of synthetic genes have been prepared by two similar approaches termed “Synthetic Shuffling”[33] and “Assembly of Designed Oligonucleotides”.[34] In both methods, the gene library is assembled by appropriately prepared oligonucleotides containing saturated degeneracies at the desired nucleotide positions, selected by sequence alignments of homologous genes. The precise annealing required to form full-length genes is achieved by controlled overlapping of the designed oligomers. However, contrary to the spiked oligonucleotides, ordinary primers synthesized with saturated degeneracies at some positions rapidly increase the number of variants at the nucleotide level[35] and enrich the libraries with protein variants harboring multiple amino acid mutations, which usually affect either the folding or function of most encoded proteins. The fundamental difference between spiked and ordinary degenerated oligonucleotides relies on the level of doping of the wild-type nucleotides. In spiked oligonucleotides each wild-type nucleotide is “doped” during the chemical synthesis, with a very low ratio of the other three bases, yielding oligonucleotide mutants harboring very few changes per chain and a high ratio containing the wild-type sequence. However, in ordinary commercial degenerated oligonucleotides, the target nucleotides are completely replaced with an equimolar ratio of the desired mixed bases: N: A/G/C/T, V: A/C/G, D: A/G/T, H: A/T/C, B: C/G/T, M: A/C, R: A/G, W: A/T, S: C/G, Y: C/T, or K: G/T, producing oligonucleotide mutants harboring multiple mutations per chain and practically nothing of the wild-type sequence when more than 12 nucleotides are randomized with N. In this article, for the first time, we report the strategy of using a complete set of spiked oligonucleotides to introduce mutations throughout an entire synthetic gene, an approach that according to a review written by Wong et al. has not been reported yet.[36] This method is not limited to a target region, as previously described,[30−32] or to a reduced number of target nucleotides.[33,34] This synthetic strategy, which we have named “Spiked Genes”, was applied to the monomeric enhanced red fluorescent protein emKate.[37,38] This relatively small protein composed of 243 amino acids, including a polyhistidine tag on its carboxy end, was selected as a model test not only because its small gene is suitable for chemical assembly but also because emKate harbors the mutation S158A with respect to the original mKate protein,[39] a mutation that makes it more fluorescent than its parental protein.[38] Moreover, some amino acid replacements on its scaffold produce variants of a different color,[37] similar to the protein DsRed that served as a scaffold for producing a palette of fluorescent proteins termed mFruits.[14]

Results and Discussion

Gene Library Synthesis

The synthetic emKate gene library was assembled, as described in the Methods section, by a one-step strategy using a diluted equimolar pool of 22 internal spiked oligonucleotides and two outermost spiked oligonucleotides as extension primers (Figure ).[37] The optimal concentration of the internal oligonucleotides was 4 nM, 100-fold less concentrated than that of the external primers. The assembly efficiency was sensitive at higher internal oligonucleotide concentrations likely due to mispairing between some of them and the external primers. We did not observe any assembly efficiency difference between the wild-type[37] and spiked mutant oligonucleotides.

Figure 1

Strategy for the assembly of the synthetic gene library. Each continuous arrow represents a spiked oligonucleotide, wherein each position was doped with 0.25% of each of the other three bases. Primer sequences are listed in Table S1. (A) Scheme of the single-step PCR assembly reaction to synthesize the emKate gene library. (B) PCR reactions, using three different starting concentrations of the internal primers (4, 8, and 16 nM) and outermost primers at 400 nM, analyzed by agarose gel electrophoresis and the GeneRuler 100 bp Plus DNA ladder as the molecular marker. To evaluate the higher efficiency of Spiked Genes for generating random point nucleotide substitutions, we compared our method with the mutagenesis efficiencies, over the emKate gene, of three enzymatic epPCR-based methods: two commercial mutagenesis kits (GeneMorph II and Diversify) and the traditional epPCR method.[20] The Diversify PCR Random Mutagenesis Kit is based on the methods of Leung et al.[19] and Cadwell and Joyce[20] and relies on the use of Titanium Taq polymerase and different concentrations of manganese and dGTPs to achieve different mutation rates. GeneMorph II is a commercial mutagenesis kit containing Mutazyme II: a blend of two error-prone DNA polymerases.[25] One of these enzymes is Mutazyme I, which favors mutations at G’s and C’s, whereas the other is a novel Taq DNA polymerase mutant that exhibits increased misinsertion and misextension frequencies at A’s and T’s compared with those of the wild-type Taq.

Analysis of the Libraries

To evaluate the mutagenic efficiency of the Spiked Genes method versus the three error-prone enzymatic methods, we used three parameters to define the quality of DNA libraries:[16,40] (1) the mutation distribution, which evaluates the capacity of the method to spread point mutations along the complete gene sequence, (2) the mutation rate, which evaluates the prediction ability of the method to achieve an expected number of point mutations per kb, and (3) the mutational diversity, also known as the mutational spectrum, which evaluates the capacity of the method to produce all type of transition and transversion mutations.[41] To measure these parameters while avoiding misrepresented data, we randomly selected several clones (independent of the phenotype displayed) from the antibiotic supplemented plate. A total of 57 867 bases were sequenced for this analysis (Table ).

Table 1

Comparison of Mutational Spectra and Mutation Rate Generated by the Spiked Genes and Three Different Enzymatic Mutagenesis Approaches

	classic epPCRa	diversify (Clontech)	GeneMorph II (Stratagene)	Spiked Genes method	expected valuesb
Bias Indicators
Ts/Tv	1.81	3.16	1.01	0.61	0.5
AT → GC/GC → AT	4.07	8.70	2.14	0.88	0.908
A → N, T → Nc	80.2	89.7	68.2	46.9	47.58
G → N, C → Nc	19.7	10.3	31.8	53.1	52.38
Types of Mutations
transitionsc
A → G, T → C	46.0	69.2	29.5	17.7	15.86
G → A, C → T	18.4	6.8	20.9	20.4	17.46
transversionsc
A → T, T → A	26.3	17.1	34.3	13.6	15.86
A → C, T → G	7.9	3.4	4.4	15.6	15.86
G → C, C → G	0	1	4.3	11.6	17.46
G → T, C → A	1.4	2.5	6.6	21.1	17.46
insertions and deletions
deletions	1	8	6	19	0
insertions	0	0	1	1	0
Data Analyzed
clones	18	19	23	22
sequenced bases	13 122	13 395	16 038	15 312
point mutations	76	204	184	155
Mutation Frequency (Mutations/kb)
exhibitedd	5.8 (4.5; 7.2)	15.2 (13.2; 17.4)	11.4 (9.8; 13.2)	10.1 (8.6; 11.8)
expectede	6.6	8	16	15

Using the conditions reported by Cadwell and Joyce.[20]

Expected values calculated from the proportion of each nucleotide in the emKate gene, assuming an equal mutation probability in every position.

In percentage.

95% confidence intervals are shown between brackets, assuming that the mutation frequency follows a Poisson distribution.[22,59]

Expected mutation frequency of each approach based on a previous report,[20] instruction manuals of commercial kits and the theoretical rates of Spiked Genes.

Using the conditions reported by Cadwell and Joyce.[20] Expected values calculated from the proportion of each nucleotide in the emKate gene, assuming an equal mutation probability in every position. In percentage. 95% confidence intervals are shown between brackets, assuming that the mutation frequency follows a Poisson distribution.[22,59] Expected mutation frequency of each approach based on a previous report,[20] instruction manuals of commercial kits and the theoretical rates of Spiked Genes.

Mutation Distribution

The first point to highlight from the sequencing results was the homogeneous distribution of mutations generated along the entire gene (Figure ). Using our Spiked Genes method, we did not observe accumulation or reduction of mutations at any location of the gene, even at the boundaries of the oligonucleotides used. Similar evenly distributed results were also obtained for the libraries created by the three enzymatic mutagenesis methods tested. Although, for the enzymatic methods, some bases were more prone to be replaced than others due to the nucleotide specificity of the DNA polymerases, which is dependent on the surrounding sequence.[24]

Figure 2

Experimental mutation distribution of the libraries constructed with the four methods examined. Each bar represents a point nucleotide substitution at a certain location of the gene from the initial ATG codon. The bar height represents the number of substitutions found in each position.

Mutation Rate

For our Spiked Genes method, the expected mutation rate was 15 changes/kb due to the 0.75% doping ratio per base per strand and therefore 1.5% per base-pair. The experimental results exhibited an average replacement rate of 10.3 point mutations per kb, which was slightly lower than the expected value, as shown in Table . This difference between the theoretical and experimental values reflects a selective pressure during the PCR assembly process to favor pairing of complementary primers with the least ratio of mismatches. The difference between the theoretical and experimental mutation rates would probably decrease using less stringent annealing conditions during the PCR step. Because of the 1.0% experimental mutation rate, the mutagenesis of the 0.7 kb emKate gene (with our method) resulted in an average of seven nucleotide changes at the DNA level and four or five amino acid changes at the protein level, which is appropriate for performing experiments of directed evolution in most protein targets.[40] However, if a lower mutagenic rate is desired for elucidating protein function–structure relationships, the doping ratio may be simply reduced during the oligonucleotide synthesis to achieve one or two amino acid changes per protein variant. Such fine control in the mutagenic rate of the Spiked Genes method is hardly achievable by the enzymatic mutagenesis approaches. In our experiment, using the epPCR protocols reported in the literature produced, on an average, 5.8 replacements per kb (0.8 replacements less than the expected 6.6 changes/kb). However, the range of mutagenic rates with the epPCR methods is very broad.[24] Using similar mutagenic conditions, with different target genes and Taq DNA polymerases from different suppliers, we have observed mutation frequencies as high as 25 changes per kb (results not shown). None of the commercial epPCR mutagenesis kits tested could achieve the predicted value even when the mutagenesis conditions were set to reach the maximal mutation rate. Whereas the 4.2 mutations per kb rate of the GeneMorph II kit was lower than the expected mutation rate and the 7.2 mutations per kb rate of the Diversify kit surpassed the expected value (Table ).

Mutational Diversity

The mutational spectrum is the most important parameter to assess the quality of mutagenesis libraries. The ideal mutagenesis method should be able to replace any base in the gene with any of the other three bases at the same frequency. Mutational diversity can be examined by different parameters.[41,42] One is by analyzing the ratio of transitions (Ts) to transversions (Tv). Transition mutations are purine (G and A) to purine changes and pyrimidine (T and C) to pyrimidine changes, whereas transversions are changes from purine to pyrimidine and from pyrimidine to purine. In this regard, the theoretical transitions to transversions ratio is 0.5 because there are 4 possible transitions (A → G, G → A, C → T, and T → C) and eight possible transversions (A → C, C → A, T → A, A → T, C → G, G → C, T → G, and G → T). The analysis of the libraries constructed for this work showed that our Spiked Genes method, with a 0.61 Ts/Tv ratio, yielded the most diverse library. However, the classical and diversify epPCR libraries produced less diverse libraries, with Ts/Tv ratios of 1.81 and 3.16, respectively. In other words, the epPCR libraries produced 64.4 and 76.0% of transitions, although the expected value is 33.3%. Another way to evaluate the mutational diversity is by calculating the ratio of AT → GC to GC → AT transition mutations (AT → GC/GC → AT ratio), which would be 1 for a perfectly unbiased mutagenic method targeting a gene with the same proportion of AT and GC pairs. As observed with the Ts/Tv ratio, Spiked Genes yielded the most even library, producing an experimental AT → GC/GC → AT ratio of 0.88 versus the expected 0.91 value calculated for the nucleotide composition of the emKate gene (A 26.13%, T 21.45%, G 27.72%, C 24.66%) (Table ). Instead, the classical epPCR and diversify methods yielded GC-enriched libraries. Finally, mutational diversity can also be assessed by measuring the ratio between the frequency of mutating A’s and T’s versus the frequency of mutating G’s and C’s (AT → NN/GC → NN ratio). This ratio should also be 1 for an unbiased mutagenic method that targets a gene with equal concentrations of ATs and GCs. For the Spiked Genes method, we measured the AT → NN value as 46.90%, which was very close to the 47.58% expected value for nucleotide composition of the emKate gene, whereas the GC → NN experimental value was 53.10% as compared with the expected value of 52.38%.[21,22,43] For the enzymatic methods, GeneMorph II produced less biased results than classical epPCR or the Diversify kit that generated a higher AT → GC bias. We dissected our results to analyze the individual mutations found in the coding strand, normalizing each type of mutation with respect to the wild-type nucleotide content of the gene. Figure shows that the Spiked Genes method yielded the broadest mutagenesis spectrum among all methods, whereas the enzymatic methods had a poor representation of the transversions C → G, G → T, G → C, T → G, and C → A, a result that is in concurrence with that reported by Alexander et al.[44] for GeneMorph II. At the protein level, the nucleotide bias of enzymatic methods favors neutral amino acid replacements due to the protecting nature found in the degeneracy of the genetic code, leaving unexplored a large fraction of the sequence space. Again, the analysis of individual mutations reflected a clear advantage of the Spiked Genes method over that of the enzymatic-based approaches to produce more complete libraries of protein variants, as reported in the supplementary Table S2, wherein all of the amino acid replacements found in the four libraries were analyzed. Clearly, Spiked Genes created amino acid substitutions unachievable by the other methods.

Figure 3

Experimental/expected frequency ratios for each of the 12 possible nucleotide mutations. The pointed line indicates the ideal normalized value for every substitution.

Experimental/expected frequency ratios for each of the 12 possible nucleotide mutations. The pointed line indicates the ideal normalized value for every substitution. Although nonbiased mutagenic libraries are mainly used to perform directed evolution studies, controlled-bias mutagenic libraries are occasionally preferred to favor certain amino acid mutations.[36,45,46] Thus, it is important to have an adjustable method for setting specific transition/transversion bias to achieve the maximal coverage of desired amino acid substitutions. For this matter, the Spiked Genes method can be finely tuned to produce the desired mutation frequency and mutational bias by simply changing, during oligonucleotide synthesis, the proportion of the spiked bases. Moreover, if an important functional region is discovered after the first screening of the library, our method allows the mutational enrichment of this specific region by reusing some of the spiked oligonucleotides used in the initial assembly, which may be selectively incorporated by additional rounds of mutagenesis via overlapped PCR.[47] Together, all of the presented results confirm that the Spiked Genes method is a superior mutagenesis approach to produce libraries with evenly distributed random point nucleotide substitutions and affords an accurate statistical description of the library composition, contrary to the libraries built by epPCR approaches.[40] Unfortunately, our method also showed a higher deletion rate than the enzymatic methods. It is well known that due to the inefficient coupling of the incoming nucleoside phosphoramidites and subsequent inefficient capping of the growing oligonucleotide chains that fail to be blocked synthetic oligonucleotides contain a high rate of single nucleotide deletions.[48−50] The deletion rate can be reduced during synthesis by increasing the concentration of the nucleoside phosphoramidites to improve coupling yield at every nucleotide addition and using a double capping step to improve blocking.[51,52] Deletion mutations can also be reduced by postsynthesis purification of the oligonucleotides by size exclusion chromatography, high-performance liquid chromatography, or polyacrylamide gel electrophoresis, as it has already been done to assemble complete virus genomes.[53] These improvements will render higher quality libraries.

Mutant Analysis

After transforming Escherichia coli competent cells with only one-tenth of the ligation reaction, nonbiased random mutagenesis of the emKate gene using the Spiked Genes method produced 181 colored colonies out of 1995 colonies selected on kanamycin. From the analyzed colonies, 149, 16, 4, and 12 were red, green, yellow, and orange variants, respectively. As expected, one clone containing the wild-type emKate intact sequence emerged after sequencing the plasmid from 12 red colonies, whereas the rest of the red clones contained an average of 2 to 3 amino acid replacements and only 2 clones contained 6 and 8 changes. Most of the substitutions corresponded to neutral replacements. The color variation observed in some of the mutant clones (Figure A) was the result of incomplete maturation of the DsRed-type chromophore present in the emKate protein.[37] In some variants, the chromophore was trapped as a GFP-like immature intermediate (Figure S1). As a consequence, some variants showed an orange phenotype that corresponded to a mixture of two proteins: one that remained in the GFP-like state, with an absorbance peak of approximately 470–500 nm and another that reached the DsRed-like state, with an absorbance peak of approximately 580–600 nm. The simultaneous emission of green and red lights produced the orange phenotype. Proteins displaying slow and incomplete maturation of their chromophores have been extensively used as fluorescent timers,[54] and perhaps some of our variants could also be used as timers.

Figure 4

emKate mutants generated by the Spiked Genes method. (A) E. coli colonies expressing wild-type GFP, a nonfluorescent GFP mutant containing the multiple mutations L64A/C65M/Y66G/G67V, wild-type emKate, and the representative fluorescent mutants found in the library. (B) Absorbance spectra of the mutants and reference proteins shown in panel A. The different levels of protonation of the chromophore also influenced the observed phenotypes, as revealed by the absorption properties of the green and yellow variants (Figure B), which exhibited different ratios of the GFP-type chromophore in its ionized state at approximately 488 nm and its neutral state at approximately 382 nm. Although the absorbance properties of green and yellow variants were very similar, their phenotype was clearly different, as shown in the streaked colonies in Figure A. Because green variants were colored but nonfluorescent, perhaps their chromophores achieved a trans configuration around the double bond formed between the Cα and Cβ of Tyr64 (equivalent to Tyr66 in GFP) during the oxidation step, whereas the yellow variants achieved a cis configuration. It is well known that most nonfluorescent chromoproteins display a GFP-type chromophore in the trans configuration, whereas most fluorescent proteins display a chromophore with cis configuration.[55,56] Here, it is important to mention that emKate’s chromophore was formed by post-translational modification of the amino acids Met63, Tyr64, and Gly65, whose equivalent positions in the reference GFP protein are Ser65, Tyr66, and Gly67. In this context, all amino acid changes that gave rise to drastic changes of phenotypes compared with the red color of emKate were located on the vicinity of the chromophore, either on the α-helix holding the chromophore or on amino acid positions located on the β-strands surrounding the chromophore, whose side-chains face the interior of the β-barrel and are near the chromophore. For instance, the yellow phenotype of emKate-yellow1 was generated by the mutation S66G, adjacent to the chromophore moiety, whereas the green and orange phenotypes found in emKate-green1 and emKate-orange1 were generated, respectively, by mutations Q39H and A158V (Figure S2), located in β-strands 2 and 7. These substitutions probably caused the loss of hydrogen bonds with the chromophore or crowded that region, disturbing the appropriate conformation of the GFP-like intermediate and hampered the second oxidation step, which would produce the red chromophore.[57] Surprisingly, a green shift was also found in a variant containing the external mutation A142P, which presumably destabilized the β-barrel, causing loss of the hydrogen bond between the phenolate ion of the chromophore and Ser143. The alteration of the H-bond network results in the destabilization of the cis configuration and probably neutralization of the phenolate.[58]

Conclusions

In this study, we describe a new method to perform random mutagenesis that we call the Spiked Genes method, and we compared its efficiency against that of enzymatic epPCR mutagenesis methods. Using oligonucleotides with 0.75% mutant nucleotides at each position, we obtained a mutant library that (with respect to other methods) showed: first, a higher control of mutation rate; second, a wider and homogeneous mutation distribution along the entire gene; and third, a more even mutation diversity, with minimal mutation bias. Moreover, our method can be modulated to achieve the desired mutation and transition/transversion ratios. In this era of synthetic biology, the continuous drop in the price of oligonucleotides makes this random mutagenesis strategy a suitable option for the improvement of protein properties. The power of the strategy was manifested in the high phenotypic variations found in the library. We have created a rich library of emKate variant proteins that once characterized could potentially be used as reporters of cellular processes.

Methods

Oligonucleotide Synthesis

All oligonucleotides used for the gene library assembly, forward and reverse (shown in Table S1), were synthesized with a MerMade 192 DNA synthesizer (BioAutomation), using standard DMTr-protected phosphoramidites dABz, dCAc, dGdmf, and dT. However, contrary to conventional synthesis, each nucleotide was doped with 0.25% of each of the other three nucleotides, producing an overall contamination of 0.75% per base. After deprotection and analysis by denaturing polyacrylamide gel electrophoresis, all oligonucleotides were observed as a single band and were used for the subsequent experiments without further purification.

Library of Synthetic Genes Encoding emKate

The library of synthetic genes encoding emKate (accession number EU383029, DNA Data Bank of Japan) was assembled in a single PCR step, as described in a previous report.[37] Oligonucleotides in one strand were designed to overlap 30 nt with the oligonucleotides in the complementary strand. To improve gene expression, the least frequent E. coli codons were substituted with more frequently used codons. An overlapping extension was carried out using an equimolar pool of the 22 internal oligonucleotides as templates (mFw2 to mFw12 and mRv1 to mRv11) (Table S1). Three different equimolar concentrations of the internal oligonucleotides were tested (4, 8, and 16 nM), plus 400 nM of the outermost primers (mFw1 and mRv12), which, for cloning purposes, also included NdeI and Xhol restriction sites at the 5′ and 3′ ends of the synthetic genes, respectively. mRv12 also contained the information to introduce a polyhistidine tag −SGGSHHHHHH at the carboxy end of the variant proteins. The final 100 μL of PCR mixture had Vent DNA polymerase (New England BioLabs), buffer, MgSO4, and dNTPs, as recommended by the supplier. The PCR reaction was performed under the following conditions: 1 cycle: 94 °C for 3 min; 25 cycles: 94 °C for 1 min, 58 °C for 1 min, 72 °C for 1 min; 1 cycle: 72 °C for 5 min. Both the gene library and the pJOQ plasmid were double-digested with the NdeI and XhoI restriction endonucleases for 12 h at 37 °C and purified from agarose gel, using the EZ-10 spin column PCR purification kit (Bio Basic Inc.). Next, 200 ng of the gene library was ligated into 200 ng of the vector in a 20 μL reaction at 16 °C for 20 h. Using electroporation, 2 μL of the ligation reaction was used to transform 100 μL of MC1061 E. coli cells. After recovering in 1 mL of LB media at 37 °C for 1 h, the transformed cells were grown at 37 °C for 20 h on LB-Kanamycin plates. After transformation, plasmids from several independent colonies (fluorescent and nonfluorescent) were isolated for DNA sequencing.

Enzymatic Mutagenesis

The wild-type gene emKate was subjected to three different epPCR-based mutagenesis approaches, two using commercial kits and one established in our laboratory. We followed the instructions to obtain the highest mutation rate recommended for two commercial kits: 16/kb for the GeneMorph II mutagenesis kit from Stratagene and 8/kb for the Diversify PCR Random Mutagenesis Kit from Clontech. As a third protocol, we evaluated the classical epPCR approach at the predicted mutation rate of 6.6/kb, as described by Cadwell and Joyce.[20] All libraries generated from the three epPCR methods were cloned, selected, and evaluated, as previously described for the synthetic library.

53 in total

1. Randomization of genes by PCR mutagenesis.

Authors: R C Cadwell; G F Joyce
Journal: PCR Methods Appl Date: 1992-08

2. Bright far-red fluorescent protein for whole-body imaging.

Authors: Dmitry Shcherbo; Ekaterina M Merzlyak; Tatiana V Chepurnykh; Arkady F Fradkov; Galina V Ermakova; Elena A Solovieva; Konstantin A Lukyanov; Ekaterina A Bogdanova; Andrey G Zaraisky; Sergey Lukyanov; Dmitriy M Chudakov
Journal: Nat Methods Date: 2007-08-26 Impact factor: 28.547

3. Introduction to the synthesis and purification of oligonucleotides.

Authors: A Ellington; J D Pollard
Journal: Curr Protoc Nucleic Acid Chem Date: 2001-05

4. Elimination of redundant and stop codons during the chemical synthesis of degenerate oligonucleotides. Combinatorial testing on the chromophore region of the red fluorescent protein mKate.

Authors: Paul Gaytán; Abigail Roldán-Salgado
Journal: ACS Synth Biol Date: 2013-02-13 Impact factor: 5.110

5. Generation of large libraries of random mutants in Bacillus subtilis by PCR-based plasmid multimerization.

Authors: S Shafikhani; R A Siegel; E Ferrari; V Schellenberger
Journal: Biotechniques Date: 1997-08 Impact factor: 1.993

6. Widening the pH activity profile of a fungal laccase by directed evolution.

Authors: Pamela Torres-Salas; Diana M Mate; Iraj Ghazi; Francisco J Plou; Antonio O Ballesteros; Miguel Alcalde
Journal: Chembiochem Date: 2013-04-16 Impact factor: 3.164

7. Directed evolution of a monomeric, bright and photostable version of Clavularia cyan fluorescent protein: structural characterization and applications in fluorescence imaging.

Authors: Hui-wang Ai; J Nathan Henderson; S James Remington; Robert E Campbell
Journal: Biochem J Date: 2006-12-15 Impact factor: 3.857

8. Conversion of red fluorescent protein into a bright blue probe.

Authors: Oksana M Subach; Illia S Gundorov; Masami Yoshimura; Fedor V Subach; Jinghang Zhang; David Grüenwald; Ekaterina A Souslova; Dmitriy M Chudakov; Vladislav V Verkhusha
Journal: Chem Biol Date: 2008-10-20

9. An approach to random mutagenesis of DNA using mixtures of triphosphate derivatives of nucleoside analogues.

Authors: M Zaccolo; D M Williams; D M Brown; E Gherardi
Journal: J Mol Biol Date: 1996-02-02 Impact factor: 5.469

Review 10. Chemical and biochemical strategies for the randomization of protein encoding DNA sequences: library construction methods for directed evolution.

Authors: Cameron Neylon
Journal: Nucleic Acids Res Date: 2004-02-27 Impact factor: 16.971

3 in total