Fumihiro Kawai1, Akihiko Nakamura1,2, Akasit Visootsat2, Ryota Iino1,2. 1. Institute for Molecular Science, National Institutes of Natural Sciences, 5-1 Higashiyama Myodaijicho, Okazaki, Aichi 444-8787, Japan. 2. The Graduate University for Advanced Studies (SOKENDAI), Shonan Village, Hayama, Kanagawa 240-0193, Japan.
Abstract
We evaluated a method for protein engineering using plasmid-based one-pot saturation mutagenesis and robot-based automated screening. When the biases in nucleotides and amino acids were assessed for a loss-of-function point mutation in green fluorescent protein, the ratios of gain-of-function mutants were not significantly different from the expected values for the primers among the three different suppliers. However, deep sequencing analysis revealed that the ratios of nucleotides in the primers were highly biased among the suppliers. Biases for NNB were less severe than for NNN. We applied this method to screen a fusion protein of two chitinases, ChiA and ChiB (ChiAB). Three NNB codons as well as tyrosine and serine (X1YSX2X3) were inserted to modify the surface structure of ChiAB. We observed significant amino acid bias at the X3 position in water-soluble, active ChiAB-X1YSX2X3 mutants. Examination of the crystal structure of one active mutant, ChiAB-FYSFV, revealed that the X3 residue plays an important role in structure stabilization.
We evaluated a method for protein engineering using plasmid-based one-pot saturation mutagenesis and robot-based automated screening. When the biases in nucleotides and amino acids were assessed for a loss-of-function point mutation in green fluorescent protein, the ratios of gain-of-function mutants were not significantly different from the expected values for the primers among the three different suppliers. However, deep sequencing analysis revealed that the ratios of nucleotides in the primers were highly biased among the suppliers. Biases for NNB were less severe than for NNN. We applied this method to screen a fusion protein of two chitinases, ChiA and ChiB (ChiAB). Three NNB codons as well as tyrosine and serine (X1YSX2X3) were inserted to modify the surface structure of ChiAB. We observed significant amino acid bias at the X3 position in water-soluble, active ChiAB-X1YSX2X3 mutants. Examination of the crystal structure of one active mutant, ChiAB-FYSFV, revealed that the X3 residue plays an important role in structure stabilization.
Saturation mutagenesis is a basic protein
engineering technique
for increasing thermostability[1,2] and substrate specificity[3,4] of enzymes and antibody affinity.[5,6] To increase
the efficiency of randomization in saturation mutagenesis, primer
design plays a critical role. Several design methods have been developed,
including Max randomization,[7] Tang’s
“small-intelligent”,[8] and
22c-tricks.[9] These designed oligonucleotide
mixtures are especially suitable for single-site saturation mutagenesis.
In contrast, conventional degenerate oligonucleotides, such as NNN,
NNB, and NNK/S, are more suitable for multisite saturation mutagenesis
in terms of cost, simplicity of primer design, and reduced number
of PCR primers. In addition to the design of degenerate oligonucleotides
for saturation mutagenesis, the development of molecular cloning techniques
also plays an important role in facilitating the mutagenesis procedure.
The most popular techniques are QuikChange mutagenesis (Agilent),
Gibson assembly[10] (NEB), and In-Fusion[11] (Clontech). These commercially available kits
are useful for ligation with single or multiple fragments in vitro
without any restriction enzymes and ligase. A recent development in
the field is plasmid-based one-pot saturation mutagenesis[12] using a pair of endonucleases (Nt.BbvCI and
Nb.BbfCI) for nicking mutagenesis, in vivo assembly for homologous
recombination cloning (IVA),[13] and seamless
ligation cloning extract from Escherichia coli(14−17) (SLiCE). SLiCE is notable for its simplicity and low cost.In this study, we evaluated a method of plasmid-based one-pot saturation
mutagenesis employing the three techniques, namely conventional degenerate
NNN or NNB codons for single or multisite saturation mutagenesis,
IVA cloning for primer design, and SLiCE for ligating both ends of
the fully amplified linear plasmid with homologous recombination in
vitro. We also performed a robot-based automated method for E. coli cell culture, protein purification, and activity
measurement in 96-well plates using a liquid-handling robot. We assessed
this combination of methods for protein engineering.First,
to assess our one-pot saturation mutagenesis and to understand
the efficiency of randomization, we applied our method to a nonfluorescent,
loss-of-function green fluorescent protein (GFP) mutant, GFPmut3-Y66H,[12] using NNN or NNB primers from three suppliers.
The recovery rates (H66Y mutation) in both colony counting and deep
sequencing were similar among the primer suppliers, but large nucleotide
biases were observed in deep sequencing. The fractions of thymine
(T) and guanine (G) from suppliers 1 and 2 were much higher than those
of adenine (A) and cytosine (C) in NNN and NNB degenerate codons.
In contrast, supplier 3 showed ratios of A, T, G, and C similar to
the expected ratios in both NNN and NNB degenerate codons. Biases
in amino acid composition given by NNN degenerate codons were also
evident in our deep sequencing results. Valine, phenylalanine, tryptophan,
and glycine all showed higher-than-expected fractions, and aspartic
acid and isoleucine were less abundant than expected. However, there
were no significant biases in amino acids from NNB degenerate codons
and the variances of observed/expected ratios were low. Therefore,
on the basis of analyses of biases in nucleotides and amino acids,
degenerate NNB codons were used for multisite saturation mutagenesis.Next, we applied this method to ChiAB, an artificial fusion protein
of two chitinases, ChiA and ChiB, from Serratia marcescens.(18,19)S. marcescensChiA
and ChiB processively hydrolyze crystalline chitin from reducing and
nonreducing ends, respectively, and are linear motor proteins that
move in opposite directions. Both ChiA and ChiB have a catalytic domain
(CD) and a carbohydrate-binding module (CBM) connected by a short
linker. The CDs show almost identical folds (TIM barrel), whereas
the positions of the CBM relative to the CD were different. We are
currently attempting to engineer a bidirectional motor protein based
on ChiAB that carries one CD from ChiA and two CBMs from ChiA and
ChiB. Multisite saturation mutagenesis was carried out with degenerate
NNB codons as well as tyrosine and serine residue insertions (X1YSX2X3) to modify surface properties
of ChiAB. The observed ratios of each amino acid at three positions
(X1, X2, and X3) were similar when
the sequences were analyzed irrespective of solubility of expressed
proteins. On the other hand, among water-soluble and active 120 clones,
we observed an amino acid bias at the third (X3) position
in ChiAB-X1YSX2X3 mutants. In an
attempt to understand the reason for the bias, we solved the crystal
structure of one active mutant, ChiAB-FYSFV. The results showed that
the valine residue at the X3 position was oriented toward
the inside of the molecule, presumably contributing to the stability
of the structure.
Results and Discussion
Fraction of Gain-of-Function
GFP Mutants Analyzed by Colony
Counting and Deep Sequencing
We have designed a method to
generate and screen a wide variety of mutant proteins. The bottleneck
steps include the mutant construction and protein purification. To
resolve these limiting steps, we combined one-pot saturation mutagenesis
with robot-based small-scale purification of a large number of mutants.
Our experimental system is shown in Figure . To assess nucleotide and amino acid biases,
one-pot saturation mutagenesis was performed with degenerate NNN and
NNB codons from three different suppliers. The fractions of gain-of-function
mutations of GFPmut3 were analyzed by colony counting and deep sequencing.
Figure 1
Overview
of one-pot saturation mutagenesis and robot-based automated
screening used in this study. There were five steps from cloning to
purification. (1) Single-site/multisite saturation mutagenesis. (2)
Extraction/purification of fragments and SLiCE reaction (ligation).
(3) Transformation into Tuner(DE3) cells directly by electroporation
for checking protein expression. (4) Cultivation with 1 mL of Super
Broth including appropriate antibiotics. (5) Purification and activity
measurement of his6-tagged target protein by a liquid handling
robot.
Overview
of one-pot saturation mutagenesis and robot-based automated
screening used in this study. There were five steps from cloning to
purification. (1) Single-site/multisite saturation mutagenesis. (2)
Extraction/purification of fragments and SLiCE reaction (ligation).
(3) Transformation into Tuner(DE3) cells directly by electroporation
for checking protein expression. (4) Cultivation with 1 mL of Super
Broth including appropriate antibiotics. (5) Purification and activity
measurement of his6-tagged target protein by a liquid handling
robot.Prior to comparison of NNN and
NNB, we checked bias in DNA amplification
efficiency of PCR depending on codons, using 64 primers encoding different
codons (Figure ).
The amounts of PCR products (∼4500 bp) were basically similar
among all codons except ATA, which did not show clear bands. This
result indicates that the PCR step does not cause bias basically,
at least when pEDA5-GFPmut3-Y66H is used as a template.
Figure 2
Comparison
of PCR products of GFPmut3-Y66H mutant with 64 primers
encoding different codons. The intensities of bands (∼4500
bp) of PCR products are basically similar in all codons except ATA,
which did not show the clear band. The codons not included in NNB
(A at third positions) are shown in red.
Comparison
of PCR products of GFPmut3-Y66H mutant with 64 primers
encoding different codons. The intensities of bands (∼4500
bp) of PCR products are basically similar in all codons except ATA,
which did not show the clear band. The codons not included in NNB
(A at third positions) are shown in red.Then, to determine the recovery rate for each supplier’s
degenerate codons by colony counting, the pEDA5-GFPmut3-Y66H plasmid
was amplified by PCR with primers containing either NNN or NNB codons.
The linear products were ligated with SLiCE, transformed into E. coli cells, and cultured on agar plates at 37
°C overnight. Colonies were counted under visible light, and
the number of fluorescent colonies (gain-of-function, H66Y mutants)
was counted under a blue/green light-emitting diode (LED) light. The
fractions of gain-of-function colonies are shown in Table . For both NNN and NNB primers,
suppliers 1–3 showed gain-of-function fractions with values
that were slightly lower or higher than expected. However, the differences
were minor, ranging from −0.9 to 2.0% (Table ). All experimental values were similar to
expectation, showing no apparent biases.
Table 1
Fractions
of Gain-of-Function GFPmut3-H66Y
Mutants from Each Primer Set and Supplier Determined by Colony Counting
and Deep Sequencing
primer set
supplier
colony counting
deep sequencing
expected value
NNN
1
2.2% (32/1453)
2.2% (1909/86 048)
3.1% (2/64)
2
4.2% (86/2048)
3.1% (2571/83 927)
3
5.1% (78/1527)
4.2% (3364/80 325)
NNB
1
4.5% (73/1618)
4.2% (3636/85 577)
4.2% (2/48)
2
3.8% (60/1573)
3.5% (3078/88 751)
3
3.6% (77/2128)
3.7% (3201/86 419)
To investigate
the potential biases in more detail, all colonies
were collected and plasmids were extracted and analyzed by deep sequencing.
First, to estimate fractions of gain-of-function GFP mutants, tyrosine
codons (TAT, TAC) were counted (Table ). Again, for both NNN and NNB primers, suppliers 1–3
showed fractions of gain-of-function similar to the expected values,
with differences ranging from −0.9 to 1.1% (Table ). For each primer set and supplier,
the value estimated by deep sequencing was similar to that from colony
counting, indicating the validity of the deep sequencing (Table ).
Biases in Codons
and Nucleotides Estimated by Deep Sequencing
Next, we determined
the fraction of each codon from each primer
set and supplier. Figure shows the fraction of each codon, ordered from largest to
smallest. Expected fractions of each codon from NNN and NNB are 1.6%
(1/64) and 2.1% (1/48), respectively. However, NNN from suppliers
1 and 2 showed significantly larger fractions of the top 10 codons,
whereas this bias was slightly less severe in NNB (Table ).
Figure 3
Ranking of codons in
NNN and NNB from three suppliers, determined
by deep sequencing. Fractions of each codon as determined by deep
sequencing were estimated and ranked. (a) Degenerate NNN includes
64 codons. (b) NNB includes 48 codons. Results from suppliers 1, 2,
and 3 are shown by blue, red, and green lines, respectively.
Table 2
Top 10 Rankings of
the Codons in NNN
and NNB from Three Suppliers, Determined by Deep Sequencing
supplier 1 (codon, %)
supplier 2 (codon, %)
supplier 3 (codon, %)
expected (%)
NNN
TTT
8.6
TTT
7.5
GGG
3.7
1.6 (1/64)
GTT
6.4
TGG
5.0
GGA
3.5
GGG
5.9
GTT
4.3
CAG
3.2
TGG
5.9
GGT
4.2
ACG
3.1
TTG
5.3
TGT
3.9
GGC
3.0
TGT
5.0
TTG
3.8
GCC
2.9
GTG
4.7
GGG
3.1
CTT
2.8
TTC
3.3
GTG
3.0
TAT
2.6
TCG
2.9
TTC
2.8
TCT
2.6
TCT
2.9
ATT
2.5
GGT
2.6
NNB
TTT
7.2
TTT
4.8
TTT
3.8
2.1 (1/48)
GTT
4.4
TGG
4.8
GGG
3.4
TGG
4.2
TTG
4.6
TGG
3.3
TTC
4.0
GTT
4.4
GGC
3.1
GGT
3.8
TGT
4.3
CGG
3.0
ATT
3.8
GGG
3.9
GTT
2.8
TGT
3.6
ATT
3.5
GAG
2.6
GGG
2.9
GGT
3.4
GTG
2.6
TAT
2.9
GTG
3.1
CAG
2.6
TCT
2.8
GAG
2.9
ACG
2.6
Ranking of codons in
NNN and NNB from three suppliers, determined
by deep sequencing. Fractions of each codon as determined by deep
sequencing were estimated and ranked. (a) Degenerate NNN includes
64 codons. (b) NNB includes 48 codons. Results from suppliers 1, 2,
and 3 are shown by blue, red, and green lines, respectively.We then assessed the nucleotide bias more
directly by estimating
fractions of each nucleotide (Table ). NNN primers from suppliers 1 and 2 had significantly
higher fractions of T and G than those of A and C. Notably, NNN from
supplier 1 showed an extremely low fraction of A. Proportions of A,
T, G, and C in primers from supplier 3 were similar to the expected
values (25%). In the case of NNB, the expected values were different
because A was not present at the third position of NNB; therefore,
the expected value of fraction of A was 16% and those of T, G, and
C were 28%. In primers from supplier 1, the value of the fraction
of T was highest and that of G was equal to the expected value. However,
the value of the fraction of C was lower than expected. From supplier
2, the values of the fractions of T and G were higher than expected
and that of C was lower than expected. In contrast, the fractions
of A, T, G, and C were highly similar to expectations in both NNN
and NNB primers from supplier 3. In the previous study, it has been
reported that hand-mixed degenerate primers showed lower nucleotide
bias than machine-mixed degenerate primers when used for PFunkel.[20] In our study, we have not used hand-mixed degenerate
primers for both NNN and NNB codons from supplier 1 to 3. The biases
observed in our study may be improved if hand-mixed degenerate primers
are used.
Table 3
Fractions of A, T, G, and C in Each
Primer Set, from Each Supplier, Determined by Deep Sequencing
primer set
supplier
A
T
G
C
NNNa
1
9% (23 183/258 114)
41% (106 320/258 114)
34% (87 976/258 114)
16% (40 665/258 114)
2
16% (40 428/251 781)
38% (95 853/251 781)
31% (76 807/251 781)
15% (38 691/251 781)
3
24% (56 992/240 975)
26% (61 972/240 975)
28% (68 464/240 975)
22% (53 547/240 975)
NNBb
1
13% (33 350/256 731)
39% (98 622/256 731)
28% (72 912/256 731)
20% (51 847/256 731)
2
13% (32 967/266 253)
35% (93 144/266 253)
34% (89 211/266 253)
18% (47 073/266 253)
3
15% (39 392/259 257)
29% (74 528/259 257)
31% (81 671/259 257)
25% (63 666/259 257)
In NNN, the expected fractions of
A, T, G, and C were 25%.
In NNB, the expected fractions were
16% A and 28% each of T, G, and C.
In NNN, the expected fractions of
A, T, G, and C were 25%.In NNB, the expected fractions were
16% A and 28% each of T, G, and C.To examine the nucleotide bias in more detail, we
analyzed the
fractions of A, T, G, and C at each of the three positions of each
codon (Table ). NNN
and NNB primers from suppliers 1 and 2 showed higher fractions of
T and G than those of A and C at all three positions. In contrast,
NNN from supplier 3 showed smaller biases than those from suppliers
1 and 2 at all three positions. This was also the case for NNB primers
from supplier 3. Overall, codon and nucleotide biases were lower in
NNB than in NNN, although the effects of different template sequences
and DNA polymerases were not evaluated in our study. To draw general
conclusion about the difference in the biases between NNN and NNB,
further quantitative analysis of different target proteins will be
required.
Table 4
Fractions of A, T, G, and C at the
First, Second, and Third Nucleotide Positions of the Codons in Each
Primer Set and Supplier, Determined by Deep Sequencing
first
nucleotide (%)
second
nucleotide (%)
third
nucleotide (%)
primer set
supplier
A
T
G
C
A
T
G
C
A
T
G
C
NNNa
1
9
42
33
16
8
43
33
16
9
39
37
15
2
16
38
30
16
17
38
30
15
16
39
31
14
3
21
28
32
19
28
19
26
27
22
30
27
21
NNBb
1
19
36
26
19
20
36
26
18
0
44
33
23
2
18
34
30
17
19
33
31
17
0
39
41
20
3
21
26
29
24
24
24
30
22
0
36
36
28
In NNN, the expected fractions of
A, T, G, and C were all 25%.
In NNB, the expected fractions of
T, G, and C at the third position were all 33% and the expected fraction
of A at the third position was 0%.
In NNN, the expected fractions of
A, T, G, and C were all 25%.In NNB, the expected fractions of
T, G, and C at the third position were all 33% and the expected fraction
of A at the third position was 0%.
Biases in Amino Acids
We also translated our deep sequencing
results into amino acids (Figures and 5). We found obvious biases
in amino acids translated from NNN (Figure , top panels). Several amino acids with significantly
higher and lower fractions than expected were observed for suppliers
1–3. Indeed, values more than twice as high from the expected
values for valine, phenylalanine, and tryptophan were observed for
both suppliers 1 and 2 (Figure , top panels). Furthermore, the fractions of lysine, histidine,
glutamic acid, threonine, asparagine, and glutamine were less than
half of that expected from supplier 1, whereas no amino acids with
significantly low fractions were observed with supplier 2. Codons
from supplier 3 showed only one amino acid, glycine, with a ratio
2 times higher than expected, whereas three amino acids, lysine, arginine,
and isoleucine, showed ratios significantly lower than expected. The
amino acid analysis therefore revealed biases in NNN codons from all
three suppliers (Table ).
Figure 4
Observed vs expected fractions of 20 amino acids in primer sets
from each supplier as determined by deep sequencing. The 20 amino
acids were classified and colored on the basis of their properties.
Basic amino acids lysine (K), arginine (R), and histidine (H) are
colored blue. Acidic amino acids aspartic acid (D) and glutamic acid
(E) are colored red. Uncharged polar amino acids serine (S), threonine
(T), tyrosine (Y), asparagine (N), and glutamine (Q) are colored green.
Nonpolar amino acids alanine (A), valine (V), leucine (L), isoleucine
(I), proline (P), phenylalanine (F), methionine (M), tryptophan (W),
glycine (G), and cysteine (C) are colored purple. Stop codons are
colored gray.
Figure 5
Observed/expected ratios
for 20 amino acids in primer sets from
each supplier determined by deep sequencing. All codons determined
by deep sequencing were classified on the basis of 20 amino acids
and stop codons. Observed/expected ratios are depicted in each histogram.
The histograms of NNN and NNB primer sets are colored light blue (top)
and light green (bottom), respectively.
Table 5
Statistics of Observed/Expected Ratios
of 20 Amino Acids in Each Primer Set, from Each Supplier, Determined
by Deep Sequencing
primer set
supplier
mean
variance
SD
median
NNN
1
1.06
1.20
1.10
0.63
2
1.11
0.66
0.81
0.79
3
1.01
0.25
0.50
1.20
NNB
1
1.04
0.29
0.53
0.88
2
1.03
0.23
0.48
0.92
3
1.03
0.08
0.28
0.94
Observed vs expected fractions of 20 amino acids in primer sets
from each supplier as determined by deep sequencing. The 20 amino
acids were classified and colored on the basis of their properties.
Basic amino acids lysine (K), arginine (R), and histidine (H) are
colored blue. Acidic amino acids aspartic acid (D) and glutamic acid
(E) are colored red. Uncharged polar amino acids serine (S), threonine
(T), tyrosine (Y), asparagine (N), and glutamine (Q) are colored green.
Nonpolar amino acids alanine (A), valine (V), leucine (L), isoleucine
(I), proline (P), phenylalanine (F), methionine (M), tryptophan (W),
glycine (G), and cysteine (C) are colored purple. Stop codons are
colored gray.Observed/expected ratios
for 20 amino acids in primer sets from
each supplier determined by deep sequencing. All codons determined
by deep sequencing were classified on the basis of 20 amino acids
and stop codons. Observed/expected ratios are depicted in each histogram.
The histograms of NNN and NNB primer sets are colored light blue (top)
and light green (bottom), respectively.Biases were relatively less severe
in amino acids translated from
NNB (Figures and 5, bottom panels). Except for phenylalanine from
supplier 1, and tryptophan, histidine, and asparagine from supplier
2, no amino acid ratios from NNB were more than twice as high or low
from the expected values (Figure ). The observed/expected ratios were nearly equal to
1 for each amino acid for supplier 3. Variances from NNB were lower
than those from NNN, and, notably, the variance was lowest (0.08, Table ) for supplier 3.Overall, the observed and expected fractions of each amino acid
from NNN and NNB codons showed wide variations (Figure ). Ratios of mutations were significantly
different among amino acids. However, observed fractions were relatively
similar to the expected values in NNB codons from all three suppliers.
Moreover, the variance among amino acid fractions was much lower for
NNB than for NNN (Tables and 6). As a result, frequencies of mutations from NNB
were more uniform than from NNN. In terms of nucleotide, amino acid,
and mutational biases, NNB codons were superior to NNN under the experimental
conditions we evaluated. This was especially true of NNB from supplier
3.
Figure 6
Observed fractions of each amino acid in NNN and NNB from three
suppliers. All histograms show experimental amino acid fractions.
Light blue: NNN (64 codons, top), green: NNB (48 codons, bottom).
The thick horizontal lines (dark blue) on the histograms indicate
expected amino acid fractions.
Table 6
Statistics of Each Experimental Fraction
of 20 Amino Acids in Each Primer Set, from Each Supplier, Determined
by Deep Sequencing
primer set
supplier
mean (%)
variance
SD (%)
median (%)
stop
codon (%)
NNN
1
4.93
21.4
4.63
2.9
1.49
2
4.84
10.7
3.27
3.79
3.29
3
4.74
13.9
3.73
3.50
5.19
NNB
1
4.96
7.78
2.79
4.21
0.74
2
4.92
8.47
2.91
4.22
1.63
3
4.94
5.31
2.30
4.35
1.28
Observed fractions of each amino acid in NNN and NNB from three
suppliers. All histograms show experimental amino acid fractions.
Light blue: NNN (64 codons, top), green: NNB (48 codons, bottom).
The thick horizontal lines (dark blue) on the histograms indicate
expected amino acid fractions.
Saturation Mutagenesis of ChiAB
Multisite saturation
mutagenesis using NNB codons and insertion of tyrosine and serine
residues were performed simultaneously on ChiAB to generate ChiAB-X1YSX2X3. This incorporated the curved
α-helix that stabilizes the CBM-linker domain from ChiB. When
ratios of each amino acid at three positions (X1, X2, and X3) were compared without screening by solubility,
they were very similar each other (Figure ). Therefore, the amino acid ratio is not
affected by the position and successive introduction of NNB. Then,
we obtained 120 water-soluble and catalytically active samples and
24 insoluble and inactive samples. These 144 clones were sequenced,
and amino acid residues in the X1, X2, and X3 positions were determined (Figure ). The results showed no large biases at
the X1 and X2 positions in either soluble or
insoluble mutants or at the X3 position in insoluble mutants.
However, in water-soluble mutants, there was a remarkable bias at
the X3 position in favor of valine (27/120) and leucine
(56/120) residues. Amino acid residues with basic (K, R, H), acidic
(D, E), or uncharged polar (S, T, Y, N, Q) side chains were not introduced
at the X3 position of the water-soluble mutants, presumably
because these residues could not support correct folding and/or resulted
in aggregation.
Figure 7
Ratios of each amino acid at X1, X2, and
X3 positions of ChiAB-X1YSX2X3 without screening by solubility. The 265 clones were sequenced,
and coded amino acid residues were counted at each position, irrespective
of solubility of expressed proteins. The thick horizontal lines (dark
blue) on the histograms indicate expected amino acid fractions.
Figure 8
Ratios of each amino acid at X1,
X2, and
X3 positions for soluble and insoluble ChiAB-X1YSX2X3. The soluble 120 clones and insoluble
24 clones were sequenced, and coded amino acid residues were counted
at each position.
Ratios of each amino acid at X1, X2, and
X3 positions of ChiAB-X1YSX2X3 without screening by solubility. The 265 clones were sequenced,
and coded amino acid residues were counted at each position, irrespective
of solubility of expressed proteins. The thick horizontal lines (dark
blue) on the histograms indicate expected amino acid fractions.Ratios of each amino acid at X1,
X2, and
X3 positions for soluble and insoluble ChiAB-X1YSX2X3. The soluble 120 clones and insoluble
24 clones were sequenced, and coded amino acid residues were counted
at each position.
Structure Determination
of ChiAB-FYSFV
To elucidate
the reason for the bias at the X3 position of ChiAB-X1YSX2X3 mutants, we solved the crystal
structure of the most active mutants, ChiAB-FYSFV, at 2.6 Å resolution
(Figure a,b, Table ). In the ChiAB-FYSFV
structure, the ChiA moiety was not significantly different from the
wild-type ChiA structure. A clear electron density was observed in
the CBM domain of ChiB, and it was located on the bottom surface of
ChiA. However, the flexible linker region of the CBM-linker from ChiB
did not show a clear electron density.
Figure 9
Crystal structure of
ChiAB-FYSFV. Side (a) and top (b) views of
ChiAB-FYSFV (PDB ID: 5ZL9). The ChiA-part is shown in green, and the CBM-linker from ChiB
is shown in blue. (c) The original α-helix in the wild-type
ChiB (PDB ID: 1E6N). (d) The original α-helix in the wild-type ChiA (PDB ID: 1EIB). (e) The α-helix
in the crystal structure of ChiAB-FYSFV. Pink circles indicate the
mutated α-helix.
Table 7
Statistics of Data Collection and
Refinement of the Crystal Structure of ChiAB-FYSFV
data collection statistics
SmChiAB-FYSFV (PDB ID: 5ZL9)
beam line
Aichi SR BL2S1
wavelength (Å)
1.12
space group
C2221
unit cell parameters
(Å)
a = 71.8, b = 190.2, c = 132.9
exposure time (frame/s)
10
number of frames
360 (ω = 1°)
resolution (Å)
47.5–2.6 (2.69–2.6)
observed/unique
reflections
418 227/28 411
multiplicity
14.7 (14.7)
completeness (%)
100 (100)
Rmerge (%)
19.8
(101.4)
mean ⟨I/σ(I)⟩
14.5 (2.6)
Crystal structure of
ChiAB-FYSFV. Side (a) and top (b) views of
ChiAB-FYSFV (PDB ID: 5ZL9). The ChiA-part is shown in green, and the CBM-linker from ChiB
is shown in blue. (c) The original α-helix in the wild-type
ChiB (PDB ID: 1E6N). (d) The original α-helix in the wild-type ChiA (PDB ID: 1EIB). (e) The α-helix
in the crystal structure of ChiAB-FYSFV. Pink circles indicate the
mutated α-helix.In the wild-type ChiB structure, the curved α-helix
stabilizing
with the CBM-linker domain is observed (Figure c). Wild-type ChiA does not have a similarly
curved α-helix, but there was an α-helix containing a
short loop in its ChiB counterpart (Figure d). In the present study, this α-helix
was mutated by multisite saturation mutagenesis and YS residue insertion.
The results showed that the mutated part of the α-helix in the
ChiAB-FYSFV structure was similar to neither the ChiA nor the ChiB
α-helix. Instead, a straight α-helix was observed (Figure e). The side chain
of the valine residue at the X3 position of the X1YSX2X3 (FYSFV) moiety of the mutated α-helix
was oriented toward the inside of the protein core. These results
suggested that the residue at the X3 position of X1YSX2X3 plays an important role in stabilizing
the structure and preventing aggregation, supporting the abovementioned
hypothesis that mutants with basic, acidic, uncharged polar amino
acid residues at the X3 position cannot fold correctly,
resulting in aggregation. Therefore, it seems very likely that the
bias in favor of valine and leucine residues at the X3 position
was indispensable for stabilization and solubilization of the ChiAB-X1YSX2X3 mutant.
Conclusions
The method of protein engineering evaluated in this study was based
on one-pot saturation mutagenesis and robot-based automated screening.
According to the nucleotide and amino acid biases in our gain-of-function
GFP mutant experiments, the NNB primer was much more useful than the
NNN primers for multisite saturation mutagenesis. The least-biased
NNB primer was obtained from supplier 3. In this study, we have not
attempted to use other degenerate primers, such as NNK and NNS. The
NNK and NNS primers contain only 32 codons, including one stop codon,
therefore incorporating less redundancy than NNB. They may provide
improved random mutagenesis by mitigating biases. Furthermore, as
a proof-of-concept, we applied our methods to ChiAB with three degenerate
NNB codons and the insertion of tyrosine and serine residues. After
screening active 120 clones, we observed heavy amino acid biases at
the X3 position of ChiAB-X1YSX2X3. This result clearly reflected the effects of amino acid
bias in the mutant generation on the efficiency of protein engineering.
With our method, we will try to engineer non-natural motor proteins
with novel functions.
Methods
Reagents
All chemicals
were purchased from Wako. The
template plasmid pEDA5_GFPmut3_Y66H was a gift from Timothy Whitehead
(Michigan State University, Addgene plasmid #80085). PrimeSTAR HS
DNA polymerase was purchased from Takara. Plasmid extraction kit was
purchased from NIPPON Genetics Co., Ltd. Wizard SV gel and PCR clean-up
system was purchased from Promega. All primers (containing NNN or
NNB, where B = T/G/C) were purchased from supplier 1 (Fasmac), supplier
2 (Eurofin), and supplier 3 (Integrated DNA Technologies).
Primer
Design for Saturation Mutagenesis of Loss-of-Function
GFP
All NNN degenerate codon primers contained all 64 codons,
including three stop codons. The forward primer (5′-CACTTGTCACTACTTTCGGTNNNGGTGTTCAATGCTTTGCG-3′), containing one degenerate
NNN motif, had a melting temperature (Tm) of 68 °C. The NNB primers included 48 codons, including one
stop codon. The forward primer (5′-CACTTGTCACTACTTTCGGTNNBGGTGTTCAATGCTTTGCG-3′), containing one degenerate
NNB motif, had a Tm of 68 °C. The
concentration of all primers was adjusted to 10 pmol/μL for
PCR. The following reverse primer was used for all saturation mutagenesis
experiments: 5′-ACCGAAAGTAGTGACAAGTGTTGGCCATGGAACAGGTAG-3′.
PCR and Treatments
All three-step PCRs were carried out with PrimeSTAR HS polymerase.
The PCR mixture was as follows: 10 μL of 5× PS buffer,
4 μL of 2.5 mM dNTP, 0.5 μL of PrimeSTAR HS DNA polymerase,
2 μL of 10 pmol/μL primer mix, and 1 ng of template plasmid,
made up to a volume of 50 μL with sterilized water. The thermocycling
protocol was as follows: 30 cycles of 98 °C for 10 s, 55 °C
for 5 s, and 72 °C for 4 min and 30 s. After amplification, all
products were incubated with 1 μL of (NEB) for 15 min at 37 °C to digest the template
plasmid in the reaction mixture. To increase efficiency of SLiCE cloning,
1% agarose electrophoresis was performed, then each fragment was extracted,
and purified by gel clean-up system.
SLiCE Cloning and Transformation
SLiCE cloning uses
homologous recombination to ligate DNA fragments from cellular extracts
in vitro. SLiCE and 10× SLiCE reaction buffer (0.5 M Tris–HCl
pH 7.5, 100 mM MgCl2, 10 mM ATP, and 10 mM DTT) were prepared
as previously described.[15] Briefly, the
product (20 ng), 1 μL SLiCE, and 1 μL SLiCE reaction buffer
were mixed in a reaction volume of 10 μL. The mixture was incubated
at 37 °C for 30 min. After the SLiCE reaction, the samples were
immediately used for transformation or stored at −30 °C.
SLiCE reactant (5 μL) was added to 50 μL of Tuner(DE3)
competent cells on ice and mixed moderately. Within 1 min, these mixtures
were transferred to a cold cuvette and transformation was carried
out with a MicroPulser electroporator (Bio-Rad). Immediately after
electroporation, 300 μL of iced super optimal broth with catabolite
repression (SOC) was added. The transformed cells were incubated at
37 °C for 30 min. Finally, 150 μL of each transformant
was spread on 15 cm LB-agar plates with 50 μg/mL ampicillin
and incubated at 37 °C overnight.
Colony Counting
Colonies were counted under blue/green
LED (Handy Blue/Green LED). The recovery rate was represented as illuminated
colonies normalized to all colonies from each primer set and supplier
[(number of illuminated colonies/total number of colonies) ×
100], respectively. For calculating the fraction of gain-of-function
by deep sequencing, the number of TAT and TAC codons was counted [(number
of TAT and TAC/total number of counts) × 100]. For calculating
the expected fraction of gain-of-function, the number of TAT and TAC
codons were counted and divided by the number of codons in each primer
set [(number of TAT and TAC/total number of codons) × 100].
Comparison of PCR Products with 64 Primers Encoding Different
Codons
The pEDA5_GFPmut3_Y66H was amplified with 64 kinds
of forward primers encoding different codons (5′-CACTTGTCACTACTTTCGGTXXXGGTGTTCAATGCTTTGCG-3′, where X is A, C, G, or T)
and the same reverse primer as described above. Other conditions of
PCR were same as the saturation mutagenesis experiments. Products
(2 μL) were electrophoresed in 1% agarose.
Deep Sequencing
All colonies on each plate were collected
with 10 mL of LB medium and centrifuged at 4000g for
10 min. The number of transformants used for deep sequencing were
supplier 1 (NNN: 1453, NNB: 1618), supplier 2 (NNN: 2048, NNB: 1573),
and supplier 3 (NNN: 1527, NNB: 2128). Plasmid mixtures were extracted
from each cell pellet by a plasmid extraction kit, diluted to 1 ng/μL,
and amplified by PCR for deep sequencing. The forward primer was 5′-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG + ATATTCAGGGAGACCACAACGGTTTC-3′,
and the reverse primer was 5′-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG + GTGTCTTGTAGTTCCCGTCATCTTTG-3′. Both primers had additional
sequences (underlined) for deep sequencing. The PCR was performed
in two steps, each with 30 cycles of 94 °C for 10 s and 68 °C
for 4 min and 30 s, with PrimeSTAR HS DNA polymerase. All products
were electrophoresed in 1% agarose and purified with a gel clean-up
system. All experiments described below were performed by Hokkaido
System Science Co., Ltd. Additional adapter sequences were attached
with KAPA HiFi DNA polymerase (HotStart Ready mix) by second index
PCR (preincubation 95 °C for 3 min, 8 cycles of 95 °C for
10 s, 55 °C for 30 s, and 72 °C for 30 s, followed by 72
°C for 5 min). Nextera XT index primers N7xx and S5xx were used.
All fragments were purified by AMPure XP beads (Beckman Coulter Genomics).
Deep sequencing was performed with MiSeq system (Illumina) by Hokkaido
System Science Co., Ltd. Processed codon counts data used for analysis
were as follows: supplier 1 (NNN: 86 048, NNB: 82 657),
supplier 2 (NNN: 83 927, NNB: 79 696), and supplier
3 (NNN: 80 325, NNB: 83 835). For calculation of the
gain-of-function fraction from the results of deep sequencing, the
numbers of TAT and TAC codons were counted and used [(number of TAT
and TAC/total number of counts) × 100]. For calculation of the
expected value of gain-of-function fraction, the number of TAT and
TAC codons in each primer set were counted and divided by the total
numbers of codons.
Multisite Saturation Mutagenesis for ChiAB-X1YSX2X3
To generate the curved
α-helix
in ChiA helix (V373 A374 Y375), tyrosine
and serine residues were inserted between V373 and A374. The three
adjacent residues (373–375) were mutated with degenerate NNB
codons. The synthetic ChiAB gene was cloned into
pET-27b with and restriction sites. To introduce three-site
saturation mutagenesis and two extra tyrosine and serine residues
(X1YSX2X3), the following primers
were used: forward primer 5′-GACAAGATCGACAAGNNBTATAGCNNBNNBAACGTTGCGCAGAACTCGATGGATCACATC-3′
(Tm 70 °C) and reverse primer 5′-CTTGTCGATCTTGTCCTTACCGGCGCTGATG-3′
(Tm 62 °C). All reactions were carried
out with three-step PCRs using PrimeSTAR HS DNA polymerase. The PCR
mixture consisted of 10 μL of 5× PS buffer, 4 μL
of 2.5 mM dNTP, 0.5 μL of PrimeSTAR HS DNA polymerase, 2 μL
of 10 pmol/μL primer mix, and 1 ng of template plasmid, made
up to a volume of 50 μL with sterilized water. The thermocycling
protocol included 30 cycles at 98 °C for 10 s, 68 °C for
7 min. All products were incubated with 1 μL of (NEB) for 15 min at 37 °C to digest the
template plasmid in the reaction mixture. To increase the efficiency
of SLiCE cloning, 1% agarose electrophoresis was performed, then each
amplified linear plasmid was extracted, and purified by a gel clean-up
system. The fragments were ligated by SLiCE and transformed into E. coli cells by electroporation. Transformed cells
were spread and incubated on agar plates with 25 μg/mL of kanamycin
at 37 °C overnight. The SLiCE reaction and transformation protocols
were same as described above.
Ratio of Each Amino Acid
at X1, X2, and
X3 Positions of ChiAB-X1YSX2X3
A 96-well deep well plate was used for small-scale
cultivation. Super Broth (1 mL, 3.2% tryptone, 2% yeast extract, and
0.5% sodium chloride) with 25 μg/mL of kanamycin was added to
each well and used to inoculate colonies from the plate. Master plates
were prepared to store the cells. The 96-well deep well plate was
cultured at 37 °C with shaking at 1300 rpm for overnight and
centrifuged at 4400g for 10 min. Harvested cells
were used for plasmid extraction, and the obtained codons and amino
acids at each position were analyzed.
Small-Scale Culture, Purification,
and Activity Measurement
of ChiAB-X1YSX2X3 by Liquid-Handling
Robot
Small-scale cultivation was basically same as described
above except that the 96-well deep well plate was cultured at 30 °C
with shaking at 1300 rpm until the cells reached an OD600 ∼ 1. Then, 0.25 mM isopropyl β-d-1-thiogalactopyranoside
(IPTG) was added and the cells were then cultured at 20 °C overnight.
Cells were then harvested by centrifugation (4400g at 25 °C for 10 min). To disrupt the cells, 300 μL of
BugBuster (Novagen) containing 10 unit/mL of Benzonase (Novagen) was
added to each well, followed by shaking at 1000 rpm for 20 min. The
disrupted cells were centrifuged at 4400g at 25 °C
for 10 min. We used Beckman Coulter Biomek 4000 for purification of
ChiAB-X1YSX2X3 mutants. After centrifugation,
the supernatant was transferred to a new 96-well deep well plate and
100 μL of 50% slurry Ni-NTAagarose (QIAGEN) was added to each
well. Mixing was then carried out at 1000 rpm for 5 min to facilitate
binding with the target protein. Ni-NTAagarose was washed with 200
μL of buffer A (50 mM sodium phosphate, pH 7.0) twice, 200 μL
buffer B (50 mM sodium phosphate, pH 7.5, 50 mM imidazole) thrice,
and eluted with 150 μL buffer C (50 mM sodium phosphate, pH
7.5, 100 mM imidazole). Eluted solution (5 μL) was loaded on
12% acrylamide gel and purity was checked. Purified enzymes (18 μL)
were incubated with 0.1% (w/v) crystalline β-chitin in 100 mM
sodium phosphate buffer (pH 6.0 at 37 °C for 8 min) in reaction
mixture volumes of 180 μL. Reactions were stopped by adding
240 μL of Schaless reagent (500 mM sodium carbonate, 1.5 mM
potassium ferricyanide), and insoluble chitin was separated on 96-well
0.45 μm PVDF filter plates. The filtered solution was heated
at 95 °C for 15 min, and 200 μL of the samples were transferred
to 96-well plates. Absorbance at 420 nm was measured, and amounts
of soluble products were calculated from a standard curve with chitobiose.
Plasmid Extraction and Sequencing of Active Mutants
Each
colony was inoculated in 10 mL of LB medium with 25 μg/mL
of kanamycin from the master plate and cultured at 37 °C overnight.
The cells were harvested by centrifugation (4000g at 4 °C for 10 min). Plasmid extractions were performed following
the manufacturer’s protocol. Sequencing of all plasmids was
carried out by FASMAC Co., Ltd.
Large-Scale Culture and
Purification of ChiAB-FYSFV
ChiAB-FYSFV, one of the ChiAB-X1YSX2X3 mutants, was selected by its
solubility and activity. Cells were
inoculated from the master plate into 10 mL of LB medium and cultured
overnight at 37 °C. Overnight culture (5 mL) was added to 1 L
of LB medium. After the cells were grown in LB medium containing 25
μg/mL of kanamycin at 37 °C to an OD600 ∼
1, protein overexpression was induced by adding 0.5 mM IPTG, followed
by overnight incubation at 20 °C. After harvesting by centrifugation
at 8000g for 20 min, the cells were resuspended in
100 mL of 100 mM Tris–HCl (pH 8.0) and sonicated with ethylenediaminetetraacetic
acid-free protease inhibitor cocktail (Complete Mini, Roche). Then,
2.5 mL of 4 M NaCl and 5 mL of Ni-NTA Superflow (50% Slurry, QIAGEN)
were added to the supernatant directly, followed by gentle mixing,
and incubated for 5 min at 25 °C. The Ni-NTA resin was packed
into an open column and washed with a buffer (50 mM sodium phosphate,
pH 7.0, 100 mM NaCl) containing 50 mM imidazole. The sample was eluted
with the same buffer containing 100 and 200 mM imidazole in a stepwise
manner. Protein fractions were mixed and concentrated to 500 μL
using a 10 kDa cut VIVASPIN Turbo 15 (Sartorius). The concentrated
sample was further purified by Superdex 200 10/300GL (GE Healthcare)
equilibrated with a buffer (50 mM sodium phosphate, pH 7.0, 100 mM
NaCl). Peak fractions were combined and concentrated to 20 mg/mL for
crystallization.
Crystallization, Data Collection, and Structure
Determination
Protein concentration of ChiAB-FYSFV was adjusted
to 20 mg/mL.
All drops contained 1 μL of protein solution and 1 μL
of reservoir solution. Plate-shaped crystals were obtained under a
wide range of sodium citrate concentrations (0.4–0.7 M) at
pH 6.4–7.4 using 5–10% MeOH at 20 °C, for a few
days. These crystals were brought to BL2S1 at Aichi synchrotron.[21] The crystals were washed briefly in mother liquid
containing 30% (v/v) glycerol and flash cooled in a nitrogen gas stream.
Data were collected using ADSC Q315r detectors (1.12 Å wavelength,
1° oscillation angle). All datasets were processed with XDS[22] (XDSGUI and XDSME[23]). ChiAB-FYSFV crystals were in C2221 (a = 71.8 Å, b = 190.2 Å, c = 133.0 Å, α, β, γ = 90°).
The initial structure was solved by molecular replacement with Phaser-MR
(PHENIX[24]) using ChiA (PDB ID: 1eib) and the chitin-binding
domain of ChiB (PDB ID: 1e6n) as template structures. Further modeling and refinement
were carried out with COOT[25] and phenix.refine
(PHENIX[24]), respectively. Finally, the
ChiAB-FYSFV structure was determined at 2.6 Å.
Authors: Sabrina Kille; Carlos G Acevedo-Rocha; Loreto P Parra; Zhi-Gang Zhang; Diederik J Opperman; Manfred T Reetz; Juan Pablo Acevedo Journal: ACS Synth Biol Date: 2012-06-22 Impact factor: 5.110
Authors: Emily E Wrenbeck; Justin R Klesmith; James A Stapleton; Adebola Adeniran; Keith E J Tyo; Timothy A Whitehead Journal: Nat Methods Date: 2016-10-10 Impact factor: 28.547