Literature DB >> 31553080

Computer-Assisted Recombination (CompassR) Teaches us How to Recombine Beneficial Substitutions from Directed Evolution Campaigns.

Haiyang Cui1, Hao Cao1,2, Haiying Cai1, Karl-Erich Jaeger1,3, Mehdi D Davari1, Ulrich Schwaneberg1,4.   

Abstract

A main remaining challenge in protein engineering is how to recombine beneficial substitutions. Systematic recombination studies show that poorly performing variants are usually obtained after recombination of 3 to 4 beneficial substitutions. This limits researchers in exploiting nature's potential in generating better enzymes. The Computer-assisted Recombination (CompassR) strategy provides a selection guide for beneficial substitutions that can be recombined to gradually improve enzyme performance by analysis of the relative free energy of folding (ΔΔGfold ). The performance of CompassR was evaluated by analysis of 84 recombinants located on 13 positions of Bacillus subtilis lipase A. The finally obtained variant F17S/V54K/D64N/D91E had a 2.7-fold improved specific activity in 18.3 % (v/v) 1-butyl-3-methylimidazolium chloride ([BMIM][Cl]). In essence, the deducted CompassR rule allows recombination of beneficial substitutions in an iterative manner and empowers researchers to generate better enzymes in a time-efficient manner.
© 2020 The Authors. Published by Wiley-VCH Verlag GmbH & Co. KGaA.

Entities:  

Keywords:  Bacillus subtilis lipase A; directed evolution; foldX; protein engineering; recombination

Year:  2019        PMID: 31553080      PMCID: PMC7003928          DOI: 10.1002/chem.201903994

Source DB:  PubMed          Journal:  Chemistry        ISSN: 0947-6539            Impact factor:   5.236


Introduction

Directed evolution of proteins has matured into a powerful methodology to improve enzyme properties, such as stability, selectivity, and specific activity.1 The Nobel Prize in chemistry in 2018 was awarded in recognition of the significant impact of directed evolution on both gaining of scientific knowledge and application in chemical industries and in medicine.2 The beauty of directed protein evolution is that all kinds of protein properties that can be reflected in the employed screening/selection system, can within physical boundaries be improved without any molecular understanding or hypothesis.3 Subsequent analysis of the identified amino acid exchanges enables the discovery of new fundamental design principles of enzymes. The key technologies required for directed evolution are diversity generation and high‐throughput screening. The diversity generation challenge is largely solved today, with the development of random and multi‐site saturation mutagenesis methods.2 For example, epPCR already generates ≈1012 variants under standard error‐prone conditions within two to three hours.4 Remaining challenges are how to navigate through the huge protein sequence space (numbers problem in screening) and how to recombine beneficial substitutions. Beneficial substitution could be obtained from directed evolution experiments after screening a few thousand variants or by (semi‐) rational design studies. Numerous reports point out that recombining more than two or three beneficial substitutions does not necessarily yield further improved enzyme variants.5 Additionally, the best performing variants are often obtained in early stages of recombination, for example, after one or two recombined substitutions.6, 7 Several studies also point out that beneficial substitutions drive each other to “extinction”.8 Interestingly, the simultaneous site saturation of two sets of amino acids (each comprising five beneficial positions with four to five substitutions per recombined position) yielded after screening of 1500 variants a fraction of only 0.67 % active clones for the phenylacetone monooxygenase (PAMO, 10 clones).9 Comparable results were reported for the alcohol dehydrogenase (cpADH5)7d with a fraction of 1.2 % of active clones (4 simultaneously saturated positions, screening of 3500 variants). In another report, ten identified positions in limonene epoxide hydrolase (LEH) were simultaneously recombined using a multi‐site directed mutagenesis method (one substitution per position) and after screening of 3320 clones, 533 active variants were obtained. The most beneficial variant with inverted enantioselectivity had only three substitutions.10 All the latter reports confirm that rules and methods to guide recombination experiments are limited by a low fraction of active recombinants that are highly desirable in the field of protein engineering for generating better performing catalysts. How can we ensure that enzymes are active after several iterative recombinations? Several factors affect the enzyme activity and function (e.g., substrate binding,11 product release,12 temperature,13 pH14), however, it is generally accepted that enzymes must be able to fold stably in order to function properly.15 The relationship between stability and function of an enzyme (referred to as stability‐activity tradeoff) is well studied in respect to thermostability and catalytic activity.16 The relative free energy of folding (ΔΔG fold) was employed as a measure of protein stability and to assess the relationship between stability and function in several enzymes (e.g., TEM‐1 β‐lactamase,17 cytochrome P450 BM3,18 green fluorescent protein avGFP19 and others16c, 20). It is known that most proteins are marginally stable, and substitutions can be tolerated until the “robustness threshold” is reached.17b, 17d The variants that have higher stability tend to have higher protein fitness17c and extra stability could increase evolvability to accept a wider range of beneficial substitutions.18 All these above studies indicate that the ΔΔG fold is an important factor for predicting the evolvability and/or performance of proteins. In order to analyze the stability of all the single substitutions, researchers used the reported accuracy of ΔΔG fold predictors to bin the ΔΔG fold into several stabilizing/destabilizing categories.16c Computed ΔΔG fold (in kcal mol−1) of single substitutions are regarded as highly stabilizing (<−1.84), stabilizing (−1.84 to −0.92), slightly stabilizing (−0.92 to −0.46), neutral (−0.46 to +0.46), slightly destabilizing (+0.46 to +0.92), destabilizing (+0.92 to +1.84), and highly destabilizing (>+1.84).12 Several computational protein stability predictors are available to calculate ΔΔG fold, for example, FoldX,21 Rosetta,22 CUPSAT,23 PoPMuSiC16b and others.24 Although stabilizing/destabilizing categories can be applied to classify single substitutions,16c the thresholds of ΔΔG fold values for recombination of single beneficial substitutions are still missing. We selected FoldX as it is a popular and reliable method for determining changes in the free energy of folding caused by substitutions. Compared with other predictors, FoldX achieved the highest correlation (r=0.96) for binned data in a recent evaluation25 and has often successfully been used for identifying beneficial positions.16c, 24, 26 Bacillus subtilis Lipase A (BSLA) is a well‐studied enzyme and was chosen to develop CompassR as a predictor for recombining substitutions. The “BSLA‐SSM” library covers all the natural diversity with a single amino acid exchange at each position of BSLA (in total 181 positions; 3439 variants; “site‐saturation mutagenesis” denoted as “SSM”). The “BSLA‐SSM” library was constructed in our previous study,27 as well as screened towards improved 1‐butyl‐3‐methylimidazolium chloride ([BMIM][Cl]) resistance. CompassR was developed by selecting 13 positions in three genes which encoded in total 39 substitutions that were recombined in a staggered extension process (StEP) library. Out of 39 substitutions, 13 beneficial substitutions (one substitution per position) were finally selected based on their ΔΔG fold values for further recombination studies. The calculated ΔΔG fold values of the 13 substitutions were used to place them into three categories. Three most stabilizing substitutions, F17S, V54K, and G155P, were selected for two recombination campaigns (“intra‐category” and “inter‐category”) with all other substitutions and up to four subsequent recombination experiments were performed, generating in total 84 recombinants (see Figure 2). Analysis of activity of the BSLA recombinants and their corresponding ΔΔG fold values was used to define the CompassR rule for recombination of beneficial substitutions.
Figure 2

Overview of all BSLA recombinants generated in the recombination of each category (“intra‐category”) and the beneficial substitutions F17S, V54K and G155P with beneficial substitutions from categories A (light green), B (light blue), and C (light purple) (“inter‐category”). Categories (A, B, and C; on the left) are composed of 13 selected beneficial substitutions obtained from the BSLA‐SSM library and grouped according to their ΔΔG fold values. Notations of recombinants: dark green: residual activity (in buffer) ≥80 % of the BSLA wild type activity. Orange: residual activity (in buffer) between 10–80 % of the BSLA wild type activity. Red: residual activity (in buffer) is between 0–10 % of the BSLA wild type activity and referred to as “inactive” recombinant.

Results

The results section is divided into four parts to illustrate how the CompassR rule was developed. The first part describes the results of standard recombination experiments in which 39 BLSA beneficial substitutions (39=3 substitutions×13 positions) were distributed over three (synthetic) genes and recombined with the staggered extension process (StEP) method.28 The analysis of the StEP recombination library demonstrated that the recombination challenge applies to BSLA in a similar manner than to reported enzymes (see Introduction; for example, Pseudomonas aeruginosa lipase,5 β‐glucuronidase,8b PAMO,9 cpADH5,7d LEH10). The second part describes the ΔΔG fold calculation and recombination analysis. In detail, the 13 beneficial substitutions were placed in three categories based on their ΔΔG fold and recombined in different modes (“intra‐category” and “inter‐category” recombination; in total 84 variants). In the third part the CompassR rule was postulated based on the obtained recombination results. In the concluding fourth part, 33 variants were analyzed in detail, and a molecular understanding of BSLA′s improved resistance towards the ionic liquid [BMIM][Cl] is provided.

BSLA recombination by the StEP method to obtain the fraction of active population after recombination

Thirty‐nine beneficial substitutions at 13 positions were identified in the “BSLA‐SSM” library.27 The 13 mutated positions were selected that match the following criteria: 1) the distance between each position was more than the minimum gap distance in the gene that can be resolved by the StEP method (≈30 bp),28 2) the targeted positions were evenly distributed over on the whole bsla gene, and 3) at least 3 substitutions among 19 substitutions in each selected position were beneficial. In order to enable an efficient recombination by the StEP method,28 the 39 substitutions were distributed over three synthetic genes (three different substitutions per positions, Figure S1 and Table S1 in Supporting Information) and recombined with the BSLA wild type (“forth” substitution per position) employing the StEP method; the latter generates a theoretical diversity of 413 (≈108) different variants. The recombination of the 13 selected positions at BSLA yielded mainly inactive variants (82 %) after screening of approximately 5000 clones. Sequencing of 30 randomly chosen variants showed that all eleven active variants harbored one to three substitution(s). In detail, 3 had one substitution, 6 had two, 2 had three (Figure 1, Table S2 in Supporting Information). Inactive recombinants of BSLA harbored two to eleven substitutions. The high fraction of inactive recombinants of BSLA and the low number of substitution in active BSLA variants are well correlating with reports on the recombination challenge (see Introduction;8b, 9, 10). The “best” variant obtained from the StEP‐BSLA recombination experiment after screening of 5000 variants was the BSLA recombinant F17S/V54K/Y129M with a 1.7‐times improved [BMIM][Cl] resistance when compared to BSLA wild type.
Figure 1

Overview of the diversity of the StEP recombination library in respect to the number of recombined substitutions determined by sequencing of 30 randomly picked variants. Yellow: active variants. Blue: inactive variants. Four picked variants were the BSLA wild type.

Overview of the diversity of the StEP recombination library in respect to the number of recombined substitutions determined by sequencing of 30 randomly picked variants. Yellow: active variants. Blue: inactive variants. Four picked variants were the BSLA wild type.

ΔΔG fold calculations and analysis of intra‐category and inter‐category recombinations

ΔΔG fold of the selected 39 beneficial substitutions were calculated using the FoldX method.21 As shown in Figure S2 in Supporting Information, substitutions were classified according to binned ΔΔG fold values16c as follows: 20/39 substitutions (51.3 %) were highly destabilizing (ΔΔG fold>+1.84 kcal mol−1), 8/39 substitutions (20.5 %) were slightly destabilizing (+0.46<ΔΔG fold<+1.84 kcal mol−1), 9/39 substitutions (23.1 %) showed the neutral effect on the stability (−0.46<ΔΔG fold<+0.46 kcal mol−1), only 2/39 substitutions (5.1 %) were stabilizing (ΔΔG fold<−0.46 kcal mol−1). As the starting point for the CompassR rule, 13 substitutions (one substitution per position) with the lowest to highest ΔΔG fold (−1.49 < ΔΔG fold < +18.64 kcal mol−1) were selected from the 39 beneficial substitutions and grouped in three categories (category A‐five substitutions: ΔΔG fold from −1.49 to +0.36 kcal mol−1; category B‐four substitutions: ΔΔG fold from +1.83 to +4.89 kcal mol−1; category C‐four substitutions: ΔΔG fold from +7.52 to +18.64 kcal mol−1; see Table 1).
Table 1

Thirteen selected substitutions at 13 positions of the BSLA grouped in three categories according to ΔΔG fold values.

Category[a]

Substitution

ΔΔG fold [kcal mol−1]

A

G155P

−1.49

F17S

−0.03

D64N

+0.09

V54K

+0.10

D91E

+0.36

B

Y129N

+1.83

L114E

+2.29

A81E

+3.00

V165E

+4.89

C

L36P

+7.52

G104Q

+14.38

P5W

+14.75

G46H

+18.64

[a] Category A comprises five beneficial substitutions with the “lowest” ΔΔG fold values, category B comprises four beneficial substitutions within the range of neutral ΔΔG fold values and category C comprises four beneficial substitutions with the largest ΔΔG fold values. The larger the ΔΔG fold negative values, the higher the stability.

Thirteen selected substitutions at 13 positions of the BSLA grouped in three categories according to ΔΔG fold values. Category[a] Substitution ΔΔG fold [kcal mol−1] A G155P −1.49 F17S −0.03 D64N +0.09 V54K +0.10 D91E +0.36 B Y129N +1.83 L114E +2.29 A81E +3.00 V165E +4.89 C L36P +7.52 G104Q +14.38 P5W +14.75 G46H +18.64 [a] Category A comprises five beneficial substitutions with the “lowest” ΔΔG fold values, category B comprises four beneficial substitutions within the range of neutral ΔΔG fold values and category C comprises four beneficial substitutions with the largest ΔΔG fold values. The larger the ΔΔG fold negative values, the higher the stability. In order to identify the threshold values of ΔΔG fold at which BSLA variants are active or inactive, two recombination campaigns (“intra‐category” and “inter‐category”) were performed as follows: In the first “intra‐category” campaign, substitutions among category A, category B, and category C were recombined. Main results in the Supporting Information show that all possible recombinants in category A yielded active variants until recombinants with five substitutions (F17S/V54K/D64N/D91E/G155P) were obtained in round IV (see Figure S3 in Supporting Information; twelve variants had a reduced activity). In category B, except for one double substitution variant (A81E/L114E), all of the recombinants were inactive, and in category C only inactive variants were obtained (after already recombining two beneficial substitutions). Overall, the fraction of active recombinants was 100 % (26/26) in category A, 13 % (1/8) in category B, and 0 % (0/6) in category C (Table S3). All these results are in agreement with the common view (see Introduction17b–17d, 18, 20a) that protein stability and function often appear to trade off at the level of individual substitutions and prove that ΔΔG fold is an excellent predictor for selecting beneficial substitutions that can be recombined. In a second “inter‐category” campaign, three beneficial positions of category A (F17S, V54K, and G155P) were individually recombined in three sets of experiments with all substitutions of category A, B, and C until inactive BSLA variants were obtained (Figure S4, S5 and S6 in Supporting Information). Figure 2 summarizes the results from the three “inter‐category” recombination campaigns with the beneficial substitutions F17S, V54K and G155P. The comparison of the three sets of experiments shows highly similar trends. Recombinants within category A yielded in all cases active variants; recombination within category B led to unpredictable recombination results (few active recombinants with two to three substitutions; none with four substitutions) and recombinants within category C were all inactive except one variant with a double substitution. Overall, the “inter‐category” recombination campaign with F17S (Figure S4) yielded in category A 100 % active recombinants (15/15), in category B 33 % (4/12), and in category C 14 % (1/7), respectively (Table S3). The “inter‐category” recombination campaign with V54K (Figure S5) yielded in category A 100 % active recombinants (15/15), in category B 14 % (1/7), and in category C 0 % (0/7) (Table S3). The “inter‐category” recombination campaign with the most stabilized substitution G155P (Figure S6) yielded in category A 100 % active recombinants (15/15), in category B 14 % (1/7), and in category C 0 % (0/4) (Table S3). Overview of all BSLA recombinants generated in the recombination of each category (“intra‐category”) and the beneficial substitutions F17S, V54K and G155P with beneficial substitutions from categories A (light green), B (light blue), and C (light purple) (“inter‐category”). Categories (A, B, and C; on the left) are composed of 13 selected beneficial substitutions obtained from the BSLA‐SSM library and grouped according to their ΔΔG fold values. Notations of recombinants: dark green: residual activity (in buffer) ≥80 % of the BSLA wild type activity. Orange: residual activity (in buffer) between 10–80 % of the BSLA wild type activity. Red: residual activity (in buffer) is between 0–10 % of the BSLA wild type activity and referred to as “inactive” recombinant.

CompassR rule postulation

Based on the obtained results of, in total, 84 recombinants (Figure 2 and Table S3 in Supporting Information) the following thresholds are postulated to place substitutions: in category A: “active recombinants” (substitutions with ΔΔG fold ≤ +0.36 kcal mol−1), in category B: “recombinants with unpredictable activity” (substitutions within +0.36 < ΔΔG fold < +7.52 kcal mol−1), in category C: “deactivating recombinants” (ΔΔG fold ≥ +7.52 kcal mol−1). In summary, the Computer‐assisted Recombination (CompassR, Figure 3) rule guides experimentalists in how to recombine beneficial substitutions based on ΔΔG fold value of the beneficial substitutions. CompassR expects that active recombinants are generated by recombining amino acid substitutions that fall into category A (ΔΔG fold≤+0.36 kcal mol−1). Recombinations with beneficial substitutions in category C should be omitted and not used for recombinations. Recombination with beneficial positions in category B should be considered only in the case that few beneficial substitutions are identified or used after recombining all beneficial substitutions from category A.
Figure 3

Computer‐assisted Recombination (CompassR) rule for selecting beneficial substitutions in recombination experiments. When substitutions with ΔΔG fold values≤+0.36 kcal mol−1 are recombined one can expect active and property improved recombinants (green). When beneficial substitutions are recombined with ΔΔG fold values ranging from +0.36 to +7.52 kcal mol−1 one cannot predict whether the recombinants will be inactive or active (unpredictable behavior; orange). Recombination of beneficial substitutions with ΔΔG fold≥+7.52 kcal mol−1 results in deactivated and in activity‐reduced recombinants (red). ΔΔG fold is calculated by the FoldX method; surface representation of the BSLA (PDB ID: 1i6w, Chain A) is shown in grey. The highlighted substitutions in green, orange, and red are the selected 13 beneficial single substitutions that were obtained from the “BSLA‐SSM” library.

Computer‐assisted Recombination (CompassR) rule for selecting beneficial substitutions in recombination experiments. When substitutions with ΔΔG fold values≤+0.36 kcal mol−1 are recombined one can expect active and property improved recombinants (green). When beneficial substitutions are recombined with ΔΔG fold values ranging from +0.36 to +7.52 kcal mol−1 one cannot predict whether the recombinants will be inactive or active (unpredictable behavior; orange). Recombination of beneficial substitutions with ΔΔG fold≥+7.52 kcal mol−1 results in deactivated and in activity‐reduced recombinants (red). ΔΔG fold is calculated by the FoldX method; surface representation of the BSLA (PDB ID: 1i6w, Chain A) is shown in grey. The highlighted substitutions in green, orange, and red are the selected 13 beneficial single substitutions that were obtained from the “BSLA‐SSM” library.

Ionic liquid resistance analysis of all active recombinants

The catalytic activity and ionic liquid ([BMIM][Cl]) resistance values of all active recombinants selected by CompassR are shown in Figure S7 in Supporting Information. As a general trend, one can observe that for most BSLA recombinants in category A and B, the ionic liquid ([BMIM][Cl]) resistance increased with increasing number of substitutions (e.g., 1st round: F17S/D91E:1.3‐fold, F17S/D64N: 1.5‐fold/ 2nd round: for example, F17S/V54K/D64N: 2.4‐fold, F17S/V54K/D91E: 2.2‐fold/ 3rd round: for example, F17S/V54K/D64N/D91E: 2.7‐fold, the best performing variant). The variant from the 4th round F17S/V54K/D64N/D91E/G155P had a 1.4‐fold improved resistance against the ionic liquid [BMIM][Cl] and exhibited a high level of residual activity (approximately 96 % of the wild type activity). Visualization of all substitutions of the best performing BLSA variant F17S/V54K/D64N/D91E from category A shows that they are located on the surface of BSLA (Figure S8). Among them, two substitutions pertain to charged amino acids (V54K, D91E) and two to polar ones (F17S, D64N). It is reported that the interaction of [BMIM][Cl] with the BSLA protein surface is the dominating factor that reduces BSLA activity.29 The identified beneficial substitutions on the BSLA surface with changes to polar and charged residues are in accordance with these previous findings.29

Discussion

In directed evolution experiments, more than ten beneficial positions are often identified in a single round of directed evolution after screening of only a few thousand variants.30 As outlined in the Introduction, methodologies are missing that empower researches to recombine efficiently and quickly more than three amino acids and to capitalize on identified beneficial substitutions. The recombination challenge of beneficial variants from directed evolution experiments clearly represents a main challenge that hampers the design of efficient enzymes for biocatalysis. In the present work, the StEP recombination experiment (three bsla genes; each gene encoding 13 substitutions + wild type) confirmed that BSLA is not more tolerant to recombinations than many other enzymes (18 % faction of active clones). Active BLSA variants carried three or less amino acid substitutions (see Results section). ΔΔG fold analysis of substitutions in the StEP library indicated a clear trend that ΔΔG fold is a predictor for the recombination experiments after analysis of active and inactive variants. “Intra‐” and “inter‐category” recombinations by site‐directed mutagenesis were performed in a stepwise manner as in previous reports.31, 32 Based on the obtained results of, in total, 84 recombinants (Figure 2 and Table S3 in Supporting Information) thresholds for recombining beneficial substitutions are postulated as the CompassR rule in the Results section. CompassR expects that active recombinants are generated by recombining amino acid substitutions that fall into category A (ΔΔG fold≤+0.36 kcal mol−1) and recombinations with beneficial substitutions in category C should be omitted. Notably, only inactive variants were obtained for recombination experiments of all beneficial substitutions in category C (“intra‐” and “inter‐category”) after recombination of three substitutions. Beneficial substitutions in category B yielded unpredictable behavior (7 active variants: 27 inactive variants) and should, at least from our point of view be considered only in cases in which few beneficial positions are identified or after recombining all beneficial substitutions from category A. The CompassR rule can be of high value for experimentalists enabling them to generate small and highly active recombination libraries of substitutions that fall in category A. The latter will significantly reduce experimental efforts; for example, 5000 StEP variants of BSLA were screened in this study yielding the BSLA variant F17S/V54K/Y129M with a 1.7‐fold ionic liquid resistance, compared to four recombined BSLA variants in category A yielding the BSLA variant F17S/V54K/D64N/D91E with a 2.7‐fold improved resistance. Interestingly, the improved variant F17S/V54K/Y129M found by the StEP recombination experiment also comprised the stabilized single substitutions (F17S and V54K), indicating that CompassR could find the substitutions obtained by StEP recombination experiment. The CompassR rule enables the reduction of screening efforts by recombining beneficial substitutions and generating highly functional variant libraries. CompassR is based on the relative free energy of folding calculations, but differs in comparison to sequence‐ and structure‐based computational methods (e.g., FoldX,21 Rosetta,22 FireProt,24a MuStab,33 I‐Mutant2.0,34 FuncLib,35 PoPMuSiC,16b and others32) in its focus on beneficial recombinants. The mentioned methods concentrate mostly on the prediction of the effect of individual substitutions and their effect on protein stability; none of these methods has been used to categorize beneficial substitutions and to guide recombination of beneficial substitutions through experimentally determined beneficial positions. It is reported that inclusion of the most stable substitution (in our case G155P) is beneficial to compensate for destabilizing substitutions.18 In order to see if the most stabilized substitution can compensate/perform in a better manner than two other substitutions (F17S and V54K) as parent, a CompassR recombination experiment with 11 recombinants was performed (Figure S6 and Table S3). Surprisingly, the “most” stabilized variant G155P did not increase the number of active clones after the first recombination in category B and C compared to F17S and V54K. In our BSLA experiments, the CompassR results from category A indicate that thermodynamic stability and enzymatic activity are not opposite sides of a coin, and can jointly be improved by recombination, as shown by the stepwise increased resistance against the ionic liquid ([BMIM]Cl). This finding agrees well with the generally accepted concept that function of a protein typically depends on its ability to fold to a sufficiently thermodynamically stable structure.18 The thermodynamic stability varies from protein to protein according to the equilibrium stability, kinetic folding/unfolding processes and the temperature at which assays are conducted.36 Thereby, the exact CompassR thresholds might slightly change depending on the type of protein and itself fold. CompassR differs in respect to methods based on statistical analysis of protein sequence/activity relationships (e.g., ProSAR®[37] and MOSAIC®[38]) by establishing a correlation between ΔΔG fold and catalytic activity. In addition, CompassR could be implemented in protein engineering strategies such as KnowVolution8d (4th recombination phase), CASTing,7b or MORPHING39 to guide recombination of beneficial substitutions and thereby speed up the design of significantly improved enzymes. CompassR could also be implemented as a preselector in gene recombination experiments (e.g., gene shuffling40 and StEP28) or in rational‐guided methods like SCHEMA41 (e.g., to select beneficial substitutions in parents and/or introduce substitutions which can rescue non‐functional chimeric proteins) or PTRec42 by limiting the recombination process (e.g., through synthetic genes) to encoded beneficial substitutions that fall into the category A.

Conclusions

CompassR enables the design of better enzymes with minimal experimental efforts through recombination of multiple beneficial substitutions that were previously identified by directed evolution and/or (semi‐)rational design. The CompassR rule guides recombination of beneficial substitutions through analysis of the relative free energy of folding and an experimentally determined threshold; all BSLA recombinants in category A were active, and improvements gradually increased with increasing the number of recombined beneficial substitutions. The latter is contrast to standard recombination methods (e.g., StEP28 or OmniChange43) which yield active populations ranging from 0.67 % to 16.55 %. CompassR is therefore of high value for experimentalists, since highly active libraries are generated and screening efforts can be minimized to a few variants or even be omitted through gene synthesis by ordering genes that encode recombinants with multiple “category A” substitutions. Furthermore, the gradually increased ionic liquids resistance with increased number of substitutions (rounds of recombination) makes it likely that more than five beneficial substitutions can be recombined, and much better performing enzymes can be designed in the future.

Experimental Section

Chemicals

All chemicals were of analytical grade or higher quality and purchased from Carl Roth (Karlsruhe, Germany), AppliChem (Darmstadt, Germany), and Sigma–Aldrich Chemie (Steinheim, Germany) unless specified. [BMIM][Cl] was synthesized by IoLiTec Ionic Liquids Technologies (Heilbronn, Germany) and was dissolved to 1.2 m by adding 18.3 % (v/v) Milli‐Q water before use.

Strains and plasmids

The plasmid pET22b(+)‐bsla WT was constructed in the previous work27 and was used as the template for the polymerase chain reactions (PCRs) performed in the present work unless specified. Chemically competent Escherichia coli DH5a and Escherichia coli BL21‐Gold (DE3) (Agilent Technologies; Santa Clara, USA) were used as hosts for plasmids amplification and protein expression, respectively.

StEP recombination library construction and expression

The StEP library of BSLA was generated using a modified StEP PCR protocol.28 The two‐step StEP PCR protocol is shown in Tables S4 and S5 in Supporting Information. Three bsla genes with 13 substitutions each were synthesized by Invitrogen (Germany). The BSLA StEP library was cloned into the pET22b(+) vector using the PLICing method,43 the specific primer is listed in Table S6. Then StEP recombination library was transformed and expressed in Escherichia coli BL21‐Gold (DE3) using standard methods.

Site‐directed mutagenesis

BSLA variants were stepwise constructed by PCR according to the QuikChange site‐directed mutagenesis method,31 using the primers listed in Table S7 in Supporting Information.

Activity assay in 96‐well microtiter plate

The screening procedure and activity determinations with the p‐nitrophenyl butyrate (pNPB) assay were performed as previously reported in 96‐well MTPs.27, 44 BSLA resistance (wild type or variant) was evaluated as activity in the presence of ionic liquid divided by activity in the absence of ionic liquid27, 44 (Infinite M200 PRO microtiter plate reader; Tecan, Maennedorf, Switzerland). Residual activity and background of an empty vector were determined and subtracted in all analysis. All data shown were at least measured in triplicates.

Computational procedures

The relative folding free energies (ΔΔG fold=ΔG fold,sub−ΔG fold,wt) were computed using FoldX version 3b5.121 employing the YASARA Plugin45 in YASARA Structure version 13.9.8.46 The initial structure of the BSLA for analysis was taken from the BSLA crystal structure (PDB ID: 1i6w47 Chain A, resolution 1.5 Å). Default FoldX parameters were used for temperature (298 K), ionic strength (0.05 m), and pH (7). The structure of the BSLA wild type was rotamerized and energy minimized using the “RepairObject” command to correct the residues that have non‐standard torsion angles. Five FoldX runs were performed for each substitution to ensure that the minimum energy conformation of even large residues that possess many rotamers is identified. The accuracy of FoldX method in prediction of relative folding free energies is reported to be 0.46 kcal mol−1 (the standard deviation of the difference between ΔΔG fold calculated by FoldX and the experimental values).21 PyMOL48 was used to visualize the BSLA structure.

Conflict of interest

The authors declare no conflict of interest. As a service to our authors and readers, this journal provides supporting information supplied by the authors. Such materials are peer reviewed and may be re‐organized for online delivery, but are not copy‐edited or typeset. Technical support issues arising from supporting information (other than missing files) should be addressed to the authors. Supplementary Click here for additional data file.
  64 in total

Review 1.  Thermodynamic stability and folding of proteins from hyperthermophilic organisms.

Authors:  Kathryn A Luke; Catherine L Higgins; Pernilla Wittung-Stafshede
Journal:  FEBS J       Date:  2007-08       Impact factor: 5.542

Review 2.  Stability effects of mutations and protein evolvability.

Authors:  Nobuhiko Tokuriki; Dan S Tawfik
Journal:  Curr Opin Struct Biol       Date:  2009-09-16       Impact factor: 6.809

Review 3.  Directed evolution 2.0: improving and deciphering enzyme properties.

Authors:  Feng Cheng; Leilei Zhu; Ulrich Schwaneberg
Journal:  Chem Commun (Camb)       Date:  2015-06-18       Impact factor: 6.222

4.  Towards understanding directed evolution: more than half of all amino acid positions contribute to ionic liquid resistance of Bacillus subtilis lipase A.

Authors:  Victorine Josiane Frauenkron-Machedjou; Alexander Fulton; Leilei Zhu; Carolin Anker; Marco Bocola; Karl-Erich Jaeger; Ulrich Schwaneberg
Journal:  Chembiochem       Date:  2015-03-18       Impact factor: 3.164

5.  Stability-activity tradeoffs constrain the adaptive evolution of RubisCO.

Authors:  Romain A Studer; Pascal-Antoine Christin; Mark A Williams; Christine A Orengo
Journal:  Proc Natl Acad Sci U S A       Date:  2014-01-27       Impact factor: 11.205

6.  Knowledge-guided laboratory evolution of protein thermolability.

Authors:  Manfred T Reetz; Pankaj Soni; Layla Fernández
Journal:  Biotechnol Bioeng       Date:  2009-04-15       Impact factor: 4.530

7.  Phosphorothioate-based DNA recombination: an enzyme-free method for the combinatorial assembly of multiple DNA fragments.

Authors:  Jan Marienhagen; Alexander Dennig; Ulrich Schwaneberg
Journal:  Biotechniques       Date:  2012-05-01       Impact factor: 1.993

Review 8.  Advances in the directed evolution of proteins.

Authors:  Michael D Lane; Burckhard Seelig
Journal:  Curr Opin Chem Biol       Date:  2014-10-11       Impact factor: 8.822

9.  iStable: off-the-shelf predictor integration for predicting protein stability changes.

Authors:  Chi-Wei Chen; Jerome Lin; Yen-Wei Chu
Journal:  BMC Bioinformatics       Date:  2013-01-21       Impact factor: 3.169

10.  A comprehensive, high-resolution map of a gene's fitness landscape.

Authors:  Elad Firnberg; Jason W Labonte; Jeffrey J Gray; Marc Ostermeier
Journal:  Mol Biol Evol       Date:  2014-02-23       Impact factor: 16.240

View more
  11 in total

1.  Recombination of Single Beneficial Substitutions Obtained from Protein Engineering by Computer-Assisted Recombination (CompassR).

Authors:  Haiyang Cui; Mehdi D Davari; Ulrich Schwaneberg
Journal:  Methods Mol Biol       Date:  2022

2.  Using Molecular Simulation to Guide Protein Engineering for Biocatalysis in Organic Solvents.

Authors:  Haiyang Cui; Markus Vedder; Ulrich Schwaneberg; Mehdi D Davari
Journal:  Methods Mol Biol       Date:  2022

3.  Recombination of Compatible Substitutions by 2GenReP and InSiReP.

Authors:  Haiyang Cui; Mehdi D Davari; Ulrich Schwaneberg
Journal:  Methods Mol Biol       Date:  2022

4.  Identification and Mutation Analysis of Nonconserved Residues on the TIM-Barrel Surface of GH5_5 Cellulases for Catalytic Efficiency and Stability Improvement.

Authors:  Jie Zheng; Han-Qing Liu; Xing Qin; Kun Yang; Jian Tian; Xiao-Lu Wang; Ya-Ru Wang; Yuan Wang; Bin Yao; Hui-Ying Luo; Huo-Qing Huang
Journal:  Appl Environ Microbiol       Date:  2022-08-24       Impact factor: 5.005

5.  Computer-Assisted Recombination (CompassR) Teaches us How to Recombine Beneficial Substitutions from Directed Evolution Campaigns.

Authors:  Haiyang Cui; Hao Cao; Haiying Cai; Karl-Erich Jaeger; Mehdi D Davari; Ulrich Schwaneberg
Journal:  Chemistry       Date:  2019-12-03       Impact factor: 5.236

6.  Can constraint network analysis guide the identification phase of KnowVolution? A case study on improved thermostability of an endo-β-glucanase.

Authors:  Francisca Contreras; Christina Nutschel; Laura Beust; Mehdi D Davari; Holger Gohlke; Ulrich Schwaneberg
Journal:  Comput Struct Biotechnol J       Date:  2020-12-28       Impact factor: 7.271

7.  Critical assessment of structure-based approaches to improve protein resistance in aqueous ionic liquids by enzyme-wide saturation mutagenesis.

Authors:  Till El Harrar; Mehdi D Davari; Karl-Erich Jaeger; Ulrich Schwaneberg; Holger Gohlke
Journal:  Comput Struct Biotechnol J       Date:  2021-12-16       Impact factor: 7.271

8.  Polar Substitutions on the Surface of a Lipase Substantially Improve Tolerance in Organic Solvents.

Authors:  Haiyang Cui; Markus Vedder; Lingling Zhang; Karl-Erich Jaeger; Ulrich Schwaneberg; Mehdi D Davari
Journal:  ChemSusChem       Date:  2022-02-09       Impact factor: 9.140

9.  Sustainable isomaltulose production in Corynebacterium glutamicum by engineering the thermostability of sucrose isomerase coupled with one-step simplified cell immobilization.

Authors:  Mengkai Hu; Fei Liu; Zhi Wang; Minglong Shao; Meijuan Xu; Taowei Yang; Rongzhen Zhang; Xian Zhang; Zhiming Rao
Journal:  Front Microbiol       Date:  2022-08-10       Impact factor: 6.064

10.  Modulating the Coupling Efficiency of P450 BM3 by Controlling Water Diffusion through Access Tunnel Engineering.

Authors:  Shuaiqi Meng; Yu Ji; Luo Liu; Mehdi D Davari; Ulrich Schwaneberg
Journal:  ChemSusChem       Date:  2022-01-27       Impact factor: 9.140

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.