| Literature DB >> 35409382 |
Vânia Cardoso1,2, Joana L A Brás2, Inês F Costa2, Luís M A Ferreira1, Luís T Gama1, Renaud Vincentelli3,4, Bernard Henrissat3,4,5, Carlos M G A Fontes1,2.
Abstract
In nature, the deconstruction of plant carbohydrates is carried out by carbohydrate-active enzymes (CAZymes). A high-throughput (HTP) strategy was used to isolate and clone 1476 genes obtained from a diverse library of recombinant CAZymes covering a variety of sequence-based families, enzyme classes, and source organisms. All genes were successfully isolated by either PCR (61%) or gene synthesis (GS) (39%) and were subsequently cloned into Escherichia coli expression vectors. Most proteins (79%) were obtained at a good yield during recombinant expression. A significantly lower number (p < 0.01) of proteins from eukaryotic (57.7%) and archaeal (53.3%) origin were soluble compared to bacteria (79.7%). Genes obtained by GS gave a significantly lower number (p = 0.04) of soluble proteins while the green fluorescent protein tag improved protein solubility (p = 0.05). Finally, a relationship between the amino acid composition and protein solubility was observed. Thus, a lower percentage of non-polar and higher percentage of negatively charged amino acids in a protein may be a good predictor for higher protein solubility in E. coli. The HTP approach presented here is a powerful tool for producing recombinant CAZymes that can be used for future studies of plant cell wall degradation. Successful production and expression of soluble recombinant proteins at a high rate opens new possibilities for the high-throughput production of targets from limitless sources.Entities:
Keywords: HTP expression; PCR; carbohydrate-active enzymes (CAZymes); gene synthesis; high-throughput (HTP) cloning; plant biomass
Mesh:
Substances:
Year: 2022 PMID: 35409382 PMCID: PMC8999789 DOI: 10.3390/ijms23074024
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1The distribution of the 1476 enzymes produced in this study by the 5 CAZy classes.
Figure 2Protein expression and purity analysis of CAZymes exemplified by SDS-PAGE. Lane M contains NZYTech Low Molecular Weight (LMW) Protein Marker; Lanes 1–8 contain purified recombinant proteins with the following accession numbers: (1) ABD81807.1, (2) AAK06049.1, (3) AAG04556.1, (4) CAA84537.1, (5) CAC83072.1, (6) CAA84537.1, (7) CAB55348.1, and (8) AAP09638.1.
Figure 3Efficacy of the recombinant production of CAZymes. (A) The three domains of life (proportion of recombinant proteins obtained among eukaryotic, bacteria, and archaea targets); (B) production strategy (GS, genes synthetized with codon-optimization for E. coli; PCR, genes obtained from genomic DNA); (C) expression vectors: pHTP1, expression vector encoding an N-terminal histidine tag; pHTP9, vector containing an N-terminal GFP tag; (D) protein’s molecular mass distribution. SOL: soluble protein, INS: insoluble protein.
Analysis of the amino acid composition in primary sequences of soluble (SOL) and insoluble (INS) CAZymes produced in this study.
| INS | SOL | SEM | ||
|---|---|---|---|---|
| (n) | 310 | 1166 | ||
| Amino acid groups (%) | ||||
| Non-polar amino acids 1 | 51.4 a | 50.2 b | 0.31 | 0.002 |
| Polar neutral amino acids 2 | 26.0 | 26.0 | 0.38 | 0.976 |
| Negatively charged amino acids 3 | 11.2 b | 12.2 a | 0.17 | 0.001 |
| Positively charged amino acids 4 | 11.4 | 11.5 | 0.17 | 0.420 |
| Amino acid contents (%) | ||||
| Isoleucine (I) | 5.0 | 5.2 | 0.11 | 0.051 |
| Leucine (L) | 7.6 | 7.4 | 0.15 | 0.213 |
| Lysine (K) | 4.4 b | 5.0 a | 0.15 | 0.001 |
| Methionine (M) | 1.9 | 2.0 | 0.06 | 0.138 |
| Phenylalanine (F) | 4.1 | 4.1 | 0.08 | 0.828 |
| Threonine (T) | 6.2 | 6.0 | 0.14 | 0.231 |
| Tryptophan (W) | 2.3 a | 2.2 b | 0.07 | 0.042 |
| Valine (V) | 6.5 | 6.5 | 0.10 | 0.610 |
| Arginine (R) | 4.7 a | 4.3 b | 0.12 | 0.005 |
| Histidine (H) | 2.3 | 2.3 | 0.07 | 0.659 |
| Alanine (A) | 9.0 a | 8.5 b | 0.20 | 0.014 |
| Asparagine (N) | 5.1 | 5.2 | 0.15 | 0.511 |
| Aspartic acid (D) | 6.0 b | 6.5 a | 0.10 | 0.001 |
| Cysteine (C) | 1.0 | 1.0 | 0.07 | 0.784 |
| Glutamic acid (E) | 5.2 b | 5.6 a | 0.12 | 0.002 |
| Glutamine (Q) | 3.6 | 3.4 | 0.09 | 0.287 |
| Glycine (G) | 8.9 a | 8.6 b | 0.13 | 0.016 |
| Proline (P) | 5.0 | 4.9 | 0.10 | 0.418 |
| Serine (S) | 6.6 | 6.5 | 0.15 | 0.246 |
| Tyrosine (Y) | 4.5 b | 4.9 a | 0.11 | 0.001 |
SEM: Standard error of the mean; Means in the same line with different letter superscripts (a, b) are significantly different (p < 0.05) or tend to be significantly different (p < 0.1). 1 The non-polar amino acid group is composed of glycine (G), cysteine (C), alanine (A), leucine (L), isoleucine (I), valine (V), methionine (M), proline (P), phenylalanine (F), and tryptophan (W); 2 The polar neutral amino acid group is composed of asparagine (N), glutamine (Q), serine (S), threonine (T), and tyrosine (Y); 3 The negatively charged amino acid group, also named acidic amino acids, comprises aspartic acid (D) and glutamic acid (E); 4 Positively charged amino acids, also named basic amino acids, comprise arginine (R), histidine (H), and lysine (K).
Activity of the 1166 soluble recombinant enzymes organized by CAZy family and expressed through the EC number.
| CAZy Family | EC Number |
|---|---|
| AA1 | 1.10.3.2 |
| AA3 | 1.1.3.12 |
| AA7 | 1.1.3.- |
| AA10 | 1.-.-.-/1.14.99.54 |
| AA NC | 1.10.3.-/1.3.3.5 |
| CE1 | 3.1.1.73 |
| CE2 | 3.1.1.72 |
| CE3 | 3.1.1.72 |
| CE4 | 3.2.1.8/3.5.1.- |
| CE6 | 3.1.1.72 |
| CE7 | 3.1.1.41/3.1.1.72 |
| CE8 | 3.1.1.11 |
| CE9 | 3.5.1.25 |
| CE11 | 3.5.1.- |
| CE12 | 3.1.1.-/3.1.1.72 |
| CE14 | 3.5.1.-/3.5.1.89 |
| CE15 | 3.1.1.-/3.1.1.72 |
| GH1 | 3.2.1.-/3.2.1.21/3.2.1.23/3.2.1.25/3.2.1.37/3.2.1.74/3.2.1.85/3.2.1.86 |
| GH2 | 3.2.1.23/3.2.1.25/3.2.1.31/3.2.1.165 |
| GH3 | 3.2.1.21/3.2.1.37/3.2.1.45/3.2.1.52/3.2.1.74/3.2.1.120 |
| GH4 | 3.2.1.20/3.2.1.22/3.2.1.67/3.2.1.86/3.2.1.122/3.2.1.139 |
| GH5 | 3.2.1.4/3.2.1.8/3.2.1.73/3.2.1.74/3.2.1.78/3.2.1.91/3.2.1.123/3.2.1.132/3.2.1.151 |
| GH6 | 3.2.1.4 |
| GH8 | 3.2.1.4/3.2.1.73/3.2.1.132/3.2.1.156 |
| GH9 | 3.2.1.-/3.2.1.4/3.2.1.91/3.2.1.151/3.2.1.165 |
| GH10 | 3.2.1.4/3.2.1.8 |
| GH11 | 3.2.1.8 |
| GH12 | 3.2.1.4/3.2.1.151 |
| GH13 | 2.4.1.4/2.4.1.7/2.4.1.18/2.4.1.19/2.4.1.25/3.2.1.1/3.2.1.4/3.2.1.10/3.2.1.20/3.2.1.41/3.2.1.68/3.2.1.70/3.2.1.93/3.2.1.98/3.2.1.133/3.2.1.135/3.2.1.141/5.4.99.11/5.4.99.15/5.4.99.16 |
| GH14 | 3.2.1.2 |
| GH15 | 3.2.1.3/3.2.1.28/3.2.1.70 |
| GH16 | 3.2.1.-/3.2.1.4/3.2.1.6/3.2.1.39/3.2.1.73/3.2.1.81/3.2.1.83/3.2.1.103/3.2.1.178 |
| GH17 | 2.4.1.- |
| GH18 | 3.2.1.-/3.2.1.14/3.2.1.96 |
| GH19 | 3.2.1.14 |
| GH20 | 3.2.1.52 |
| GH23 | 3.2.1.17/4.2.2.n1 |
| GH24 | 3.2.1.17 |
| GH25 | 3.2.1.17 |
| GH26 | 3.2.1.78/3.2.1.100 |
| GH27 | 3.2.1.88/3.2.1.94 |
| GH28 | 3.2.1.15/3.2.1.67/3.2.1.82 |
| GH29 | 3.2.1.51/3.2.1.111 |
| GH30 | 3.2.1.8/3.2.1.31/3.2.1.38/3.2.1.136/3.2.1.164 |
| GH31 | 3.2.1.-/3.2.1.20/3.2.1.84/2.4.1.161/3.2.1.177 |
| GH32 | 3.2.1.26/3.2.1.64/3.2.1.65/3.2.1.80/3.2.1.153/4.2.2.16 |
| GH33 | 3.2.1.-/3.2.1.18/2.4.1.- |
| GH35 | 3.2.1.23/3.2.1.165 |
| GH36 | 3.2.1.22/3.2.1.49 |
| GH37 | 3.2.1.28 |
| GH38 | 3.2.1.-/3.2.1.24 |
| GH39 | 3.2.1.37 |
| GH42 | 3.2.1.-/3.2.1.23 |
| GH43 | 3.2.1.-/3.2.1.37/3.2.1.55/3.2.1.99 |
| GH44 | 3.2.1.4 |
| GH45 | 3.2.1.4 |
| GH46 | 3.2.1.132 |
| GH47 | 3.2.1.113 |
| GH48 | 3.2.1.4/3.2.1.176 |
| GH49 | 3.2.1.11/3.2.1.95 |
| GH50 | 3.2.1.23/3.2.1.81 |
| GH51 | 3.2.1.55 |
| GH52 | 3.2.1.37 |
| GH53 | 3.2.1.89 |
| GH55 | 3.2.1.39 |
| GH57 | 2.4.1.18/2.4.1.25/3.2.1.1/3.2.1.41/3.2.1.54 |
| GH62 | 3.2.1.55 |
| GH63 | 3.2.1.20/3.2.1.84/3.2.1.170 |
| GH64 | 3.2.1.39 |
| GH65 | 2.4.1.8/2.4.1.64/2.4.1.216/2.4.1.230/2.4.1.279/2.4.1.282 |
| GH66 | 3.2.1.11 |
| GH67 | 3.2.1.139 |
| GH68 | 2.4.1.9/2.4.1.10/3.2.1.26 |
| GH70 | 2.4.1.5/2.4.1.140/2.4.4.- |
| GH73 | 3.2.1.- / |
| GH74 | 3.2.1.-/3.2.1.4 |
| GH75 | 3.2.1.132 |
| GH76 | 3.2.1.101 |
| GH77 | 2.4.1.25 |
| GH78 | 3.2.1.40 |
| GH79 | 3.2.1.31 |
| GH80 | 3.2.1.132 |
| GH81 | 3.2.1.39 |
| GH82 | 3.2.1.157 |
| GH84 | 3.2.1.35/3.2.1.52/3.2.1.169 |
| GH85 | 3.2.1.96 |
| GH86 | 3.2.1.81/3.2.1.178 |
| GH87 | 3.2.1.59/3.2.1.61 |
| GH88 | 3.2.1.- |
| GH91 | 4.2.2.18 |
| GH92 | 3.2.1.-/3.2.1.24/3.2.1.113 |
| GH94 | 2.4.1.-/2.4.1.20/2.4.1.49 |
| GH95 | 3.2.1.51/3.2.1.63 |
| GH97 | 3.2.1.3/3.2.1.20/3.2.1.22 |
| GH98 | 3.2.1.102 |
| GH99 | 3.2.1.130 |
| GH100 | 3.2.1.26 |
| GH101 | 3.2.1.97 |
| GH102 | 4.2.2.n1 |
| GH103 | 4.2.2.n1 |
| GH104 | 4.2.2.n1 |
| GH105 | 3.2.1.-/3.2.1.172 |
| GH106 | 3.2.1.40 |
| GH107 | 3.2.1.- |
| GH108 | 3.2.1.17 |
| GH109 | 3.2.1.49 |
| GH110 | 3.2.1.-/3.2.1.22 |
| GH111 | 3.2.1.- |
| GH112 | 2.4.1.211/2.4.1.247 |
| GH113 | 3.2.1.78 |
| GH114 | 3.2.1.109 |
| GH115 | 3.2.1.- |
| GH116 | 3.2.1.21/3.2.1.37/3.2.1.52 |
| GH117 | 3.2.1.- |
| GH118 | 3.2.1.81 |
| GH119 | 3.2.1.1 |
| GH120 | 3.2.1.37 |
| GH121 | 3.2.1.- |
| GH122 | 3.2.1.20 |
| GH123 | 3.2.1.53 |
| GH125 | 3.2.1.- |
| GH126 | 3.2.1.- |
| GH127 | 3.2.1.185 |
| GH129 | 3.2.1.49 |
| GH130 | 2.4.1.281/2.4.1.319 |
| GH134 | 3.2.1.78 |
| GH137 | 3.2.1.31 |
| GH142 | 3.2.1.185 |
| GH143 | 3.2.1.185 |
| PL1 | 4.2.2.2/4.2.2.10 |
| PL2 | 4.2.2.2/4.2.2.9 |
| PL3 | 4.2.2.2 |
| PL4 | 4.2.2.- |
| PL5 | 4.2.2.3 |
| PL6 | 4.2.2.-/4.2.2.3 |
| PL7 | 4.2.2.-/4.2.2.3/4.2.2.11 |
| PL8 | 4.2.2.1/4.2.2.5/4.2.2.12/4.2.2.20 |
| PL9 | 4.2.2.2/4.2.2.9 |
| PL10 | 4.2.2.2 |
| PL11 | 4.2.2.23/4.2.2.24 |
| PL12 | 4.2.2.8 |
| PL13 | 4.2.2.7 |
| PL15 | 4.2.2.-/4.2.2.3 |
| PL17 | 4.2.2.- |
| PL18 | 4.2.2.3/4.2.2.11 |
| PL21 | 4.2.2.7/4.2.2.8 |
| PL22 | 4.2.2.6 |
| PL24 | 4.2.2.- |
Figure 4Number of CAZy families and ECs obtained in this study. (A) CAZy database numbers (www.cazy.org, accessed on 31 January 2022), (B) selected biochemically characterized CAZymes, (C) CAZymes produced in this study, (D) enzymes obtained in the soluble form, and (E) final Nzytech’ CAZy commercial library.
Figure 5Number of CAZy classes and families represented in the current library.