| Literature DB >> 24023508 |
Benjamin Kumwenda1, Derek Litthauer, Ozlem Tastan Bishop, Oleg Reva.
Abstract
Elucidation of evolutionary factors that enhance protein thermostability is a critical problem and was the focus of this work on Thermus species. Pairs of orthologous sequences of T. scotoductus SA-01 and T. thermophilus HB27, with the largest negative minimum folding energy (MFE) as predicted by the UNAFold algorithm, were statistically analyzed. Favored substitutions of amino acids residues and their properties were determined. Substitutions were analyzed in modeled protein structures to determine their locations and contribution to energy differences using PyMOL and FoldX programs respectively. Dominant trends in amino acid substitutions consistent with differences in thermostability between orthologous sequences were observed. T. thermophilus thermophilic proteins showed an increase in non-polar, tiny, and charged amino acids. An abundance of alanine substituted by serine and threonine, as well as arginine substituted by glutamine and lysine was observed in T. thermophilus HB27. Structural comparison showed that stabilizing mutations occurred on surfaces and loops in protein structures.Entities:
Keywords: 3D structures; biotechnology; enzyme; evolution; folding energy; thermostability
Year: 2013 PMID: 24023508 PMCID: PMC3762613 DOI: 10.4137/EBO.S12539
Source DB: PubMed Journal: Evol Bioinform Online ISSN: 1176-9343 Impact factor: 1.625
PDB Templates and target pairs of orthologous proteins used for constructing 3D protein structure models.
| Templates (PDB ID) | Annotation | MFE (kcal/mol) | Sequence identity (%) | Coverage (%) | Coverage range | E-value | DOPE Z score |
|---|---|---|---|---|---|---|---|
| 2DP9 | TSC_C20070 ASCH domain superfamily protein ( | −0.39 | 82 | 96.77 | 3–124 | 1.4E-49 | −1.56 |
| TTC1891 hypothetical protein ( | −0.50 | 100 | 82.55 | 3–124 | 9.9E-41 | −0.65 | |
| 2EBJ | TSC_C13250 peptidase ( | −0.36 | 69 | 99.48 | 1–192 | 2.7E-62 | −2.42 |
| TTC0531 peptidase ( | −0.49 | 99 | 99.48 | 1–192 | 5E-63 | −2.42 | |
| 2CWY | TSC_C00450 conserved hypothetical protein ( | −0.44 | 60 | 93.62 | 5–94 | 5.9E-35 | −2.19 |
| TTC1937 hypothetical protein ( | −0.54 | 100 | 100 | 1–94 | 3.7E-38 | −2.78 | |
| 2FK5 | TSC_C04250 fuculose-1-phosphate aldolases ( | −0.48 | 83 | 94.5 | 1–190 | 1.4E-51 | −1.78 |
| TTC1459 L-fuculose phosphate aldolases ( | −0.57 | 98 | 99.5 | 1–200 | 1E-54 | −1.73 | |
| 1V8D | TSC_C14180 conserved hypothetical protein ( | −0.46 | 80 | 82.55 | 40–235 | 5E-104 | −0.84 |
| TTC0214 transcriptional regulator ( | −0.55 | 95 | 82.55 | 40–235 | 5E-107 | −1.14 |
Figure 1Distribution of MFE calculated for genes of five bacterial genomes. Negative MFE values are ordered along the axis Y from bigger to smaller.
Figure 2Composition of amino acids in sequences of T. thermophilus HB27 and T. scotoductus SA-01 sequences.
Pairwise substitutions of amino acids in orthologous proteins of thermotolerant T. scotoductus SA-01 and thermophilic T. thermophilus HB27.
| Ala | Cys | Asp | Glu | Phe | Gly | His | Ile | Lys | Leu | Met | Asn | Pro | Gln | Arg | Ser | Thr | Val | Trp | Tyr | AMC | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| |||||||||||||||||||||
| Difference | 1836 | −2 | 63 | 128 | −184 | 415 | −57 | −986 | −795 | 143 | −185 | −263 | 234 | −823 | 1297 | −816 | −560 | 690 | −60 | −75 | |
|
| |||||||||||||||||||||
| Increased | Val | Arg | Asp | Arg, Glu | Ala | Ala | His | ||||||||||||||
| Decreased | Ser, Thr | Asn | Gln | Tyr | Gln, Lys | Ile | |||||||||||||||
| Ala | 0 | 32594 | |||||||||||||||||||
| Cys | 0 | 0 | 936 | ||||||||||||||||||
| Asp | 20 | 0 | 0 | 9433 | |||||||||||||||||
| Glu | 290 | −1 | 13 | 0 | 25858 | ||||||||||||||||
| Phe | 23 | −1 | 0 | −3 | 0 | 10072 | |||||||||||||||
| Gly | 10 | −1 | −47 | −39 | −15 | 0 | 25751 | ||||||||||||||
| His | 22 | 0 | 8 | 20 | −13 | 8 | 0 | 5125 | |||||||||||||
| Ile | 46 | 0 | 0 | 9 | 9 | 1 | 4 | 0 | 6434 | ||||||||||||
| Lys | 71 | 0 | 11 | 110 | 1 | 47 | −5 | −1 | 0 | 9729 | |||||||||||
| Leu | 83 | 4 | 5 | 6 | −112 | 19 | −14 | −227 | −9 | 0 | 43451 | ||||||||||
| Met | 40 | 1 | 0 | 20 | 4 | 9 | 0 | 2 | 0 | 44 | 0 | 3299 | |||||||||
| Asn | 52 | −1 | 25 | 4 | 40 | 10 | 0 | 3 | 5 | −1 | 0 | 3745 | |||||||||
| Pro | 19 | 0 | −5 | 22 | −1 | −6 | −3 | −4 | −30 | −32 | −3 | −7 | 0 | 18184 | |||||||
| Gln | 108 | 1 | 12 | 0 | 35 | 16 | 0 | 16 | 46 | 2 | −2 | 13 | 0 | 7155 | |||||||
| Arg | 47 | −5 | −4 | −92 | −16 | 19 | −110 | −11 | − | −71 | −13 | −41 | 26 | − | 0 | 24379 | |||||
| Ser | 2 | 21 | 51 | 5 | 114 | 12 | −6 | 7 | −1 | −4 | −26 | 89 | −5 | 92 | 0 | 9364 | |||||
| Thr | 1 | 14 | 41 | −5 | 33 | 0 | −9 | 2 | 13 | 4 | 11 | 30 | −6 | 42 | −34 | 0 | 9772 | ||||
| Val | 235 | −1 | 0 | 2 | −22 | 2 | 1 | − | −7 | −110 | −52 | −5 | 20 | −11 | 10 | −13 | −79 | 0 | 22490 | ||
| Trp | −3 | 1 | 1 | 11 | 16 | 3 | 7 | −2 | 0 | 0 | 2 | 0 | −2 | 1 | 19 | 1 | 3 | −2 | 0 | 3126 | |
| Tyr | 8 | −2 | −2 | −2 | −20 | −1 | −3 | −2 | 4 | 0 | 0 | 8 | −2 | 18 | −3 | 2 | 6 | −4 | 0 | 7595 | |
Notes: Positive values mean that amino acids shown in columns were accumulated in proteins of thermophilic T. thermophilus HB27. An overall increase or decrease of the amino acids is indicated in the row difference. The rows ‘Increased’ and ‘Decreased,’ indicate amino acids that underwent overall increase or decrease as shown in column titles.
Figure 3Protein structural models based on templates: (A) 2DP9; (B) 2EBJ; (C) 2CWY; (D) 2FK5; and (E) 1V8D. Positions of minor structural changes between the orthologous proteins of T. scotoductus SA-01 and T. thermophilus HB27, are depicted by arrows.
Energy difference impacts of predominant amino acid substitutions on protein stability as predicted by FoldX.
| PDB ID | Mutation SA-01 =. HB27 | Property change | Position | Location in 3D | Energy change (kcal/mol) |
|---|---|---|---|---|---|
|
| |||||
| 2DP9 | Lys =. Arg | Conserved | 3 | S, L | −0.61* |
| Lys =. Arg | Conserved | 29 | S, L | +0.03 | |
| Ile =. Val | Conserved | 41 | B, BS | +0.61* | |
| Asn =. Asp | To charged | 51 | S, BS | −0.09 | |
| Gln =. Glu | To charged | 55 | S, L | 0.00 | |
| Gln =. Glu | To charged | 56 | S, BS | +0.10 | |
| Ile =. Val | Conserved | 104 | B, L | +0.52* | |
| Lys =. Arg | Conserved | 106 | S, L | −0.48 | |
|
| |||||
| Total energy change | |||||
|
| |||||
| 2EBJ | Thr =. Ala | To non-polar | 20 | S, H | −1.50* |
| Ile =. Val | Conserved | 38 | S, BS | +0.48 | |
| Ser =. Ala | To non-polar | 43 | S, L | −0.86* | |
| Ile =. Val | Conserved | 80 | S, BS | −0.49 | |
| Lys =. Arg | Conserved | 129 | S, BS | −1.20 | |
|
| |||||
| Total energy change | − | ||||
|
| |||||
| 2CWY | Glu =. Arg | To charged | 9 | S, H/L | −0.82* |
| Lys =. Arg | Conserved | 49 | S, H | −0.32 | |
| Lys =. Arg | Conserved | 59 | S, H | +0.19 | |
| Lys =. Arg | Conserved | 64 | S, H | +0.31 | |
|
| |||||
| Total energy change | − | ||||
|
| |||||
| 2FK5 | Lys =. Arg | Conserved | 4 | S, H | +0.27 |
| Ser =. Ala | To non-polar | 7 | B, H | −0.75* | |
| Ile =. Val | Conserved | 73 | S, H | −0.14 | |
| Tyr =. His | To charged | 113 | S, H | +0.30 | |
| Lys =. Arg | Conserved | 141 | S, H | +0.25 | |
| Ile =. Val | Conserved | 156 | S, BS | +0.38 | |
|
| |||||
| Total energy change | |||||
|
| |||||
| 1V8D | Lys =. Arg | Conserved | 6 | L, B | N/A |
| Lys =. Arg | Conserved | 10 | H, S | N/A | |
| Ile =. Val | Conserved | 12 | H, B | N/A | |
| Lys =. Arg | Conserved | 40 | L, S | N/A | |
| Lys =. Arg | Conserved | 44 | H, S | N/A | |
| Ile =. Val | Conserved | 53 | H, B | N/A | |
| Ile =. Val | Conserved | 66 | L, S | N/A | |
| Ile =. Val | Conserved | 68 | BS, B | N/A | |
| Thr =. Ala | To non-polar | 89 | H, S | N/A | |
| Thr =. Ala | To non-polar | 96 | L, S | N/A | |
| Thr =. Ala | To non-polar | 106 | B, H | N/A | |
| Glu =. Arg | To charged | 181 | H, S | N/A | |
Notes: The location of substitutions was marked as S—surface; BS—beta sheet; H—helices and L—in loops. Asterisks mark values of energy change above the FoldX error margin of ± 0.5 kcal/mol. FoldX calculations were not applicable for the template 1V8D due to its low identity (95%) with the studied proteins.