| Literature DB >> 28545124 |
Tom van den Bergh1,2, Giorgio Tamo1, Alberto Nobili3, Yifeng Tao3,4, Tianwei Tan4, Uwe T Bornscheuer3, Remko K P Kuipers1, Bas Vroling1, René M de Jong5, Kalyanasundaram Subramanian2, Peter J Schaap2, Tom Desmet6, Bernd Nidetzky7, Gert Vriend8, Henk-Jan Joosten1,8.
Abstract
CorNet is a web-based tool for the analysis of co-evolving residue positions in protein super-family sequence alignments. CorNet projects external information such as mutation data extracted from literature on interactively displayed groups of co-evolving residue positions to shed light on the functions associated with these groups and the residues in them. We used CorNet to analyse six enzyme super-families and found that groups of strongly co-evolving residues tend to consist of residues involved in a same function such as activity, specificity, co-factor binding, or enantioselectivity. This finding allows to assign a function to residues for which no data is available yet in the literature. A mutant library was designed to mutate residues observed in a group of co-evolving residues predicted to be involved in enantioselectivity, but for which no literature data is available yet. The resulting set of mutations indeed showed many instances of increased enantioselectivity.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28545124 PMCID: PMC5436653 DOI: 10.1371/journal.pone.0176427
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Visualisation of correlated mutation networks in the protein structures.
In the boxes the correlated mutation networks are shown. Nodes represent alignment positions. Node sizes indicate the number of edges. Nodes shown in cyan indicate residue positions for which keyword related mutation data is available in the literature. Edge colours indicate the strength of the pair-wise correlation (yellow to red). The residues visualized in the structures correspond with, and match colours with nodes in the network. a. Correlated mutation network of the isocitrate lyases (ICL) visualised in structure pdb-code: 1IGW. The cyan nodes in this network are related to the keyword ‘specificity’. b. Correlated mutation network of the alcohol dehydrogenases (ADH) visualized in pdb-code: 1D1T that contains a substrate analog in the active site and the NAD co-factor (magenta). Position 41 that is the central hub in the correlation network is also the centre of the 3D network and it is located between the NAD and the substrate-binding pocket. c. Correlated mutation network of the amino acid oxidases (AAO) visualized in pdb-code: 1B37 that contains the FAD co-factor. This network consists of two sub-networks (blue surrounding the FAD, red surrounding the substrate binding pocket). d. Correlated mutation network of the α/β-hydrolase fold enzymes (a-bH) visualized in pdb-code: 1VA4. This network consists of two sub-networks. The smaller sub-network is highly enriched with positions (cyan nodes) related to the keyword ‘enantioselectivity’.
Fig 2Escores for a series of keywords related to mutations in the families as function of the correlated mutation analysis cut-off.
a. Keyword enrichments for the alcohol dehydrogenases (ADH). b. Keyword enrichments for the Amino acid oxidases (AAO). c. Keyword enrichments for the Cupins. d. Keyword enrichments for the isocitrate lyases (ICL). e. Keyword enrichments for the UDP-Glycosyltransferases (UDP-GT). f. Keyword enrichments for a subset of the UDP-Glycosyltransferases (UDP-GT) alignment. This subset is composed of all sequences that have a proline at 3D-number 218. g. Keyword enrichments for the α/β-hydrolases (a-bH).
Enrichment scores for control keywords.
| keyword | and | the | stability | zinc |
|---|---|---|---|---|
| ADH | 1.15 | 1.18 | 2.02 | – |
| AAO | 1.16 | 1.15 | 1.94 | 0.00 |
| Cupin | 1.00 | 0.94 | 0.87 | 1.40 |
| ICL | 4.04 | 3.83 | 0.00 | 1.00 |
| UDP-GT | 1.54 | 1.53 | 0.00 | 0.00 |
| a-bH | 0.00 | 0.00 | 1.35 | 1.57 |
The enrichments were calculated at a CMA cut-off of 0.80.
The keyword ‘zinc‘ is not shown for the ADH super-family because zinc is a co-factor in this family and thus not a control keyword.
Specific activities and apparent enantioselectivity for the top esterase variants.
| Variant | Specific activity | ||
|---|---|---|---|
| ( | ( | ||
| 1.44 (± 0.09) | 0.30 (± 0.11) | 5 | |
| 3.22 (± 0.19) | 0.54 (± 0.03) | 6 | |
| 4.48 (± 0.72) | 0.47 (± 0.04) | 10 | |
| 6.86 (± 1.08) | 0.51 (± 0.03) | 13 | |
a One unit corresponds to 1 μmol converted min-1 mg-1 protein.
b Eapp is the ratio of activity for the two enantiomer of (R)- and (S)-3PB-pNP.
Fig 3Alcohol dehydrogenase family structure 1CDO-A with CMA network positions of three different alignments visualized.
The red residues represent the CMA positions for the complete super-family alignment. The yellow residues represent a network generated for a sub-alignment composed of sequences with a cysteine on 3D-number 41. The blue residues reflect a Network generated for a sub-sub-alignment composed of sequences with a cysteine at position 41 and a glycine at position 159. The catalytic zinc ion is shown in magenta.
Fig 4Isocytrate lyases family structure 1DQU-A with CMA networks and dimer interface visualized.
a. The red residues represent the Network for the complete super-family alignment. The blue residues represent the network for an alignment subset that contains a proline on 3D-number 157. b. The purple residues represent the 3D-positions that make an inter-molecular contact in most of the 70 available structures of the ICL family.
Sequences, structures, and mutations found for the six super-families.
| Name | Sequences | Core alignment positions | Structures | Articles scanned | Mutation data extracted |
|---|---|---|---|---|---|
| ADH | 14696 | 353 | 447 | 15144 | 10437 |
| AAO | 12155 | 253 | 356 | 14442 | 6203 |
| Cupin | 1650 | 43 | 338 | 53400 | 4362 |
| ICL | 3019 | 170 | 70 | 2013 | 160 |
| UDP-GT | 36402 | 313 | 475 | 26919 | 7610 |
| a-bH | 59904 | 88 | 1665 | 60926 | 60755 |
Fig 5Example to illustrate the use of 3D-numbers.
We are interested in histidine 22 in the human sequence, however mutation related information from the bibliome is only available for the mouse homologous sequence. In the main text we find a description of the effect of a mutation of histidine 49 to an alanine. This histidine residue is in the structure at equivalent position of the human histidine-22 and therefore shares the same 3D number (17).
3D positions selected, codons used for library design and corresponding encoded amino acids.
| 3D position | Codons | Amino acids encoded |
|---|---|---|
| TKG/TWT | L,W, | |
| GBC/ACC | V,A,G, | |
| VTT/GGT | V, | |
| GSC/ARC | ||
| YAT/CGT/GTT | H,Y,R,V |
Residues in bold correspond to wild-type esterase.