| Literature DB >> 25165734 |
Md Anowar Hossain1, Hairul Azman Roslan2.
Abstract
beta-D-N-Acetylhexosaminidase, a family 20 glycosyl hydrolase, catalyzes the removal of β-1,4-linked N-acetylhexosamine residues from oligosaccharides and their conjugates. We constructed phylogenetic tree of β-hexosaminidases to analyze the evolutionary history and predicted functions of plant hexosaminidases. Phylogenetic analysis reveals the complex history of evolution of plant β-hexosaminidase that can be described by gene duplication events. The 3D structure of tomato β-hexosaminidase (β-Hex-Sl) was predicted by homology modeling using 1now as a template. Structural conformity studies of the best fit model showed that more than 98% of the residues lie inside the favoured and allowed regions where only 0.9% lie in the unfavourable region. Predicted 3D structure contains 531 amino acids residues with glycosyl hydrolase20b domain-I and glycosyl hydrolase20 superfamily domain-II including the (β/α)8 barrel in the central part. The α and β contents of the modeled structure were found to be 33.3% and 12.2%, respectively. Eleven amino acids were found to be involved in ligand-binding site; Asp(330) and Glu(331) could play important roles in enzyme-catalyzed reactions. The predicted model provides a structural framework that can act as a guide to develop a hypothesis for β-Hex-Sl mutagenesis experiments for exploring the functions of this class of enzymes in plant kingdom.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25165734 PMCID: PMC4129151 DOI: 10.1155/2014/186029
Source DB: PubMed Journal: ScientificWorldJournal ISSN: 1537-744X
Figure 1Conserved domains for tomato β-hexosaminidase, analyzed using Conserved Domain Database search in NCBI-BLAST.
Figure 2Sequence alignment of β-Hex-Sl with nine other sequences by CD search. The amino acid residues Arg(178), Asp(207), His(261), Asp(330), Glu(331), Trp(378), Trp(404), Tyr(430), Asp(432), Trp(494), and Glu(496) were predicted to be responsible for the activity of β-Hex-S. The conserved amino acids are shown as yellow color.
Proteins sequences used for construction of phylogenetic studies.
| SL | GI number | Name used in the tree | Description | Organism | Taxonomy |
|---|---|---|---|---|---|
| 1. | 4261632 | Homo sapiens-A |
|
| Eukaryota (Primates) |
| 2. | 426379627 | Gorilla gorilla gorilla |
|
| // |
| 3. | 329112561 | Pongo abelii-A | Predicted |
| // |
| 4. | 332844225 | Pan troglodytes-A |
|
| // |
| 5. | 387849165 | Macaca mulatta-A |
|
| // |
| 6. | 402874775 | Papio anubis-A |
|
| // |
| 7. | 635134633 | Chlorocebus sabaeus-A |
|
| // |
| 8. | 296213630 | Callithrix jacchus-A |
|
| // |
| 9. | 640780361 | Tarsius syrichta-A |
|
| // |
| 10. | 441617200 | Nomascus leucogenys-A | Predicted |
| // |
| 11. | 867691 | Homo sapiens-B |
|
| // |
| 12. | 114599673 | Pan troglodytes-B |
|
| // |
| 13. | 297675458 | Pongo abelii-B | Predicted |
| // |
| 14. | 635028815 | Chlorocebus sabaeus-B | Predicted |
| // |
| 15. | 388454685 | Macaca mulatta-B |
|
| // |
| 16. | 402871850 | Papio Anubis-B | Predicted |
| // |
| 17. | 296194339 | Callithrix jacchus-B |
|
| // |
| 18. | 403256462 | Saimiri boliviensis-B | Predicted |
| // |
| 19. | 478492476 | Ceratotherium simum-B | Predicted |
| // |
| 20. | 17647501 | Drosophila melanogaster-1 |
|
| Eukaryota (Insect) |
| 21. | 557771663 | Musca domestica1 |
|
| // |
| 22. | 498964043 | Ceratitis capitata1 |
|
| // |
| 23. | 498931058 | Ceratitis capitata1-1 |
|
| // |
| 24. | 157106934 | Aedes aegypti1 |
|
| // |
| 25. | 170057261 | Culex quinquefasciatus1 |
|
| // |
| 26. | 508082176 | Spodoptera frugiperda |
|
| // |
| 27. | 294988604 | Agrotis ipsilon |
|
| // |
| 28. | 19072855 | Trichoplusia ni |
|
| // |
| 29. | 62722476 | Choristoneura fumiferana |
|
| // |
| 30. | 114842947 | Ostrinia furnacalis1 |
|
| // |
| 31. | 37678109 | Manduca sexta |
|
| // |
| 32. | 17933586 | Drosophila melanogaster-2 |
|
| // |
| 33. | 557764625 | Musca domestica2 |
|
| // |
| 34. | 499003284 | Ceratitis capitata2 |
|
| // |
| 35. | 157117066 | Aedes aegypti2 |
|
| // |
| 36. | 642910295 | Tribolium castaneum |
|
| // |
| 37. | 170029661 | Culex quinquefasciatus2 |
|
| // |
| 38. | 157804574 | Ostrinia furnacalis2 |
|
| // |
| 39. | 145651816 | Bombyx mori |
|
| // |
| 40. | 350540008 | Solanum lycopersicum2 |
|
| Eukaryota (planta) |
| 41. | 565386664 | Solanum tuberosum2 | Predicted |
| // |
| 42. | 315440799 | Capsicum annuum2 |
|
| // |
| 43. | 225450263 | Vitis vinifera2 | Predicted |
| // |
| 44. | 449532074 | Cucumis sativus2 | Predicted |
| // |
| 45. | 255581813 | Ricinus communis | Putative |
| // |
| 46. | 440355382 | Prunus persica2 |
|
| // |
| 47 | 568858509 | Citrus sinensis2 | Predicted |
| // |
| 48. | 15220590 | Arabidopsis thaliana2 |
|
| // |
| 49. | 568879684 | Citrus sinensis3 | Predicted |
| // |
| 50. | 356528621 | Glycine max2 | Predicted |
| // |
| 51. | 357116549 | Brachypodium distachyon2 | Predicted |
| // |
| 52. | 30694211 | Arabidopsis thaliana1 |
|
| // |
| 53. | 567186303 | Eutrema salsugineum | Hypothetical Protein |
| // |
| 54. | 449459940 | Cucumis sativus1 | Predicted |
| // |
| 55. | 356568953 | Glycine max1 | Predicted |
| // |
| 56. | 401065909 | Prunus persica1 |
|
| // |
| 57. | 565358237 | Solanum tuberosum1 | Predicted |
| // |
| 58. | 350538741 | Solanum lycopersicum1 | Predicted |
| // |
| 59. | 357134815 | Brachypodium distachyon1 |
|
| // |
| 60. | 573945166 | Oryza brachyantha | Predicted |
| // |
| 61. | 115461737 | Oryza sativa | Putative |
| // |
| 62. | 169766420 | Aspergillus oryzae |
|
| Eukaryota (Fungi) |
| 63. | 238483137 | Aspergillus flavus | Putative |
| // |
| 64. | 115491163 | Aspergillus terreus | Putative |
| // |
| 65. | 119484544 | Neosartorya fischeri | Putative |
| // |
| 66. | 145241784 | Aspergillus niger |
|
| // |
| 67. | 70983560 | Aspergillus fumigatus |
|
| // |
| 68. | 358375826 | Aspergillus kawachii-1 |
|
| // |
| 69. | 121719823 | Aspergillus clavatus | Putative |
| // |
| 70. | 358372216 | Aspergillus kawachii-2 |
|
| // |
| 71. | 525585306 | Penicillium oxalicum | Putative |
| // |
| 72. | 557727225 | Byssochlamys spectabilis | Putative |
| // |
| 73. | 13786695 | Streptomyces Plicatus |
|
| Prokaryote (Bacteria) |
| 74. | 494714113 | Streptomyces coelicoflavus | Predicted |
| // |
| 75. | 511095822 | Streptomyces lividans | Putative |
| // |
| 76. | 490099150 | Streptomyces viridochromogenes1 | Putative |
| // |
| 77. | 499338878 | Streptomyces coelicolor1 | Putative |
| // |
| 78. | 640930344 | Streptomyces olindensis | Predicted |
| // |
| 79. | 493092893 | Streptomyces gancidicus | Predicted |
| // |
| 80. | 594145706 | Streptomyces coelicolor2 |
|
| // |
| 81. | 505473521 | Streptomyces davawensis |
|
| // |
| 82. | 490088482 | Streptomyces viridochromogenes2 |
|
| // |
| 83. | 119720203 | Thermofilum pendens | Glycoside hydrolase family protein |
| Archea |
Bold font indicates the experimentally characterized beta-N-acetylhexosaminidases.
Figure 3The phylogenetic tree based on beta-hexosaminidase amino acid sequences obtained by the maximum likelihood method. Thermofilum (Archea) was used as an outgroup to reconstruct the phylogenetic tree. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (100 replicates) is shown next to the branches. All analyses were performed with the WAG amino acid substitution model and 1 invariable and 4 gamma distributed site rate categories. Detailed information about the sequences is shown in Table 1.
Figure 4Ramachandran plot of the modeled structure of tomato β-N-acetyl hexosaminidase provided by PROCHECK.
Figure 5Anolea, Qmean, and DSSP (define secondary structure of protein) obtained from the structural assessment by SWISS-MODEL workplace online software.
Figure 6The molecular 3D modeling of tomato beta-N-acetyl hexosaminidase (β-Hex-Sl). SPDB viewer and Chimera were used to prepare the images. (a) The predicted 3D modeled structure is shown as ribbon diagram. The structure contains two fold domains (I and II) including α-helix (red), β-pleated sheets (purple), and coils (gray) The catalytic domain II is a (β/α)8 barrel with the active site located at the C terminus of the barrel. Template used for building this structure was 1now_B(PDB). (b) Superimposition magic fit image of the modeled structure β-Hex-Sl (blue) with template structure human 1now, human β-N-acetyl-hexosaminidase (red), and human β-hexosaminidase B-subunit. (c) The predicted ligand-binding site (active site) residues identified are depicted by as blue color. (d) Space filled view of ligand biding site of β-Hex-Sl with docking substrate N-acetyl-β-D-glucosamine (NAG).
Top 10 identified structural analogs in PDB by COFACTOR.
| Rank | PDB Hit | TM-score | RMSDa | IDENa | Cov. |
|---|---|---|---|---|---|
| 1 | 1nowB | 0.785 | 2.21 | 0.339 | 0.823 |
| 2 | 2gjxH | 0.777 | 2.60 | 0.298 | 0.827 |
| 3 | 3s6tA | 0.769 | 3.07 | 0.297 | 0.844 |
| 4 | 1c7sA | 0.751 | 3.66 | 0.198 | 0.848 |
| 5 | 3rcnA | 0.723 | 3.76 | 0.230 | 0.815 |
| 6 | 4h04A | 0.709 | 4.33 | 0.168 | 0.842 |
| 7 | 3gh7A | 0.707 | 3.63 | 0.244 | 0.795 |
| 8 | 1hp5A | 0.701 | 3.47 | 0.236 | 0.783 |
| 9 | 2eplX | 0.671 | 3.89 | 0.120 | 0.787 |
| 10 | 1qba_3 | 0.566 | 3.18 | 0.236 | 0.622 |
TM-score is a measure of global structural similarity between query and template protein.
RMSDa is the RMSD between residues that are structurally aligned by TM-align.
IDENa is the percentage sequence identity in the structurally aligned region.
Cov. represents the coverage of the alignment by TM-align and is equal to the number of structurally aligned residues divided by length of the query protein.
Top 5 enzyme homologs in PDB by COFACTOR.
| Rank | CscoreEC | PDB Hit | TM-score | RMSD | IDEN | Cov | EC number | Predicted active site residues |
|---|---|---|---|---|---|---|---|---|
| 1 | 0.576 | 2gjxA | 0.776 | 2.53 | 0.300 | 0.825 | 3.2.1.52 | 330, 331 |
| 2 | 0.512 | 1hp4A | 0.698 | 3.47 | 0.236 | 0.781 | 3.2.1.52 | 330, 331 |
| 3 | 0.508 | 3gh4A | 0.706 | 3.64 | 0.244 | 0.795 | 3.2.1.52 | 330, 331 |
| 4 | 0.173 | 1o7aA | 0.784 | 2.34 | 0.338 | 0.825 | 3.2.1.52 | 330, 331 |
| 5 | 0.142 | 1yhtA | 0.502 | 3.60 | 0.166 | 0.565 | 3.2.1.52 | 330, 331 |
CscoreEC is the confidence score for the enzyme classification (EC) number prediction. CscoreEC values range in between [0-1], where a higher score indicates a more reliable EC number prediction.
TM-score is a measure of global structural similarity between query and template protein.
RMSDa is the RMSD between residues that are structurally aligned by TM-align.
IDENa is the percentage sequence identity in the structurally aligned region.
Cov. represents the coverage of global structural alignment and is equal to the number of structurally aligned residues divided by length of the query protein.
Template proteins with similar binding sites searched by COFACTOR.
| Rank | CscoreLB | PDB Hit | TM-score | RMSDa | IDENa | Cov. | BS-score | Lig. Name | Predicted binding sites |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 0.64 | 3lmyA | 0.78 | 2.19 | 0.345 | 0.82 | 1.55 | CP6 | 178, 204, 207, 261, 330, 404, 430, 432, 433, 494, 496 |
| 2 | 0.45 | 2gk1G | 0.77 | 2.56 | 0.300 | 0.82 | 1.50 | NGT | 178, 251, 330, 331, 378, 404, 429, 494, 496 |
| 3 | 0.06 | 2gjx1 | 0.78 | 2.56 | 0.344 | 0.82 | 0.95 | Peptide | 178, 179, 227, 228, 230, 231, 464, 496, 497, 499, 500, 501, 502, 505, 506 |
CscoreLB is the confidence score of predicted binding site. CscoreLB values range in between [0-1], where a higher score is better site prediction.
BS-score is a measure of local similarity (sequence and structure) between template binding site and predicted binding site in the query structure. Based on large scale benchmarking analysis; we have observed that a BS-score > 1 reflects a significant local match between the predicted and template binding site.
TM-score is a measure of global structural similarity between query and template protein.
RMSDa is the RMSD between residues that are structurally aligned by TM-align.
IDENa is the percentage sequence identity in the structurally aligned region.
Cov. represents the coverage of global structural alignment and is equal to the number of structurally aligned residues.
Consensus prediction of gene ontology terms searched by COFACTOR.
| Molecular function | Biological process | Cellular function | |||
|---|---|---|---|---|---|
| GO term | GO score | GO term | GO score | GO term | GO score |
| GO:0043169 | 0.96 | GO:0006689 | 0.80 | GO:0016020 | 0.80 |
| GO:0046982 | 0.80 | GO:0030203 | 0.80 | GO:0005764 | 0.80 |
| GO:0005529 | 0.56 | GO:0042552 | 0.80 | GO:0005625 | 0.56 |
| GO:0016231 | 0.56 | GO:0050885 | 0.80 | GO:0001669 | 0.56 |
| GO:0042803 | 0.56 | GO:0019915 | 0.80 | ||
| GO:0007605 | 0.80 | ||||
| GO:0007040 | 0.80 | ||||
| GO:0001501 | 0.80 | ||||
| GO:0008219 | 0.80 | ||||
| GO:0031323 | 0.56 | ||||
Table 5 shows a consistence of function (GO terms) amongst top scoring templates. The GO score associated with each prediction is defined as the average weight of the GO term, where the weights are assigned based on CscoreGO of the template from which the GO term is derived.