| Literature DB >> 16131397 |
Abstract
BACKGROUND: As a rule, about 1% of genes in a given genome encode glycoside hydrolases and their homologues. On the basis of sequence similarity they have been grouped into more than ninety GH families during the last 15 years. The GH97 family has been established very recently and initially included only 18 bacterial proteins. However, the evolutionary relationship of the genes encoding proteins of this family remains unclear, as well as their distribution among main groups of the living organisms.Entities:
Mesh:
Substances:
Year: 2005 PMID: 16131397 PMCID: PMC1249566 DOI: 10.1186/1471-2164-6-112
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Structure of . Arrows indicate the direction of gene transcription. Red arrows correspond to glycosidase (GH) and glycosyltransferase (GT) genes: family belonging is indicated. Yellow arrows correspond to genes coding outer membrane proteins involved in starch binding (susC-susF) and their homologues. Green arrows correspond to genes of the transcriptional activator SusR and predicted transcriptional regulators homologous to AraC.
Glycoside hydrolases analyzed in the work
| Name | Family, subfamily | Organism | Accession numbera | Protein function (annotation) | Lengthb |
| 97A1_BACTH | GH97, 97a | AAC44671 | alpha-glucosidase SusB | 738 | |
| 97A2_BACTH | GH97, 97a | AAO79686 | ORF: alpha-glucosidase | 719 | |
| 97A3_BACTH | GH97, 97a | AAO75790 | ORF: alpha-glucosidase | 671 | |
| 97B1_BACTH | GH97, 97b | AAO76978 | ORF: putative alpha-glucosidase | 662 | |
| 97B2_BACTH | GH97, 97b | AAO78400 | ORF: putative alpha-glucosidase | 650 | |
| 97B3_BACTH | GH97, 97b | AAO77727 | ORF: alpha-glucosidase | 649 | |
| 97B4_BACTH | GH97, 97b | AAO78269 | ORF: putative alpha-glucosidase | 674 | |
| 97C1_BACTH | GH97, 97c | AAO78766 | ORF: alpha-glucosidase | 647 | |
| 97C2_BACTH | GH97, 97c | AAO78769 | ORF: putative alpha-glucosidase | 638 | |
| 97E1_BACTH | GH97, 97e | AAO75239 | ORF: putative alpha-glucosidase | 644 | |
| 97A1_BACFR | GH97, 97a | BAD47941 | ORF: alpha-glucosidase | 719 | |
| 97A2_BACFR | GH97, 97a | BAD48072 | ORF: alpha-glucosidase | 671 | |
| 97B1_BACFR | GH97, 97b | BAD50730 | ORF: putative alpha-glucosidase | 649 | |
| 97B2_BACFR | GH97, 97b | BAD50235 | ORF: putative alpha-glucosidase | 649 | |
| 97A1_TANFO | GH97, 97a | AAO33827 | alpha-D-glucosidase SusB | 708 | |
| 97A1_PREIN | GH97, 97a | (TIGR_246198) | ORF | 733 | |
| 97A1_PRERU | GH97, 97a | (TIGR_264731) | ORF | 737 | |
| 97B1_PRERU | GH97, 97b | (TIGR_264731) | ORF | 645 | |
| 97B2_PRERU | GH97, 97b | (TIGR_264731) | ORF | 658 | |
| 97C1_PRERU | GH97, 97c | (TIGR_264731) | ORF | 621 | |
| 97C2_PRERU | GH97, 97c | (TIGR_264731) | ORF | 639 | |
| 97C3_PRERU | GH97, 97c | (TIGR_264731) | ORF | 645 | |
| 97A1_SALRU | GH97, 97a | (NC_006812) | ORF | 708 | |
| 97A1_AZOVI | GH97, 97a | EAM07225 | ORF: alpha-glucosidase | 673 | |
| 97A1_XANAX | GH97, 97a | AAM37448 | ORF: alpha-glucosidase | 693 | |
| 97D1_XANAX | GH97, 97d | AAM38156 | ORF: alpha-glucosidase | 654 | |
| 97A1_XANCA | GH97, 97a | AAM41744 | ORF: alpha-glucosidase | 692 | |
| 97D1_XANCA | GH97, 97d | AAM42433 | ORF: alpha-glucosidase | 654 | |
| 97A1_MICDE | GH97, 97a | ZP_00315606 | ORF: hypothetical protein | 684 | |
| 97B1_MICDE | GH97, 97b | ZP_00317369 | ORF: hypothetical protein | 679 | |
| 97C1_MICDE | GH97, 97c | ZP_00317507 | ORF: hypothetical protein | 674 | |
| 97C2_MICDE | GH97, 97c | ZP_00315142 | ORF: hypothetical protein | 661 | |
| 97A1_SHEON | GH97, 97a | AAN55484 | ORF: alpha-glucosidase | 699 | |
| 97A1_SHEBA | GH97, 97a | EAN43632 | ORF: alpha-glucosidase | 710 | |
| 97A1_SHEFR | GH97, 97a | EAN73178 | ORF: alpha-glucosidase | 697 | |
| 97A1_SHEDE | GH97, 97a | EAN70289 | ORF: alpha-glucosidase | 727 | |
| 97A1_SHEAM | GH97, 97a | EAN38820 | ORF: alpha-glucosidase | 676 | |
| 97A1_NOVAR | GH97, 97a | ZP_00303588 | ORF: transketolase | 682 | |
| 97A1_SPHAL | GH97, 97a | EAN45679 | ORF: alpha-glucosidase | 680 | |
| 97D1_CAUCR | GH97, 97d | AAK22781 | ORF: putative alpha-glucosidase | 670 | |
| 97A1_ERYLI | GH97, 97a | EAL74063 | ORF: alpha-glucosidase | 681 | |
| 97E1_RHOBA | GH97, 97e | CAD78916 | ORF: alpha-glucosidase | 645 | |
| 97C1_LEIXY | GH97, 97c | (NC_006087)* | ORF: similar to alpha-glucosidase | 775* | |
| 97X1_SOLUS | GH97 | EAM58489 | ORF: hypothetical protein | 619 | |
| 97A1_HALMA | GH97, 97a | AAV45265 | ORF: alpha-glucosidase | 1144 | |
| 97A1_ANOGA | GH97, 97a | (AAAB01006165) | ORF | 380* | |
| 97A2_ANOGA | GH97, 97a | (AAAB01064948) | ORF | 209* | |
| 97A3_ANOGA | GH97, 97a | (AAAB01020110) | ORF | 231* | |
| 97A4_ANOGA | GH97, 97a | (AAAB01068263) | ORF | 229* | |
| 97A1_UNBAC | GH97, 97a | uncultured murine large bowel bacterium BAC31B | AAX16382 | ORF: alpha-glucosidase | 720 |
| 97A2_UNBAC | GH97, 97a | uncultured bacterium | (AY350337) | ORF | 106* |
| 97A1_ENSEQ | GH97, 97a | environmental sequence (cf. | EAJ06144* | ORF: unknown | 703 |
| 97A2_ENSEQ | GH97, 97a | environmental sequence (cf. | EAI69763 | ORF: unknown | 699 |
| 97A3_ENSEQ | GH97, 97a | environmental sequence | EAJ75652 | ORF: unknown | 714 |
| 97A4_ENSEQ | GH97, 97a | environmental sequence | EAI51202 | ORF: unknown | 713 |
| 97A5_ENSEQ | GH97, 97a | environmental sequence | EAI80962 | ORF: unknown | 702* |
| 97A6_ENSEQ | GH97, 97a | environmental sequence | EAH92811, EAI03708, EAD44407, EAG79875, EAH92819, EAI36772 | ORF: unknown | 711 |
| 97A7_ENSEQ | GH97, 97a | environmental sequence | EAJ99185, EAD99255, EAH48404, EAH57728, EAD83763, EAH04981, EAC91563, EAH85977, EAD11728 | ORF: unknown | 710 |
| 97A8_ENSEQ | GH97, 97a | environmental sequence | EAJ85380, EAH86891 | ORF: unknown | 669* |
| 97C1_ENSEQ | GH97, 97c | environmental sequence | EAD85224* | ORF: unknown | 218* |
| GH27_ORYSA | GH27, 27a | BAB12570 | alpha-galactosidase | 417 | |
| GH36_LACPL | GH36, 36A | AAF02774 | alpha-galactosidase MelA | 738 | |
| GH31_ECOLI | GH31 | AAC76680 | alpha-xylosidase YicI | 772 |
aAccession numbers of protein sequences are given according to the NCBI database [72]. Numbers of nucleic sequences are given (in parentheses) if the corresponding protein sequences have not been deposited. In some cases (asterisked), protein sequences were edited by changing the start codon.
bProtein length was established as the number of amino acids in the corresponding precursor. Incomplete sequences (protein fragments) are asterisked.
Figure 2Portion of the multiple sequence alignment of the sequences analyzed. Ten-letter name for each sequence is indicated in the leftmost column (for origin of the sequences see Table I). The alignment continuously spans three panels. Distances to the N- and C-termini and length of omitted fragments are indicated. Highly conserved residues are highlighted in sequences. Amino acid positions that are highly conserved within several subfamilies but varied in amino acid residues in different subfamilies are coloured. Subfamily belonging of sequences (for family GH97) are indicated in the most right. Amino acid residues, interacting with the substrate in the active center of GH27 and GH31 family glycosidases, are indicated by arrows at the bottom [50-54]. The arrow on the gray background corresponds to the Asp residue, playing the role of the nucleophile in glycosidases of families GH27 and GH31. Red asterisks over and under the alignment indicate three conserved positions (in red) probably corresponding to the nucleophile and proton donor in the glycosidases of family GH97 (see text). Alignment of GH27_ORYSA and GH31_ECOLI is structure-based. At the bottom of the figure, β-strands and α-helixes of the (β/α)8-barrel are indicated. The first part of the barrel (β1–β4) is shown according to the known structures of GH27 and GH31 family members [51, 54]. The second part of the barrel (α4–α8) is based on generalization of predictions for several GH97 family proteins by 3D-PSSM, GOR IV, and nnpredict programs.
Figure 3Phylogenetic trees of family GH97. The trees were reconstructed by the PHYLIP package. Each node was tested using the bootstrap approach and the number of supporting pseudoiterations (out of 1000) is indicated for each internal knot. Subfamily belongings of sequences are indicated, the value of bootstrap support for each subfamily is coloured in yellow. Red arrows indicate to the enzymatically-characterized proteins 97A1_BACTH and 97A1_TANFO (see text). The origin of sequences is given in Table I. (A) The maximum parsimony phylogenetic tree. The bootstrap values were determined using the maximum parsimony (PROTPARS) method. (B) The neighbor-joining phylogenetic tree. The number of amino acid substitutions per site is taken as a measure of branch length.