| Literature DB >> 32637037 |
Peng Zhang1, Zheng Zhang1, Lijuan Zhang1, Jingjing Wang1, Changsheng Wu1.
Abstract
Glycosyltransferases (GTs) are responsible for transferring glycosyl moieties from activated sugar donors to certain acceptors, among which the GT1 family enzymes have been known for their outstanding glycosylation capacities toward diverse natural products, such as glycolipids, flavonoids and macrolides etc. However, there still lacks a systematic collation of this important family members. In this minireview, all the GT1 family sequences were phylogenetically analyzed, and the grouping of GT1 proteins exhibited a taxonomic life domain-dependent pattern, revealing many untapped clades of GTs. The further phylogenetic analysis of the characterized GTs facilitated the classification of substrates coverage of GT1 family enzymes from different life domains, whereby the GTs from bacteria can tolerate a wider spectrum of chemical skeletons as substrates, showing higher promiscuity than those from other domains. Furthermore, the sequence sizes of GTs from different domains were compared to understand their different substrates selectivity. Based on the multiple sequence alignments of 28 representative GT1 enzymes with crystal structures, two critical regions located in the N-terminal of GTs were identified, which were most variable among sequences from different taxonomic domains and essential for substrates binding and/or catalysis. The key roles of these two regions were validated by enumerating the influential residues that interacted with substrates in the representative structures from bacteria and plants. The atlas for GT1 family in terms of phylogeny, substrates selectivity, sequence length, and critical motifs provides the clues for the exploration of unknown GT1s and rational engineering of known enzymes, synthesizing novel promising glycoconjugates for pharmaceutical application.Entities:
Keywords: Distinct binding motifs; GT1 family; Glycosyltransferase (GT); Phylogenetic distribution; Sequence length disparity; Substrates spectrum
Year: 2020 PMID: 32637037 PMCID: PMC7316871 DOI: 10.1016/j.csbj.2020.06.003
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Fig. 1The distribution statistics of GT1 enzymes from diverse life domains. (A) GT1 family exhibits an uneven distribution pattern in different domains of life. (B) The statistics of paralogs of family GT1 genes in different life domains.
Fig. 2The phylogenetic tree of all the GT1 family members. The three groups A, B and C were distinguished by different background. The origin of the GT1 were color labeled in peripheral circle. The characterized GT1 enzymes in the branches were highlighted by red lines, and the functions of GT1 in many clades awaits characterization. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Fig. 3Phylogenetic tree analysis of the characterized GTs from the GT1 family. The three groups A1, B1 and C1 were color labelled with different background according to those in Fig. 2. The substrate specificities of the characterized GT1 family were depicted by different colors in peripheral circle. The characterized GT1 enzymes with co-crystal structures reported were highlighted by red lines. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Fig. 4The proportion of proteins of different lengths from each life domains. Every 20 amino acid residues were binned as one scatter. The frequency and variation in the protein sequence length was represented by the peak height and width.
Comparison of the average length (aa) of GTs from different life domains.
| GTs from different life domains | Length (aa) of whole Protein | Length (aa) of Region A | Length (aa) of Region B |
|---|---|---|---|
| 17 BGTs with crystal structures | 400.8 ± 22.0 | 38.06 ± 9.66 | 43.29 ± 14.03 |
| 11 PGTs with crystal structures | 460.7 ± 14.9 | 24.36 ± 2.77 | 63.00 ± 13.39 |
| All archaea-derived GTs (ArGTs) | 400.6 ± 69.8 | 37.11 ± 7.56 | 21.68 ± 6.76 |
| All bacteria-derived GTs (BGTs) | 402.0 ± 27.9 | 39.85 ± 8.03 | 41.45 ± 14.62 |
| All plant-derived GTs (PGTs) | 470.3 ± 31.7 | 40.89 ± 14.02 | 49.45 ± 13.33 |
| All fungi-derived GTs (FGTs) | 479.1 ± 68.1 | 27.77 ± 6.70 | 64.48 ± 10.15 |
| All animal-derived GTs (AGTs) | 494.7 ± 42.0 | 47.31 ± 7.02 | 73.13 ± 10.48 |
| All virus-derived GTs (VGTs) | 514.3 ± 38.0 | 45.89 ± 3.52 | 75.48 ± 5.31 |
Fig. 5Comparison of the structures and sequence lengths of Region A and Region B in BGTs and PGTs. (A) The representative crystal structures of the bacterial protein GtfA and botanical protein UGT78K6. The major differences among GtfA and UGT78K6 lied in the Region A and Region B that were depicted in red and blue, respectively. (B) The multiple sequence alignment of the 10 GT sequences for which co-crystal structures were known, including 4 BGTs labelled in orange color and 6 PGTs labelled in green color. The Region A and B were labeled with lines and the number in the brackets represented the amino acids omitted. The influential amino acid residues that interacted with substrates in the co-crystals were marked with blue boxes. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)