| Literature DB >> 20056006 |
Vivek Anantharaman1, L Aravind.
Abstract
BACKGROUND: Eukaryotic extracellular matrices such as proteoglycans, sclerotinized structures, mucus, external tests, capsules, cell walls and waxes contain highly modified proteins, glycans and other composite biopolymers. Using comparative genomics and sequence profile analysis we identify several novel enzymes that could be potentially involved in the modification of cell-surface glycans or glycoproteins.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20056006 PMCID: PMC2824669 DOI: 10.1186/1745-6150-5-1
Source DB: PubMed Journal: Biol Direct ISSN: 1745-6150 Impact factor: 4.540
Figure 1Multiple alignment of representatives of the PC-Esterase family. Multiple sequence alignment of the PC-Esterase domain was constructed using Kalign after parsing high-scoring pairs from PSI-BLAST search results. The alignment was refined based on the pairwise alignments produced by the profile-profile searches with the HHpred program using the PC-esterase profile. The secondary structure from the crystal structures is shown above the alignment with E representing a strand and H a helix. The 85% consensus shown below the alignment was derived using the following amino acid classes: hydrophobic (h: ALICVMYFW, yellow shading); small (s: ACDGNPSTV, green); polar (p: CDEHKNQRST, blue) and its charged subset (c: DEHKR, pink), and big (b: FILMQRWYEK; grey shading). The limits of the domains are indicated by the residue positions, on each end of the sequence. The numbers within the alignment are non-conserved inserts that have not been shown. The sequences are denoted by their gene name followed by the species abbreviation and GenBank Identifier (gi). The species abbreviations are Afum: Aspergillus fumigatus; Atha: Arabidopsis thaliana; Bflo: Branchiostoma floridae; Bfuc: Botryotinia fuckeliana; Caps: Capitella spI; Cele: Caenorhabditis elegans; Cneo: Cryptococcus neoformans; Dpul: Daphnia pulex; Efae: Enterococcus faecalis; Hsap: Homo sapiens; Lgig: Lottia gigantea; Mmus: Mus musculus; Nvec: Nematostella vectensis; Ppat: Physcomitrella patens; Spur: Strongylocentrotus purpuratus; Xtro: Xenopus tropicalis. The bacterial (1yzf) prototype of the GDSL/SGNH family is the first sequence shown. The conserved residues of this family are marked by triangles above the alignment, with the red triangles denoting active residues. The conserved residues of the PC-Esterase domains are shown using circles below the alignment, with the red ones denoting active residues. The PC-Esterase domains also have an extra N terminal Helix-0 that is not seen in the canonical SGNH domains. The PC-Esterase domain in the Lottia gigantea protein has a NxxxH instead of a DxxH, where the polar N has shifted one amino acid to the left. The metazoan C7orf58 have a S in place of the D in the catalytic triad, but the hydroxyl group of this residue can function equivalently. The subfamily names are shown to the right.
Figure 2A) A classification scheme for the PC-Esterase proteins. Each family is represented by an ellipse while the higher level relationships like the groups and the core cluster are represented as ellipses surrounding them. The outer circle represents the classical GDSL/SGNH domains. The PSI-BLAST e-values provide a numerical measure of these relationships - each subfamily recovers other members of the same family at e-values in the range of 10-64-10-7 within a single iteration; the member of any group are recovered by members of other groups in multiple iterations. The domain architecture representative of each family is shown along with the gene name and organism. The phyletic pattern of the family is given and any lineage specific expansion is shown in red. The class abbreviations are Cho - Chordates, Cru - crustacea, Ann -annelid, Fun - Fungi, Asc - Ascomycota, Basi - Basidomycota, B-Fun - Basal Fungi, Ani -Animals, Ech - echinodermata, Mol - molluscs, Cni - cnidaria, Nem - Nematoda, Pla - Plants, Chl - Chlorophytes, Str - Stramenopiles, Hap - Haptophyceae, and the species abbreviations are as above. B) Domain architectures of the Cadherin-like domainsThe domain architectures of the Cadherin-like (Cad-L) domains are shown. The gene name, Organism name and gi are given below. The phyletic pattern is also given, if the architecture is widespread. Domain abbreviations: Fn3 - Fibronectin 3; CBM - Carbohydrate binding domain; LamGL - LamG like jelly roll domain; CWB - Cell Wall Binding repeats.