Literature DB >> 20652405

Prediction of mucin-type O-glycosylation sites by a two-staged strategy.

YuDong Cai1, JianFeng He, Lin Lu.   

Abstract

The mucin-type O-glycosylation of a protein is an important type of protein post-translational modification. This process is mediated by a family of UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferases which transfer the N-acetylgalactosamine (GalNAc) to the serine or threonine residues with unknown specificity. In order to determine the glycosylation sites of a given protein, we present a two-staged prediction method here, which first determines whether a protein is a glycoprotein, and then determines the glycosylation sites of a protein that has been predicted to be glycosylated in the first stage. In the first stage, a protein is encoded by the protein families in PFAM, which is a collective annotated database of classified protein families; then it is predicted by a predictor trained by the training set. In the second stage, nonapeptides of the predicted mucin-type glycoproteins, with serine or threonine residues at their fifth sites, are represented by indices in AAIndex. Then, it is predicted whether the nonapeptides are attached by GalNAc by a predictor, which is constructed with features selected by feature selection methods [Maximum Relevance Minimum Redundancy (mRMR) method and Incremental Feature Selection method]. The prediction accuracy of the first stage is 94.9% validated by Leave-One-Out validation method; the prediction accuracy of the second stage is 99.4%. These results show that this method is valuable to study the mucin-type O-glycosylation. The analysis of the features used to construct the predictor of the second stage confirms the previously obtained results from other groups. The residues at position -1 and +3 have great impact on the prediction. Among other amino acid indices, the indices about alpha and turn propensities and indices about hydrophobicity of the residues in nonapeptide also influence the recognition of the GalNAc transferases. A web server is available at http://chemdata.shu.edu.cn/gal/.

Entities:  

Mesh:

Substances:

Year:  2010        PMID: 20652405     DOI: 10.1007/s11030-010-9240-y

Source DB:  PubMed          Journal:  Mol Divers        ISSN: 1381-1991            Impact factor:   2.943


  24 in total

1.  AAindex: amino acid index database.

Authors:  S Kawashima; M Kanehisa
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

Review 2.  Erythropoietin: structure, control of production, and function.

Authors:  W Jelkmann
Journal:  Physiol Rev       Date:  1992-04       Impact factor: 37.312

3.  A novel computational approach to predict transcription factor DNA binding preference.

Authors:  Yudong Cai; Jianfeng He; Xinlei Li; Lin Lu; Xinyi Yang; Kaiyan Feng; Wencong Lu; Xiangyin Kong
Journal:  J Proteome Res       Date:  2009-02       Impact factor: 4.466

Review 4.  Expression of mucin antigens in human cancers and its relationship with malignancy potential.

Authors:  S Yonezawa; E Sato
Journal:  Pathol Int       Date:  1997-12       Impact factor: 2.534

5.  The role of glycosylation in synthesis and secretion of beta-amyloid precursor protein by Chinese hamster ovary cells.

Authors:  P Påhlsson; S L Spitalnik
Journal:  Arch Biochem Biophys       Date:  1996-07-15       Impact factor: 4.013

6.  The specificity of UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase as inferred from a database of in vivo substrates and from the in vitro glycosylation of proteins and peptides.

Authors:  A P Elhammer; R A Poorman; E Brown; L L Maggiora; J G Hoogerheide; F J Kézdy
Journal:  J Biol Chem       Date:  1993-05-15       Impact factor: 5.157

Review 7.  MUC1 and the MUCs: a family of human mucins with impact in cancer biology.

Authors:  Stephan E Baldus; Katja Engelmann; Franz-Georg Hanisch
Journal:  Crit Rev Clin Lab Sci       Date:  2004       Impact factor: 6.250

8.  Infrastructure for the life sciences: design and implementation of the UniProt website.

Authors:  Eric Jain; Amos Bairoch; Severine Duvaud; Isabelle Phan; Nicole Redaschi; Baris E Suzek; Maria J Martin; Peter McGarvey; Elisabeth Gasteiger
Journal:  BMC Bioinformatics       Date:  2009-05-08       Impact factor: 3.169

9.  Amino acid sequence and post-translational modification of human interleukin 2.

Authors:  R J Robb; R M Kutny; M Panico; H R Morris; V Chowdhry
Journal:  Proc Natl Acad Sci U S A       Date:  1984-10       Impact factor: 11.205

10.  The Pfam protein families database.

Authors:  Robert D Finn; John Tate; Jaina Mistry; Penny C Coggill; Stephen John Sammut; Hans-Rudolf Hotz; Goran Ceric; Kristoffer Forslund; Sean R Eddy; Erik L L Sonnhammer; Alex Bateman
Journal:  Nucleic Acids Res       Date:  2007-11-26       Impact factor: 16.971

View more
  1 in total

1.  A novel model to predict O-glycosylation sites using a highly unbalanced dataset.

Authors:  Kun Zhou; Chunzhi Ai; Peipei Dong; Xuran Fan; Ling Yang
Journal:  Glycoconj J       Date:  2012-08-03       Impact factor: 2.916

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.