Literature DB >> 18955094

GalNAc-transferase specificity prediction based on feature selection method.

Lin Lu1, Bing Niu, Jun Zhao, Liang Liu, Wen-Cong Lu, Xiao-Jun Liu, Yi-Xue Li, Yu-Dong Cai.   

Abstract

GalNAc-transferase can catalyze the biosynthesis of O-linked oligosaccharides. The specificity of GalNAc-transferase is composed of nine amino acid residues denoted by R4, R3, R2, R1, R0, R1', R2', R3', R4'. To predict whether the reducing monosaccharide will be covalently linked to the central residue R0(Ser or Thr), a new method based on feature selection has been proposed in our work. 277 nonapeptides from reference [Chou KC. A sequence-coupled vector-projection model for predicting the specificity of GalNAc-transferase. Protein Sci 1995;4:1365-83] are chosen for training set. Each nonapeptide is represented by hundreds of amino acid properties collected by Amino Acid Index database (http://www.genome.jp/aaindex) and transformed into a numeric vector with 4554 features. The Maximum Relevance Minimum Redundancy (mRMR) method combining with Incremental Feature Selection (IFS) and Feature Forward Selection (FFS) are then applied for feature selection. Nearest Neighbor Algorithm (NNA) is used to build prediction models. The optimal model contains 54 features and its correct rate tested by Jackknife cross-validation test reaches 91.34%. Final feature analysis indicates that amino acid residues at position R3' play the most important role in the recognition of GalNAc-transferase specificity, which were confirmed by the experiments [Elhammer AP, Poorman RA, Brown E, Maggiora LL, Hoogerheide JG, Kezdy FJ. The specificity of UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase as inferred from a database of in vivo substrates and from the in vitro glycosylation of proteins and peptides. J Biol Chem 1993;268:10029-38; O'Connell BC, Hagen FK, Tabak LA. The influence of flanking sequence on the O-glycosylation of threonine in vitro. J Biol Chem 1992;267:25010-8; Yoshida A, Suzuki M, Ikenaga H, Takeuchi M. Discovery of the shortest sequence motif for high level mucin-type O-glycosylation. J Biol Chem 1997;272:16884-8]. Our method can be used as a tool for predicting O-glycosylation sites and for investigating the GalNAc-transferase specificity, which is useful for designing competitive inhibitors of GalNAc-transferase. The predicting software is available upon the request.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18955094     DOI: 10.1016/j.peptides.2008.09.020

Source DB:  PubMed          Journal:  Peptides        ISSN: 0196-9781            Impact factor:   3.750


  6 in total

1.  Prediction of mucin-type O-glycosylation sites by a two-staged strategy.

Authors:  YuDong Cai; JianFeng He; Lin Lu
Journal:  Mol Divers       Date:  2010-07-22       Impact factor: 2.943

2.  Isoform-specific O-glycosylation of osteopontin and bone sialoprotein by polypeptide N-acetylgalactosaminyltransferase-1.

Authors:  Hazuki E Miwa; Thomas A Gerken; Oliver Jamison; Lawrence A Tabak
Journal:  J Biol Chem       Date:  2009-10-30       Impact factor: 5.157

3.  A novel model to predict O-glycosylation sites using a highly unbalanced dataset.

Authors:  Kun Zhou; Chunzhi Ai; Peipei Dong; Xuran Fan; Ling Yang
Journal:  Glycoconj J       Date:  2012-08-03       Impact factor: 2.916

Review 4.  The role of mucin-type O-glycans in eukaryotic development.

Authors:  Lawrence A Tabak
Journal:  Semin Cell Dev Biol       Date:  2010-02-06       Impact factor: 7.727

5.  Logic minimization and rule extraction for identification of functional sites in molecular sequences.

Authors:  Raul Cruz-Cano; Mei-Ling Ting Lee; Ming-Ying Leung
Journal:  BioData Min       Date:  2012-08-16       Impact factor: 2.522

6.  Classification and analysis of regulatory pathways using graph property, biochemical and physicochemical property, and functional property.

Authors:  Tao Huang; Lei Chen; Yu-Dong Cai; Kuo-Chen Chou
Journal:  PLoS One       Date:  2011-09-28       Impact factor: 3.240

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.