Literature DB >> 20351933

Support vector machine-based mucin-type o-linked glycosylation site prediction using enhanced sequence feature encoding.

Manabu Torii1, Hongfang Liu, Zhang-Zhi Hu.   

Abstract

Glycosylation is a common and complex protein post-translational modification (PTM). In particular, mucin-type O-linked glycosylation is abundant and plays important biological functions. The number of determined glycosylation sites is still small and there remains the need of accurate computational prediction for annotation and functional understanding of proteins. PTM site prediction can be formulated as a machine learning task. An important step in applying machine learning to this task is encoding protein fragments as feature vectors. Here we assess existing encoding methods as well as an enhanced encoding method named composition of monomer spectrum (CMS) using support vector machines (SVMs). SVMs employing the existing encoding methods achieved AUC (area under ROC curve) of 90.3-91.3%, and ones employing CMS achieved AUC of 92.4%. Analysis of different encoding methods suggests the potential in further improving the prediction.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 20351933      PMCID: PMC2815398     

Source DB:  PubMed          Journal:  AMIA Annu Symp Proc        ISSN: 1559-4076


  27 in total

Review 1.  On the frequency of protein glycosylation, as deduced from analysis of the SWISS-PROT database.

Authors:  R Apweiler; H Hermjakob; N Sharon
Journal:  Biochim Biophys Acta       Date:  1999-12-06

2.  The spectrum kernel: a string kernel for SVM protein classification.

Authors:  Christina Leslie; Eleazar Eskin; William Stafford Noble
Journal:  Pac Symp Biocomput       Date:  2002

Review 3.  Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence.

Authors:  Nikolaj Blom; Thomas Sicheritz-Pontén; Ramneek Gupta; Steen Gammeltoft; Søren Brunak
Journal:  Proteomics       Date:  2004-06       Impact factor: 3.984

4.  Discovery of the shortest sequence motif for high level mucin-type O-glycosylation.

Authors:  A Yoshida; M Suzuki; H Ikenaga; M Takeuchi
Journal:  J Biol Chem       Date:  1997-07-04       Impact factor: 5.157

Review 5.  Congenital disorders of glycosylation: the rapidly growing tip of the iceberg.

Authors:  J Jaeken; H Carchon
Journal:  Curr Opin Neurol       Date:  2001-12       Impact factor: 5.710

6.  O-GLYCBASE: a revised database of O-glycosylated proteins.

Authors:  J E Hansen; O Lund; J O Nielsen; J E Hansen; S Brunak
Journal:  Nucleic Acids Res       Date:  1996-01-01       Impact factor: 16.971

7.  Prediction of O-glycosylation of mammalian proteins: specificity patterns of UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase.

Authors:  J E Hansen; O Lund; J Engelbrecht; H Bohr; J O Nielsen; J E Hansen
Journal:  Biochem J       Date:  1995-06-15       Impact factor: 3.857

Review 8.  Role of unusual O-glycans in intercellular signaling.

Authors:  Kelvin B Luther; Robert S Haltiwanger
Journal:  Int J Biochem Cell Biol       Date:  2008-10-08       Impact factor: 5.085

9.  Sequence differences between glycosylated and non-glycosylated Asn-X-Thr/Ser acceptor sites: implications for protein engineering.

Authors:  Y Gavel; G von Heijne
Journal:  Protein Eng       Date:  1990-04

Review 10.  Roles of N-linked glycans in the endoplasmic reticulum.

Authors:  Ari Helenius; Markus Aebi
Journal:  Annu Rev Biochem       Date:  2004       Impact factor: 23.643

View more
  2 in total

1.  dbOGAP - an integrated bioinformatics resource for protein O-GlcNAcylation.

Authors:  Jinlian Wang; Manabu Torii; Hongfang Liu; Gerald W Hart; Zhang-Zhi Hu
Journal:  BMC Bioinformatics       Date:  2011-04-06       Impact factor: 3.169

2.  PKIS: computational identification of protein kinases for experimentally discovered protein phosphorylation sites.

Authors:  Liang Zou; Mang Wang; Yi Shen; Jie Liao; Ao Li; Minghui Wang
Journal:  BMC Bioinformatics       Date:  2013-08-13       Impact factor: 3.169

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.