Literature DB >> 19472067

Protein sumoylation sites prediction based on two-stage feature selection.

Lin Lu1, Xiao-He Shi, Su-Jun Li, Zhi-Qun Xie, Yong-Li Feng, Wen-Cong Lu, Yi-Xue Li, Haipeng Li, Yu-Dong Cai.   

Abstract

Protein sumoylation is one of the most important post-translational modifications. Accurate prediction of sumoylation sites is very useful for the analysis of proteome. Though the putative motif Psi K XE can be used, optimization of prediction models still remains a challenge. In this study, we developed a prediction system based on feature selection strategy. A total of 1,272 peptides with 14 residues from SUMOsp (Xue et al. [8] Nucleic Acids Res 34:W254-W257, 2006) were investigated in this study, including 212 substrates and 1,060 non-substrates. Among the substrates, only 162 substrates comply to the motif Psi K XE. First, 1,272 substrates were divided into training set and test set. All the substrates were encoded into feature vectors by hundreds of amino acid properties collected by Amino Acid Index Database (AAIndex, http://www.genome.jp/aaindex ). Then, mRMR (minimum redundancy-maximum relevance) method was applied to extract the most informative features. Finally, Nearest Neighbor Algorithm (NNA) was used to produce the prediction models. Tested by Leave-one-out (LOO) cross-validation, the optimal prediction model reaches the accuracy of 84.4% for the training set and 76.4% for the test set. Especially, 180 substrates were correctly predicted, which was 18 more than using the motif Psi K XE. The final selected features indicate that amino acid residues with two-residue downstream and one-residue upstream of the sumoylation sites play the most important role in determining the occurrence of sumoylation. Based on the feature selection strategy, our prediction system can not only be used for high throughput prediction of sumoylation sites but also as a tool to investigate the mechanism of sumoylation.

Mesh:

Substances:

Year:  2009        PMID: 19472067     DOI: 10.1007/s11030-009-9149-5

Source DB:  PubMed          Journal:  Mol Divers        ISSN: 1381-1991            Impact factor:   2.943


  18 in total

1.  AAindex: amino acid index database.

Authors:  S Kawashima; M Kanehisa
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

Review 2.  Proteomic analysis of post-translational modifications.

Authors:  Matthias Mann; Ole N Jensen
Journal:  Nat Biotechnol       Date:  2003-03       Impact factor: 54.908

Review 3.  Protein modification by SUMO.

Authors:  Erica S Johnson
Journal:  Annu Rev Biochem       Date:  2004       Impact factor: 23.643

Review 4.  SUMO and transcriptional regulation.

Authors:  David W H Girdwood; Michael H Tatham; Ronald T Hay
Journal:  Semin Cell Dev Biol       Date:  2004-04       Impact factor: 7.727

5.  Sumo1 conjugates mitochondrial substrates and participates in mitochondrial fission.

Authors:  Zdena Harder; Rodolfo Zunino; Heidi McBride
Journal:  Curr Biol       Date:  2004-02-17       Impact factor: 10.834

Review 6.  SUMO wrestling with type 1 diabetes.

Authors:  Manyu Li; Dehuang Guo; Carlos M Isales; Decio L Eizirik; Mark Atkinson; Jin-Xiong She; Cong-Yi Wang
Journal:  J Mol Med (Berl)       Date:  2005-04-02       Impact factor: 4.599

Review 7.  SUMO: a history of modification.

Authors:  Ronald T Hay
Journal:  Mol Cell       Date:  2005-04-01       Impact factor: 17.970

8.  Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy.

Authors:  Hanchuan Peng; Fuhui Long; Chris Ding
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2005-08       Impact factor: 6.226

9.  AAindex: Amino Acid Index Database.

Authors:  S Kawashima; H Ogata; M Kanehisa
Journal:  Nucleic Acids Res       Date:  1999-01-01       Impact factor: 16.971

10.  Regulation of Smad4 sumoylation and transforming growth factor-beta signaling by protein inhibitor of activated STAT1.

Authors:  Min Liang; Frauke Melchior; Xin-Hua Feng; Xia Lin
Journal:  J Biol Chem       Date:  2004-03-17       Impact factor: 5.157

View more
  13 in total

1.  Prediction of interactiveness of proteins and nucleic acids based on feature selections.

Authors:  YouLang Yuan; XiaoHe Shi; XinLei Li; WenCong Lu; YuDong Cai; Lei Gu; Liang Liu; MinJie Li; XiangYin Kong; Meng Xing
Journal:  Mol Divers       Date:  2009-10-09       Impact factor: 2.943

2.  Posttranslational modifications in proteins: resources, tools and prediction methods.

Authors:  Shahin Ramazi; Javad Zahiri
Journal:  Database (Oxford)       Date:  2021-04-07       Impact factor: 3.451

3.  Fuzzy clustering of physicochemical and biochemical properties of amino acids.

Authors:  Indrajit Saha; Ujjwal Maulik; Sanghamitra Bandyopadhyay; Dariusz Plewczynski
Journal:  Amino Acids       Date:  2011-10-13       Impact factor: 3.520

4.  Systematic Characterization of Lysine Post-translational Modification Sites Using MUscADEL.

Authors:  Zhen Chen; Xuhan Liu; Fuyi Li; Chen Li; Tatiana Marquez-Lago; André Leier; Geoffrey I Webb; Dakang Xu; Tatsuya Akutsu; Jiangning Song
Journal:  Methods Mol Biol       Date:  2022

5.  Large-scale comparative assessment of computational predictors for lysine post-translational modification sites.

Authors:  Zhen Chen; Xuhan Liu; Fuyi Li; Chen Li; Tatiana Marquez-Lago; André Leier; Tatsuya Akutsu; Geoffrey I Webb; Dakang Xu; Alexander Ian Smith; Lei Li; Kuo-Chen Chou; Jiangning Song
Journal:  Brief Bioinform       Date:  2019-11-27       Impact factor: 11.622

6.  SUMOgo: Prediction of sumoylation sites on lysines by motif screening models and the effects of various post-translational modifications.

Authors:  Chi-Chang Chang; Chi-Hua Tung; Chi-Wei Chen; Chin-Hau Tu; Yen-Wei Chu
Journal:  Sci Rep       Date:  2018-10-19       Impact factor: 4.379

7.  Identification of protein SUMOylation sites by mass spectrometry using combined microwave-assisted aspartic acid cleavage and tryptic digestion.

Authors:  Omoruyi Osula; Stephen Swatkoski; Robert J Cotter
Journal:  J Mass Spectrom       Date:  2012-05       Impact factor: 1.982

8.  Classification and analysis of regulatory pathways using graph property, biochemical and physicochemical property, and functional property.

Authors:  Tao Huang; Lei Chen; Yu-Dong Cai; Kuo-Chen Chou
Journal:  PLoS One       Date:  2011-09-28       Impact factor: 3.240

9.  Cooperativity among short amyloid stretches in long amyloidogenic sequences.

Authors:  Lele Hu; Weiren Cui; Zhisong He; Xiaohe Shi; Kaiyan Feng; Buyong Ma; Yu-Dong Cai
Journal:  PLoS One       Date:  2012-06-22       Impact factor: 3.240

10.  Prediction of pharmacological and xenobiotic responses to drugs based on time course gene expression profiles.

Authors:  Tao Huang; Weiren Cui; Lele Hu; Kaiyan Feng; Yi-Xue Li; Yu-Dong Cai
Journal:  PLoS One       Date:  2009-12-02       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.