Literature DB >> 22555647

AMS 4.0: consensus prediction of post-translational modifications in protein sequences.

Dariusz Plewczynski1, Subhadip Basu, Indrajit Saha.   

Abstract

We present here the 2011 update of the AutoMotif Service (AMS 4.0) that predicts the wide selection of 88 different types of the single amino acid post-translational modifications (PTM) in protein sequences. The selection of experimentally confirmed modifications is acquired from the latest UniProt and Phospho.ELM databases for training. The sequence vicinity of each modified residue is represented using amino acids physico-chemical features encoded using high quality indices (HQI) obtaining by automatic clustering of known indices extracted from AAindex database. For each type of the numerical representation, the method builds the ensemble of Multi-Layer Perceptron (MLP) pattern classifiers, each optimising different objectives during the training (for example the recall, precision or area under the ROC curve (AUC)). The consensus is built using brainstorming technology, which combines multi-objective instances of machine learning algorithm, and the data fusion of different training objects representations, in order to boost the overall prediction accuracy of conserved short sequence motifs. The performance of AMS 4.0 is compared with the accuracy of previous versions, which were constructed using single machine learning methods (artificial neural networks, support vector machine). Our software improves the average AUC score of the earlier version by close to 7 % as calculated on the test datasets of all 88 PTM types. Moreover, for the selected most-difficult sequence motifs types it is able to improve the prediction performance by almost 32 %, when compared with previously used single machine learning methods. Summarising, the brainstorming consensus meta-learning methodology on the average boosts the AUC score up to around 89 %, averaged over all 88 PTM types. Detailed results for single machine learning methods and the consensus methodology are also provided, together with the comparison to previously published methods and state-of-the-art software tools. The source code and precompiled binaries of brainstorming tool are available at http://code.google.com/p/automotifserver/ under Apache 2.0 licensing.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 22555647      PMCID: PMC3397139          DOI: 10.1007/s00726-012-1290-2

Source DB:  PubMed          Journal:  Amino Acids        ISSN: 0939-4451            Impact factor:   3.520


  29 in total

1.  AAindex: amino acid index database.

Authors:  S Kawashima; M Kanehisa
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Sequence and structure-based prediction of eukaryotic protein phosphorylation sites.

Authors:  N Blom; S Gammeltoft; S Brunak
Journal:  J Mol Biol       Date:  1999-12-17       Impact factor: 5.469

3.  Prediction of glycosylation across the human proteome and the correlation to protein function.

Authors:  Ramneek Gupta; Søren Brunak
Journal:  Pac Symp Biocomput       Date:  2002

4.  The Sulfinator: predicting tyrosine sulfation sites in protein sequences.

Authors:  Flavio Monigatti; Elisabeth Gasteiger; Amos Bairoch; Eva Jung
Journal:  Bioinformatics       Date:  2002-05       Impact factor: 6.937

5.  PROSITE: a documented database using patterns and profiles as motif descriptors.

Authors:  Christian J A Sigrist; Lorenzo Cerutti; Nicolas Hulo; Alexandre Gattiker; Laurent Falquet; Marco Pagni; Amos Bairoch; Philipp Bucher
Journal:  Brief Bioinform       Date:  2002-09       Impact factor: 11.622

Review 6.  Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence.

Authors:  Nikolaj Blom; Thomas Sicheritz-Pontén; Ramneek Gupta; Steen Gammeltoft; Søren Brunak
Journal:  Proteomics       Date:  2004-06       Impact factor: 3.984

7.  Identification of phosphorylation sites in protein kinase A substrates using artificial neural networks and mass spectrometry.

Authors:  Majbrit Hjerrild; Allan Stensballe; Thomas E Rasmussen; Christine B Kofoed; Nikolaj Blom; Thomas Sicheritz-Ponten; Martin R Larsen; Søren Brunak; Ole N Jensen; Steen Gammeltoft
Journal:  J Proteome Res       Date:  2004 May-Jun       Impact factor: 4.466

8.  Analysis of amino acid indices and mutation matrices for sequence comparison and structure prediction of proteins.

Authors:  K Tomii; M Kanehisa
Journal:  Protein Eng       Date:  1996-01

9.  Cluster analysis of amino acid indices for prediction of protein structure and function.

Authors:  K Nakai; A Kidera; M Kanehisa
Journal:  Protein Eng       Date:  1988-07

10.  Phospho.ELM: a database of experimentally verified phosphorylation sites in eukaryotic proteins.

Authors:  Francesca Diella; Scott Cameron; Christine Gemünd; Rune Linding; Allegra Via; Bernhard Kuster; Thomas Sicheritz-Pontén; Nikolaj Blom; Toby J Gibson
Journal:  BMC Bioinformatics       Date:  2004-06-22       Impact factor: 3.169

View more
  10 in total

1.  Protein-protein interaction site prediction in Homo sapiens and E. coli using an interaction-affinity based membership function in fuzzy SVM.

Authors:  Brijesh Kumar Sriwastava; Subhadip Basu; Ujjwal Maulik
Journal:  J Biosci       Date:  2015-10       Impact factor: 1.826

Review 2.  Application of Proteomics Technologies in Oil Palm Research.

Authors:  Benjamin Yii Chung Lau; Abrizah Othman; Umi Salamah Ramli
Journal:  Protein J       Date:  2018-12       Impact factor: 2.371

3.  Incorporating post-translational modifications and unnatural amino acids into high-throughput modeling of protein structures.

Authors:  Ken Nagata; Arlo Randall; Pierre Baldi
Journal:  Bioinformatics       Date:  2014-02-25       Impact factor: 6.937

4.  MusiteDeep: a deep-learning framework for general and kinase-specific phosphorylation site prediction.

Authors:  Duolin Wang; Shuai Zeng; Chunhui Xu; Wangren Qiu; Yanchun Liang; Trupti Joshi; Dong Xu
Journal:  Bioinformatics       Date:  2017-12-15       Impact factor: 6.937

5.  Bioinformatics and computational biology in Poland.

Authors:  Janusz M Bujnicki; Jerzy Tiuryn
Journal:  PLoS Comput Biol       Date:  2013-05-02       Impact factor: 4.475

6.  A grammar inference approach for predicting kinase specific phosphorylation sites.

Authors:  Sutapa Datta; Subhasis Mukhopadhyay
Journal:  PLoS One       Date:  2015-04-17       Impact factor: 3.240

7.  PDP-CON: prediction of domain/linker residues in protein sequences using a consensus approach.

Authors:  Piyali Chatterjee; Subhadip Basu; Julian Zubek; Mahantapas Kundu; Mita Nasipuri; Dariusz Plewczynski
Journal:  J Mol Model       Date:  2016-03-11       Impact factor: 1.810

8.  Consensus classification of human leukocyte antigen class II proteins.

Authors:  Indrajit Saha; Giovanni Mazzocco; Dariusz Plewczynski
Journal:  Immunogenetics       Date:  2012-11-16       Impact factor: 2.846

9.  PPIcons: identification of protein-protein interaction sites in selected organisms.

Authors:  Brijesh K Sriwastava; Subhadip Basu; Ujjwal Maulik; Dariusz Plewczynski
Journal:  J Mol Model       Date:  2013-06-02       Impact factor: 1.810

10.  An ensemble method approach to investigate kinase-specific phosphorylation sites.

Authors:  Sutapa Datta; Subhasis Mukhopadhyay
Journal:  Int J Nanomedicine       Date:  2014-05-10
  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.