| Literature DB >> 20423529 |
Subhadip Basu1, Dariusz Plewczynski.
Abstract
BACKGROUND: We present here the recent update of AMS algorithm for identification of post-translational modification (PTM) sites in proteins based only on sequence information, using artificial neural network (ANN) method. The query protein sequence is dissected into overlapping short sequence segments. Ten different physicochemical features describe each amino acid; therefore nine residues long segment is represented as a point in a 90 dimensional space. The database of sequence segments with confirmed by experiments post-translational modification sites are used for training a set of ANNs.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20423529 PMCID: PMC2874555 DOI: 10.1186/1471-2105-11-210
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1MLP algorithm. A block diagram of an MLP shown as a feed forward layered neural network.
Single Amino-Acids Features Selection.
| Accession number | Brief feature description | Reference | Selected/Rejected |
|---|---|---|---|
| ARGP820101 | Hydrophobicity index | Eur. J. Biochem. 128, 565-575 (1982) | Rejected |
| BIOV880101 | Information value for accessibility; average fraction 35% | Protein Engineering 2, 185-191 (1988) | Selected |
| BIOV880102 | Information value for accessibility; average fraction 23% | Protein Engineering 2, 185-191 (1988) | Selected |
| BLAS910101 | Scaled side chain hydrophobicity values | Analytical Biochemistry 193, 72-82 (1991) | Selected |
| BLAS910101 | Scaled side chain hydrophobicity values | Analytical Biochemistry 193, 72-82 (1991) | Rejected |
| BULH740101 | Surface tension of amino acid solutions: A hydrophobicity scale of the amino acid residues | Arch. Biochem. Biophys. 161, 665-670 (1974) | Rejected |
| FASG760101 | Molecular weight | Handbook of Biochemistry and Molecular Biology, 3rd ed., Proteins - Volume 1, CRC Press, Cleveland (1976) | Rejected |
| HOPA770101 | Hydration number | Intermolecular Interactions and Biomolecular Organizations, Wiley, New York (1977) | Selected |
| KRIW710101 | Side chain interaction parameter | Biochim. Biophys. Acta 229, 368-383 (1971) | Selected |
| KRIW790101 | Side chain interaction parameter | Biochim. Biophys. Acta 576, 204-228 (1979) | Selected |
| KRIW790102 | Fraction of site occupied by water | Biochim. Biophys. Acta 576, 204-228 (1979) | Selected |
| KRIW790103 | Side chain volume | Biochim. Biophys. Acta 576, 204-228 (1979) | Selected |
| LAWE840101 | Transfer free energy, CHP/water | J. Biol. Chem. 259, 2910-2912 (1984) | Selected |
| OOBM850105 | Optimized side chain interaction parameter | Bull. Inst. Chem. Res., Kyoto Univ. 63, 82-94 (1985) | Selected |
| WARP780101 | Average interactions per side chain atom | J. Mol. Biol. 118, 289-304 (1978) (Gly 0.81) | Rejected |
List of experimentally chosen features used in the current experiment from the AAindex database with the corresponding feature accession number, brief description and the journal reference.
Improvement of the performance of the AutoMotifServer 3.0.
| AMS 2.0 | AMS 3.0 | AMS 3.0 (New SwissProt dataset) | |||||||
|---|---|---|---|---|---|---|---|---|---|
| 3,4-dihydroxyproline | - | 88,89 | 66,67 | 95,45 | 87,5 | 0,962273 | 97,37 | 97,37 | 0,985088 |
| 4-carboxyglutamate | - | 90,56 | 95,93 | 96,27 | 99,65 | 0,980856 | 96,54 | 99,35 | 0,982076 |
| 4-hydroxyproline | - | 65,59 | 85,31 | 86,88 | 83,94 | 0,905902 | 95,03 | 94,14 | 0,969559 |
| 5-hydroxylysine | - | 12,79 | 84,62 | 97,3 | 38,92 | 0,845236 | 90,16 | 95,93 | 0,948272 |
| Asymmetric dimethylarginine | - | 84,62 | 82,09 | 94,55 | 94,55 | 0,965227 | - | - | - |
| dihydroxyphenylalanine | - | 29,41 | 76,92 | 86,17 | 86,17 | 0,914601 | - | - | - |
| Glycine amide | - | 84,38 | 90 | 97,92 | 97,92 | 0,977679 | 99,3 | 99,3 | 0,984574 |
| Hydroxyproline | - | 28,8 | 85,71 | 89,62 | 94,06 | 0,942113 | 94,62 | 99,19 | 0,972567 |
| Leucine amide | - | 98,97 | 97,96 | 98,1 | 99,36 | 0,989049 | - | - | - |
| Methionine amide | - | 100 | 93,75 | 100 | 93,75 | 0,666667 | - | - | - |
| N-acetylglycine | - | 79,17 | 90,48 | 97,83 | 97,83 | 0,985797 | - | - | - |
| N-acetylmethionine | - | 100 | 95,42 | 98,4 | 98,93 | 0,990593 | - | - | - |
| N-acetylserine | - | 98,24 | 99,64 | 98,22 | 99,4 | 0,990124 | - | - | - |
| N-acetylthreonine | - | 76,47 | 88,64 | 89,8 | 97,78 | 0,94648 | - | - | - |
| N6-acetyllysine | - | 12,76 | 74,32 | 94,3 | 93,23 | 0,953465 | - | - | - |
| N6, N6, N6-trimethyllysine | - | 7,5 | 60 | 85,71 | 97,5 | 0,926065 | - | - | - |
| Omega-N-methylated arginine | - | 0 | 0 | 78 | 58,21 | 0,82 | - | - | - |
| Phenylalanine amide | - | 98,59 | 93,33 | 99 | 99 | 0,986667 | - | - | - |
| Phospho | PKA | 63,64 | 77,78 | 90,72 | 86,27 | 0,939608 | 97,39 | 95,39 | 0,983802 |
| Phospho | PKC | 32,56 | 84,85 | 86,17 | 97,59 | 0,928851 | 94,78 | 99,18 | 0,973369 |
| Phospho | autocatalysis | 13,16 | 96,15 | 79,35 | 93,59 | 0,888392 | 95,96 | 99,15 | 0,97851 |
| Phospho | CDC2 | 85,71 | 80,9 | 89,55 | 75,95 | 0,916095 | 93,73 | 92,36 | 0,963488 |
| Phosphoserine | - | 97,06 | 91,34 | 94,33 | 95,56 | 0,94975 | 96,87 | 98,02 | 0,912848 |
| Phosphoserine | PKA | 68,42 | 78,31 | 91,67 | 72,64 | 0,922083 | 97,66 | 98,43 | 0,987096 |
| Phosphoserine | PKC | 23,29 | 80,95 | 96,43 | 95,29 | 0,975476 | 94,93 | 98,94 | 0,973939 |
| Phosphoserine | autocatalysis | 10 | 85,71 | 75 | 90,7 | 0,868311 | 91,57 | 99,13 | 0,957295 |
| Phosphoserine | CK2 | 29,73 | 70,97 | 88,14 | 59,77 | 0,882345 | 93,02 | 98,23 | 0,962999 |
| Phosphothreonine | - | 46,12 | 75,46 | 78,16 | 69,73 | 0,828495 | 85,2 | 97,4 | 0,905598 |
| Phosphothreonine | autocatalysis | 0 | 0 | 73,47 | 80 | 0,849347 | 94,96 | 99,82 | 0,974381 |
| Phosphotyrosine | - | 9,32 | 71,2 | 84,24 | 96,69 | 0,91775 | 91,75 | 98,97 | 0,952053 |
| Phosphotyrosine | autocatalysis | 1,41 | 12,5 | 92 | 92 | 0,951404 | 96,82 | 97,21 | 0,979045 |
| Pyrrolidone carboxylic acid | - | 60,31 | 97,11 | 91,58 | 97,92 | 0,955236 | - | - | - |
| Sulfotyrosine | - | 70,19 | 85,88 | 98,39 | 97,6 | 0,988929 | 99,48 | 98,47 | 0,990853 |
| Valine amide | - | 94,29 | 97,06 | 98,57 | 98,57 | 0,986607 | - | - | - |
| Phospho.ELM | - | 98 | 91,68 | - | - | - | 98,56 | 96,07 | 0,949397 |
| Phospho.ELM | Abl | 0 | 0 | - | - | - | 81,82 | 52,94 | 0,862032 |
| Phospho.ELM | AMPK_group | 6,25 | 100 | - | - | - | 100 | 72,22 | 0,975 |
| Phospho.ELM | ATM | 92,98 | 81,54 | - | - | - | 94,87 | 90,24 | 0,967439 |
| Phospho.ELM | CaM-KIIalpha | 41,67 | 88,24 | - | - | - | 100 | 77,42 | 0,981081 |
| Phospho.ELM | CaM-KII_group | 14,55 | 88,89 | - | - | - | 82,05 | 94,12 | 0,906796 |
| Phospho.ELM | CDK1 | 41,73 | 63,04 | - | - | - | 96,94 | 62,91 | 0,94711 |
| Phospho.ELM | CDK2 | 7,14 | 45,45 | - | - | - | 92,31 | 64,86 | 0,928627 |
| Phospho.ELM | CDK_group | 59,8 | 67,03 | - | - | - | 95,65 | 67,35 | 0,947785 |
| Phospho.ELM | CK1_group | 0 | 0 | - | - | - | 86,67 | 72,22 | 0,911594 |
| Phospho.ELM | CK2 alpha | 38,98 | 67,65 | - | - | - | 78,82 | 70,53 | 0,872036 |
| Phospho.ELM | CK2_group | 43,33 | 72,22 | - | - | - | 73,33 | 64,36 | 0,839759 |
| Phospho.ELM | EGFR | 0 | 0 | - | - | - | 73,17 | 78,95 | 0,852909 |
| Phospho.ELM | Fyn | 0 | 0 | - | - | - | 87,5 | 84,85 | 0,9275 |
| Phospho.ELM | GRK_group | 2,7 | 100 | - | - | - | 83,33 | 90,91 | 0,911404 |
| Phospho.ELM | GSK-3beta | 18,37 | 75 | - | - | - | 85,29 | 64,44 | 0,896282 |
| Phospho.ELM | GSK-3_group | 12,5 | 66,67 | - | - | - | 65,22 | 93,75 | 0,823128 |
| Phospho.ELM | IGF1R | 26,09 | 100 | - | - | - | 85 | 73,91 | 0,90625 |
| Phospho.ELM | IKK_group | 0 | 0 | - | - | - | 95,83 | 92 | 0,973761 |
| Phospho.ELM | InsR | 6,67 | 60 | - | - | - | 70,97 | 88 | 0,848289 |
| Phospho.ELM | Lck | 11,76 | 60 | - | - | - | 80,56 | 74,36 | 0,884596 |
| Phospho.ELM | Lyn | 0 | 0 | - | - | - | 66,67 | 84,62 | 0,825137 |
| Phospho.ELM | MAPK1 | 45,88 | 67,24 | - | - | - | 88,98 | 55,85 | 0,898547 |
| Phospho.ELM | MAPK14 | 4 | 22,22 | - | - | - | 82,35 | 77,78 | 0,896017 |
| Phospho.ELM | MAPK3 | 74,7 | 78,48 | - | - | - | 87,93 | 77,27 | 0,922801 |
| Phospho.ELM | MAPK8 | 14,71 | 41,67 | - | - | - | 91,67 | 84,62 | 0,947523 |
| Phospho.ELM | MAPKAPK2 | 3,03 | 100 | - | - | - | 69,57 | 80 | 0,836332 |
| Phospho.ELM | MAPK_group | 54,9 | 77,78 | - | - | - | 82,35 | 75,68 | 0,894784 |
| Phospho.ELM | PDK-1 | 42,86 | 85,71 | - | - | - | 80 | 95,24 | 0,897283 |
| Phospho.ELM | PKA alpha | 42,42 | 82,35 | - | - | - | 86,96 | 95,24 | 0,931824 |
| Phospho.ELM | PKA_group | 58,15 | 90,43 | - | - | - | 91,74 | 62,99 | 0,904783 |
| Phospho.ELM | PKB_group | 79,76 | 65,05 | - | - | - | 88,14 | 77,61 | 0,924011 |
| Phospho.ELM | PKC alpha | 19,7 | 72,22 | - | - | - | 75,56 | 48,57 | 0,825223 |
| Phospho.ELM | PKC_group | 25,21 | 80 | - | - | - | 82,02 | 84,39 | 0,900038 |
| Phospho.ELM | PLK1 | 0 | 0 | - | - | - | 68,97 | 45,45 | 0,788753 |
| Phospho.ELM | Src | 0,67 | 5 | - | - | - | 74,31 | 65,32 | 0,845308 |
| Phospho.ELM | Syk | 6,67 | 30 | - | - | - | 85,71 | 53,57 | 0,878378 |
Comparison of best performances of the AMS 2.0 server with the current AMS version 3.0 on best results obtained on the training datasets of respective PTM types.
Figure 2AUC for four kinase families. Scope of AUC values for the kinase families PKA, PKC, CDK and CK2, computed on sample train and test dataests using AMS3.
Figure 3AUC values for predictors. Comparison of scope of AUC best values for the kinase families PKA, PKC, CDK and CK2, using AMS3, GPS, KinasePhos, NetPhosK, PPSP, PredPhospho, Scansite and Meta Predictor.
The best PTM predictors.
| Sensitivity | Specificity | Accuracy | MCC | POS | |
|---|---|---|---|---|---|
| GPS | 0,908 | 0,8 | 0,844 | 0,695 | 294 |
| KinasePhos_90 | 0,884 | 0,717 | 0,784 | 0,589 | |
| KinasePhos_95 | 0,799 | 0,837 | 0,822 | 0,632 | |
| KinasePhos_100 | 0,571 | 0,923 | 0,782 | 0,542 | |
| KinasePhos_bitscore | 0,912 | 0,685 | 0,776 | 0,588 | |
| NetPhosK_0.3 | 1 | 0 | 0,4 | N/A | |
| NetPhosK_0.5 | 0,639 | 0,748 | 0,705 | 0,387 | |
| NetPhosK_0.7 | 0,065 | 0,998 | 0,624 | 0,188 | |
| PPSP_highsens | 0,983 | 0,075 | 0,438 | 0,128 | |
| PPSP_balanced | 0,905 | 0,796 | 0,839 | 0,687 | |
| PPSP_highspec | 0,054 | 0,982 | 0,611 | 0,1 | |
| PredPhospho | 0,898 | 0,823 | 0,853 | 0,708 | |
| Scansite_low | 0,667 | 0,884 | 0,797 | 0,571 | |
| Scansite_medium | 0,405 | 0,971 | 0,744 | 0,479 | |
| Scansite_high | 0,153 | 0,993 | 0,657 | 0,29 | |
| Meta Predictor | 0,912 | 0,832 | 0,864 | 0,73 | |
| 229 | |||||
| GPS | 0,699 | 0,895 | 0,816 | 0,613 | |
| KinasePhos_90 | 0,581 | 0,904 | 0,774 | 0,523 | |
| KinasePhos_95 | 0,476 | 0,95 | 0,76 | 0,504 | |
| KinasePhos_100 | 0,266 | 0,985 | 0,698 | 0,386 | |
| KinasePhos_bitscore | 0,594 | 0,901 | 0,778 | 0,53 | |
| NetPhosK_0.3 | 0,961 | 0,525 | 0,699 | 0,506 | |
| NetPhosK_0.5 | 0,755 | 0,948 | 0,871 | 0,73 | |
| NetPhosK_0.7 | 0,245 | 1 | 0,698 | 0,403 | |
| PPSP_highsens | 0,93 | 0,227 | 0,509 | 0,208 | |
| PPSP_balanced | 0,742 | 0,933 | 0,857 | 0,7 | |
| PPSP_highspec | 0,048 | 1 | 0,619 | 0,171 | |
| PredPhospho | 0,594 | 0,959 | 0,813 | 0,616 | |
| Scansite_low | 0,576 | 0,983 | 0,82 | 0,64 | |
| Scansite_medium | 0,38 | 0,997 | 0,75 | 0,512 | |
| Scansite_high | 0,135 | 1 | 0,654 | 0,293 | |
| Meta Predictor | 0,878 | 0,904 | 0,893 | 0,779 | |
| GPS | 0,817 | 0,809 | 0,812 | 0,618 | 360 |
| KinasePhos_90 | 0,722 | 0,843 | 0,794 | 0,569 | |
| KinasePhos_95 | 0,65 | 0,887 | 0,792 | 0,56 | |
| KinasePhos_100 | 0,361 | 0,952 | 0,716 | 0,405 | |
| KinasePhos_bitscore | 0,775 | 0,804 | 0,792 | 0,573 | |
| NetPhosK_0.3 | 0,878 | 0,724 | 0,786 | 0,59 | |
| NetPhosK_0.5 | 0,694 | 0,874 | 0,802 | 0,583 | |
| NetPhosK_0.7 | 0,483 | 0,959 | 0,769 | 0,525 | |
| PPSP_highsens | 0,967 | 0,231 | 0,526 | 0,27 | |
| PPSP_balanced | 0,85 | 0,806 | 0,823 | 0,645 | |
| PPSP_highspec | 0,008 | 0,998 | 0,602 | 0,048 | |
| PredPhospho | 0,808 | 0,839 | 0,827 | 0,642 | |
| Scansite_low | 0,644 | 0,917 | 0,808 | 0,596 | |
| Scansite_medium | 0,422 | 0,981 | 0,758 | 0,515 | |
| Scansite_high | 0,158 | 0,991 | 0,658 | 0,288 | |
| Meta Predictor | 0,883 | 0,828 | 0,85 | 0,699 | |
| GPS | 0,718 | 0,753 | 0,739 | 0,466 | 348 |
| KinasePhos_90 | 0,649 | 0,789 | 0,733 | 0,441 | |
| KinasePhos_95 | 0,48 | 0,864 | 0,71 | 0,378 | |
| KinasePhos_100 | 0,129 | 0,977 | 0,638 | 0,211 | |
| KinasePhos_bitscore | 0,687 | 0,722 | 0,708 | 0,404 | |
| NetPhosK_0.3 | 0,716 | 0,695 | 0,703 | 0,403 | |
| NetPhosK_0.5 | 0,491 | 0,841 | 0,701 | 0,358 | |
| NetPhosK_0.7 | 0,333 | 0,935 | 0,694 | 0,348 | |
| PPSP_highsens | 0,954 | 0,274 | 0,546 | 0,289 | |
| PPSP_balanced | 0,741 | 0,743 | 0,743 | 0,477 | |
| PPSP_highspec | 0,006 | 1 | 0,602 | 0,059 | |
| PredPhospho | 0,598 | 0,805 | 0,722 | 0,412 | |
| Scansite_low | 0,411 | 0,866 | 0,684 | 0,315 | |
| Scansite_medium | 0,17 | 0,946 | 0,636 | 0,189 | |
| Scansite_high | 0,069 | 0,994 | 0,624 | 0,179 | |
| Meta Predictor | 0,773 | 0,791 | 0,784 | 0,558 | |
Comparison of best recognition performances of different state-of-the-art phosphorylation site prediction programs with AMS-3 for four kinase families CDK, CK2, PKA and PKC.
Figure 4ROC values for four kinase families. Comparison of ROC values for the kinase families PKA, PKC, CDK and CK2, using GPS, KinasePhos, NetPhosK, PPSP, PredPhospho, Scansite and Meta Predictor with the corresponding ROC curves for training and test datasets using AMS3.
NetPhosK and AMS web servers.
| NetPhosK | AMS 3.0 | |||
|---|---|---|---|---|
| Positives | Recall | Positives | Recall | |
| PKA | 258 | 82 | 121 | 87.5 |
| PKC | 193 | 62 | 118 | 70.83 |
| CaM-II | 26 | 73 | 57 | 82.05 |
| cdc2 | 22 | 37 | 84 | 88.24 |
| CKII | 85 | 75 | 248 | 73.3 |
Comparison of best recognition performances on independent test sets between the servers NetPhosK and AMS 3.0.