| Literature DB >> 24918295 |
Cangzhi Jia1, Xin Lin2, Zhiping Wang3.
Abstract
Protein S-nitrosylation is a reversible post-translational modification by covalent modification on the thiol group of cysteine residues by nitric oxide. Growing evidence shows that protein S-nitrosylation plays an important role in normal cellular function as well as in various pathophysiologic conditions. Because of the inherent chemical instability of the S-NO bond and the low abundance of endogenous S-nitrosylated proteins, the unambiguous identification of S-nitrosylation sites by commonly used proteomic approaches remains challenging. Therefore, computational prediction of S-nitrosylation sites has been considered as a powerful auxiliary tool. In this work, we mainly adopted an adapted normal distribution bi-profile Bayes (ANBPB) feature extraction model to characterize the distinction of position-specific amino acids in 784 S-nitrosylated and 1568 non-S-nitrosylated peptide sequences. We developed a support vector machine prediction model, iSNO-ANBPB, by incorporating ANBPB with the Chou's pseudo amino acid composition. In jackknife cross-validation experiments, iSNO-ANBPB yielded an accuracy of 65.39% and a Matthew's correlation coefficient (MCC) of 0.3014. When tested on an independent dataset, iSNO-ANBPB achieved an accuracy of 63.41% and a MCC of 0.2984, which are much higher than the values achieved by the existing predictors SNOSite, iSNO-PseAAC, the Li et al. algorithm, and iSNO-AAPair. On another training dataset, iSNO-ANBPB also outperformed GPS-SNO and iSNO-PseAAC in the 10-fold crossvalidation test.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24918295 PMCID: PMC4100159 DOI: 10.3390/ijms150610410
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Best predictive performances of four sequence encoding schemes.
| Sequence Encoding Scheme |
| ||||
|---|---|---|---|---|---|
| BPB + Ecomposition a + Scomposition b | 2 | 65.31 | 65.63 | 65.52 | 0.2933 |
| BRABSB + Ecomposition + Scomposition | 2.5 | 73.09 | 58.16 | 63.14 | 0.2949 |
| ANBPB + Ecomposition + Scomposition | |||||
| RANS + Ecomposition + Scomposition | 2.5 | 63.90 | 61.42 | 62.24 | 0.2391 |
a Ecomposition denotes the composition of positively charged amino acids; b Scomposition denotes the composition of α-helix propensities of amino acids.
Performance comparison of different computational approaches on different datasets.
| Dataset | Methods |
| |||
|---|---|---|---|---|---|
| Li training dataset | Li
| 42.86 | 70.98 | 61.61 | 0.1381 |
| iSNO-ANBPB | 67.60 | 64.29 | 65.39 | 0.3014 | |
| Xu dataset | GPS-SNO a | 18.88 | 89.63 | 56.07 | 0.1210 |
| GPS-SNO b | 28.04 | 81.98 | 56.39 | 0.1193 | |
| GPS-SNO c | 45.01 | 73.33 | 59.90 | 0.1915 | |
| iSNO-PseAAC | 67.01 | 68.15 | 67.62 | 0.3515 | |
| iSNO-ANBPB | 67.33 | 73.78 | 70.77 | 0.4146 | |
| Li test dataset | SNOSite | 74.42 | 28.10 | 40.24 | 0.0248 |
| iSNO-AAPair | 27.91 | 80.17 | 66.46 | 0.0858 | |
| Li
| 51.16 | 69.42 | 64.63 | 0.1886 | |
| iSNO-PseAAC | 58.14 | 63.64 | 62.20 | 0.1940 | |
| iSNO-ANBPB | 74.12 | 59.50 | 63.41 | 0.2984 |
a The data was derived from Table 1 in Xu et al. [16] and the threshold of GPS-SNO was set at “high”; b The data was derived from Table 1 in Xu et al. [16] and the threshold of GPS-SNO was set at “medium”; c The data was derived from Table 1 in Xu et al. [16] and the threshold of GPS-SNO was set at “low”.
Figure 1Potential S-nitrosylation sites predicted on 37 proteins through S-nitrosothiols (SNO)site, iSNO-PseAAC, iSNO-AAPair and iSNO-adapted normal distribution bi-profile Bayes (ANBPB) predictor.