| Literature DB >> 25961025 |
Yan Liu1, Wenxiang Gu2, Wenyi Zhang3, Jianan Wang3.
Abstract
Glycation is a nonenzymatic process in which proteins react with reducing sugar molecules. The identification of glycation sites in protein may provide guidelines to understand the biological function of protein glycation. In this study, we developed a computational method to predict protein glycation sites by using the support vector machine classifier. The experimental results showed that the prediction accuracy was 85.51% and an overall MCC was 0.70. Feature analysis indicated that the composition of k-spaced amino acid pairs feature contributed the most for glycation sites prediction.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25961025 PMCID: PMC4413511 DOI: 10.1155/2015/561547
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 1The MCC value against feature number.
Top 10 features of the optimal features.
| Order | Feature number | Feature name | Explain |
|---|---|---|---|
| 1 | 1793 | S∧∧∧∧W | Pair of serine and tryptophan spaced with 4 residues |
| 2 | 33 | SS | The secondary structure of site 3 |
| 3 | 1494 | C∧∧∧∧Q | Pair of cystine and glutamine spaced by 4 residues |
| 4 | 1752 | Q∧∧∧∧Y | Pair of glutamine and tyrosine spaced by 4 residues |
| 5 | 91 | EC | The electrostatic charge of site 14 |
| 6 | 710 | H∧∧H | Pair of histidine and histidine spaced by 2 residues |
| 7 | 129 | MV | The molecular volume of site 22 |
| 8 | 341 | L∧S | Pair of leucine and serine spaced by one residue |
| 9 | 629 | D∧∧L | Pair of aspartic acid and leucine spaced by 2 residues |
| 10 | 210 | E∧M | Pair of glutamic acid and methionine spaced by one residue |
Figure 2Amino acids classification distribution.
Figure 3Number of corresponding specific sites.
Comparison of PreGly with GlyNN.
| Method | Sensitivity (%) | Specificity (%) | Accuracy (%) | MCC |
|---|---|---|---|---|
| PreGly | 71.06 | 95.85 | 85.51 | 0.70 |
| GlyNN | 78.65 | 80.15 | 79.50 | 0.58 |
Figure 4The distribution of each feature type in the optimal features set.
Figure 5The distribution of the amino acid types in the optimal features.