| Literature DB >> 23840574 |
Jagat Singh Chauhan1, Alka Rao, Gajendra P S Raghava.
Abstract
Glycosylation is one of the mEntities:
Mesh:
Substances:
Year: 2013 PMID: 23840574 PMCID: PMC3695939 DOI: 10.1371/journal.pone.0067008
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Flowchart showing process for creating various datasets used for developing GlycoEP models.
Figure 2The process of creating of overlapping patterns in a glycoproteins and assigning glycosylated and non-glycosylated patterns.
Non-redundant glycosylated and non-glycosylated (positive+negative) patterns at different level of similarity cut-off.
| Redundancy cut-off | Number of total patterns (glycosylated plus non-glycosylated) | ||
| N-linked (Positive+Negative) | O-linked (Positive+Negative) | C-linked (Positive +Negative) | |
|
| 39024 = (2604+36420) | 10403 = (451+9952) | 157 = (48+109) |
| 100% | 39019 = (2604+36415) | 10371 = (451+9920) | 157 = (48+109) |
| 90% | 35293 = (2588+32705 | 7314 = (339+6975) | 150 = (48+102) |
| 80% | 32245 = (2549+29696) | 5669 = (289+5380) | 116 = (32+84) |
| 70% | 29376 = (2506+26870) | 4566 = (258+4308) | 106 = (27+79) |
| 60% | 26505 = (2454+24051) | 3776 = (235+3541) | 105 = (27+78) |
| 50% | 23076 = (2361+20715) | 3234 = (214+3020) | 99 = (23+7) |
| 40% | 10102 = (1599+8503) | 2390 = (174+2216) | 90 = (16+74) |
The performance of sequon (motifs) detection in N-linked glycosylation using five independent glycoproteins on GlycoEP server.
| Glycoprotein IDs | Total Asparagine residuesin whole sequnec | Total detection of N-linked sequonin sequence (NXS/T) | Actual N-linked sequon |
| P28825 | 41 | 10 | 9 |
| P81447 | 8 | 1 | 1 |
| P06756 | 50 | 13 | 4 |
| P31809 | 47 | 16 | 8 |
| P01833 | 39 | 7 | 7 |
Figure 3Performances of various models on standard datasets in term of ROC, for N-, O- and C-linked glycosites (Panel A, B and C, respectively) in eukaryotic proteins.
The performance of models developed on advanced datasets for predicting N-linked, O-linked and C-linked glycosites.
| Datasets | Type | Sensitivity | Specificity | Accuracy | MCC | AUC |
| Advanced datasets | N-linked | 98.16±0.54 | 82.82±0.58 | 84.24±0.49 | 0.54±0.001 | 0.93±0.001 |
| O-linked | 35.75±6.28 | 90.26±0.79 | 86.87±0.86 | 0.20±0.05 | 0.71±0.02 | |
| C-linked | 70.67±8.94 | 93.59±2.98 | 91.43±3.999 | 0.78±0.1 | 0.92±0.08 | |
| Advanced datasets (Balanced patterns) | N-linked | 98.25±0.53 | 86.27±1.02 | 92.26±0.42 | 0.85±0.01 | 0.929±0.001 |
| O-linked | 63.4±9.57 | 62.13±17.31 | 62.77±5.48 | 0.27±0.13 | 0.69±0.08 | |
| C-linked | 82.67±16.73 | 80±36.13 | 79.82±14.77 | 0.66±0.22 | 0.91±0.09 |
As well as performance on balanced patterns of advanced datasets (results with standard deviation of five fold).
The performance of models on an independent datasets, these models were developed on standard datasets.
| Types | No. of patterns | Sensitivity | Specificity | Accuracy | MCC | AUC |
| N-linked | 521 | 96.93 | 88.87 | 92.90 | 0.86 | 0.935 |
| O-linked | 90 | 72.22 | 74.44 | 73.33 | 0.47 | 0.783 |
| C-linked | 9 | 100.00 | 88.89 | 94.44 | 0.89 | 1.000 |
Comparative performances of existing method with our model developed on standard datasets.
| Glycosites | Methods | Sensitivity | Specificity | Accuracy | MCC |
|
|
|
|
|
|
|
| GPP1 | 96.6 | 91.8 | 92.8 | 0.85 | |
| EnsembleGly2 | 98.0 | 77.0 | 95.0 | 0.84 | |
| NetNglyc3 | 43.9 | 95.7 | 76.7 | 0.49 | |
|
|
|
|
|
|
|
| GPP1 | 94.9 | 90.7 | 91.4 | 0.83 | |
| EnsembleGly2 | 59.0 | 68.0 | 89.0 | 0.64 | |
| NetOglyc4 | 76.0 | 92.8 | 88.6 | 0.66 | |
|
|
|
|
|
|
|
| GPP1 | 96.1 | 88.9 | 90.8 | 0.81 | |
| NetOglyc4 | 66.7 | 95.3 | 91.8 | 0.62 | |
|
|
|
|
|
|
|
| GPP1 | 93.6 | 92.4 | 92.0 | 0.84 | |
| NetOglyc4 | 81.5 | 89.5 | 84.9 | 0.67 | |
|
| GlycoEP | 89.80 | 81.63 | 85.71 | 0.72 |
| EnsembleGly2 | 79.0 | 77.0 | 83.0 | 0.63 |
Note: GlycoEP -http://www.imtech.res.in/raghava/glycoep/, 1- http://www.comp.chem.nottingham.ac.uk/glyco/, 2- http://www.turing.cs.iastate.edu/EnsembleGly/, 3- http://www.cbs.dtu.dk/services/NetNGlyc/, 4- http://www.cbs.dtu.dk/services/NetOGlyc.