| Literature DB >> 25391399 |
David T Jones1, Domenico Cozzetto1.
Abstract
MOTIVATION: A sizeable fraction of eukaryotic proteins contain intrinsically disordered regions (IDRs), which act in unfolded states or by undergoing transitions between structured and unstructured conformations. Over time, sequence-based classifiers of IDRs have become fairly accurate and currently a major challenge is linking IDRs to their biological roles from the molecular to the systems level.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25391399 PMCID: PMC4380029 DOI: 10.1093/bioinformatics/btu744
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.ROC curves of probability-based IDR predictions for DISOPRED3 and DISOPRED2 on the CASP10 data
Comparison of DISOPRED3 and DISOPRED2 performance divided into IDR length ranges
| Measure | No IDR shorter than 4 aas | No IDR shorter than 20 aas | ||
|---|---|---|---|---|
| DISOPRED2 | DISOPRED3 | DISOPRED2 | DISOPRED3 | |
| Sensitivity | 0.396 | 0.384 | 0.343 | 0.533 |
| Specificity | 0.941 | 0.991 | 0.941 | 0.991 |
| Precision | 0.323 | 0.755 | 0.134 | 0.616 |
| MCC | 0.307 | 0.517 | 0.181 | 0.563 |
| AUC | 0.787 | 0.880 | 0.731 | 0.904 |
| No IDR shorter than 30 aas | No IDR shorter than 40 aas | |||
| Sensitivity | 0.419 | 0.581 | 0.340 | 0.319 |
| Specificity | 0.941 | 0.991 | 0.941 | 0.991 |
| Precision | 0.084 | 0.459 | 0.024 | 0.131 |
| MCC | 0.166 | 0.509 | 0.076 | 0.199 |
| AUC | 0.745 | 0.900 | 0.726 | 0.891 |
Performance comparison between DISOPRED releases by IDR position along sequences
| Measure | Terminal protein regions | Internal protein regions | ||
|---|---|---|---|---|
| DISOPRED2 | DISOPRED3 | DISOPRED2 | DISOPRED3 | |
| Sensitivity | 0.665 | 0.584 | 0.275 | 0.293 |
| Specificity | 0.825 | 0.914 | 0.947 | 0.995 |
| Precision | 0.645 | 0.766 | 0.209 | 0.745 |
| MCC | 0.486 | 0.541 | 0.195 | 0.452 |
| AUC | 0.825 | 0.877 | 0.755 | 0.850 |
Benchmark of SVM classifiers of disordered protein-binding residues trained on different sets of sequence-derived features
| Measure | Naïve | Sequence | PSSM | PSSM, IDR location, length and AA composition |
|---|---|---|---|---|
| Sensitivity | 0.500 | 0.640 | 0.781 | 0.270 |
| Specificity | 0.500 | 0.517 | 0.401 | 0.940 |
| Precision | 0.174 | 0.218 | 0.215 | 0.485 |
| MCC | 0.000 | 0.119 | 0.143 | 0.269 |
| 0.258 | 0.325 | 0.337 | 0.347 |
Naïve predictions correspond to random labelling of the known disordered residues as folding upon protein binding or not with equal probability.
Benchmark results of DISOPRED3 against other approaches for disordered protein-binding prediction
| Method | Sensitivity | Specificity | Precision | MCC | |
|---|---|---|---|---|---|
| DISOPRED3 | 0.147 | 0.958 | 0.218 | 0.176 | 0.126 |
| MoRFpred | 0.190 | 0.922 | 0.162 | 0.175 | 0.104 |
| MFSPSSMpred | 0.206 | 0.900 | 0.152 | 0.175 | 0.093 |
| Naïve | 0.500 | 0.500 | 0.074 | 0.129 | 0.000 |
| DISOPRED3 no DPB SVM | 0.307 | 0.613 | 0.059 | 0.100 | −0.043 |
| ANCHOR | 0.288 | 0.536 | 0.047 | 0.081 | −0.092 |
DISOPRED3 no DPB SVM is a baseline method that considers all disordered residues identified by DISOPRED3 as involved in protein binding.