| Literature DB >> 27587065 |
Ashraf Yaseen1, Mais Nijim2, Brandon Williams3, Lei Qian3, Min Li4, Jianxin Wang4, Yaohang Li5.
Abstract
BACKGROUND: The fluctuation of atoms around their average positions in protein structures provides important information regarding protein dynamics. This flexibility of protein structures is associated with various biological processes. Predicting flexibility of residues from protein sequences is significant for analyzing the dynamic properties of proteins which will be helpful in predicting their functions.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27587065 PMCID: PMC5009531 DOI: 10.1186/s12859-016-1117-3
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Distribution of the normalized B-factors in Cull5547. Large normalized B-values (to the right) indicate more flexible residues and small normalized B-values (to the left) indicate more rigid residues. Most residues fall in the middle (intermediate flexibility)
Fig. 2Encoding and neural network architecture for flexibility prediction. PSSM(20): position specific scoring matrix. SCRS(3): context-based scores of residue Ri in Rigid, Intermediate, and Flexible states. SS(3): predicted secondary structures, represent probabilities of residue Ri in Helix, Sheet, and Coil. SA(2): predicted relative solvent accessibility, represent probabilities of residue Ri in Extended and Buried states. AAP(5): amino acid properties
Prediction performance on Cull5547 dataset
| QR | QI | QF | Q3 | |
|---|---|---|---|---|
| PSSM Only | 56.7 | 50.4 | 51.3 | 52.6 |
| PSSM+Scores | 57.5 | 56.0 | 58.9 | 57.3 |
| All-features (FLEXc) | 61.7 | 57.2 | 66.6 | 61.0 |
Comparison of prediction accuracy using PSSM-only encoding, PSSM+context-based scores encoding, and all-features encoding on Cull5547 using 7-fold cross validation. All-features including PSSM, context-based scores, predicted secondary structures and solvent accessibility, and amino acid physicochemical properties
Prediction performance on benchmark datasets
| CASP11 | CASP10 | CASP9 | CASP8 | |
|---|---|---|---|---|
| PSSM Only | 47.1 | 48.6 | 50.8 | 50.7 |
| PSSM+Scores | 52.6 | 52.7 | 52.9 | 52.6 |
| All-features (FLEXc) | 54.4 | 54.2 | 54.9 | 53.8 |
Comparison of Q3 prediction performance of protein flexibility using PSSM-only encoding, PSSM+context-based scores encoding, and all-features encoding on CASP8, CASP9, CASP10, and CASP11 targets
Comparison of prediction performance of FLEXc with PredyFlexy on benchmarks of CASP(8-11) targets
| Benchmark | Method | QR | QI | QF | Q3 |
|---|---|---|---|---|---|
| CASP11 | PredyFlexy | 41.5 | 41.3 | 58.3 | 42.0 |
| FLEXc | 48.3 | 55.4 | 65.2 | 54.4 | |
| CASP10 | PredyFlexy | 36.4 | 42.5 | 53.5 | 42.4 |
| FLEXc | 47.6 | 56.4 | 62.2 | 54.2 | |
| CASP9 | PredyFlexy | 37.9 | 42.0 | 57.4 | 42.3 |
| FLEXc | 50.1 | 55.2 | 62.2 | 54.9 | |
| CASP8 | PredyFlexy | 40.3 | 41.4 | 55.6 | 41.8 |
| FLEXc | 49.1 | 58.4 | 57.0 | 53.8 |
Comparison of performance of 2-state FLEXc prediction with 2-state PredyFlexy and PROFbval prediction results using F-measure
| PredyFlexy | PROFbval | FLEXc | |
|---|---|---|---|
| Strict, (threshold=0.03) | 48.08 | 53.30 | 58.46 |
| Non-Strict (threshold=−0.3) | 71.99 | 71.90 | 72.80 |
Fig. 3Frequency distribution of normalized B-factors associated with secondary structures assignment (coil and others) in Cull5547 dataset. a Distribution of B-factors from -2.9 to 0. b Distribution of B-factors from 4 to 9.4
Fig. 4Frequency distribution of normalized B-factors associated with the relative solvent accessibility assignment (buried and exposed) in Cull5547 dataset. a Distribution of B-factors from -2.9 to 0. b Distribution of B-factors from 4 to 9.4