| Literature DB >> 29390958 |
Jianzhao Gao1, Yuedong Yang2, Yaoqi Zhou3.
Abstract
BACKGROUND: Protein structure can be described by backbone torsion angles: rotational angles about the N-Cα bond (φ) and the Cα-C bond (ψ) or the angle between Cαi-1-Cαi-Cαi + 1 (θ) and the rotational angle about the Cαi-Cαi + 1 bond (τ). Thus, their accurate prediction is useful for structure prediction and model refinement. Early methods predicted torsion angles in a few discrete bins whereas most recent methods have focused on prediction of angles in real, continuous values. Real value prediction, however, is unable to provide the information on probabilities of predicted angles.Entities:
Keywords: Deep learning neural network; Intrinsically disordered region; Model quality assessment; Torsion angle
Mesh:
Substances:
Year: 2018 PMID: 29390958 PMCID: PMC5796405 DOI: 10.1186/s12859-018-2031-7
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Accuracy for four angles, 5° for each bin
| Dataset | Method | φ (Top 5c) | ψ (Top 5c) | θ (Top 5c) | τ (Top 5c) |
|---|---|---|---|---|---|
| TR4590 | SPIDER2a | 0.166 | 0.162 | 0.318 | 0.161 |
| M1b | 0.196(0.607) | 0.179(0.583) | 0.365(0.799) | 0.174(0.504) | |
| M2b | 0.203(0.636) | 0.187(0.616) | 0.379(0.828) | 0.185(0.547) | |
| TS1199 | ANGLOR | 0.141 | 0.055 | NA | NA |
| SPIDER2a | 0.162 | 0.151 | 0.304 | 0.153 | |
| SPIDER3a | 0.156 | 0.157 | 0.325 | 0.162 | |
| M1b | 0.192(0.598) | 0.171(0.567) | 0.358(0.794) | 0.171(0.497) | |
| M2 b | 0.196(0.615) | 0.174(0.588) | 0.367(0.810) | 0.178(0.528) |
aPredicted real angle values from SPIDER2/SPIDER3 were evaluated according to 5° bin. bM1 and M2 are models without or with SPIDER2 as input, respectively. c The number in parentheses is the accuracy of matching the native angles to one of the top five predicted angle bins
Accuracy for four angles, 10° for each bin in TS1199
| Method | φ | ψ | θ | τ |
|---|---|---|---|---|
| SPIDER2a | 0.292 | 0.263 | 0.458 | 0.241 |
| M2–5°b | 0.337 | 0.297 | 0.516 | 0.274 |
| M2–10°c | 0.340 | 0.300 | 0.520 | 0.277 |
aPredicted real angle values from SPIDER2 were evaluated based on 10° bin. bTrained with SPIDER2 input and 5° bin and evaluated by combining two neighboring 5° bin. c Trained with SPIDER2 input and 10° bin
Accuracy for four angles, 5° for each bin, using different combinations of features groups in M2 on training dataset TR4590 with 10-fold cross validation. The number in parentheses is the accuracy of matching the native angles to one of the top five predicted angle bins
| Method | φ (Top 5) | ψ (Top 5) | θ (Top 5) | τ (Top 5) |
|---|---|---|---|---|
| Angles-based features(Angles)a | 0.200(0.629) | 0.183(0.608) | 0.374(0.823) | 0.180(0.542) |
| Structure-based features(Struct)b | 0.193(0.602) | 0.176(0.583) | 0.363(0.804) | 0.174(0.521) |
| PSSM-based features(PSSM)c | 0.188(0.588) | 0.168(0.555) | 0.353(0.784) | 0.167(0.493) |
| Angles+PSSM | 0.202(0.633) | 0.186(0.613) | 0.377(0.826) | 0.184(0.545) |
| Angles+Struct | 0.201(0.632) | 0.185(0.611) | 0.376(0.825) | 0.182(0.544) |
| PSSM+Struct | 0.198(0.622) | 0.183(0.603) | 0.373(0.819) | 0.180(0.534) |
| All features of M2 model | 0.203(0.636) | 0.187(0.616) | 0.379(0.828) | 0.185(0.547) |
apredicted angle feature group (φ and ψ angles and Cα-atom-based angle θ and rotational angle τ). b Structure-based feature group: predicted secondary structure probability, relative solvent accessibility, half-sphere exposure, and contact numbers. c PSSM based feature group: the features from PSSM profile
Fig. 1Receiver operating characteristic curve for disorder prediction given by a single feature from entropy of different angle probabilities predicted by M1 (PSSM + amino acid properties) and M2 (with SPIDER 2 as input), as compared to a deep-learning neural network based techniques SPOT-disorder employing multiple features
Performance in model selection according to average Pearson correlation coefficient (PCC) and average Global Distance Test (GDT) score of top 1 ranked models in the CASP11MOD dataset
| Method | PCC a (median b) | GDT |
|---|---|---|
| DFIRE | −0.24 (−0.23) | 0.46 |
| dDFIRE | −0.27(−0.31) | 0.45 |
| RWPlus | − 0.20(− 0.21) | 0.47 |
| M2 | 0.45(0.47) |
|
| M2 -ψ | 0.49(0.49) |
|
| M2-θ | 0.53(0.55) | 0.47 |
| M2-τ |
| 0.47 |
aAverage 72 targets’ PCCs, bMedian of 72 targets’PCCs , and the best results were emphasized
Fig. 2Average Pearson correlation coefficients for four angle based scores and statistical energy scores: DFIRE, dDFIRE and RWplus
Fig. 3Scatter plot for quality scores and GDT score for target T0848. Dashed line is the regression line between quality scores and GDT scores. (A) dDFIRE energy score vs. GDT score, Pearson correlation coefficient is − 0.09 (B) M2-τ scores vs. GDT score, Pearson correlation coefficient is 0.55
Fig. 4The alignment between the first domain of the selected model using M2-τ quality score in purple and the first domain of actual target T0848 structure (PDBID: 4R4G) in green
Performance in model selection according to average Pearson correlation coefficient (PCC) and average Global Distance Test (GDT) score of models in the Rosetta decoy set
| Method | PCC a (median b) | GDT |
|---|---|---|
| DFIRE | −0.53 (−0.71) |
|
| dDFIRE | −0.38(− 0.48) | 0.59 |
| RWPlus | −0.51(− 0.68) | 0.70 |
| M2 | 0.43(0.51) | 0.66 |
| M2 -ψ | 0.48(0.65) | 0.69 |
| M2-θ | 0.50(0.66) |
|
| M2-τ |
| 0.69 |
aAverage 58 native structures’ PCCs, bMedian of 58 native structures’PCCs , and the best results were emphasized