| Literature DB >> 22570414 |
Yuanpeng Janet Huang1, Antonio Rosato, Gautam Singh, Gaetano T Montelione.
Abstract
We describe the RPF web server, a quality assessment tool for protein NMR structures. The RPF server measures the 'goodness-of-fit' of the 3D structure with NMR chemical shift and unassigned NOESY data, and calculates a discrimination power (DP) score, which estimates the differences between the fits of the query structures and random coil structures to these experimental data. The DP-score is an accuracy predictor of the query structure. The RPF server also maps local structure quality measures onto the 3D structure using an online molecular viewer, and onto the NMR spectra, allowing refinement of the structure and/or NOESY peak list data. The RPF server is available at: http://nmr.cabm.rutgers.edu/rpf.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22570414 PMCID: PMC3394279 DOI: 10.1093/nar/gks373
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.RPF output. (A) The distribution of the Precision Violations (a.k.a. false-positive interactions) mapped on the query structure based on a heat index. Red represents residues with strong Precision Violations and blue represents residues with few or no Precision Violations. In this example, residues 29 and 32 are colored red, indicating that several very short distances based on the input structure do not have corresponding NOE data in the NOESY peak list and/or one or more of the corresponding resonances are mis-assigned in the chemical shift list. (B) The ‘Precision Violations’ page reports all distances ≤5.0 Å calculated from the query structures that are not supported by the NOESY data. In this example, there are six Precision Violations involving residues 29 or 32 with max distance of 3.0 Å. (C) The ‘Recall Violations’ page reports the input NOESY peaks that are not supported by the query structures within the average distance of 5.0 Å.
Figure 2.Correlation between accuracy measures (backbone RMSD to the reference structure and GDT_TS score) and the DP-score. The various thresholds mentioned in the text are highlighted by the continuous (RMSD ≤ 2 Å; GDT_TS ≥ 80) and dashed (DP-score ≥ 0.7) lines. These results demonstrate the discriminating power of the DP score in distinguishing accurate from less accurate protein NMR models.
Pearson’s correlation coefficient between various accuracy and quality scores for the same data shown in Figure 2
| DP-score | Verify3D | ProsaII | PROCHECK (phi–psi) | PROCHECK (all) | MolProbity clash score | |
|---|---|---|---|---|---|---|
| RMSD | −0.659 | −0.139 | −0.156 | 0.108 | 0.257 | 0.065 |
| GDT_TS | 0.887 | 0.283 | 0.260 | −0.065 | −0.246 | −0.085 |
Confusion matrix and metrics for accuracy prediction on the basis of the DP-score
| Success | ||
|---|---|---|
| Positive | Negative | |
| DP-score prediction | ||
| Positive | 44 (TP | 2 (FP |
| Negative | 4 (FN | 13 (TN |
| Metrics | ||
| Sensitivity [TP/(TP + FN)] | 0.917 | |
| Specificity [TN/(TN + FP)] | 0.867 | |
| Precision | 0.957 | |
aTrue positives (TP) are accurate structures (i.e. RMSD ≤ 2.0 Å or GDT_TS ≥ 80) that are correctly predicted to be accurate on the basis of their DP-score higher than the threshold (i.e. 0.7).
bFalse positives (FP) are inaccurate structures that are erroneously predicted to be accurate on the basis of their DP-score higher than the threshold.
cFalse negatives (FN) are accurate structures that are erroneously predicted to be inaccurate on the basis of their DP-score lower than the threshold.
dTrue negatives (TN) are inaccurate structures that are correctly predicted to be inaccurate on the basis of their DP-score lower than the threshold.
eThe precision (i.e. the ratio of true positives among all positive predictions) becomes 1.00 at a DP-cut-off of 0.76.