| Literature DB >> 26032236 |
Timothy J Ragan1, Rasmus H Fogh1, Roberto Tejero2, Wim Vranken3,4, Gaetano T Montelione5,6, Antonio Rosato7, Geerten W Vuister8.
Abstract
We performed a comprehensive structure validation of both automated and manually generated structures of the 10 targets of the CASD-NMR-2013 effort. We established that automated structure determination protocols are capable of reliably producing structures of comparable accuracy and quality to those generated by a skilled researcher, at least for small, single domain proteins such as the ten targets tested. The most robust results appear to be obtained when NOESY peak lists are used either as the primary input data or to augment chemical shift data without the need to manually filter such lists. A detailed analysis of the long-range NOE restraints generated by the different programs from the same data showed a surprisingly low degree of overlap. Additionally, we found that there was no significant correlation between the extent of the NOE restraint overlap and the accuracy of the structure. This result was surprising given the importance of NOE data in producing good quality structures. We suggest that this could be explained by the information redundancy present in NOEs between atoms contained within a fixed covalent network.Entities:
Keywords: Blind testing; CASD-NMR; NMR; NOE; Protein; Quality; Structure determination; Validation
Mesh:
Substances:
Year: 2015 PMID: 26032236 PMCID: PMC4569653 DOI: 10.1007/s10858-015-9949-0
Source DB: PubMed Journal: J Biomol NMR ISSN: 0925-2738 Impact factor: 2.835
CASD-2013 targets
| Target ID | PDB ID | Valid range(s) | Reference ensemble authors |
|---|---|---|---|
| HR2876B | 2LTM | 13–105 | Liu, G., Xiao, R., Janjua, H., Hamilton, K., Shastry, R., Kohan, E., Acton, T.B., Everett, J.K., Lee, H., Huang, Y.J., Montelione, G.T. |
| HR2876C | 2M5O | 17–91 | Liu, G., Xiao, R., Janjua, H., Hamilton, K., Shastry, R., Kohan, E., Acton, T.B., Everett, J.K., Pederson, K., Huang, Y.J., Montelione, G.T. |
| HR5460A | 2LAH | 14–25, 33–158 | Liu, G., Shastry, R., Ciccosanti, C., Hamilton, K., Acton, T.B., Xiao, R., Everett, J.K., Montelione, G.T. |
| HR6430A | 2LA6 | 14–99 | Liu, G., Xiao, R., Janjua, H., Lee, H., Ciccosanti, C.T., Acton, T.B., Everett, J.K., Huang, Y.J., Montelione, G.T. |
| HR6470A | 2L9R | 554–608 | Liu, G., Xiao, R., Lee, H.-W., Hamilton, K., Ciccosanti, C., Wang, H.B., Acton, T.B., Everett, J.K., Huang, Y.J., Montelione, G.T. |
| HR8254A | 2M2E | 15–56 | Lemak, A., Yee, A., Houliston, S., Garcia, M., Ong, M., Arrowsmith, C. |
| OR135 | 2LN3 | 4–74 | Liu, G., Koga, R., Koga, N., Xiao, R., Lee, H., Janjua, H., Kohan, E., Acton, T.B., Everett, J.K., Baker, D., Montelione, G.T. |
| OR36 | 2LCI | 2–46, 53–125 | Liu, G., Koga, N., Koga, R., Xiao, R., Lee, H.T., Janjua, H., Ciccosanti, C., Acton, T.B., Everett, J., Baker, D., Montelione, G.T. |
| StT322 | 2LOJ | 23–63 | Wu, B., Yee, A., Houliston, S., Garcia, M., Savchenko, A., Arrowsmith, C.H. |
| YR313A | 2LTL | 17–41, 45–115 | Liu, G., Xiao, R., Hamilton, K., Janjua, H., Shastry, R., Kohan, E., Acton, T.B., Everett, J.K., Lee, H., Huang, Y.J., Montelione, G.T. |
The PDB ID, valid ranges and reference ensemble sources for comparison of each target is given
Fig. 5Overlap of long-range NOE restraints between the targets and the entries. Fraction of overlapping NOE restraints between the target and each entry determined on the basis of a pseudo-atom or b residue. Symbols and labels are explained in the legend for Fig. 1. c, d Heatmaps of the fractions of overlapping long-range NOE restraints between the OR36 target and entries, determined on the basis of pseudo-atom (c) or residue (d). The total number of long-range restraints present for each target/entry is shown on the diagonal. The off-diagonal values denote the percentage of restraints used in the entry indicated along the row that are also found in the entry indicated along the column. The top row shows the percentage of NOE’s used in the reference structure that were found in each entry, while the left-most column shows the percentage of NOE’s used by each entry that were found in the reference structure. For example, the entry in the square marked by the black box in (c) shows that 238 restraints (22 %) used in the OR36 target are also present in the OR36_ASDP-CNS_c entry
Fig. 1Comparison of targets and entries. a Structural similarity (accuracy): the mean all versus all pairwise backbone RMSD for well-defined residues for each of the entries with respect to the target. The dashed line at 1.5 Å indicates a reasonable upper threshold for identity within experimental uncertainty (see text for details). b The pairwise backbone RMSD for well-defined residues within each ensemble for each of the targets and entries. The dashed line at 1.0 Å indicates an estimated upper threshold for a converged structure. Symbols for each target are indicated on the left. Open symbols indicate entries generated from truncated input sequences. Horizontal axis labels: targets are labeled in green, entries generated from curated lists in black, curated lists plus RDCs in bold-black, un-curated lists in blue, un-curated lists plus RDCs in bold-blue, CS only in magenta, CS plus RDCs in bold magenta and raw data in purple
Median accuracy of paired entries
| Program | Curated (Å) | Un-curated (Å) | Raw (Å) | Number of targets |
|---|---|---|---|---|
| ARIA | 0.78 | 0.91 | – | 5 |
| ASDP-CNS | 1.27 | 1.20 | – | 8 |
| ASDP-ROSETTA | 1.16 | 1.43 | – | 6 |
| CHESHIRE-YAPP | 1.05 | 1.24 | – | 7 |
| CYANA | 0.84 | 0.97 | – | 5 |
| UNIO | – | 1.01 | 1.11 | 6 |
Only targets calculated using both curated and un-curated data (or un-curated and raw data, in the case of UNIO) are included. Note that no program submitted paired entries for all targets and therefore comparison of accuracies made across programs is potentially inappropriate (see text)
Fig. 2Overall quality scores of the targets and the entries. a Molprobity Ramachandran outliers (Lovell et al. 2003). b Molprobity number of clashes per thousand atoms in the ensemble. c WHAT-IF Ramachandran Z-scores (Vriend 1990). d WHAT-IF side chain Z-scores. Symbols and labels are explained in the legend for Fig. 1
Fig. 3ROG scores (Doreleijers et al. 2012a) of the targets and the entries. a The fraction of residues with a green ROG score. The lower threshold of 0.5 is indicated by a dashed line. b The fraction of residues with a red ROG score. The upper threshold of 0.3 is indicated by a dashed line. Symbols and labels are explained in the legend for Fig. 1
Fig. 4Agreement with experimental data of the targets and the entries. a The DP score (Huang et al. 2005). The dashed line indicates the lower threshold of 0.75 for agreement between the structure and the input data. b The NOE completeness determined by Wattos. The dashed line indicates the median NOE completeness (44.2 %) for all structures in the NRG-CING database (Doreleijers et al. 2012b). Symbols and labels are explained in the legend for Fig. 1. Only entries calculated from NOESY lists have been included
Fig. 6Correlation between entry pairwise RMSD and NOE restraint overlap. For every pair of entries for a given target, the all-by-all RMSD and NOE restraint overlap between those entries is shown. NOE restraint overlap are calculated on a a pseudo-atom or b residue basis. Symbols are explained in the legend for Fig. 1