| Literature DB >> 26071966 |
Antonio Rosato1, Wim Vranken2,3, Rasmus H Fogh4, Timothy J Ragan4, Roberto Tejero5, Kari Pederson6, Hsiau-Wei Lee6, James H Prestegard6, Adelinda Yee7, Bin Wu7, Alexander Lemak7, Scott Houliston7, Cheryl H Arrowsmith7, Michael Kennedy8, Thomas B Acton9,10, Rong Xiao9,10, Gaohua Liu9,10, Gaetano T Montelione11,12, Geerten W Vuister13.
Abstract
The second round of the community-wide initiative Critical Assessment of automated Structure Determination of Proteins by NMR (CASD-NMR-2013) comprised ten blind target datasets, consisting of unprocessed spectral data, assigned chemical shift lists and unassigned NOESY peak and RDC lists, that were made available in both curated (i.e. manually refined) or un-curated (i.e. automatically generated) form. Ten structure calculation programs, using fully automated protocols only, generated a total of 164 three-dimensional structures (entries) for the ten targets, sometimes using both curated and un-curated lists to generate multiple entries for a single target. The accuracy of the entries could be established by comparing them to the corresponding manually solved structure of each target, which was not available at the time the data were provided. Across the entire data set, 71 % of all entries submitted achieved an accuracy relative to the reference NMR structure better than 1.5 Å. Methods based on NOESY peak lists achieved even better results with up to 100% of the entries within the 1.5 Å threshold for some programs. However, some methods did not converge for some targets using un-curated NOESY peak lists. Over 90% of the entries achieved an accuracy better than the more relaxed threshold of 2.5 Å that was used in the previous CASD-NMR-2010 round. Comparisons between entries generated with un-curated versus curated peaks show only marginal improvements for the latter in those cases where both calculations converged.Entities:
Keywords: Accuracy; Automation; Blind testing; CASD-NMR; Chemical shift; NMR; NOE; Precision; Protein; Quality; Structure determination; Validation
Mesh:
Substances:
Year: 2015 PMID: 26071966 PMCID: PMC4569658 DOI: 10.1007/s10858-015-9953-4
Source DB: PubMed Journal: J Biomol NMR ISSN: 0925-2738 Impact factor: 2.835
Fig. 1Side by side superimposed backbone ribbon traces and cartoon representations for the ten manually-determined CASD-NMR-2013 reference structures, labeled with PDB codes and coloured blue to red from N- to C-terminus. Ill-defined regions are shown in light grey
Targets and data of the CASD-NMR-2013 round
| Target1 | Protein length | PDB/BMRB codes | Secondary structure (% α/β/coil/ill-defined) | CS (#) | CS completion2 (%1H/13C/15N) | Un-curated NOESY peak lists (# peaks) | Curated NOESY peak lists (# peaks) | 1H-15N RDCs (#) | r (#curated/#un-curated)3 | Well-defined residue range(s)4 | Closest homologue5 | Homology (% identical/similar)6 | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 13C | 13C-Arom. | 15N | 13C | 13C-Arom. | 15N | |||||||||||
| HR2876B | 107 | 2LTM/18489 | 23/22/41/13 | 1095 | 90/80/77 | 11,345 | 697 | 2060 | 5231 | 387 | 1436 | 121 | 0.50 | 13–105 | 2K1H | 30/45 |
| HR2876C | 97 | 2M5O/19068 | 40/18/20/23 | 955 | 90/80/85 | 6870 | 217 | 2212 | 4580 | 161 | 1596 | 95 | 0.68 | 17–91 | 1VEH | 93/97 |
| HR5460A | 160 | 2LAH/17524 | 65/1/20/14 | 1734 | 92/79/84 | 12,444 | 634 | 4172 | 8259 | 792 | 2964 | 83 | 0.70 | 14–25, 33–158 | 2WVI | 39/60 |
| HR6430A | 99 | 2LA6/17508 | 19/28/39/13 | 975 | 89/80/81 | 4932 | 265 | 1628 | 4839 | 303 | 1501 | 126 | 0.97 | 14–99 | 2CPE | 63/90 |
| HR6470A | 69 | 2L9R/17484 | 48/0/13/39 | 732 | 86/77/83 | 3103 | 169 | 990 | 3098 | 168 | 950 | 73 | 0.99 | 15–56 | 1NK2 | 69/79 |
| HR8254A | 73 | 2M2E/18909 | 56/0/19/25 | 859 | 96/88/89 | 15,073 | 421 | 3768 | 2549 | 163 | 853 | None | 0.19 | 554–608 | 2CQR | 47/61 |
| OR135 | 83 | 2LN3/18145 | 37/25/23/14 | 916 | 92/82/89 | 4669 | 143 | 2937 | 4680 | 150 | 1529 | 116 | 0.82 | 4–74 | 2L69 | 20/48 |
| OR36 | 134 | 2LCI/17613 | 49/22/17/12 | 1558 | 94/84/91 | 10,846 | 314 | 2634 | 7125 | 209 | 2125 | 165 | 0.69 | 2–46, 53–125 | 1MEJ | 27/33 |
| StT322 | 63 | 2LOJ/18214 | 0/35/30/35 | 681 | 93/85/86 | 7454 | 28 | 1793 | 1596 | 26 | 835 | None | 0.18 | 23–63 | 1ON4 | 34/54 |
| YR313A | 119 | 2LTL/18487 | 28/22/31/19 | 1259 | 91/81/87 | 10,171 | 148 | 1984 | 4897 | 90 | 1605 | 112 | 0.54 | 17–41, 45–115 | 2M6Q | 19/39 |
1See Supplementary Table S1 for additional target information
2Assignments as a percentage of all proton signals, all carbon signals, and backbone only nitrogen signals, respectively. The totals include signals that are not assignable with standard experiments, such as proline backbone N and exchangeable protons
3Ratio of curated to un-curated peak count across all NOESY experiments used
4Well-defined ranges as determined by CyRange (Kirchner and Guentert 2011)
5Closest homologue in the PDB dated prior to the release date of the target. Where several entries had the same homology we have given priority to structure ensembles for systems and conditions as similar to the target as possible
6Percentage of identical/similar residues for the well-defined regions
7Recorded in D2O
CASD-NMR2013 participants
| Program | References |
|---|---|
| ASDP (CNS/Rosetta) | Huang et al. ( |
| Aria | Mareuil et al. ( |
| Autonoe | Zhang et al. ( |
| CHESHIRE (YAPP) | Cavalli and Vendruscolo ( |
| CS-HM-Rosetta | Thompson et al. ( |
| Cyana | Herrmann et al. ( |
| Ponderosa | Lee et al. ( |
| BE-Metadynamics | Granata et al. ( |
| i-TASSER | Jang et al. ( |
| Rosetta-web | van der Schot and Bonvin ( |
| UNIO | Guerry et al. ( |
Programs submitting to CASD-NMR-2013 and references to the protocols used
Fig. 2Targets submitted per program. A target is counted as submitted if there is at least one entry; programs often submitted multiple entries for a single target. Calculations that did not converge are ignored. Colour (see legend) encodes the targets calculated using different input data sets; e.g. using curated NOESY peak lists only (light blue) or two entries for one target with one using curated and one using curated peaklists (dark blue). Results for ASDP were provided using two different refinement methods (CNS and Rosetta), but at least one submission was provided for each of the 10 targets
Fig. 3Accuracy of CASD-NMR-2013 entries based on un-curated information. a Number of CASD-NMR2013 entries based on un-curated NOESY, raw spectral data, or chemical shift (CS)-only data, colour-coded based on their accuracy. For each program, we include all targets for which there was at least one entry submitted. Each column is colour-coded based on the average accuracy of the submitted entry(ies) (compare to Table 2). Green high accuracy (RMSD bias to the reference <1.5 Å); orange intermediate accuracy (RMSD bias to the reference <2.5 Å); red low accuracy. b Histograms of the accuracy of entries using curated (blue, n = 55) or un-curated (orange, n = 63) peak lists. The RMSD values were calculated as the average of the pairwise root mean square deviation of the backbone atoms between the conformers in the target and entry ensembles using the well-defined regions as determined by CyRange (cf. Table 2)
Accuracy of CASD-NMR2013 entries based on un-curated NOESY, raw spectral data, or chemical shift (CS)-only data
| Target | Reference | Based on un-curated NOESY lists | Raw spectra | |||||
|---|---|---|---|---|---|---|---|---|
| ARIA | ASDP | CYANA | I-Tasser | UNIO | UNIO | Ponderosa | ||
| HR2876B | (0.62) | 0.98 (0.64) | 0.94 (0.68) | 1.03 (0.45) | 1.04 (0.58) | 1.29 (0.11) | ||
| HR2876C | (0.53) | 0.81 (0.42) | 1.41 (0.65) | 0.97 (0.30) | 1.11 (n.a.) | 1.12 (0.50) | 1.32 (0.63) | 0.95 (0.17) |
| HR5460A | (0.60) | 1.52 (1.14) | 2.26 (1.40) | 9.33 (0.93) | ||||
| HR6430A | (0.52) | 0.82 (0.43) | 0.95 (0.77) | 0.90 (0.34) | 0.91 (0.73) | 1.13 (0.74) | 1.03 (0.05) | |
| HR6470A | (0.40) | 0.56 (0.42) | 1.00 (0.51) | 0.59 (0.46) | 0.66 (0.51) | 1.09 (0.61) | 0.66 (0.06) | |
| HR8254A | (0.72) | 1.73 (1.32) | 1.31 (0.25) | 1.45 (0.95) | 3.01 (0.02) | |||
| OR135 | (0.64) | 0.90 (0.53) | 0.96 (0.38) | 0.98 (0.38) | 0.98 (0.60) | 1.01 (0.56) | 1.91 (0.28) | |
| OR36 | (0.77) | 1.34 (0.95) | 1.14 (0.33) | 1.43 (0.70) | 1.57 (1.06) | 2.78 (0.27) | ||
| StT322 | (0.57) | 2.56 (1.46) | 3.69 (0.24) | |||||
| YR313A | (0.97) | 1.44 (0.93) | 2.94 (1.82) | 1.67 (0.11) | ||||
| Median | n.a. | 0.82 | 1.34 | 0.91 | 1.21 | 1.08 | 1.55 | 2.63 |
| <1.5 Å | n.a. | 5 (100 %) | 7 (70 %) | 5 (100 %) | 2 (100 %) | 7 (100 %) | 5 (63 %) | 4 (40 %) |
| <2.0 Å | n.a. | 5 (100 %) | 9 (90 %) | 5 (100 %) | 2 (100 %) | 7 (100 %) | 6 (75 %) | 6 (60 %) |
| <2.5 Å | n.a. | 5 (100 %) | 9 (90 %) | 5 (100 %) | 2 (100 %) | 7 (100 %) | 7 (88 %) | 6 (60 %) |
Structure precision is also shown in parentheses. When multiple structures were submitted for a given target based on the same method, the average accuracy and precision are given. Structures marked as not converged or incorrect at the time of submission were excluded. At the bottom of the table, we report the median accuracy of each tool. In addition, the number of structures with an accuracy better than a fixed threshold are reported, together with the percentage of entries that were better than (i.e. correct within) the threshold. Tools were grouped according to the main input data used