| Literature DB >> 25861734 |
Fabien Mareuil1,2, Thérèse E Malliavin1, Michael Nilges1, Benjamin Bardiaux3.
Abstract
In biological NMR, assignment of NOE cross-peaks and calculation of atomic conformations are critical steps in the determination of reliable high-resolution structures. ARIA is an automated approach that performs NOE assignment and structure calculation in a concomitant manner in an iterative procedure. The log-harmonic shape for distance restraint potential and the Bayesian weighting of distance restraints, recently introduced in ARIA, were shown to significantly improve the quality and the accuracy of determined structures. In this paper, we propose two modifications of the ARIA protocol: (1) the softening of the force field together with adapted hydrogen radii, which is meaningful in the context of the log-harmonic potential with Bayesian weighting, (2) a procedure that automatically adjusts the violation tolerance used in the selection of active restraints, based on the fitting of the structure to the input data sets. The new ARIA protocols were fine-tuned on a set of eight protein targets from the CASD-NMR initiative. As a result, the convergence problems previously observed for some targets was resolved and the obtained structures exhibited better quality. In addition, the new ARIA protocols were applied for the structure calculation of ten new CASD-NMR targets in a blind fashion, i.e. without knowing the actual solution. Even though optimisation of parameters and pre-filtering of unrefined NOE peak lists were necessary for half of the targets, ARIA consistently and reliably determined very precise and highly accurate structures for all cases. In the context of integrative structural biology, an increasing number of experimental methods are used that produce distance data for the determination of 3D structures of macromolecules, stressing the importance of methods that successfully make use of ambiguous and noisy distance data.Entities:
Keywords: ARIA; Automated NOE assignment; CASD–NMR; Nuclear magnetic resonance; Structure determination
Mesh:
Substances:
Year: 2015 PMID: 25861734 PMCID: PMC4569677 DOI: 10.1007/s10858-015-9928-5
Source DB: PubMed Journal: J Biomol NMR ISSN: 0925-2738 Impact factor: 2.835
Protein targets from the CASD–NMR 1 data set (Rosato et al. 2012), used for the development of ARIA protocols presented here
| Target name | Sequence length | No. of peak lists | No. of peaks | Residue range for RMSD | PDB entry |
|---|---|---|---|---|---|
| Vpr247 | 102 | 3 | 5756 | 2–13,21–31,35–46,57–58,68–80,92–97 | 2KIF |
| atc0905 | 118 | 3 | 8036 | 4–19,22–27,36–41,61–66,70–93,97–102 | 2KNR |
| PGR122A | 73 | 3 | 3515 | 418–423,426–432,437–443,447–451,453–457,460–462,472–478 | 2KMM |
| HR5537A | 135 | 2 | 8370 | 39–54,59–79,83–105,117–134 | 2KK1 |
| ET109A_ox | 102 | 3 | 6751 | 91–101,107–110,129–133,140–155,168–170,174–180,184–188 | 2KKY |
| ET109A_red | 102 | 3 | 6474 | 91–101,107–110,114–117,146–155, 177–180,184–188 | 2KKX |
| CtR69A | 63 | 3 | 1975 | 8–16,19–36,43–53 | 2KRU |
| CGR26A | 146 | 3 | 5133 | 57–59,66–83,86–92,100–111,116–132,138–154,157–168 | 2KPT |
| NeR103A | 105 | 3 | 4648 | 23–33,42–52,58–61,67–76,91–96 | 2KPM |
For each protein, the number of residues, the number of peak lists, and the total number of peaks, as well as the residue ranges used for RMSD calculations and the corresponding PDB entry, are given
Protein targets from the CASD–NMR 2 data set
| Target name | Sequence length | No. of peak lists | No. of peaks (unrefined/refined) | Residue range for RMSD | PDB entry |
|---|---|---|---|---|---|
| HR6470A | 69 | 3 | 4262/4216 | 12–58 | 2L9R |
| HR6430A | 99 | 3 | 6825/6643 | 14–99 | 2LA6 |
| HR5460A | 160 | 3 | 17,250/12,015 | 12–158 | 2LAH |
| OR36 | 129 | 3 | 13,794/9459 | 2–128 | 2LCI |
| OR135 | 83 | 3 | 7749/6359 | 4–73 | 2LN3 |
| StT322 | 63 | 4 | 12,437/2727 | 26–62 | 2LOJ |
| HR2876B | 107 | 3 | 14,102/7054 | 12–105 | 2LTM |
| YR313A | 119 | 3 | 12,303/6592 | 17–111 | 2LTL |
| HR8254A | 73 | 3 | 19,262/3565 | 553–612 | 2M2E |
| HR2876C | 97 | 3 | 9299/6337 | 17–93 | 2M5O |
For each protein, the number of residues, the number of peak lists, and the total number of peaks (unrefined and refined), as well as the residue ranges used for RMSD calculation and corresponding PDB entry, are given
van der Waals radii of hydrogen atoms for hydrogen–hydrogen interactions in the version of the PROLSQ force field used in ARIA
| Hydrogen type | CNS atom type | Former radius (Å) | New radius (Å) |
|---|---|---|---|
| H aliphatic | HA | 1.0 | 1.2 |
| H amide | H | 0.8 | 1.0 |
| H charged | HC | 0.8 | 1.0 |
Fig. 1Average WHAT-IF RMS Z-scores according to the distance potential and the force field parameters used for CASD–NMR 1 targets calculated with ARIA. The WHAT-IF RMS Z-score of bond-angles, peptide-bond dihedral angles, side-chains planarity and improper dihedral angles are reported (average and standard deviation among all conformers calculated for CASD–NMR 1 targets)
Fig. 2Quality scores according to the distance potential and the force field parameters used for CASD–NMR 1 targets calculated with ARIA. (Top) Molprobity Score versus molprobity clashscore in log-scale. Reference denotes the corresponding structure deposited in the PDB. (Bottom) Accuracy versus molprobity quality score
Fig. 3Percentages of green residues determined using CING ROG score on the conformations obtained in the last iteration and after water refinement for CASD–NMR 1 targets calculated with ARIA
Precision (convergence) and accuracy (RMSD from the reference structure) of the CASD–NMR targets Vpr247 and atc0905 using standard or adaptive criterion for the violation tolerance determination
| Target name | Potential/force field | No. of conformers per iteration | Violation tolerance | Backbone precision (Å) | Backbone accuracy (Å) |
|---|---|---|---|---|---|
| VpR247 | FBHW | 50 | Standard | 0.53 ± 0.10 | 1.75 |
| VpR247 | LogH | 50 | Standard | 7.70 ± 1.19 | 9.44 |
| VpR247 | LogHs* | 50 | Standard | 5.15 ± 1.37 | 6.30 |
| VpR247 | LogHs* | 500 | Standard | 1.31 ± 0.49 | 1.41 |
| VpR247 | LogHs* | 200 | Manuala | 0.73 ± 0.27 | 1.25 |
| VpR247 | LogHs* | 50 | Adaptive | 0.77 ± 0.14 | 1.12 |
| atc0905 | FBHW | 50 | Standard | 1.87 ± 0.39 | 1.52 |
| atc0905 | LogH | 50 | Standard | 1.20 ± 0.43 | 1.46 |
| atc0905 | LogHs* | 50 | Standard | 0.72 ± 0.18 | 1.55 |
| atc0905 | LogHs* | 50 | Adaptive | 0.54 ± 0.15 | 1.34 |
aManual determination of the optimal violation tolerance parameters t to achieve convergence (final values: 1000, 6, 4, 2, 2, 2, 2, 1.1, 1.1 Å)
Fig. 4Conformers ensemble determined by ARIA according to the method used to determine the restraint violation tolerance for the CASD–NMR 1 target Vpr247. The average structure of the reference PDB entry is showed in blue. a Standard tolerance and 50 conformers per iteration. b Standard tolerance and 500 conformers per iteration. c Manual monitoring of the tolerance and 200 conformers per iteration. d Adaptive tolerance and 50 conformers per iteration
Fig. 5Average validation scores of the structures determined by ARIA on CASD–NMR 2 targets. Blind calculation starting from unrefined peak lists are represented as triangle (successful) or square (unsuccessful) while blind calculation starting from refined peak lists are represented as diamond-shape. Structures re-calculated from unrefined peak lists using manually optimised parameters are shown as dot. a Ensemble precision (average backbone RMSD between the conformers and the average conformer). b Ensemble accuracy (backbone RMSD between the average conformer and the average reference PDB structure). c Average GDT_TS (Global distance test, total score) between the average conformer and the average reference PDB structure. d Average GDT_HA (Global distance test, high accuracy) between the average conformer and the average reference PDB structure. e CING percentage of green residues. f Average backbone normality Z-score reported by WHAT-IF. g Average / correlation Z-score reported by WHAT-IF. h Average Ramachandran plot appearance Z-score reported by WHAT-IF. i Average 2nd generation packing quality Z-score reported by WHAT-IF. j Average Molprobity clashscore Z-score reported by PSVS. k Average Procheck Z-score reported by PSVS
Fig. 6Overview of structures obtained with ARIA calculations for the ten targets from the CASD–NMR 2 data set. For each target, the average ARIA conformers in overlaid with the average reference PDB structure (in blue). Structures obtained by blind calculation from unrefined and refined peak list are shown in red and in orange, respectively. Structures re-calculated from unrefined peak lists using manually optimised parameters are shown in pink. Only the regions corresponding to ordered residues, determined by PSVS on the reference PDB structures, are drawn
Fig. 7Example of peak list filtering results for the 3D NOESY peak lists of CASD–NMR target HR8254A. The cross-peak positions are projected on the – plane. For each peak list, the number of cross-peaks is given along with the percentage of cross-peaks having a match in the refined peak list. See “Material and methods” section for a definition of the filters applied