| Literature DB >> 28048939 |
Asaminew H Aytenfisu1,2, Aleksandar Spasic1,2, Alan Grossfield1, Harry A Stern3, David H Mathews1,2,4.
Abstract
The backbone dihedral parameters of the Amber RNA force field were improved by fitting using multiple linear regression to potential energies determined by quantum chemistry calculations. Five backbone and four glycosidic dihedral parameters were fit simultaneously to reproduce the potential energies determined by a high-level density functional theory calculation (B97D3 functional with the AUG-CC-PVTZ basis set). Umbrella sampling was used to determine conformational free energies along the dihedral angles, and these better agree with the population of conformations observed in the protein data bank for the new parameters than for the conventional parameters. Molecular dynamics simulations performed on a set of hairpin loops, duplexes and tetramers with the new parameter set show improved modeling for the structures of tetramers CCCC, CAAU, and GACC, and an RNA internal loop of noncanonical pairs, as compared to the conventional parameters. For the tetramers, the new parameters largely avoid the incorrect intercalated structures that dominate the conformational samples from the conventional parameters. For the internal loop, the major conformation solved by NMR is stable with the new parameters, but not with the conventional parameters. The new force field performs similarly to the conventional parameters for the UUCG and GCAA hairpin loops and the [U(UA)6A]2 duplex.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28048939 PMCID: PMC5312698 DOI: 10.1021/acs.jctc.6b00870
Source DB: PubMed Journal: J Chem Theory Comput ISSN: 1549-9618 Impact factor: 6.006
Figure 1Procedure for fitting force field parameters. A diverse database of conformations was generated from X-ray structures and from dihedral scanning. For X-ray structures, bonds and angles were fixed to an A-form reference value using sander from the Amber software package.[24] The dihedral scan structures were generated by energy minimization with restraints on selected torsions. Additional restraints were applied on bond and angles to set them to the A-form reference value. The database was then reduced in size by considering sequence identity and dihedrals to remove redundancies. Structures generated by scanning and taken from the PDB were both used for the linear least-square fit.
Sequence Composition of All Dinucleotides and Nucleosides Used in Fitting
| sequence | no. conformations | sequence | no. conformations |
|---|---|---|---|
| AA | 2198 | CA | 1913 |
| AC | 1832 | CC | 1756 |
| AG | 2152 | CG | 1782 |
| AU | 1903 | CU | 1610 |
| GA | 2048 | UA | 1737 |
| GC | 1714 | UC | 1700 |
| GG | 1753 | UG | 1728 |
| GU | 1785 | UU | 1636 |
| A | 717 | G | 724 |
| C | 718 | U | 718 |
Fitting Torsion Names, Atom Names, and Atom Types According to Amber Force Field Nomenclaturea
| dihedral angle
definition | ||||
|---|---|---|---|---|
| torsion name | atom name | atom type | alternative paths | |
| 1 | α | O3 | OS/OH–P–OS–CI | OP1 |
| OP2 | ||||
| 2 | β | P | P–OS–CI–CF | P |
| P | ||||
| 3 | γ | O5 | OS/OH–CI–CF–CT | H5 |
| H5 | ||||
| H5 | ||||
| H5 | ||||
| H5 | ||||
| H5 | ||||
| O5 | ||||
| O5 | ||||
| 4 | ε | C4 | CF–CT–OS–P | H3 |
| C2 | ||||
| 5 | ζ | C3 | CT–OS–P–OS | C3 |
| C3′ | ||||
| 6 | χ adenine | O4′ | OS–CT–N*–C5 | O4 |
| C2 | ||||
| C2 | ||||
| H1 | ||||
| H1 | ||||
| 7 | χ cytosine | O4 | OS–CT–N*–C4 | O4 |
| C2 | ||||
| C2 | ||||
| H1 | ||||
| H1 | ||||
| 8 | χ guanine | O4 | OS–CT–N*–CP | O4 |
| C2 | ||||
| C2 | ||||
| H1 | ||||
| H1 | ||||
| 9 | χ uracil | O4 | OS–CT–N*–CS | O4 |
| C2 | ||||
| C2 | ||||
| H1 | ||||
| H1 | ||||
The atom type “OS/OH” is to include the atom type for terminal O3′ and internal O3′ atom type.
RNA Molecules Used for Molecular Dynamics Simulations
| sequence | source | model no. | PDB ID | no. NaCl |
|---|---|---|---|---|
| 5′ GGUGAAGGC 3′ | NMR | 22 | 2DD2( | 7 |
| 3′ CCGAAGCCG 5′ | ||||
| 5′ GGCACUUCGGUGCC 3′ | NMR | 11 | 2KOC( | 6 |
| 5′ U(UA)6A 3′ | X-ray | 1 | 1RNA( | 11 |
| 3′ A(AU)6U 5′ | ||||
| 5′ GCGCAAGC 3′ | NMR | 6 | 1ZIH( | 5 |
| 5′ AAAA 3′ | single strand of
A-form RNA built using NAB module of Amber[ | 3 | ||
| 5′ CAAU 3′ | ||||
| 5′ CCCC 3′ | ||||
| 5′ GACC 3′ | ||||
| 5′ UUUU 3′ | ||||
| 5′ CCCC 3′ | C2′ endo anti | taken from the work
of Condon et al.[ | ||
| C2′ endo syn | ||||
| C3′ endo syn | ||||
Two base pairs were removed from the 1ZIH sequence for simulations relative to the solution structure.
Mean Structural Parameters for Watson–Crick Duplex [U(UA)6A]2a
| this work | ff10 | X-ray | |
|---|---|---|---|
| local base-pair parameters | |||
| shear [Å] | 0.02 (0.9) | 0.0 (0.5) | –0.1 (0.4) |
| stretch [Å] | –0.05 (0.5) | 0.01 (0.3) | –0.2 (0.1) |
| stagger [Å] | 0.08 (0.5) | –0.04 (0.6) | –0.01 (0.2) |
| buckle [deg] | 0.03 (13) | 0.5 (15) | 1.0 (5) |
| propeller [deg] | –12.2 (12) | –14.1 (14) | –18.8 (2) |
| opening [deg] | –0.3 (12) | 1.9 (9) | 0.1 (3) |
| local base-pair step parameters | |||
| shift [Å] | 0.0 (0.8) | –0.01 (0.7) | 0.03 (0.4) |
| slide [Å] | –1.4 (0.7) | –1.2 (0.6) | –1.3 (0.1) |
| rise [Å] | 3.1 (0.8) | 3.1 (0.9) | 3.3 (0.2) |
| tilt [deg] | 0.0 (6) | 0.07 (6) | –0.2 (3) |
| roll [deg] | 8.5 (8) | 12.2 (9) | 10.7 (5) |
| twist [deg] | 29.1 (10) | 27.7 (8) | 31.1 (5) |
| local base-pair helical parameters | |||
| X-displacement [Å] | –3.7 (3) | –3.8 (2) | –4.1 (1) |
| Y-displacement [Å] | 0.01 (2) | 0.03 (2) | –0.1 (0.7) |
| helical rise [Å] | 2.6 (1) | 2.3 (1) | 2.7 (0.3) |
| inclination [deg] | 14.5 (14) | 21.1 (15) | 19.5 (10) |
| tip [deg] | –0.03 (11) | –0.1 (10) | 0.02 (6) |
| helical twist [deg] | 31.4 (12) | 31.6 (10) | 33.3 (4) |
Closing base-pairs have been left out of analysis to avoid the noise from end fraying. Data presented is the average over all other nucleotides and over all four trajectories. The second column represents results using the revised force field from this work, and the third column represents the simulations with the conventional Amber force field. The last column contains the values from the starting X-ray structure (PDB 1RNA). The values in parentheses are standard deviations.
Figure 2Dihedral term potential energy (kcal/mol) as a function of dihedral angle for Amber99 (red),[25] Amber ff10 (Zgarbova et al. and α/γ bsc0; green),[17,18] χ, and this work (blue). For comparison, the average energy of each curve was set to zero.
Figure 3Dihedral potentials of mean force (PMF) for nine torsions. Shown are the new dihedrals from this work (blue), Amber ff10[17,18] with χ and α/γ bsc0 correction (green), and a statistical potential derived from a set of crystal structures in the PDB (red). For bins of statistical data where there were no representatives in the pdb, the points are not plotted. The PMF is an average of 16 RNA dinucleotide molecules in explicit solvent (TIP3P) water model.
RMSD of Umbrella Sampling of Torsion Angles with Respect to Experimental Statistical Potentiala
| torsion | ff10 | this work |
|---|---|---|
| α | 1.12 | 0.60 |
| β | 1.36 | 1.41 |
| γ | 2.24 | 1.41 |
| ε | 2.31 | 1.86 |
| ζ | 1.37 | 0.92 |
| χ adenine | 1.54 | 1.95 |
| χ guanine | 1.56 | 2.02 |
| χ cytosine | 2.14 | 1.52 |
| χ uracil | 2.02 | 1.37 |
RMSD is calculated in 5° bin interval. Unit of RMSD is kcal/mol. Bins that were empty because there were no examples in the PDB were excluded from the RMSD calculation.
Figure 4Histogram of mass-weighted RMSD to A-form-like reference. Histograms are provided for AAAA (panel A), CAAU (panel B), CCCC (panel C), GACC (panel D), and UUUU (panel E). Each histogram was generated by merging the conformations for four independent simulations. The bin widths are 0.01 Å. For major peaks in the histogram, corresponding centroid structures from clustering (Table ) are labeled.
Clustering Results from the Combined Trajectories of All Five Tetramers Run Using the Current Amber Force Field (ff10) and the Force Field Derived in This Worka
| sequence | force field | % noise | cluster no. | % of frames | average distance within cluster (Å) | standard deviation within cluster (Å) | average distance between clusters (Å) | RMSD of centroid to A-form (Å) |
|---|---|---|---|---|---|---|---|---|
| AAAA | ff10 | 32.7 | 1 | 55.9 | 2.67 | 0.72 | 4.49 | 1.89 |
| 2 | 4.2 | 1.08 | 0.38 | 4.02 | 5.27 | |||
| 3 | 2.5 | 0.78 | 0.24 | 3.73 | 5.01 | |||
| 4 | 2 | 1.39 | 0.35 | 4.33 | 4.42 | |||
| this work | 39.7 | 1 | 28.1 | 1.67 | 0.54 | 3.89 | 2.42 | |
| 2 | 9.2 | 2.18 | 0.67 | 3.80 | 3.68 | |||
| 3 | 5.5 | 1.53 | 0.40 | 3.63 | 4.74 | |||
| 4 | 5.4 | 1.68 | 0.43 | 4.16 | 1.33 | |||
| CAAU | ff10 | 13.3 | 1 | 73.3 | 0.97 | 0.48 | 4.57 | 4.60 |
| 2 | 8.8 | 2.00 | 0.53 | 3.26 | 2.65 | |||
| 3 | 3.5 | 1.94 | 0.65 | 3.79 | 1.00 | |||
| 4 | 0.7 | 0.96 | 0.36 | 4.04 | 4.16 | |||
| this work | 19.1 | 1 | 66.0 | 1.43 | 0.52 | 3.04 | 2.52 | |
| 2 | 10.0 | 1.42 | 0.45 | 3.47 | 1.46 | |||
| 3 | 3.6 | 1.06 | 0.36 | 3.39 | 4.56 | |||
| 4 | 1.0 | 0.88 | 0.28 | 3.13 | 4.00 | |||
| CCCC | ff10 | 6.2 | 1 | 72.7 | 1.13 | 0.43 | 3.75 | 4.87 |
| 2 | 11.4 | 1.76 | 0.76 | 4.02 | 1.91 | |||
| 3 | 7.7 | 1.88 | 0.64 | 3.97 | 2.78 | |||
| 4 | 0.8 | 0.99 | 0.42 | 3.81 | 4.40 | |||
| this work | 3.3 | 1 | 93.9 | 1.71 | 0.64 | 4.11 | 2.25 | |
| 2 | 1.6 | 1.25 | 0.42 | 4.96 | 4.76 | |||
| 3 | 1.1 | 0.83 | 0.27 | 5.21 | 4.40 | |||
| GACC | ff10 | 20.0 | 1 | 21.3 | 1.66 | 0.77 | 3.93 | 1.46 |
| 2 | 19.6 | 1.76 | 0.61 | 3.98 | 2.61 | |||
| 3 | 7.7 | 1.34 | 0.48 | 3.85 | 4.83 | |||
| 4 | 6.9 | 1.11 | 0.35 | 5.43 | 5.89 | |||
| this work | 15.0 | 1 | 45.0 | 1.43 | 0.53 | 3.78 | 2.47 | |
| 2 | 28.0 | 1.18 | 0.41 | 3.78 | 1.74 | |||
| 3 | 3.8 | 0.79 | 0.24 | 4.23 | 4.44 | |||
| 4 | 2.6 | 0.92 | 0.30 | 4.38 | 4.75 | |||
| UUUU | ff10 | 28.5 | 1 | 39.8 | 1.52 | 0.61 | 4.41 | 5.64 |
| 2 | 9.9 | 1.27 | 0.40 | 4.43 | 5.79 | |||
| 3 | 8.8 | 2.04 | 0.57 | 4.52 | 2.67 | |||
| 4 | 4.7 | 1.27 | 0.40 | 4.29 | 5.04 | |||
| this work | 18.5 | 1 | 72.8 | 1.43 | 0.60 | 3.06 | 2.78 | |
| 2 | 3.7 | 0.88 | 0.33 | 4.33 | 4.28 | |||
| 3 | 3.3 | 1.41 | 0.43 | 3.23 | 1.53 | |||
| 4 | 1.0 | 1.22 | 0.29 | 3.67 | 3.71 |
Only the top 4 clusters are presented. The sizes of the remaining clusters (if they existed) were always less than 5% of total frames in trajectory. Starting from the left, columns denote molecule type, force field used, % of noise frames (those that did not get placed in a cluster), cluster number, % of frames within that cluster, average distance and standard deviation between elements of that cluster, average distance between that and all other clusters, and finally mass-weighted RMSD of the centroid of the cluster to the A-form conformation of the respective molecule.
Figure 5Comparison of Amber ff10 (right panels, green) to the dihedral parameters fit in this work (left panels, blue) for dynamics of the Watson–Crick duplex, 5′ U(UA)6A 3′.[66] Mass-weighted atomic RMSD to the solution structure is shown as a function of time for four independent simulations. The higher RMSD for ff10 of simulation 3 is an unfolding of four nucleotides of the 5′ terminal pairs. For comparison, these RMSDs as a function of time when excluding terminal base pairs are provided in Figure S5. The same trends are seen in both plots.
Mean Values of Backbone and Glycosidic Torsions, as Well as Sugar Puckers, from the Simulations of Watson–Crick Duplex [U(UA)6A]2a
| this work | ff10 | X-ray | A-RNA PDB average | |
|---|---|---|---|---|
| α [deg] | 263 (66) | 277 (33) | 287 (36) | 295 (8) |
| β [deg] | 167 (32) | 174 (11) | 171 (14) | 174 (8) |
| γ [deg] | 86 (54) | 64 (16) | 65 (31) | 54 (6) |
| δ [deg] | 83 (15) | 81 (13) | 80 (5) | 81 (3) |
| ε [deg] | 217 (18) | 203 (16) | 213 (17) | 212 (10) |
| ζ [deg] | 268 (54) | 284 (40) | 280 (12) | 289 (7) |
| χ [deg] | 200 (20) | 208 (17) | 201 (9) | NA |
| pucker (% C3′- | 95.7 | 97.1 | 100 | NA |
Closing base-pairs have been left out to avoid noise from fraying of ends. The numbers are values in degrees averaged over all internal nucleotides and over all trajectories, except for pucker where % of C3′-endo conformation is indicated. The second column are the results using the parameters derived in this work, and the third column are the results from the conventional Amber current force field. The fourth column contains average values taken from the starting X-ray structure (PDB ID 1RNA). The fifth column contains averages from analysis of many structures with the A-RNA conformation taken from work of Richardson et al.[5] The values in parentheses are standard deviations. The notes indicate angles were the average is affected by the presence of secondary peak. See Figure S6 for more details.
20% of population is in a trans conformation with a peak at 150°, 76.4% is in a gauche– population with a peak at 295°, and the remaining 3.6% is in gauche+ confirmation.
The X-ray structure has 2 out of 24 nucleotides in trans position, that is, 8%; the rest is gauche–.
21.5% of population is in a trans conformation with a peak at 180°; the remaining 78.5% is in a gauche+ population with a peak at 58°.
The X-ray structure has 2 out of 24 nucleotides in trans position, that is, 8%; the rest is gauche+.
Figure 6Comparison of Amber ff10 (right panel, green) and the dihedral parameters fit in this work (left panels, blue) for dynamics of 5′GGUGAAGGC3′/3′CCGAAGCCG5′ (major conformation).[71] Mass-weighted atomic RMSD to the solution structure is shown as a function of time for four independent simulations. The higher RMSD for ff10 is a result of of stem nucleotides U3 and C17 frequently moving from their base pairing partner and flipping out into solution.
Figure 7Comparison of Amber ff10 (right panel; green) and the dihedral parameters fit in this work (left panel; blue) for dynamics of the UUCG tetraloop, 2KOC.[29] Mass-weighted atomic RMSD to the solution structure is shown as a function of time for four independent simulations. The higher RMSD for simulations with parameters derived in this work is a reflection of C8 (the loop C) leaves the conformation of the solution structure and becomes either exposed to solvent or extends into the helix major groove.
Figure 8Comparison of Amber ff10 (right panel; green) and the dihedral parameters fit in this work (left panel; blue) for dynamics of the GCAA tetraloop, 1ZIH.[77] Mass-weighted atomic RMSD to the solution structure is shown as a function of time for four independent simulations. The higher RMSD for both ff10 and this work is due to unfolding of the loop region away from the solution structure.