Literature DB >> 24803866

Improvement of DNA and RNA Sugar Pucker Profiles from Semiempirical Quantum Methods.

Ming Huang¹, Timothy J Giese², Tai-Sung Lee², Darrin M York².

Abstract

Neglect of diatomic differential overlap (NDDO) and self-consistent density-functional tight-binding (SCC-DFTB) semiempirical models commonly employed in combined quantum mechanical/molecular mechanical simulations fail to adequately describe the deoxyribose and ribose sugar ring puckers. This failure limits the application of these methods to RNA and DNA systems. In this work, we provide benchmark ab initio gas-phase two-dimensional potential energy scans of the RNA and DNA sugar puckering. The benchmark calculations are compared with semiempirical models. Pucker corrections are introduced into the semiempirical models via B-spline interpolation of the potential energy difference surface relative to the benchmark data. The corrected semiempirical models are shown to well reproduce the ab initio puckering profiles. Furthermore, we demonstrate that the uncorrected semiempirical models do not usually produce a transition state between the A-form and B-form sugar puckers, but the ab initio transition state is reproduced when the B-spline correction is used.

Entities: Chemical Disease Species

Year: 2014 PMID： 24803866 PMCID： PMC3985690 DOI： 10.1021/ct401013s

Source DB: PubMed Journal: J Chem Theory Comput ISSN： 1549-9618 Impact factor: 6.006

Introduction

Sugar rings play a central role in biological systems. They are one of the building blocks of carbohydrates[1−3] and are the flexible link between the nucleic acid nucleobase and phosphate backbone.[4,5] By adopting different puckering conformations, sugar rings alter the relative orientation of the phosphate backbone, nucleobase, and ribonucleotide 2′-OH. As a result, changes in sugar ring pucker lead to variations in the overall structure and function of nucleic acids.[4,5] The sugar ring pucker conformations affect the overall structure of RNA and DNA by maintaining a stable helical form.[6−8] This effect on structure translates to an affect on function. For example, the structure of A-form and B-form helices display differences in their major and minor grooves, which in turn play a vital role in protein recognition.[9,10] The adoption of specific sugar pucker conformations have also been found to be critical for nucleic acid polymerization and catalysis reactions.[11−14] An accurate quantum mechanical description of sugar ring pucker is thus a vital component to the modeling of larger biological macromolecules. High-level quantum mechanical (QM) methods are successful at describing the relative energy of ring conformations[15−22] but at a cost that exceeds their utility in molecular simulation. It has thus become common practice to incorporate neglect of diatomic differential overlap (NDDO) or density functional tight-binding (DFTB) semiempirical quantum models within combined quantum mechanical/molecular mechanical (QM/MM) methods[23−26] to provide a tractable description suitable for simulation. Recent evidence suggests that semiempirical methods fail to adequately model the conformational landscape of sugar rings,[27−29] presumably due to approximations inherent within these methods used to achieve their efficiency. Accurate modeling of conformational energy barriers is particularly problematic for the NDDO-based methods (although progress has been made for the OMx methods[30−34]), which in the present case is exacerbated by the strain of the five-membered ring. In this work, we improve the semiempirical description of sugar rings with an empirical correction without unnecessarily complicating their functional forms nor requiring a significant amount of additional computation. We introduce a suitable coordinate system to characterize the sugar ring conformations and use this definition to map the conformational landscape of sugar rings in RNA and DNA nucleosides with benchmark ab initio calculations. The reference calculations are compared to the AM1/d-PhoT[35] and DFTB3[36] semiempirical methods. The difference between the reference and semiempirical landscapes produces a table of correction energies from which we interpolate. By interpolating the correction from two dihedral angles within the ring, we avoid making direct modifications to the underlying electronic structure method. Upon applying the interpolated corrections, the semiempirical models are shown to well reproduce the reference ab initio calculations. The significance of this work is in providing the high-level reference data, identifying and quantifying the errors in semiempirical models, and describing our strategy for their improvement.

Methods

Sugar rings adopt puckered conformations to reduce steric and electronic repulsions, and several mathematical definitions have been put forth to characterize and distinguish between them[37−43] with so-called pseudorotation or puckering parameters. The definition we use, Z and Z (described below), is highly related to those developed by Sato[42] and Altona[39] and requires only two proper endocyclic torsions within the ringwhere ν1 and ν3 are the proper torsions shown in Figure 1. Z and Z are linear combinations of these angles and can be interpreted as a Cartesian representation of the pseudorotation wheel illustrated in Figure 1. The relationships between this definition of pseudorotation and the more common[43] usage of phase Pθ and amplitude A, which can be viewed as the polar coordinate representation of Figure 1, are

Figure 1

Furanose pseudorotation wheel. The furanose proper torsions ν are inset with a molecular structure. A ring’s pseudorotation can be characterized by a position within the wheel using the A–Pθ polar coordinates or the Z–Z Cartesian coordinates. The red dashed, green dotted, and black lines indicate Exo (E), Endo (E), and Twist (T) conformations, respectively, where the integers m and n denote the O4′, C1′, C2′, C3′, and C4′ atoms in respective order. The energy profiles of the deoxyribonucleosides deoxyadenosine (dA), deoxyguanosine (dG), deoxycytidine (dC), and thymidine (dT) and the ribonucleosides adenosine (rA), guanosine (rG), cytidine (rC), and uridine (rU) are shown in Figures 2 and 3 as 2D contours of the Z and Z puckering parameters. These contours were constructed by scanning the Z and Z in a series of constrained geometry optimizations from −60° to 60° in steps of 6°. The geometry optimizations imposed several additional torsion constraints listed in Table 1, together with constraint parameters from NAB program,[44] to mimic the nuceleoside connection to the B-DNA or A-RNA backbone while avoiding intramolecular hydrogen bonding interactions. Although intramolecular hydrogen bonding involving the 2′-OH group is known to affect the RNA conformation,[45] the purpose here is to characterize and formulate a correction for the intrinsic (steric) conformational profile that is decoupled from intramolecular hydrogen bonding. The scans were performed using MP2/6-311++G(3df,2p)//MP2/6-31++G(d,p) and B3LYP/6-311++G(3df,2p)//B3LYP/6-31++G(d,p) as implemented in Gaussian 09,[46] which we shall refer to as MP2 and B3LYP henceforth unless otherwise explicitly noted, and the semiempirical models PM6,[47] AM1/d-PhoT,[35] DFTB2-mio,[48] and DFTB3-mio.[49] The DFTB2-mio and PM6 results and conclusions are sufficiently similar to DFTB3-mio and AM1/d-PhoT, respectively, that we have placed them in the Supporting Information. We will refer to the AM1/d-PhoT and DFTB3-mio quantum models as AM1/d and DFTB3, respectively.

Figure 2

2D contours of sugar pseudorotation of nucleosides. The energy units along each contour curve are in kcal/mol, and all energy values are calculated with respect to the global minimum.

Figure 3

2D contours of sugar pseudorotation of nucleosides calculated with corrected DFTB3 and AM1/d methods and compared with the MP2 method. The energy units along each contour curve are in kcal/mol, and all energy values are calculated with respect to the global minimum.

Table 1

Constrained Dihedral Angles in Nucleosidesa

	dihedral angle	B-DNA	A-RNA
β	H5′–O5′–C5′–C4′	–151.5	–179.9
γ	O5′–C5′–C4′–C3′	30.9	47.4
ε	C4′–C3′–O3′–H3′	159.1	–151.7
χ_R	O4′–C1′–N9–C4	–99.4	–166.1
χ_Y	O4′–C1′–N1–C2	–99.4	–166.1
	C3′–C2′–O2′–H2′	–	–169.7

The constrained torsion values are taken from the NAB program in the AmberTools 13 program suite. For glycosidic bond torsion (χ), the numbering scheme for purines (R) and pyrimidines (Y) are labeled by subscripts and indicated separately.

2D contours of sugar pseudorotation of nucleosides. The energy units along each contour curve are in kcal/mol, and all energy values are calculated with respect to the global minimum. 2D contours of sugar pseudorotation of nucleosides calculated with corrected DFTB3 and AM1/d methods and compared with the MP2 method. The energy units along each contour curve are in kcal/mol, and all energy values are calculated with respect to the global minimum. The constrained torsion values are taken from the NAB program in the AmberTools 13 program suite. For glycosidic bond torsion (χ), the numbering scheme for purines (R) and pyrimidines (Y) are labeled by subscripts and indicated separately. Nucleobases in the dH and rH are replaced by hydrogen atoms. CBS-QB3//MP2/6-31++G(d,p). MP2/6-311++G(3df,2p)//MP2/6-31++G(d,p). MP2/cc-pVTZ//MP2/6-31+G(d). B3LYP/6-311++G(3df,2p)//B3LYP/6-31++G(d,p). The pseudorotation angles, puckering amplitudes, and the relative energies of the stationary points of dH and rH, DNA, and RNA nucleosides are listed in Tables 2, 3, and 4, respectively. MIN1 and MIN2 are the “Eastern” (large Z) and “Western” (small Z) minima as observed in the 2D contours. TS1 and TS2 are the “Northern” (large Z) and “Southern” (small Z) transition states. Minimum energy paths and associated pseudorotation amplitudes resulting from a nudged elastic band[50] analysis of the 2D contours are provided in the Supporting Information.

Table 2

Pseudorotation Phases Pθ, Pucker Amplitudes A, and Relative Energies ΔE of Minima (MIN) and Transition States (TS) of Abasic Nucleosides, dH and rHa

		MIN1			MIN2			TS1			TS2
		P_θ (deg)	A_r (deg)	ΔE (kcal/mol)	P_θ (deg)	A_r (deg)	ΔE (kcal/mol)	P_θ (deg)	A_r (deg)	ΔE (kcal/mol)	P_θ (deg)	A_r (deg)	ΔE (kcal/mol)
dH	CBS-QB3b	6.3	42.0	2.5	156.2	40.5	0.0	77.8	42.6	2.9	317.0	42.0	3.0
	CBS-QB3	336.4	38.4	2.9	157.8	39.8	0.0	78.2	39.4	3.0	263.4	36.8	1.9
	MP2c	6.3	42.0	2.4	156.2	40.5	0.0	77.8	42.6	2.8	317.0	42.0	2.9
	MP2tzd	2.5	42.1	2.6	155.6	40.5	0.0	76.6	42.8	3.0	319.0	42.2	2.9
	B3LYPe	358.4	38.6	1.4	153.1	38.3	0.0	58.9	38.8	1.6	293.2	37.5	1.8
rH	CBS-QB3b	357.4	43.2	1.5	150.8	43.2	0.0	74.2	42.1	3.6	278.9	32.7	5.1
	CBS-QB3	355.1	39.9	1.6	153.0	41.3	0.0	68.2	38.7	3.6	274.8	30.9	5.1
	MP2c	357.4	43.2	1.4	150.8	43.2	0.0	74.2	42.1	3.5	278.9	32.7	4.9
	MP2tzd	356.4	43.1	1.5	150.5	43.2	0.0	73.2	42.2	3.4	280.1	32.8	4.7
	B3LYPe	359.0	39.9	1.2	149.3	40.9	0.0	64.1	38.2	2.3	276.6	30.0	4.0

Nucleobases in the dH and rH are replaced by hydrogen atoms.

CBS-QB3//MP2/6-31++G(d,p).

MP2/6-311++G(3df,2p)//MP2/6-31++G(d,p).

MP2/cc-pVTZ//MP2/6-31+G(d).

B3LYP/6-311++G(3df,2p)//B3LYP/6-31++G(d,p).

Table 3

Pseudorotation Phases Pθ, Pucker Amplitudes A, and Relative Energies ΔE of Minima (MIN) and Transition States (TS) of DNA Nucleosides

		MIN1			MIN2			TS1			TS2
		P_θ (deg)	A_r (deg)	ΔE (kcal/mol)	P_θ (deg)	A_r (deg)	ΔE (kcal/mol)	P_θ (deg)	A_r (deg)	ΔE (kcal/mol)	P_θ (deg)	A_r (deg)	ΔE (kcal/mol)
dA	MP2a	358.7	34.3	2.9	189.8	34.7	0.0	83.7	40.6	4.5	295.0	13.4	4.9
	B3LYPb	13.6	33.1	1.8	187.3	32.6	0.0	75.4	37.2	2.2	300.4	13.1	3.0
	DFTB3	347.0	23.4	0.6	194.6	28.3	0.0	66.5	28.6	1.1	289.1	8.1	0.9
	AM1/d	281.5	11.3	0.0	–	–	–	–	–	–	–	–	–
dG	MP2a	357.8	33.9	3.1	190.6	34.8	0.0	83.2	40.4	4.7	293.7	14.2	5.0
	B3LYPb	18.1	33.8	1.9	186.8	33.1	0.0	74.3	37.0	2.4	300.3	13.5	3.2
	DFTB3	345.1	23.4	0.7	195.3	28.6	0.0	67.8	28.3	1.3	290.8	8.9	1.0
	AM1/d	–	–	–	269.5	11.9	0.0	–	–	–	–	–	–
dC	MP2a	36.8	37.5	2.8	180.2	35.4	0.0	89.6	39.9	3.9	344.2	3.6	4.9
	B3LYPb	40.9	35.2	1.5	179.3	33.3	0.0	81.4	37.1	1.7	–	–	–
	DFTB3	353.1	18.4	0.5	190.3	27.1	0.0	91.4	29.0	1.1	282.6	4.2	0.7
	AM1/d	292.8	9.6	0.0	–	–	–	–	–	–	–	–	–
dT	MP2a	26.5	33.5	3.2	180.8	35.4	0.0	90.6	38.3	4.5	4.9	3.4	5.0
	B3LYPb	38.6	34.1	1.7	180.1	33.3	0.0	85.4	35.9	2.1	–	–	–
	DFTB3	354.9	17.3	0.7	189.3	27.5	0.0	95.2	22.5	1.7	310.2	4.0	0.8
	AM1/d	277.0	7.9	0.0	–	–	–	–	–	–	–	–	–

MP2/6-311++G(3df,2p)//MP2/6-31++G(d,p).

B3LYP/6-311++G(3df,2p)//B3LYP/6-31++G(d,p).

Table 4

Pseudorotation Phases Pθ, Pucker Amplitudes A, and Relative Energies ΔE of Minima (MIN) and Transition States (TS) of RNA Nucleosidesa

		MIN1			MIN2			TS1
		P_θ (deg)	A_r (deg)	ΔE (kcal/mol)	P_θ (deg)	A_r (deg)	ΔE (kcal/mol)	P_θ (deg)	A_r (deg)	ΔE (kcal/mol)
rA	MP2b	8.5	42.5	0.0	200.8	33.1	0.0	87.9	32.2	3.7
	B3LYPc	10.5	38.6	0.6	185.6	30.5	0.0	57.7	33.9	1.5
	DFTB3	14.3	31.8	2.2	172.8	30.2	0.0	36.5	27.8	2.4
	AM1/d	–	–	–	191.2	19.8	0.0	–	–	–
rG	MP2b	9.1	42.5	0.1	197.9	33.3	0.0	85.5	34.0	3.7
	B3LYPc	10.7	38.8	0.6	181.7	31.6	0.0	57.2	34.4	1.6
	DFTB3	14.5	31.6	2.6	173.0	30.5	0.0	36.2	27.1	2.7
	AM1/d	–	–	–	189.6	21.7	0.0	–	–	–
rC	MP2b	13.5	41.4	0.0	207.5	31.1	0.9	91.2	12.6	3.8
	B3LYPc	17.3	36.8	0.0	193.3	26.5	0.5	106.6	16.8	1.3
	DFTB3	–	–	–	193.2	24.2	0.0	–	–	–
	AM1/d	–	–	–	185.5	16.9	0.0	–	–	–
rU	MP2b	12.7	42.0	0.0	207.8	31.9	0.9	89.2	14.3	4.3
	B3LYPc	15.0	37.6	0.0	193.8	28.3	0.5	102.8	19.7	1.7
	DFTB3	–	–	–	193.4	26.1	0.0	–	–	–
	AM1/d	–	–	–	188.9	19.2	0.0	–	–	–

Only TS1 of RNA nucleosides are present.

MP2/6-311++G(3df,2p)//MP2/6-31++G(d,p).

B3LYP/6-311++G(3df,2p)//B3LYP/6-31++G(d,p).

MP2/6-311++G(3df,2p)//MP2/6-31++G(d,p). B3LYP/6-311++G(3df,2p)//B3LYP/6-31++G(d,p). Only TS1 of RNA nucleosides are present. MP2/6-311++G(3df,2p)//MP2/6-31++G(d,p). B3LYP/6-311++G(3df,2p)//B3LYP/6-31++G(d,p). Table 2 compares the stationary point energies and geometries for the abasic nucleosides dH and rH computed with our MP2 reference and several other ab initio methods. The purpose of this comparison is to validate our MP2 calculations, which are similar to those used in previous related works.[15−22,45] Specifically, Table 2 compares MP2/6-311++G(3df,2p)//MP2/6-31++G(d,p) to CBS-QB3,[51] CBS-QB3//MP2/6-31++G(d,p), MP2/cc-pVTZ//MP2/6-31+G(d) (abbreviated as MP2tz here), and B3LYP/6-311++G(3df,2p)//B3LYP/6-31++G(d,p). We note that the CBS-QB3 protocol, by construction, involves a B3LYP/6-311G(3df,2d,p) geometry optimization. Figure 3 displays the DFTB3 and AM1/d energy contours upon applying a molecular mechanical (i.e., nonelectronic) sugar pucker correction Ecorr() ≡ Ecorr(Z,Z), defined by the difference between the MP2 and semiempirical energies relative to their respective minima, i.e.,where Ecorr(Z) is an energy correction at ZE(Z) is the energy of method X (MP2 or model), and min{E(Z)} is the energy at the global minimum. The above procedure could be used to produce a separate correction for each nucleoside; however, we noticed that the four DNA and four RNA nucleoside correction profiles were similar enough that it sufficed to use only two corrections: an average correction for DNA nucleosides and an average correction for RNA nucleosides. The two averaged correction profiles are a discrete sampling of the energy correction landscape on a uniform grid. In order to obtain atomic forces, continuous representations of the corrections are constructed from Cardinal B-spline interpolation of the pretabulated values. Cardinal B-splines are useful for multi-dimensional interpolation in molecular applications, including application in smooth particle mesh Ewald methods and have been thoroughly described elsewhere.[52,53] In brief, let t index the uniform grid. The location of the grid points are then Z = Ẑ(t – 1)L/N + Ẑ(t – 1)L/N, where L and Ẑ are the lengths and directions of the “unit cell” containing the pretabulated values, and N is the number of pretabulated points in each direction. The energy correction interpolated from the pretabulated grid data iswhereis a weight constructed from nth-order Cardinal B-splineswhich is nonzero only for u ∈ (0,n), andis a modified set of pretabulated values of the energy correction to remove the artifacts of weighting the data. In this notation, k = 2πk1L/N is an “angular wave number” of a plane wave basis function, where k1 is an integer. If we had simply set Ẽ(Z) = Ecorr(Z), then interpolation of the data at Z using eq 7 would not yield Ecorr(Z) because nearby values would still contribute with a nonzero weight. To resolve this, eq 10 projects Ecorr(Z) and w(Z) into a basis of plane waves via Fast Fourier Transforms (FFTs), divides the Fourier coefficients of Ecorr(Z) by those of w(Z), and re-evaluates the scaled data back onto the grid points with a reverse FFT. Although the present work is focused on applying corrections to improve the sugar puckers of DNA and RNA nucleosides, our approach could just as easily be extended to other systems, such as furanose sugars with different substituents. Our approach is to choose the correction by performing scans of the two-dimensional potential, and we found it sufficient to use a single correction for the DNA nucleosides and a single correction for the RNA nucleosides. If one were to apply this correction to other systems, such as furanose sugars, and found that different corrections were needed for different substituents, one can either construct scans for each substituent or one can explore the possibility of introducing a minimal number of parameters to make rational changes to a base correction. For example, it is common for semiempirical models to use Gaussian functions to make adjustments to the core–core interactions, and one, too, could alter the pretabulated two-dimensional pucker corrections with two-dimensional Gaussians to adjust the location and height of maxima. The proposed correction extends easily to higher-dimensional problems without introducing significant additional computational cost, which may be necessary to extend this method to 6- or 7-membered rings; however, a suitable reduced set of generalized coordinates would need to be identified to accurately describe the puckering correction or else the correction would require 3- or 4-dimensional tables of correction energies. The burden, in that case, is not the use of the correction but in obtaining the reference values that one desires from ab initio calculation.

Results and Discussion

Comparison of ab Initio Methods

Foloppe and MacKerell found that the structural and energetic properties of a nucleoside analog computed with MP2/6-31G(d) satisfactorally agreed with experimental results.[21] Other works that have examined sugar ring conformations in full nucleosides and nucleoside analogues have therefore followed similar protocols.[15−20,22] The largest basis sets used in those works have been MP2/6-311+G(d,p)//MP2/6-31G(d)[19] and MP2/6-311++G(d,p)//B3LYP/6-31G(d,p),[20] and ref (45) has recently used RI-MP2/cc-pVTZ//MP2/6-31+G(d). Our choice for using MP2/6-311++G(3df,2p)//MP2/6-31++G(d,p) is based on the success of all of the above cited works. To further validate our MP2 reference in light of the choices described above, we compare the stationary points of abasic nucleosides (dH and rH) in Table 2 to methods similar to those used in previous works and to CBS-QB3.[51] In brief, the MP2 energies and geometries are not particularly sensitive to basis set, whereas the B3LYP geometries do show sensitivity to basis. We therefore find the MP2 geometries to be more reliable than the B3LYP geometries, and our MP2 energies should be of similar quality to those used in previous works.

Benchmark Data

The MP2 energy profiles of the DNA nucleosides shown in Figure 2 contain two minima: a global minimum C2′-endo (MIN2) conformation and a C3′-endo (MIN1) local minimum, which is on average 3.00 ± 0.18 kcal/mol higher in energy. The two minima are connected by two transition state pathways with average barrier heights of 4.40 ± 0.28 (TS1) and 4.96 ± 0.05 (TS2) kcal/mol. Both pathways are thus feasible but with only a small preference for the northern routes (TS1), which is in agreement with previous work.[19] The unfavorability of the southern routes appear to originate from steric contact between the C4′ hydroxymethyl group and C1′ nucleobase within their transition state structures. The energy profiles of the pyrimidine nucleosides dC and dT are less unfavorable than the purine nucleosides dA and dG near Z ≈ Z ≈ 0, i.e., the area of small pucker amplitude. In other words, the dC and dT sugar rings can be more easily flattened, and it is for this reason that the dC and dT MP2 transition state pucker amplitudes in Table 3 are much smaller than the purine nucleosides. The MP2 energy profiles of the RNA nucleosides shown in Figure 2 also contain C3′-endo (MIN1) and C2′-endo (MIN2) minima. However, unlike the DNA nucleosides, these minima are much closer in energy and separated by a single transition state. The conformational minima for the purine nucleosides are very similar (within 0.1 kcal/mol), whereas the pyrimidine nucleosides favor the C3′-endo (MIN1) conformation by around 0.9 kcal/mol. Canonical duplex A-form RNA has a C3′-endo sugar pucker, although it is widely known that RNA adopts a wide range of noncanonical secondary and tertiary structures. It should be emphasized that the sugar puckering profiles in the present work were generated using constraints on the orientation of the 2′-OH group so as to minimize intramolecular hydrogen bonding. It is known that these interactions have important effects on the conformational heterogeneity of RNA and have been recently characterized in other work.[45] The purpose here is to generate benchmark data from which to derive correction potentials for systematic errors in the intrinsic ring puckering that will be used with models where the intramolecular hydrogen bonding will be accounted for explicitly. The B3LYP energy profiles of the DNA and RNA nucleosides are in qualitative agreement with MP2, but the relative energies of the stationary points are noticeably different. The B3LYP energy barriers listed in Tables 3 and 4 are almost half those of MP2, which is consistent with previous findings.[19] In previous work, the North and South minima of free nucleosides were examined[19] to make comparison with experimental results;[54−57] however, it was also noted that the free energy nucleoside potential energy surface is not representative of the puckers encountered in duplex DNA and RNA (except rC).[17,19] Foloppe et al.[19] therefore examined the effect of constraints on the relative energies between the North and South minima. The relative energies appearing in Table 3 of this work are larger than those reported in Table 3 of ref (19) because our choice of constraints were meant to mimic B-form DNA (and A-form RNA), whereas the North–South relative energies in ref (19) involve comparison to the minima observed in free nucleosides.

Uncorrected Semiempirical Methods

The agreement between the DFTB3 and ab initio DNA nucleoside energy profiles is poor. The two most striking features of DFTB3’s DNA profiles are the overall depth and flatness of the potential energy surface. The observed flatness of the DFTB3 pucker profile is in agreement with a previous study that applied DFTB methods to furanose ring systems.[29] The average relative energy of the local minima is 3 to 5 times smaller than B3LYP and MP2, respectively, suggesting a lack of preference between the C2′-endo and C3′-endo conformations. Furthermore, the barriers for pseudorotation are only 1.30 ± 0.21 and 0.85 ± 0.11 kcal/mol for the northern and southern pathways, respectively. Not only are these barriers approximately 3.5 kcal/mol lower than the ab initio results, but the southern rather than the northern pseudorotation path is preferred. This is counter to steric arguments,[58] previous ab initio computation,[19,21,22] and our MP2 and B3LYP results. The DFTB3 RNA nucleoside profiles vaguely resemble B3LYP, but upon closer inspection, the pyrimidine nucleosides have only one minimum and thus lack a transition state. Similarly, the local minima in the RNA purine nucleosides are barely bound with an average depth of 0.14 ± 0.01 kcal/mol. The AM1/d DNA nucleoside sugar rings are flat. It predicts only a single circular-shaped minimum for both RNA and DNA nucleosides. This behavior is presumably a symptom of AM1/d’s neglect of diatomic differential overlap. The enforcement of molecular orbital orthogonality constraints within the solution of the wave function would normally act to raise the electronic energy as atoms overlapped. Conversely, ignoring those constraints largely prevents steric distortions of the sugar ring. DFTB3 does enforce orthogonality constraints on the molecular orbitals, and one thus does observe structure in its potential energy surfaces. The poor quality of the DFTB3 structures and its overly flat pucker profile is likely caused by its use of a minimal valence atomic orbital basis.

Pucker-Corrected Semiempirical Methods

The MP2 potential energy surface is used to construct pucker-energy corrections Ecorr to the AM1/d and DFTB3 Hamiltonians. The potential energy surfaces of the corrected semiempirical methods upon reoptimizing their structures are shown in Figure 3, and 1D minimum potential energy paths are provided in the Supporting Information. The corrected semiempirical models do not exactly reproduce the MP2 results because each of the four DNA and four RNA nucleosides use the same average correction; however, the corrected methods agree with MP2 much more than even B3LYP. Table 5 summarizes the stationary point relative energies, torsion angles, and pseudorotation parameters of the corrected models, which are now in excellent agreement with MP2.

Table 5

Statistical Analysis of Relative Energies ΔE , ν1 and ν3 Torsions, and Pseudorotation Parameters Pθ and A Evaluated at Stationary Points (MIN1, MIN2, TS1, and TS2) Calculated with Corrected Semiempirical Methods and DFT Methodsa

	DFTB3+E_corr			AM1/d+E_corr			B3LYP
	n	mse	mue	n	mse	mue	n	mse	mue
ΔE (kcal/mol)	28	–0.09	0.23	28	–0.01	0.16	26	–1.04	1.13
ν₁,ν₃ (deg)	56	0.38	2.01	56	0.77	2.38	52	–0.84	4.24
P_θ (deg)	26	1.68	4.83	26	1.94	5.30	25	–0.97	9.13
A_r (deg)	28	1.50	1.51	28	1.77	1.85	26	–1.60	2.56

“n” is the number of points used to generate the statistics. The phase angles of TS2 of pyrimidine deoxyribonucleosides are not included in the statistics because when their amplitudes of puckering are very small, these phase angles are sensitive to perturbation. Figures 4 and 5 display the corrected and uncorrected AM1/d and DFTB3 deoxycytidine stationary structures superimposed on to the MP2 structures, respectively. The corrected structures are in such good agreement with MP2 that it is difficult to distinguish between them in the figures.

Figure 4

Figure 5

Superimposed geometries of MIN and TS of deoxycytidine optimized using MP2 (dull) and DFTB3 (shiny) with and without sugar pucker correction. The RMS values (Å) from heavy atom alignment are shown in the parentheses.

Superimposed geometries of MIN and TS of deoxycytidine optimized using MP2 (dull) and AM1/d (shiny) with and without sugar pucker correction. The RMS values (Å) from heavy atom alignment are shown in red within the parentheses. Superimposed geometries of MIN and TS of deoxycytidine optimized using MP2 (dull) and DFTB3 (shiny) with and without sugar pucker correction. The RMS values (Å) from heavy atom alignment are shown in the parentheses. In this work, we provided benchmark gas phase ab initio potential energy surfaces and described a method for correcting semiempirical models to reproduce benchmark data. Whether the gas phase ab initio potential energy surface or the Amber potential energy surface are a more appropriate reference for correcting the semiempirical models in condensed phase environments remains an open question to be resolved in future work. The most appropriate correction depends on the manner in which it is used. For condensed phase simulations, the appropriate reference may not result from gas phase scans at all; instead, it may be sought by reproducing the two-dimensional pucker free energy profile computed from simulation.[59] In this case, one must resort to the use of MM methods to obtain the reference free energy profile because a MP2 or B3LYP simulation would be prohbitively slow. Furthermore, one may also prefer to use the MM Hamiltonian as the reference if the corrected semiempirical model is specifically used within the QM/MM simulation so that the semiempirical model is made consistent with the remainder of the system. The use of a MP2 ab initio reference in the present work is for the application of corrected semiempirical models to small molecules in the gas phase.

Conclusion

We performed benchmark 2D potential energy scans of sugar pseudorotations of DNA and RNA nucleosides with MP2 and B3LYP and compared those results with the AM1/d and DFTB3-mio semiempirical Hamiltonians. The semiempirical models poorly reproduced the ab initio pseudorotation energy profiles. AM1/d sugar rings have only one stable conformation corresponding to a nearly flat ring. The AM1/d surfaces resemble 2D parabola as opposed to having the rich features found in the ab initio surfaces. DFTB3-mio correctly produces two stable DNA nucleoside sugar pucker conformations, but the pseudorotation barrier between them is greatly underestimated. Furthermore, DFTB3-mio incorrectly predicts that the southern pseudorotation pathway is preferred. Like AM1/d, DFTB3-mio exhibits only one stable RNA nucleoside sugar pucker conformation, but their potential energy surfaces resemble the features produced by the ab initio methods only vaguely. The failure of semiempirical methods at reproducing sugar ring pucker is attributed to the use of a minimal basis set, which for the case of AM1/d is exacerbated by the neglect of diatomic differential overlap. We presented a method for correcting semiempirical methods by introducing a pucker correction computed from Cardinal B-spline interpolation of pretabulated values. The correction has smooth energies and forces and is a nonelectronic term not coupled with the SCF procedure, thus avoiding the need to complicate the underlying electronic structure calculation. The corrected semiempirical models were shown to well reproduce the ab initio potential energy surfaces, barriers of pseudorotation, and stationary point geometries. The pucker correction introduced in this work will be useful in the modeling RNA and DNA in condensed phase using QM/MM simulation with semiempirical QM methods.

28 in total

1. The double helix: a tale of two puckers.

Authors: Alexander Rich
Journal: Nat Struct Biol Date: 2003-04

Review 2. Illuminating the silence: understanding the structure and function of small RNAs.

Authors: Tariq M Rana
Journal: Nat Rev Mol Cell Biol Date: 2007-01 Impact factor: 94.444

3. Intrinsic conformational properties of deoxyribonucleosides: implicated role for cytosine in the equilibrium among the A, B, and Z forms of DNA.

Authors: N Foloppe; A D MacKerell
Journal: Biophys J Date: 1999-06 Impact factor: 4.033

4. Another method for specifying furanose ring puckering.

Authors: T Sato
Journal: Nucleic Acids Res Date: 1983-07-25 Impact factor: 16.971

5. Less is more when simulating unsulfated glycosaminoglycan 3D-structure: comparison of GLYCAM06/TIP3P, PM3-CARB1/TIP3P, and SCC-DFTB-D/TIP3P predictions with experiment.

Authors: Benedict M Sattelle; Andrew Almond
Journal: J Comput Chem Date: 2010-12 Impact factor: 3.376

Review 6. Conformational analysis of furanoside-containing mono- and oligosaccharides.

Authors: Hashem A Taha; Michele R Richards; Todd L Lowary
Journal: Chem Rev Date: 2012-10-16 Impact factor: 60.622

7. Specific Reaction Parametrization of the AM1/d Hamiltonian for Phosphoryl Transfer Reactions: H, O, and P Atoms.

Authors: Kwangho Nam; Qiang Cui; Jiali Gao; Darrin M York
Journal: J Chem Theory Comput Date: 2007-03 Impact factor: 6.006

8. Parametrization and Benchmark of DFTB3 for Organic Molecules.

Authors: Michael Gaus; Albrecht Goez; Marcus Elstner
Journal: J Chem Theory Comput Date: 2012-11-26 Impact factor: 6.006

9. Theoretical study of large conformational transitions in DNA: the B<-->A conformational change in water and ethanol/water.

Authors: Agnes Noy; Alberto Pérez; Charles A Laughton; Modesto Orozco
Journal: Nucleic Acids Res Date: 2007-04-25 Impact factor: 16.971

22 in total

1. Nucleic acid reactivity: challenges for next-generation semiempirical quantum models.

Authors: Ming Huang; Timothy J Giese; Darrin M York
Journal: J Comput Chem Date: 2015-05-06 Impact factor: 3.376

Review 2. Semiempirical Quantum Mechanical Methods for Noncovalent Interactions for Chemical and Biochemical Applications.

Authors: Anders S Christensen; Tomáš Kubař; Qiang Cui; Marcus Elstner
Journal: Chem Rev Date: 2016-04-13 Impact factor: 60.622

Review 3. RNA Structural Dynamics As Captured by Molecular Simulations: A Comprehensive Overview.

Authors: Jiří Šponer; Giovanni Bussi; Miroslav Krepl; Pavel Banáš; Sandro Bottaro; Richard A Cunha; Alejandro Gil-Ley; Giovanni Pinamonti; Simón Poblete; Petr Jurečka; Nils G Walter; Michal Otyepka
Journal: Chem Rev Date: 2018-01-03 Impact factor: 60.622

4. Quantum mechanical force fields for condensed phase molecular simulations.

Authors: Timothy J Giese; Darrin M York
Journal: J Phys Condens Matter Date: 2017-08-17 Impact factor: 2.333

5. Intermolecular interactions in the condensed phase: Evaluation of semi-empirical quantum mechanical methods.

Authors: Anders S Christensen; Jimmy C Kromann; Jan H Jensen; Qiang Cui
Journal: J Chem Phys Date: 2017-10-28 Impact factor: 3.488

6. Synthesis and antiviral evaluation of 2',2',3',3'-tetrafluoro nucleoside analogs.

Authors: Ozkan Sari; Leda Bassit; Christina Gavegnano; Tamara R McBrayer; Louise McCormick; Bryan Cox; Steven J Coats; Franck Amblard; Raymond F Schinazi
Journal: Tetrahedron Lett Date: 2017-01-04 Impact factor: 2.415

7. Framework for Conducting and Analyzing Crystal Simulations of Nucleic Acids to Aid in Modern Force Field Evaluation.

Authors: Şölen Ekesan; Darrin M York
Journal: J Phys Chem B Date: 2019-05-03 Impact factor: 2.991

8. A Multidimensional B-Spline Correction for Accurate Modeling Sugar Puckering in QM/MM Simulations.

Authors: Ming Huang; Thakshila Dissanayake; Erich Kuechler; Brian K Radak; Tai-Sung Lee; Timothy J Giese; Darrin M York
Journal: J Chem Theory Comput Date: 2017-08-17 Impact factor: 6.006

9. Improved Treatment of Nucleosides and Nucleotides in the OPLS-AA Force Field.

Authors: Michael J Robertson; Julian Tirado-Rives; William L Jorgensen
Journal: Chem Phys Lett Date: 2017-02-16 Impact factor: 2.328

10. Characterization of the three-dimensional free energy manifold for the uracil ribonucleoside from asynchronous replica exchange simulations.

Authors: Brian K Radak; Melissa Romanus; Tai-Sung Lee; Haoyuan Chen; Ming Huang; Antons Treikalis; Vivekanandan Balasubramanian; Shantenu Jha; Darrin M York
Journal: J Chem Theory Comput Date: 2015-02-10 Impact factor: 6.006