| Literature DB >> 31459834 |
Antarip Halder1, Dhruv Data1, Preethi P Seelam1, Dhananjay Bhattacharyya2, Abhijit Mitra1.
Abstract
Noncoding RNA molecules are composed of a large variety of noncanonical base pairs that shape up their functionally competent folded structures. Each base pair is composed of at least two interbase hydrogen bonds (H-bonds). It is expected that the characteristic geometry and stability of different noncanonical base pairs are determined collectively by the properties of these interbase H-bonds. We have studied the ground-state electronic properties [using density functional theory (DFT) and DFT-D3-based methods] of all the 118 normal base pairs and 36 modified base pairs, belonging to 12 different geometric families (cis and trans of WW, WH, HH, WS, HS, and SS) that occur in a nonredundant set of high-resolution RNA crystal structures. Having addressed some of the limitations of the earlier approaches, we provide here a comprehensive compilation of the average energies of different types of interbase H-bonds (E HB). We have also characterized each interbase H-bond using 13 different parameters that describe its geometry, charge distribution at its bond critical point (BCP), and n → σ*-type charge transfer from filled π orbitals of the H-bond acceptor to the empty antibonding orbital of the H-bond donor. On the basis of the extent of their linear correlation with the H-bonding energy, we have shortlisted five parameters to model linear equations for predicting E HB values. They are (i) electron density at the BCP: ρ, (ii) its Laplacian: ∇2ρ, (iii) stabilization energy due to n → σ*-type charge transfer: E(2), (iv) donor-hydrogen distance, and (v) hydrogen-acceptor distance. We have performed single variable and multivariable linear regression analysis over the normal base pairs and have modeled sets of linear relationships between these five parameters and E HB. Performance testing of our model over the set of modified base pairs shows promising results, at least for the moderately strong H-bonds.Entities:
Year: 2019 PMID: 31459834 PMCID: PMC6648064 DOI: 10.1021/acsomega.8b03689
Source DB: PubMed Journal: ACS Omega ISSN: 2470-1343
Figure 1(a) RNA bases have been characterized by three edges—Watson–Crick, Hoogsteen (C–H edge for pyrimidines), and Sugar. (b) In principle, any edge of a base can interact with any other edge of a second base. Depending on the mutual orientation of the two glycosidic bonds, the base pair can be annotated as either cis or trans.
Count of Different Types of H-Bonds Studied in This Work
| sl. | H-bond type | donor | acceptor | notation | count |
|---|---|---|---|---|---|
| 1 | N–H···N | primary N | imino N | NI–H···NIII | 42 |
| 2 | N–H···N | secondary N | imino N | NII–H···NIII | 17 |
| 3 | N–H···O | primary N | carbonyl O | NI–H···Oc | 38 |
| 4 | N–H···O | secondary N | carbonyl O | NII–H···Oc | 23 |
| 5 | N–H···O | primary N | hydroxyl O | NI–H···Oh | 17 |
| 6 | O–H···N | hydroxyl O | imino N | Oh–H···NIII | 18 |
| 7 | O–H···O | hydroxyl O | carbonyl O | Oh–H···Oc | 4 |
| 8 | O–H···O | hydroxyl O | hydroxyl O | Oh–H···Oh | 9 |
| 9 | C–H···N | C–H | imino N | C–H···NIII | 11 |
| 10 | C–H···O | C–H | carbonyl O | C–H···Oc | 14 |
Figure 3IR spectra (calculated at the B3LYP-D3(BJ)/6-31+G(d,p) level) of four nucleobases (A) adenine, (B) uracil, (C) guanine, and (D) cytosine; two canonical base pairs (E) A:U W:W Cis and (F) G:C W:W Cis; and two noncanonical base pairs (G) A:U H:H Cis and (H) G:U W:H Trans. For the nucleobases, the orange arrow and the green arrow point to the frequencies corresponding to symmetric stretching of the N–H bonds of the primary amino and secondary amino groups, respectively. Interbase H-bonds of the base pairs are shown in broken line. Frequencies corresponding to symmetric stretching of the N–H (or C–H) bond in interbase H-bonds are pointed using different colored arrows for different types of H-bonds: NI–H···Oc (blue), NII–H···NIII (green), NII–H···Oc (cyan), C–H···N (orange), and C–H···O (black).
Figure 2Comparison of ZPVE- and BSSE-corrected interaction energy (Eint) of a base pair and the sum of the H-bonding energies of the interbase H-bonds present in it. Comparisons are made on the basis of the magnitude of the energies (in kcal mol–1) for 15 guanine-containing base pairs.
Average Hydrogen Bonding Energy (EHB) of Different Interbase Hydrogen Bonds Obtained Using B3LYP and B3LYP-D3 Functionals Is Reported in kcal mol–1a
| name | type of base pair | B3LYP | B3LYP-D3(BJ) |
|---|---|---|---|
| NI–H···NIII | all | 4.31 (1.23) | 4.24 (1.35) |
| NS | 4.35 (1.38) | 4.40 (1.52) | |
| S | 4.26 (1.02) | 4.00 (1.04) | |
| NII–H···NIII | all | 6.42 (0.94) | 6.37 (1.16) |
| NS | 6.74 (0.84) | 6.81 (0.92) | |
| S | 5.66 (0.76) | 5.26 (1.01) | |
| NI–H···Oc | all | 2.85 (1.52) | 3.09 (1.46) |
| NS | 3.52 (1.51) | 3.75 (1.48) | |
| S | 2.25 (1.28) | 2.57 (1.25) | |
| NII–H···Oc | all | 5.13 (1.50) | 5.29 (1.39) |
| NS | 5.56 (0.81) | 5.80 (0.83) | |
| S | 4.33 (2.16) | 4.40 (1.74) | |
| NI–H···Oh | all | 2.56 (1.62) | 2.65 (1.38) |
| Oh–H···NIII | all | 5.24 (1.24) | 5.34 (1.19) |
| Oh–H···Oc | all | 4.09 (0.75) | 4.79 (1.01) |
| Oh–H···Oh | all | 3.33 (1.34) | 2.76 (1.1) |
Values reported within parenthesis represent corresponding standard deviation.
“All”: all base pairs studied, “S”: base pairs of the WS, HS, and SS families, “NS”: base pairs of the WW, WH, and HH families.
Pearson Correlation Coefficients between EHB and Different QTAIM, NBO, and Geometric Parameters (at B3LYP/6-31+G(d,p) Level) for Eight Different Types of Interbase H-Bonds Studied in This Worka
| QTAIM
parameters | NBO | geometrical
parameters | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| sl. | type of HB | type of BP | ρ | ∇2ρ | D–A | H–A | D–H | ∠D–H–A | ΔDHA | |||||
| 1 | NI–H···NIII | all | 0.89 (0.91) | 0.85 (0.90) | –0.88 (−0.89) | 0.87 (0.89) | –0.89 (−0.79) | 0.90 (0.92) | –0.74 (−0.78) | –0.88 (−0.82) | 0.92 (0.95) | 0.75 (0.73) | –0.85 (−0.81) | –0.77 (−0.74) |
| NS | 0.96 (0.96) | 0.93 (0.92) | –0.95 (−0.94) | 0.94 (0.93) | –0.89 (−0.88) | 0.97 (0.94) | –0.82 (−0.81) | –0.94 (−0.91) | 0.97 (0.97) | 0.87 (0.82) | –0.91 (−0.88) | –0.88 (−0.84) | ||
| S | 0.76 (0.89) | 0.74 (0.91) | –0.75 (−0.89) | 0.75 (0.90) | –0.63 (−0.76) | 0.80 (0.93) | –0.69 (−0.83) | –0.64 (−0.78) | 0.80 (0.91) | 0.52 (0.60) | –0.66 (−0.80) | –0.54 (−0.63) | ||
| 2 | NII–H···NIII | all | 0.83 (0.88) | 0.72 (0.80) | –0.79 (−0.84) | 0.76 (0.82) | –0.86 (−0.86) | 0.87 (0.91) | –0.77 (−0.84) | –0.82 (−0.88) | 0.99 (0.99) | 0.70 (0.65) | –0.78 (−0.85) | –0.71 (−0.66) |
| NS | 0.76 (0.79) | 0.56 (0.60) | –0.71 (−0.74) | 0.65 (0.67) | –0.92 (−0.92) | 0.79 (0.82) | –0.61 (−0.66) | –0.69 (−0.73) | 0.99 (0.99) | 0.66 (0.67) | –0.63 (−0.67) | –0.66 (−0.66) | ||
| S | 0.98 (0.96) | 0.97 (0.93) | –0.96 (−0.92) | 0.97 (0.93) | –0.86 (−0.88) | 0.93 (0.95) | –0.98 (−0.89) | –0.98 (−0.94) | 0.99 (0.99) | 0.53 (0.46) | –0.98 (−0.92) | –0.56 (−0.51) | ||
| 3 | NI–H···Oc | all | 0.93 (0.95) | 0.84 (0.83) | –0.91 (−0.91) | 0.88 (0.88) | –0.24 (−0.17) | 0.93 (0.93) | –0.62 (−0.56) | –0.83 (−0.89) | 0.94 (0.95) | 0.62 (0.74) | –0.80 (−0.81) | –0.63 (−0.76) |
| NS | 0.95 (0.97) | 0.96 (0.89) | –0.94 (−0.91) | 0.95 (0.90) | 0.12 (0.41) | 0.95 (0.93) | –0.91 (−0.89) | –0.90 (−0.97) | 0.95 (0.95) | 0.95 (0.75) | –0.92 (−0.94) | –0.73 (−0.77) | ||
| S | 0.94 (0.93) | 0.87 (0.85) | –0.93 (−0.92) | 0.91 (0.89) | –0.07 (−0.04) | 0.87 (0.91) | –0.57 (−0.47) | –0.88 (−0.89) | 0.96 (0.97) | 0.47 (0.71) | –0.87 (−0.84) | –0.48 (−0.73) | ||
| 4 | NII–H···Oc | all | 0.84 (0.89) | 0.78 (0.83) | –0.78 (−0.83) | 0.78 (0.83) | –0.16 (−0.18) | 0.90 (0.94) | –0.62 (−0.71) | –0.81 (−0.86) | 0.96 (0.98) | 0.76 (0.79) | –0.73 (−0.79) | –0.78 (−0.82) |
| NS | 0.74 (0.75) | 0.68 (0.71) | –0.66 (−0.67) | 0.67 (0.69) | –0.38 (−0.35) | 0.85 (0.85) | –0.69 (−0.72) | –0.79 (−0.81) | 0.98 (0.97) | 0.23 (0.11) | –0.71 (−0.74) | –0.26 (−0.15) | ||
| S | 0.87 (0.93) | 0.87 (0.93) | –0.84 (−0.91) | 0.86 (0.92) | 0.42 (0.45) | 0.93 (0.97) | –0.69 (−0.77) | –0.81 (−0.88) | 0.98 (0.99) | 0.84 (0.91) | –0.76 (−0.84) | –0.86 (−0.94) | ||
| 5 | NI–H···Oh | all | 0.39 (0.17) | 0.48 (0.31) | –0.41 (−0.17) | 0.44 (0.23) | –0.25 (−0.20) | 0.31 (0.10) | –0.80 (−0.63) | –0.41 (−0.12) | 0.18 (0.21) | 0.04 (−0.15) | –0.55 (−0.34) | –0.08 (0.12) |
| 6 | Oh–H···NIII | all | 0.54 (0.60) | 0.43 (0.49) | –0.55 (−0.60) | 0.51 (0.57) | –0.61 (−0.65) | 0.55 (0.63) | –0.48 (−0.49) | –0.46 (−0.55) | 0.63 (0.64) | –0.10 (0.26) | –0.46 (−0.52) | 0.08 (−0.27) |
| 7 | Oh–H···Oc | all | 0.46 (0.64) | 0.38 (0.61) | –0.43 (−0.65) | 0.40 (0.63) | –0.67 (−0.60) | 0.64 (0.60) | –0.44 (−0.58) | –0.47 (−0.59) | 0.41 (0.66) | 0.73 (0.66) | –0.46 (−0.58) | –0.65 (−0.63) |
| 8 | Oh–H···Oh | all | –0.45 (0.04) | –0.51 (0.07) | 0.49 (−0.01) | –0.50 (−0.04) | 0.34 (0.35) | –0.34 (−0.06) | 0.64 (0.04) | 0.50 (−0.01) | –0.01 (0.07) | –0.09 (−0.02) | 0.58 (−0.01) | 0.22 (0.05) |
Values given in parenthesis represent the Pearson correlation coefficients calculated at the B3LYP-D3/6-31+G(d,p) level of theory.
Linear Relationships (y = Ax + B) between H-Bonding Strength and the Individual Parameters, Which Have High Pearson Correlation Coefficient (r) for All the Four Types of H-Bonds in Table , Were Derived from Single Variable Linear Regression Analysisa
| H-bond type | independent variable ( | slope ( | intercept ( |
|---|---|---|---|
| NI–H···NIII | ρ | 190.34 ± 14.68 | –0.60 ± 0.38 |
| ∇2ρ | 90.09 ± 7.60 | –1.35 ± 0.48 | |
| 0.22 ± 0.02 | 1.33 ± 0.22 | ||
| H–A | –6.90 ± 0.81 | 18.44 ± 1.67 | |
| D–H | 210.20 ± 11.88 | –210.78 ± 12.16 | |
| NII–H···NIII | ρ | 135.45 ± 21.24 | 1.89 ± 0.72 |
| ▽2ρ | 57.20 ± 12.58 | 1.98 ± 0.99 | |
| 0.13 ± 0.02 | 3.65 ± 0.39 | ||
| H–A | –9.90 ± 1.58 | 25.42 ± 3.04 | |
| D–H | 150.20 ± 5.17 | –149.41 ± 5.36 | |
| NI–H···Oc | 0.26 ± 0.02 | –0.14 ± 0.24 | |
| ρ | 251.43 ± 13.83 | –3.42 ± 0.37 | |
| ∇2ρ | 75.03 ± 8.71 | –2.71 ± 0.69 | |
| H–A | –12.32 ± 1.10 | 27.04 ± 2.14 | |
| D–H | 253.08 ± 13.98 | –254.87 ± 14.25 | |
| NII–H···Oc | 0.18 ± 0.01 | 1.88 ± 0.27 | |
| ρ | 145.50 ± 16.13 | 0.60 ± 0.54 | |
| ▽2ρ | 46.48 ± 6.76 | 0.88 ± 0.66 | |
| H–A | –8.57 ± 1.17 | 21.28 ± 2.21 | |
| D–H | 212.44 ± 9.71 | –213.70 ± 10.01 |
For this, two topological parameters (ρ, ∇2ρ), one charge-transfer-based parameter (E(2)), and two geometry-based parameters (H–A distance and D–H distance) were considered as independent variables (x) and EHB was considered as the scalar dependent variable (y). The values for the slope (A) and y-intercept (B), along with their respective standard deviations, are tabulated here. Units of different parameters are as follows: EHB [kcal mol–1], ρ [a.u., 1 a.u. = ea0–3, where e is the elementary charge and a0 is the Bohr radius], ∇2ρ [a.u., 1 a.u. = ea0–5], E(2) [kcal mol–1], H–A [Å] and D–H [Å].
MSE and RMSE Values (in kcal mol–1) between the Set of Expected EHB Values and Set of Predicted EHB Values from Different Single Parameter and Multiparameter Modelsa
| MSE | RMSE | |||||
|---|---|---|---|---|---|---|
| parameter | weak | strong | all | weak | strong | all |
| ρ | 0.32 | 0.33 | 0.33 | 0.56 | 0.58 | 0.58 |
| ∇2ρ | 0.59 | 0.63 | 0.62 | 0.77 | 0.79 | 0.79 |
| 0.20 | 0.25 | 0.23 | 0.45 | 0.50 | 0.49 | |
| H–A | 0.44 | 0.49 | 0.47 | 0.66 | 0.7 | 0.69 |
| D–H | 0.17 | 0.16 | 0.17 | 0.41 | 0.4 | 0.41 |
| 3-P model | 0.54 | 0.20 | 0.30 | 0.74 | 0.45 | 0.55 |
| 5-P model | 0.12 | 0.07 | 0.09 | 0.34 | 0.28 | 0.30 |
“Weak” and “strong” correspond to H-bonds with EHB < 4 and 15 kcal mol–1 ≥ EHB ≥ 4 kcal mol–1, respectively.
Figure 4Comparison of the three-parameter (in red circle) and five-parameter (in blue square) models for interbase H-bonds present in modified base pairs. The predicted values of EHB are plotted with respect to their corresponding expected values. To illustrate the performance of the two models, the y = x straight line is shown as a reference.