| Literature DB >> 18477633 |
Daniel Svozil1, Jan Kalina, Marek Omelka, Bohdan Schneider.
Abstract
The geometry of the phosphodiester backbone was analyzed for 7739 dinucleotides from 447 selected crystal structures of naked and complexed DNA. Ten torsion angles of a near-dinucleotide unit have been studied by combining Fourier averaging and clustering. Besides the known variants of the A-, B- and Z-DNA forms, we have also identified combined A + B backbone-deformed conformers, e.g. with alpha/gamma switches, and a few conformers with a syn orientation of bases occurring e.g. in G-quadruplex structures. A plethora of A- and B-like conformers show a close relationship between the A- and B-form double helices. A comparison of the populations of the conformers occurring in naked and complexed DNA has revealed a significant broadening of the DNA conformational space in the complexes, but the conformers still remain within the limits defined by the A- and B- forms. Possible sequence preferences, important for sequence-dependent recognition, have been assessed for the main A and B conformers by means of statistical goodness-of-fit tests. The structural properties of the backbone in quadruplexes, junctions and histone-core particles are discussed in further detail.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18477633 PMCID: PMC2441783 DOI: 10.1093/nar/gkn260
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
The datasets of the dinucleotides used in this study
| Dataset | Characterization | Number of structures | Number of dinucleotides |
|---|---|---|---|
| 1 | All dinucleotides analyzed by FT averaging and clustering | 447 | 7739 |
| 2 | Only noncomplexed DNA | 187 | 1861 |
| 3 | Dataset 2 without quadruplexes, Z-DNA, 1DC0 and all dinucleotides forming non-WC pairs | 46 of A-form 72 of B-form 118 in total | 391 in A-form structures 806 in B-form structures 1197 in total |
The PDB codes of the structures used in the analysis
| Structure Type | PDB Codes |
|---|---|
| Noncomplexed A-DNA ( | 118d, 137d, 138d, 160d, 1d78, 1d79, 1dnz, 1kgk, 1m77, 1ma8, 1mlx, 1nzg, 1vj4, 1xjx, 1z7i, 1zex, 1zey, 1zf1, 1zf6, 1zf8, 1zf9, 1zfa, 213d, 243d, 260d, 295d, 2d94, 317d, 338d, 344d, 345d, 348d, 349d, 368d, 369d, 370d, 371d, 395d, 396d, 399d, 414d, 440d, 9dna, dh010, adh012, adh034 |
| Noncomplexed B-DNA ( | 122d, 123d, 158d, 183d, 196d, 1bd1, 1bna, 1cw9, 1d23, 1d3r, 1d49, 1d56, 1d61, 1d8g, 1d8x, 1dou, 1dpn, 1edr, 1ehv, 1en3, 1en8, 1en9, 1ene, 1enn, 1fq2, 1g75, 1i3t, 1ikk, 1j8l, 1jgr, 1l4j, 1l6b, 1m6g, 1n1o, 1nvn, 1nvy, 1p4y, 1p54, 1s23, 1s2r, 1sgs, 1sk5, 1ub8, 1ve8, 1zf0, 1zf3, 1zf4, 1zf5, 1zf7, 1zfb, 1zff, 1zfg, 232d, 251d, 2d25, 307d, 355d, 3dnb, 403d, 423d, 428d, 431d, 436d, 454d, 455d, 456d, 460d, 463d, 476d, 477d, 5dnb, 9bna |
| DNA/drug and DNA/ protein complexes, Z-DNA, quadruplexes (329) | 110d, 115d, 131d, 145d, 151d, 152d, 159d, 181d, 182d, 184d, 190d, 191d, 1a1g, 1a1h, 1a1i, 1a1k, 1a2e, 1a73, 1aay, 1ais, 1azp, 1b94, 1b97, 1bf4, 1bqj, 1brn, 1c8c, 1cdw, 1ckq, 1cl8, 1cn0, 1d02, 1d11, 1d14, 1d15, 1d21, 1d22, 1d2i, 1d32, 1d37, 1d38, 1d40, 1d41, 1d45, 1d48, 1d53, 1d54, 1d58, 1d67, 1d76, 1d90, 1d9r, 1da0, 1da2, 1da9, 1dc0, 1dc1, 1dcg, 1dcr, 1dcw, 1dfm, 1dj6, 1dl8, 1dn4, 1dn5, 1dn8, 1dnf, 1dp7, 1dsz, 1e3o, 1egw, 1em0, 1emh, 1eo4, 1eon, 1esg, 1eyu, 1f0v, 1fd5, 1fdg, 1fhz, 1fiu, 1fms, 1fn1, 1fn2, 1g2f, 1g9z, 1gtw, 1gu4, 1h6f, 1hcr, 1hlv, 1hwt, 1hzs, 1i0t, 1i3w, 1ick, 1ign, 1ih4, 1ih6, 1imr, 1ims, 1j59, 1j75, 1jb7, 1jes, 1jft, 1jh9, 1jk1, 1jk2, 1jpq, 1jtl, 1juc, 1jux, 1jx4, 1k3w, 1k3x, 1k9g, 1kbu, 1kci, 1kx3, 1kx5, 1l1h, 1l1t, 1l1z, 1l3l, 1l3s, 1l3t, 1l3u, 1l3v, 1lat, 1lau, 1ljx, 1llm, 1lmb, 1m07, 1m19, 1m3q, 1m5r, 1m69, 1m6f, 1mf5, 1mj2, 1mjm, 1mjo, 1mjq, 1mnn, 1mus, 1mw8, 1nh2, 1njw, 1njx, 1nk0, 1nk4, 1nk7, 1nk8, 1nk9, 1nkc, 1nke, 1nkp, 1nnj, 1nqs, 1nr8, 1nt8, 1nvp, 1o0k, 1omk, 1orn, 1p20, 1p3i, 1p3l, 1p71, 1per, 1pfe, 1ph4, 1ph6, 1ph8, 1pji, 1pjj, 1puf, 1pup, 1puy, 1q3f, 1qda, 1qn3, 1qn4, 1qn5, 1qn6, 1qn8, 1qn9, 1qna, 1qnb, 1qne, 1qum, 1qyk, 1qyl, 1qzg, 1r2z, 1r3z, 1r41, 1r68, 1rff, 1rh6, 1rnb, 1rpe, 1rqy, 1run, 1s1k, 1s1l, 1s32, 1ssp, 1suz, 1sx5, 1sxq, 1t9i, 1tdz, 1tez, 1tro, 1u1p, 1u1q, 1u1r, 1u4b, 1ue2, 1ue4, 1uhy, 1v3n, 1v3o, 1v3p, 1vzk, 1w0u, 1wd0, 1wte, 1wto, 1wtp, 1wtq, 1wtr, 1wtv, 1xa2, 1xam, 1xc9, 1xjv, 1xo0, 1xuw, 1xux, 1xvn, 1xvr, 1xyi, 1ytb, 1ytf, 1zez, 1zf2, 1zna, 200d, 210d, 211d, 212d, 215d, 221d, 224d, 234d, 235d, 236d, 241d, 242d, 244d, 245d, 254d, 258d, 276d, 277d, 278d, 279d, 284d, 288d, 292d, 293d, 2bdp, 2bop, 2cgp, 2crx, 2dcg, 2des, 2hap, 2hdd, 2nll, 2or1, 2pvi, 304d, 306d, 308d, 313d, 314d, 331d, 334d, 336d, 351d, 352d, 360d, 362d, 366d, 367d, 383d, 385d, 386d, 3bam, 3bdp, 3cro, 3crx, 3hts, 3pvi, 400d, 417d, 427d, 432d, 441d, 442d, 443d, 452d, 453d, 465d, 467d, 473d, 481d, 482d, 4bdp, adh013, zdf013, zdfb03, zdfb06 |
Figure 1.The analyzed unit is defined by ten torsion angles from γ to δ + 1 along the backbone plus torsions χ and χ + 1 at the glycosidic bond. B0 and B1 symbolize the bases.
Counts of 16 dinucleotide steps in conformational families of noncomplexed A- and B-form double helices (Dataset 3)
| Conformation | RR | RY | YR | YY | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| AA | AG | GA | GG | AC | AT | GC | GT | CA | CG | TA | TG | CC | CT | TC | TT | |
| AI | 0 | 1 | 1 | 46 | 13 | 6 | 47 | 20 | 8 | 56 | 9 | 10 | 60 | 1 | 2 | 0 |
| AII | 0 | 0 | 1 | 16 | 1 | 0 | 0 | 0 | 0 | 27 | 8 | 0 | 1 | 0 | 0 | 0 |
| RestA | 0 | 2 | 0 | 13 | 4 | 1 | 7 | 0 | 1 | 9 | 0 | 1 | 14 | 2 | 3 | 0 |
| Total in A | 0 | 3 | 2 | 75 | 18 | 7 | 54 | 20 | 9 | 92 | 17 | 11 | 75 | 3 | 5 | 0 |
| BI | 53 | 16 | 39 | 4 | 5 | 29 | 12 | 9 | 1 | 48 | 12 | 8 | 11 | 12 | 31 | 50 |
| BII | 4 | 2 | 10 | 18 | 5 | 1 | 35 | 2 | 31 | 33 | 10 | 18 | 0 | 0 | 0 | 0 |
| A/B | 2 | 1 | 0 | 0 | 0 | 4 | 1 | 0 | 3 | 41 | 8 | 0 | 2 | 4 | 1 | 8 |
| B/A | 17 | 2 | 2 | 0 | 3 | 19 | 37 | 7 | 0 | 1 | 0 | 0 | 2 | 2 | 11 | 8 |
| RestB | 9 | 2 | 10 | 1 | 4 | 5 | 33 | 1 | 4 | 19 | 1 | 0 | 3 | 3 | 10 | 6 |
| Total in B | 85 | 23 | 61 | 23 | 17 | 58 | 118 | 19 | 39 | 142 | 31 | 26 | 18 | 21 | 53 | 72 |
Dinucleotides from RestA and RestB categories were not assigned to any of the conformations; R are purines, Y pyrimidines.
The main DNA conformational classes identified in the present work
| Description | Clustered torsions | Cluster number | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| γ | δ | ɛ | ζ | α + 1 | β + 1 | γ + 1 | δ + 1 | χ | χ + 1 | |||
| AII, A-form with an α + 1/γ + 1 switch | 44 | 52 | 82 | 195 | 291 | 149 | 194 | 182 | 87 | 204 | 188 | 19 |
| A with δ, δ + 1 close to O4′-endo | 9 | 44 | 101 | 192 | 281 | 297 | 182 | 44 | 99 | 210 | 211 | 25 |
| AI–BI, with δ C3′-, δ + 1 C2′-endo | 32 | 54 | 86 | 194 | 281 | 301 | 179 | 55 | 142 | 214 | 251 | 41 |
| AI–BI, with δ O4′-, δ + 1 C2′-endo | 34 | 54 | 99 | 186 | 274 | 297 | 178 | 51 | 141 | 235 | 264 | 47 |
| BI–AI, with δ1 O4′-endo | 100 | 51 | 130 | 183 | 267 | 297 | 171 | 51 | 106 | 250 | 239 | 32 |
| BII–AI, with an α + 1/γ + 1 switch, high β + 1 | 9 | 49 | 146 | 257 | 186 | 60 | 224 | 196 | 90 | 260 | 200 | 110 |
| BI variation in complexes | 412 | 45 | 137 | 178 | 255 | 304 | 187 | 45 | 139 | 252 | 256 | 58 |
| BII variation in complexes | 269 | 43 | 140 | 201 | 216 | 314 | 156 | 46 | 140 | 261 | 253 | 86 |
| BI, with an α + 1/γ + 1 switch | 109 | 46 | 139 | 195 | 245 | 32 | 196 | 296 | 150 | 252 | 253 | 116 |
| BI, 3′-mismatches with an χ + 1 syn, α + 1/γ + 1 switch | 8 | 50 | 137 | 196 | 225 | 33 | 187 | 295 | 145 | 257 | 70 | 122 |
| AI–BI, 3′-mismatches with χ + 1 syn | 14 | 58 | 91 | 214 | 280 | 295 | 176 | 56 | 139 | 238 | 67 | 121 |
| Z-form, Y–R step | 21 | 54 | 147 | 264 | 76 | 66 | 186 | 179 | 95 | 205 | 61 | 123 |
| ZI-form, R–Y step | 40 | 177 | 96 | 242 | 292 | 210 | 233 | 54 | 144 | 63 | 205 | 124 |
| ZII-form, R–Y step | 18 | 179 | 95 | 187 | 63 | 169 | 162 | 44 | 144 | 58 | 213 | 126 |
‘Description’ is a short annotation of the conformation, ‘N’ is the number of dinucleotides which define the conformation, ‘Clustered Torsions’ are the arithmetic means calculated for the torsions used in the clustering process, with the torsions being defined in Figure 1. Bold font is used merely to indicate the three most important DNA forms.
Figure 2.Two-dimensional scattergrams of torsion angles in naked DNA (Dataset 2 from Table 1, dark blue) and in DNA from complexes (Dataset 1, cyan). A, B and Z are the respective double-helical forms, r stands for purines and y for pyrimidines. The conformations of almost 4000 RNA dinucleotides are plotted as pink dots for comparison.
Torsion angles [°] in nucleotides of the major A- and B-forms
| α | β | γ | δ | ɛ | ζ | χ | ||
|---|---|---|---|---|---|---|---|---|
| Canonical A-form (AI) | 294.8 ± 0.9 | 172.7 ± 1.0 | 54.3 ± 0.9 | 82.1 ± 0.7 | 205.6 ± 1.0 | 285.4 ± 0.7 | 200.5 ± 1.0 | 180 |
| AII | 145.6 ± 2.3 | 191.9 ± 2.0 | 182.8 ± 1.7 | 85.0 ± 1.4 | 197.0 ± 2.0 | 289.2 ± 1.7 | 203.4 ± 1.1 | 49 |
| Canonical B-form (BI) | 299.0 ± 0.9 | 179.3 ± 1.0 | 48.4 ± 0.6 | 132.8 ± 1.0 | 181.7 ± 1.0 | 263.2 ± 0.8 | 250.3 ± 1.1 | 418 |
| BII | 292.6 ± 1.3 | 143.1 ± 1.3 | 46.0 ± 0.9 | 143.0 ± 0.9 | 251.1 ± 2.1 | 168.0 ± 1.4 | 277.8 ± 1.4 | 187 |
The data were obtained through the analysis of 118 naked (noncomplexed) DNA structures from Dataset 3 (Table 1). The confidence intervals for the mean values of the torsion angles were computed under the assumption that the angles at the 95% confidence level are distributed normally. N is the number of observations of each conformer, AI corresponds to Cluster 8, AII to Cluster 19, BI to Cluster 54 and BII to Cluster 98.
The violation of the uniformity of dinucleotide representation for purines (R) and pyrimidines (Y) between A, B and combined conformational families as measured by the standardized Pearson residuals
The underrepresented sequences are indicated by a gray background, the overrepresented are in bold, both exceeding the critical value of ± 2.50 (the 5% confidence level).
The violation of the homogeneity of purine (R) and pyrimidine (Y) dinucleotide steps between AI, AII and nonclassified (RestA) conformers in the A-form double helices as measured by the standardized Pearson residuals
The underrepresented sequences are indicated by a gray background, the overrepresented are in bold, both exceeding the critical value of ±2.87 (the 5% level test).
The violation of the dinucleotide homogeneity for sequences between BI, BII, A/B, B/A and unclassified dinucleotides (RestB) in the B-form double helices measured by the standardized Pearson residuals
The underrepresented sequences are indicated by the gray background, the overrepresented are in bold, both exceeding the critical value of ±3.42 (the 5% level test).
A summary of the conformational preferences of dinucleotide steps in B-DNA helices
| Sequence | RR | RY | YR | YY | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| AA | AG | GA | GG | AC | AT | GC | GT | CA | CG | TA | TG | CC | CT | TC | TT | |
| Conformation | BI | BI | BI | BII | – | B/A | B/A, RestB, BII | – | BII | A/B, (BII) | – | BII | (BI) | (BI) | BI | BI |
Some sequences were not assigned any conformational preference because of their low representation in the whole data set.
Figure 3.Dinucleotide conformations in the crystal structure of the histone-core particle 1KX5 (89). Dinucleotides are classified into four conformational families and labeled as follows: BI–1, BII–2, BI conformers with a α + 1/γ + 1 switch (Clusters 113–117) – 3, unclassifiable conformers – 4. One DNA chain, labeled I in the PDB file and drawn in blue and marked in the left y axis in the Figure, is traced from the 5′-end to the 3′-end. The other chain, labeled J in the PDB file and drawn in red in the right y axis, is traced from the 3′-end to the 5′-end. Base paired nucleotides from chains I and J have therefore the same x coordinate.