| Literature DB >> 31230075 |
Michal Růžička1,2, Přemysl Souček1, Petr Kulhánek1,3, Lenka Radová1, Lenka Fajkusová4, Kamila Réblová1.
Abstract
Mutations can be induced by environmental factors but also arise spontaneously during DNA replication or due to deamination of methylated cytosines at CpG dinucleotides. Sites where mutations occur with higher frequency than would be expected by chance are termed hotspots while sites that contain mutations rarely are termed coldspots. Mutations are permanently scanned and repaired by repair systems. Among them, the mismatch repair targets base pair mismatches, which are discriminated from canonical base pairs by probing altered elasticity of DNA. Using biased molecular dynamics simulations, we investigated the elasticity of coldspots and hotspots motifs detected in human genes associated with inherited disorders, and also of motifs with Czech population hotspots and de novo mutations. Main attention was paid to mutations leading to G/T and A+/C pairs. We observed that hotspots without CpG/CpHpG sequences are less flexible than coldspots, which indicates that flexible sequences are more effectively repaired. In contrary, hotspots with CpG/CpHpG sequences exhibited increased flexibility as coldspots. Their mutability is more likely related to spontaneous deamination of methylated cytosines leading to C > T mutations, which are primarily targeted by base excision repair. We corroborated conclusions based on computer simulations by measuring melting curves of hotspots and coldspots containing G/T mismatch.Entities:
Keywords: DNA bending; Muts protein; free energy calculations; hotspots–coldspots; mutations
Year: 2019 PMID: 31230075 PMCID: PMC6704406 DOI: 10.1093/dnares/dsz013
Source DB: PubMed Journal: DNA Res ISSN: 1340-2838 Impact factor: 4.458
Figure 1Base pair geometries: (A) Canonical WC base pairs A-T and G = C. (B) Mismatches A+/C and G/T used for calculations of bending free energies. (C) The remaining tested mismatches: A/A, C+/C, G/A, G/G, T/C and T/T.
Figure 2(A) Superimposed bent X-ray DNA structure (from PDB ID 2O8B) (green) and relaxed MD structure (blue) (structures are superimposed over lower part). In the MD structure, a central 5-nt segment which was either coldspot or hotspot is cyan, and G/T mismatch is highlighted in sphere representation. Detail of G/T pair in the X-ray structure (B) and in MD structure (C).
Bending free energies of hotspots with CpG and CpHpG sequence
| Type | Motif 5’→3’/5’→3’ | Fisher combined | Bending free energy for G/T (kcal/mol) | Bending free energy for A+/C (kcal/mol) |
|
|---|---|---|---|---|---|
| CpG | ACGGC/GCCGT | 5.51E-05 | 12.0 | 12.4 | 0.08 |
| CpG | CCGAG/CTCGG | 2.11E-05 | 13.1 | 11.8 | 0.52 |
| CpG | TCGCA/TGCGA | 2.69E-08 | 13.2 | 13.1 | 0.58 |
| CpHpG | CTGTG/CACAG | 2.97E-06 | 13.0 | 11.8 | 0.46 |
| CpHpG | CTGGG/CCCAG | 2.24E-05 | 12.8 | 11.5 | 0.35 |
| CpHpG | CAGTA/TACTG | 2.87E-04 | 12.3 | 12.8 | 0.14 |
| Average values | 12.7 | 12.2 | |||
Bending free energies smaller than 13.0 kcal/mol are in grey fields.
Bending free energies of 10 Czech population hotspots with G/T and A+/C
| Gene | Mutation at protein, DNA level, transcript number | Number of alleles | Motif 5’→3’/5’→3’ | Bending free energy for G/T (kcal/mol) | Bending free energy for A+/C (kcal/mol) |
|
|---|---|---|---|---|---|---|
|
| p.Tyr521cys, c.1562A>G, NM_001139.2 | 6/12 | TTACC/GGTAA | 14.0 | 14.1 | 0.89 |
|
| p.Arg234Term, c.700C>T, NM_021628.2 | 6 |
| 13.1 | 14.4 | 0.51 |
|
| p.Arg142His, c.425G>A, NM_000359.2 | 1/2 |
| 12.5 | 13.7 | 0.21 |
|
| p.Arg448His, c.1343G>A, NM_000070.2 | 2/4 |
| 13.0 | 12.4 | 0.43 |
|
| p.Arg758Cys, c.2272C>T, NM_213599.2 | 1/5 |
| 11.8 | 12.6 | 0.05 |
|
| p.Arg77Cys, c.229C>T, NM_000023.3 | 4/6 |
| 11.7 | 11.8 | 0.04 |
|
| p.Gly592Glu, c.1775G>A, NM_000527.4 | 104/188 |
| 12.4 | 12.7 | 0.17 |
|
| p.Arg408Trp, c.1222C > T, NM_000277.2 | 560 |
| 13.1 | 11.8 | 0.52 |
|
| p.Trp779Term, c.2336G>A, NM_000053.3 | 14 | GTGGC/GCCAC | 13.1 | 12.0 | 0.49 |
|
| p.Arg894Term, c.2680C>T, NM_000083.2 | 38 |
| 13.1 | 13.9 | 0.52 |
| Average values | 12.8 | 12.9 | ||||
CpG or CpHpG motifs are in bold and energy below 13 kcal/mol is in the grey field.
Number of alleles indicated in our publications (see above)/number of alleles detected up to date if differ.
This motif is already included among top hotspots with CpG see Table 2.
Figure 3Melting curves of oligonucleotide duplexes with the G/T mismatch acquired using RotorGene software, the derivative of the raw fluorescence with respect to temperature (dF/dT) is shown. (A) systems with 0 G = C base pair, (B) systems with 1 G = C base pair and (C) systems with 2 G = C base pairs. Bending free energies of analysed motifs are indicated. (D) The melting curve fitted by exponential function in the range of 52–90°C.
Bending free energies of top 10 coldspots and top 10 hotspots with G/T and A+/C pairs
| ID | Motif 5’→3’/5’→3’ | Fisher combined | Bending free energy for G/T (kcal/mol) | Bending free energy for A+/C (kcal/mol) |
|
|---|---|---|---|---|---|
| Coldspots | |||||
| C01 | AAGAA/TTCTT | 1.27E-06 | 12.4 | 12.4 | 0.17 |
| C02 | CAGTG/CACTG | 9.22E-05 | 12.4 | 12.8 | 0.17 |
| C03 | AAAGA/TCTTT | 9.37E-09 | 11.6 | 12.3 | 0.03 |
| C04 | AAAAT/ATTTT | 1.08E-08 | 13.0 | 14.1 | 0.46 |
| C05 | AAAAA/TTTTT | 1.36E-08 | 11.9 | 12.7 | 0.06 |
| C06 | AGAAA/TTTCT | 1.78E-08 | 13.4 | 13.9 | 0.68 |
| C07 | GAAAA/TTTTC | 4.50E-08 | 12.2 | 13.1 | 0.12 |
| C08 | GAAGA/TCTTC | 6.47E-07 | 11.6 | 12.5 | 0.03 |
| C09 | GGAGA/TCTCC | 1.22E-06 | 13.2 | 12.7 | 0.59 |
| C10 | GGAAA/TTTCC | 3.04E-06 | 13.2 | 14.0 | 0.55 |
| Average values | 12.5 | 13.1 | |||
| Hotspots | |||||
| H01 | AGGTA/TACCT | 1.24E-11 | 13.7 | 14.0 | 0.80 |
| H02 | TGGAA/TTCCA | 6.58E-04 | 14.4 | 14.5 | 0.95 |
| H03 | AGGTG/CACCT | 2.44E-03 | 14.2 | 13.6 | 0.93 |
| H04 | TGAGT/ACTCT | 1.37E-03 | 14.2 | 14.5 | 0.93 |
| H05 | TGGCT/AGCCA | 8.02E-04 | 14.4 | 14.7 | 0.96 |
| H06 | ACATG/CATGT | 5.38E-04 | 12.6 | 13.2 | 0.26 |
| H07 | GGGCA/TGCCC | 4.70E-03 | 13.0 | 13.3 | 0.48 |
| H08 | TTGTA/TACAA | 4.79E-03 | 13.7 | 12.9 | 0.82 |
| H09 | GGATG/CATCC | 4.90E-03 | 13.6 | 13.3 | 0.78 |
| H10 | GCATG/CATGC | 7.46E-03 | 12.5 | 12.5 | 0.22 |
| Average values | 13.6 | 13.7 | |||
Bending free energies smaller than 13.0 kcal/mol are in grey fields.
Bending free energies of motifs with 15 de novo mutations calculated for G/T and A+/C mismatches
| Gene | Mutation at protein level, DNA level and transcript number | Motif 5’→3’/5’→3’ | Bending free energy for G/T (kcal/mol) | Bending free energy for A+/C (kcal/mol) |
|
|---|---|---|---|---|---|
|
| p.Val130Met, c.388G>A, NM_000022 | TGGTG/CACCA | 14.2 | 14.2 | 0.93 |
|
| p.Gly232Glu, c.695G>A, NM_000206.2 | TGGAA/TTCCA | 14.4 | 14.5 | 0.95 |
|
| p.Leu349Pro, c.1046T>C, NM_000062.2 | GCTCT/AGAGC | 13.2 | 13.5 | 0.56 |
|
| p.Lys658Glu, c.1972A>G, NM_139276.2 | ATAAG/CTTAT | 14.8 | 13.3 | 0.98 |
|
| His115Arg, c.344A>G, NM_000377 | CCACA/TGTGG | 12.4 | 13.1 | 0.18 |
|
| p.Asp229Gly, c.686A>G, NM_000277.2 | AGACG/CGTCT | 13.3 | 13.4 | 0.62 |
|
| p.Phe263Ser, c.788T>C, NM_000277.2 | CTTCC/GGAAG | 13.1 | 13.8 | 0.51 |
|
| Ile406Met, c.1218A>G, NM_000277.2 | ATACC/GGTAT | 13.8 | 13.3 | 0.83 |
|
| p.Met1Ile, c.3G>A, NM_000277.2 | ATGTC/GACAT | 14.1 | 13.6 | 0.91 |
|
| p.Ser16Pro, c.46T>C, NM_000277.2 | TCTCT/AGAGA | 12.9 | 12.9 | 0.38 |
|
| p.Ser40Leu, c.119C>T, NM_000277.2 | CTCAC/GTGAG | 13.9 | 12.7 | 0.86 |
|
| p.Lys42Glu, c.124A>G, NM_000277.2 | TCAAA/TTTGA | 14.1 | 13.1 | 0.92 |
|
| p.Val45Ala, c.134T>C, NM_000277.2 | AGTTG/CAACT | 13.2 | 14.2 | 0.58 |
|
| p.Tyr302Cys, c.905A>G, NM_000083.2 | CTACT/AGTAG | 13.1 | 12.9 | 0.51 |
|
| p.Gln223Term, c.667C>T, NM_173483.3 | GCCAA/TTGGC | 13.6 | 12.8 | 0.78 |
| Average values | 13.6 | 13.4 | |||
Bending free energies smaller than 13.0 kcal/mol are in grey fields.
This motif is already included in Table 1 as H02.