| Literature DB >> 30923812 |
Mona Saad1,2, Aurore Guédin1, Souheila Amor1, Amina Bedrat1, Nicolas J Tourasse1, Hussein Fayyad-Kazan2, Geneviève Pratviel3, Laurent Lacroix4, Jean-Louis Mergny1,5.
Abstract
G-quadruplexes (G4) are non-canonical DNA and/or RNA secondary structures formed in guanine-rich regions. Given their over-representation in specific regions in the genome such as promoters and telomeres, they are likely to play important roles in key processes such as transcription, replication or RNA maturation. Putative G4-forming sequences (G4FS) have been reported in humans, yeast, bacteria, viruses and many organisms. Here we present the first mapping of G-quadruplex sequences in Dictyostelium discoideum, the social amoeba. 'Dicty' is an ameboid protozoan with a small (34 Mb) and extremely AT rich genome (78%). As a consequence, very few G4-prone motifs are expected. An in silico analysis of the Dictyostelium genome with the G4Hunter software detected 249-1055 G4-prone motifs, depending on G4Hunter chosen threshold. Interestingly, despite an even lower GC content (as compared to the whole Dicty genome), the density of G4 motifs in Dictyostelium promoters and introns is significantly higher than in the rest of the genome. Fourteen selected sequences located in important genes were characterized by a combination of biophysical and biochemical techniques. Our data show that these sequences form highly stable G4 structures under physiological conditions. Five Dictyostelium genes containing G4-prone motifs in their promoters were studied for the effect of a new G4-binding porphyrin derivative on their expression. Our results demonstrated that the new ligand significantly decreased their expression. Overall, our results constitute the first step to adopt Dictyostelium discoideum as a 'G4-poor' model for studies on G-quadruplexes.Entities:
Mesh:
Substances:
Year: 2019 PMID: 30923812 PMCID: PMC6511855 DOI: 10.1093/nar/gkz196
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Number, density and relative density of G4 motifs found in D. discoideum genome with G4Hunter. For each genomic feature, we report the number of G4 motifs that overlap the feature coordinates. We did not take into account the strand information for the feature and the G4
| 1.5 | 1.75 | 2 | |||||||
|---|---|---|---|---|---|---|---|---|---|
| G4Hunter Threshold | n | densitya | relativeb | n | density | relative | n | density | relative |
| Prom500c | 349 | 0.044 |
| 160 | 0.020 |
| 85 | 0.011 |
|
| Prom1kbc | 612 | 0.043 |
| 274 | 0.020 |
| 154 | 0.011 |
|
| Trans | 582 | 0.024 | 0.76 | 237 | 0.010 | 0.73 | 122 | 0.005 | 0.67 |
| CDS | 328 | 0.016 | 0.52 | 102 | 0.005 | 0.37 | 41 | 0.002 | 0.28 |
| Exons | 464 | 0.020 | 0.64 | 165 | 0.007 | 0.52 | 68 | 0.003 | 0.39 |
| Introns | 157 | 0.063 |
| 94 | 0.037 |
| 62 | 0.025 |
|
| intergene | 490 | 0.056 |
| 229 | 0.026 |
| 131 | 0.015 |
|
|
|
|
| 1.00 |
|
| 1.00 |
|
| 1.00 |
aNumber of hits per kb: number of hits in the left column divided by the length of the feature.
bRelative density as compared to density for the whole D. discoideum genome = 1.00. Higher than average densities are shown with italic digits.
cSee comment on promoter definition in the Materials and Methods section; these regions may also include 5′ UTR regions.
dTotal number is not the sum of categories above as a G4 sequence may be listed in several categories.
Figure 1.(A) The density of G4 motifs (G4FS) was calculated by dividing the number of G4FS found with G4Hunter (threshold = 1.5) overlapping the given genomic feature by the total length of the feature in kilobase. The dotted line indicates the average G4FS density for the whole genome. Prom500 and Prom1kb indicate promoters defined as 500 or 1000 bp upstream of genes, respectively. (B) Distribution of the GC content for the different features represented as a boxplot (computed with default parameters in R, v3.5.1). The dotted line indicates the average GC content of the D. discoideum genome. (C) Fraction of promoters (defined as 500 bp upstream of genes) with at least one G4FS for different thresholds. This fraction was calculated by counting the number of promoters overlapping with at least one G4FS at a given threshold divided by the number of promoter regions.
Selected sequences in D. discoideum genome. The selected G4-prone sequences with their name, function, length, probable distance from annotated ORF and the dictyBase Gene ID
| Sequence name | Gene function and name | Gene ID | G4 sequence (5′ = > 3′) | Genomic location | DNA | RNA | Oligo. length (nt) | Distance to TSS (bp) (n) |
|---|---|---|---|---|---|---|---|---|
|
| Immunoglobulin E-set domain-containing protein ( | DDB_G0277627 | AGGAGAGTGGGGGTGTGTGTTTGGGTGGGT | Chr2, 8277511–8277540 | x | 30 | -183 | |
|
| 40S ribosomal protein S20 ( | DDB_G0278429 | AGGATTGGGGCTGGGTGTGCGGGA | Chr3, 872263-872240 | x | 24 | -75 | |
|
| Phosphatidylethanolamine-binding protein PEBP | DDB_G0283803 | TGGGGGGAAGAGGGGAGAGACAGGGGGAGGTAGG | Chr4, 1122650–1122683 | x | 34 | -224 | |
|
| Small GTPase ( | DDB_G0279305 | GGGGGTGGGGGTGGGGTAAGGGG | Chr3, 1907066–1907088 | x | 23 | -359 | |
|
| Unknown | DDB_G0349518 | UGGGAUGGGGUGACAGUUGGGGGGAGAGGGGUAUGAUG | Chr2, 5616940-5616903 | x | 38 | (a) | |
|
| Similar to serologically defined breast cancer antigen 84 (sdbcag84/ERGIC3) | DDB_G0280993 | ATGGGAGGTAGATGGTGGGTGGGTGGTGA | Chr3, 3920891–3920919 | x | x | 29 | (b) |
|
| Cell migration and cell-cell adhesion ( | DDB_G0281803 | AGGGGATGGTTAGCGGTGCGACAGGGGGGAGGAGACAGGGGGA | Chr3, 4732632–4732674 | x | x | 43 | (c) |
|
| Hypothetical protein | AGGTGGGGGTGAGCAGTGTGGTGTGGGT | Chr1, 717709-717682 | x | x | 28 | (d) | |
|
| Hypothetical protein | DDB_G0276613 | TGGTGGTGGTAGGAAGTGGT | Chr2, 7074056–7074075 | x | x | 20 | (e) |
|
| Hypothetical protein | DDB_G0276707 | TGGTGGAGGGGGGTGGGTGGT | Chr2, 7077389–7077409 | x | x | 21 | (f) |
|
| Hypothetical protein | TGGGGTTGGATGGGTGGGT | Chr1, 1755973-1755955 Chr1, 1756455-1756437 | x | 19 | (g) | ||
|
| Conserved protein, DNAJ heat shock N-terminal domain-containing protein | DDB_G0288639 | TGGGAGGTGGTATGGGAGGTGGAGGC | Chr5, 1758405–1758430 | x | x | 26 | (h) |
|
| Putative Ras effector, chemotaxis ( | DDB_G0287875 | TGGGGGGAGGTGGTGGTGGTGGTGGAT | Chr5, 631724–631750 | x | x | 27 | (i) |
|
| Short-chain dehydrogenase | DDB_G0295833/ DDB_G0273855 | GGGGGAGGGGTACAGGGGTACAGGGG | Chr2, 2545914-2545889 Chr2, 3485872–3485897 | x | 26 | (j) |
(n) The localization of G4 sequence is determined based on the data from RNA-Seq and EST alignments available in dictyBase.
(a) G4 is localized in 5′ UTR of DDB_G0349518.
(b) G4 is localized in the promoter of DDB_G0280993 (similar to serologically defined breast cancer antigen 84) or in noncoding RNA gene DDB_G394564.
(c) G4 is localized in the promoter of DDB_G0281803 (sma, which plays a role in cell migration and cell-cell adhesion) or in 3′ UTR of DDB_G0281631 (pseudogene).
(d) G4 is localized in 5′ UTR or promoter of DDB_G026833 or within upstream ncRNA gene DDB_G3967196.
(e) G4 is localized in promoter of DDB_G0276613.
(f) G4 is localized in promoter of DDB_G0276707.
(g) Not enough coverage to determine the exact localization (UTR or promoter). There are two copies of p5 flanking the gene DDB_G0268466 (∼100 bp upstream and downstream). In addition, three and five copies of p5 interspersed by ∼400 bp are found in intergenic regions in chromosome 2 (4788017-4788460) and 5 (849448-850835), respectively.
(h) Located in exon 4 of DDB_G0288639 that codes for a conserved heat shock protein from DnaJ homolog subfamily C member 7.
(i) Located in exon 2 of dydA gene (DDB_G0287875) coding for a Ras effector protein playing an important role in chemotaxis in D. discoideum.
(j) p10 can be mapped at 263 and 86 nucleotides upstream of two divergent genes (DDB_G0295833/DDB_G0273855 and DDB_G0273115/DDB_G0273853), respectively. Gene ID DDB_G0295833/DDB_G0273855 is annotated as a short-chain dehydrogenase/reductase (SDR) family protein whereas no functional annotation is available for Gene ID DDB_G0273115/DDB_G0273853. It is likely to be in the promoter region due to proximity to the genes (33). There are two copies of p10 and its neighboring genes as they are part of a 750-kb region that is duplicated in chromosome 2 of strain AX4.
Figure 2.Characterization of p16, p32, p40, p172, p23R, p220, p309, p220R and p309R by different biophysical methods. (A) CD spectra done at 25°C; (B) thermal difference spectra (TDS); (C) isothermal difference spectra (IDS) at 25°C. The DNA samples were folded in 100 mM KCl at 4 μM except p172 which was folded in 10 mM KCl and the RNA sequences were folded in 50 mM KCl. TDS and IDS spectra were normalized to [0;1].
Melting temperatures of the different G4 sequences calculated according to UV absorbance melting profiles recorded at 295 nm, and their G4 conformation concluded from CD signature. R indicates the corresponding RNA sequence. Tm determinations were calculated based on the heating curves and repeated at least twice (Tm values were reproducible within 2–3°C)
| Sequence name | Probable G4 conformationa |
| G4Hunter score |
|---|---|---|---|
|
| Parallel | 56b | 1.57 |
|
| Mixed | 51b | 1.58 |
|
| Parallel | 57b | 2.06 |
|
| Mixed | 77c | 3.13 |
|
| Parallel | 67d | 1.82 |
|
| Parallel | 61b | 1.41 |
|
| Parallel | 78d | 1.41 |
|
| Parallel | 56b | 1.67 |
|
| Parallel | 61d | 1.67 |
|
| Antiparallel | 42b | 1.46 |
|
| Parallel | 60d | 1.46 |
|
| Antiparallel | 36b | 1.05 |
|
| Parallel | 45d | 1.05 |
|
| Parallel | 52b | 2.14 |
|
| Parallel | 72d | 2.14 |
|
| Parallel | 55b | 2.00 |
|
| Parallel | 68b | 1.42 |
|
| Parallel | 65b | 1.42 |
|
| Parallel | 60b | 1.78 |
|
| Parallel | 78b | 1.78 |
|
| Antiparallel | >90b | 2.54 |
aAs inferred from CD shape.
bDetermined in 100 mM KCl.
cDetermined in 10 mM KCl.
dDetermined in 50 mM KCl.
eSee (40).
Figure 3.UV melting and annealing curves recorded at 295 nm after baseline substraction recorded at 335 nm. The DNA samples were folded in 100 mM KCl at 4 μM.
Figure 4.1D Imino proton spectra in 20 mM potassium phosphate buffer at pH 6.9 with 70 mM KCl at 25°C.
Figure 5.(A) Structure of AuMA, a G4 ligand porphyrin derivative. (B) AuMA inhibits the transcription of G4-prone D. discoideum genes. D. discoideum HMX44A cells were treated with increasing concentrations of AuMA (10, 20 and 30 μM) for 24 h and then the transcript levels of genes carrying p10, p187, p32, p193 and p40 G4 sequences in their upstream regions were measured by using real time qRT-PCR. The histograms show the relative fold change related to the internal control H3a. The control group is the condition without ligand treatment, and the corresponding variation in relative expression is zero. The data are presented as mean ± SD from two replicate wells and two independent experiments. The data were analyzed on GraphPad Prism and the paired Student's t-test (repeated measures one-way ANOVA) was applied. Asterisks indicate significant differences between treated cells and the control group. (*): (0.001 < P-value < 0.05); (**): (P-value < 0.001).