| Literature DB >> 26200754 |
Jean-Michel Garant1, Mikael J Luce1, Michelle S Scott2, Jean-Pierre Perreault3.
Abstract
G-quadruplexes (G4) are tetrahelical structures formed from planar arrangement of guanines in nucleic acids. A simple, regular motif was originally proposed to describe G4-forming sequences. More recently, however, formation of G4 was discovered to depend, at least in part, on the contextual backdrop of neighboring sequences. Prediction of G4 folding is thus becoming more challenging as G4 outlier structures, not described by the originally proposed motif, are increasingly reported. Recent observations thus call for a comprehensive tool, capable of consolidating the expanding information on tested G4s, in order to conduct systematic comparative analyses of G4-promoting sequences. The G4RNA Database we propose was designed to help meet the need for easily-retrievable data on known RNA G4s. A user-friendly, flexible query system allows for data retrieval on experimentally tested sequences, from many separate genes, to assess G4-folding potential. Query output sorts data according to sequence position, G4 likelihood, experimental outcomes and associated bibliographical references. G4RNA also provides an ideal foundation to collect and store additional sequence and experimental data, considering the growing interest G4s currently generate.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26200754 PMCID: PMC5630937 DOI: 10.1093/database/bav059
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.Schematic of an RNA G4 structure and the regular expression used to predict this motif where N refers to any base including guanine and × ≥3.
Figure 2.Cropped screen capture of a G4RNA query. It displays the gene symbol, location in the mRNA, nucleotide length, sequence and reference of wild-type sequences presenting an ‘AAUAAA’ polyadenylation signal sorted by their location in mRNA.
Distribution of sequences in G4RNA database
| Locations |
G4 validation
| ||||
|---|---|---|---|---|---|
| Confirmed G4s | Denied G4s | Inconclusive results | |||
| Wild-type sequences | 5′UTR | 99 | 79 | 17 | 3 |
| 3′UTR | 45 | 41 | 0 | 4 | |
| Exonic coding | 12 | 10 | 2 | 0 | |
| Intronic | 8 | 8 | 0 | 0 | |
| TERRA | 1 | 1 | 0 | 0 | |
| Total | 165 | 139 | 19 | 7 | |
| All sequences | 5′UTR | 218 | 108 | 106 | 4 |
| 3′UTR | 72 | 45 | 23 | 4 | |
| Exonic | 21 | 17 | 4 | 0 | |
| Intronic | 12 | 9 | 2 | 1 | |
| Artificial | 10 | 5 | 5 | 0 | |
| TERRA | 1 | 1 | 0 | 0 | |
| Total | 334 | 185 | 140 | 9 | |
|
| |||||
| Discrete tests | Probing and SHAPE | 167 | 87 | 80 | 0 |
| Circular dichroism | 136 | 92 | 44 | 0 | |
| Expression assay | 115 | 58 | 54 | 3 | |
| Melting temperature | 104 | 69 | 35 | 0 | |
| NMR | 20 | 19 | 1 | 0 | |
| Native gel mobility | 16 | 14 | 2 | 0 | |
| Other | 20 | 13 | 7 | 0 | |
| Total | 578 | 352 | 223 | 3 | |
a G4 validation presents the outcome of experimental tests in three columns: Confirmed G4s / denied G4s / inconclusive results
UTR, Untranslated Region; TERRA, Telomeric Repeat-containing RNA; SHAPE, Selective 2'-Hydroxyl Acylation analyzed by Primer Extension; NMR, Nuclear Magnetic Resonance.
Sources of data in the literature
| Journals | Publications | Sequences |
|---|---|---|
| NAR | 12 | 160 |
| RNA | 5 | 49 |
| Biochemistry | 5 | 24 |
|
Nature Group
| 5 | 15 |
| Journal of Biochemical Chemistry | 3 | 8 |
| Other | 16 | 78 |
| Total | 46 | 334 |
a Nature group comprises Nature, Nature Structural & Molecular Biology and Nature Chemical Biology journals.