| Literature DB >> 19811654 |
Federico Fogolari1, Haritha Haridas, Alessandra Corazza, Paolo Viglino, Davide Corà, Michele Caselle, Gennaro Esposito, Luigi E Xodo.
Abstract
BACKGROUND: Independent surveys of human gene promoter regions have demonstrated an overrepresentation of G(3)X(n1)G3X(n2)G(3)X(n3)G(3) motifs which are known to be capable of forming intrastrand quadruple helix structures. In spite of the widely recognized importance of G-quadruplex structures in gene regulation and growing interest around this unusual DNA structure, there are at present only few such structures available in the Nucleic Acid Database. In the present work we generate by molecular modeling feasible G-quadruplex structures which may be useful for interpretation of experimental data.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19811654 PMCID: PMC2768733 DOI: 10.1186/1472-6807-9-64
Source DB: PubMed Journal: BMC Struct Biol ISSN: 1472-6807
Non-redundant features of the fragments selected from the database.
| a s a | a | 4 | 111 | 13 |
| a a a | p | 3 | 000 | 10 |
| s a a | a | 3 | 200 | 7 |
| s s a | a | 3 | 002 | 6 |
| s a a | p | 3 | 200 | 6 |
| a a a | p | 1 | 000 | 6 |
| a s a | a | 4 | 020 | 5 |
| s s a | a | 3 | 220 | 2 |
| a s a | a | 3 | 020 | 2 |
| a a a | p | 4 | 111 | 2 |
| s s a | a | 4 | 111 | 1 |
| s a s | a | 3 | 111 | 1 |
| s a a | p | 2 | 200 | 1 |
| s a a | p | 1 | 200 | 1 |
| s a a | a | 4 | 200 | 1 |
| s a a | a | 4 | 111 | 1 |
| a s a | a | 4 | 202 | 1 |
| a s a | a | 3 | 202 | 1 |
| a a a | p | 2 | 000 | 1 |
syn/anti indicates the conformation at the glycosidic bond of the first three G's, a/p indicates antiparallel/parallel arrangement, 0, 1 and 2 pairings are described in the text and the number of fragments different in sequence or conformation is given in the column "counts".
Figure 1The standard G-tetrad with reference numbering. When the glycosidic bond angle is anti the chain progresses over the page, when it is syn the chain progresses below the page. According to Webba da Silva [56] the sign of the loop topology is positive when the first stem is progressing towards the viewer and the second stem is found rotating clockwise, and negative when it is found rotating anti-clockwise. E. g., when the glycosidic bond angle is anti the topology of a loop connecting the stem of base 1 to the stem of base 2 would be marked with - sign, and the topology of a loop connecting the stem of base 1 to the stem of base 4 would be marked with + sign.
Features of modeled intrastrand G-quadruplexes.
| a a a | -p-p-p | ppp | 1 | 1 | 1 | 10 |
| a a a | -p-p-p | ppp | 1 | 1 | 2 | 6 |
| a a a | -p-p-p | ppp | 1 | 1 | 3 | 53 |
| a a a | -p-p-p | ppp | 1 | 2 | 1 | 8 |
| a a a | -p-p-p | ppp | 1 | 2 | 2 | 2 |
| a a a | -p-p-p | ppp | 1 | 2 | 3 | 20 |
| a a a | -p-p-l | ppa | 1 | 2 | 3 | 6 |
| a a a | -p-p-l | ppa | 1 | 2 | 4 | 2 |
| a a a | -p-p-p | ppp | 1 | 3 | 1 | 58 |
| a a a | -p-p-p | ppp | 1 | 3 | 2 | 8 |
| a a a | -p-p-p | ppp | 1 | 3 | 3 | 237 |
| a a a | -p-p-l | ppa | 1 | 3 | 3 | 34 |
| a a a | -p-l-l | pap | 1 | 3 | 3 | 3 |
| a a a | -p-p-l | ppa | 1 | 3 | 4 | 12 |
| a a a | -pd+p | paa | 1 | 4 | 1 | 3 |
| a a a | -pd+p | paa | 1 | 4 | 2 | 10 |
| a a a | -pd+l | ppa | 1 | 4 | 3 | 10 |
| a a a | -p-l-l | pap | 1 | 4 | 3 | 1 |
| a a a | -pd+p | paa | 1 | 4 | 3 | 35 |
| a a a | -p-p-p | ppp | 2 | 1 | 1 | 4 |
| a a a | -p-p-p | ppp | 2 | 1 | 2 | 2 |
| a a a | -p-p-p | ppp | 2 | 1 | 3 | 21 |
| a a a | -p-p-p | ppp | 2 | 2 | 1 | 3 |
| a a a | -p-p-p | ppp | 2 | 2 | 2 | 1 |
| a a a | -p-p-p | ppp | 2 | 2 | 3 | 11 |
| a a a | -p-p-l | ppa | 2 | 2 | 3 | 3 |
| a a a | -p-p-l | ppa | 2 | 2 | 4 | 1 |
| a a a | -p-p-p | ppp | 2 | 3 | 1 | 23 |
| a a a | -p-p-p | ppp | 2 | 3 | 2 | 3 |
| a a a | -p-p-l | ppa | 2 | 3 | 3 | 17 |
| a a a | -p-l-l | pap | 2 | 3 | 3 | 2 |
| a a a | -p-p-p | ppp | 2 | 3 | 3 | 74 |
| a a a | -p-p-l | ppa | 2 | 3 | 4 | 5 |
| a a a | -pd+p | paa | 2 | 4 | 1 | 13 |
| a a a | -pd+p | paa | 2 | 4 | 2 | 3 |
| a a a | -pd+p | paa | 2 | 4 | 3 | 16 |
| a a a | -pd+l | ppa | 2 | 4 | 3 | 1 |
| a a a | -p-l-l | pap | 2 | 4 | 3 | 1 |
| a a a | -p-p-p | ppp | 3 | 1 | 1 | 28 |
| s a s | -p-p-p | ppp | 3 | 1 | 1 | 2 |
| s s a | +l+p+p | aaa | 3 | 1 | 1 | 34 |
| s a a | -p-p-p | ppp | 3 | 1 | 1 | 6 |
| a a a | -p-p-p | ppp | 3 | 1 | 2 | 10 |
| s a a | -p-p-p | ppp | 3 | 1 | 2 | 6 |
| s s a | +l+p+p | aaa | 3 | 1 | 2 | 7 |
| s s a | +l+p+p | aaa | 3 | 1 | 3 | 120 |
| s a a | -p-p-l | ppa | 3 | 1 | 3 | 12 |
| a a a | -p-p-p | ppp | 3 | 1 | 3 | 131 |
| s s a | +l+p+l | paa | 3 | 1 | 3 | 27 |
| s a a | -p-p-p | ppp | 3 | 1 | 3 | 49 |
| a a a | -p-p-l | ppa | 3 | 1 | 3 | 4 |
| a a a | -p-p-l | ppa | 3 | 1 | 4 | 1 |
| s a a | -p-p-l | ppa | 3 | 1 | 4 | 3 |
| s s a | +l+p+l | paa | 3 | 1 | 4 | 6 |
| s s a | +l+p+p | aaa | 3 | 2 | 1 | 10 |
| s a a | -p-p-p | ppp | 3 | 2 | 1 | 6 |
| a a a | -p-p-p | ppp | 3 | 2 | 1 | 9 |
| s a a | -p-p-p | ppp | 3 | 2 | 2 | 2 |
| a a a | -p-p-p | ppp | 3 | 2 | 2 | 3 |
| s s a | +l+p+p | aaa | 3 | 2 | 2 | 3 |
| s a a | -p-p-p | ppp | 3 | 2 | 3 | 22 |
| a a a | -p-p-p | ppp | 3 | 2 | 3 | 33 |
| s s a | +l+p+p | aaa | 3 | 2 | 3 | 46 |
| s a a | -p-p-l | ppa | 3 | 2 | 3 | 6 |
| a a a | -p-p-l | ppa | 3 | 2 | 3 | 9 |
| s s a | +l+p+l | paa | 3 | 2 | 3 | 9 |
| s a a | -p-p-l | ppa | 3 | 2 | 4 | 2 |
| a a a | -p-p-l | ppa | 3 | 2 | 4 | 3 |
| s s a | +l+p+l | paa | 3 | 2 | 4 | 3 |
| s a a | -l-l-p | app | 3 | 3 | 1 | 102 |
| s s a | +l+p+p | aaa | 3 | 3 | 1 | 102 |
| a a a | -p-p-p | ppp | 3 | 3 | 1 | 125 |
| s a a | -p-p-p | ppp | 3 | 3 | 1 | 80 |
| a s a | -l-l-p | app | 3 | 3 | 2 | 2 |
| s a a | -l-l-p | app | 3 | 3 | 2 | 70 |
| s a a | -p-p-p | ppp | 3 | 3 | 2 | 8 |
| a a a | -p-p-p | ppp | 3 | 3 | 2 | 9 |
| s s a | +l+p+p | aaa | 3 | 3 | 2 | 9 |
| s a a | -l-l-l | apa | 3 | 3 | 3 | 10 |
| s s a | +l+l+l | apa | 3 | 3 | 3 | 167 |
| s a a | -p-p-p | ppp | 3 | 3 | 3 | 241 |
| a a a | -p-l-l | pap | 3 | 3 | 3 | 30 |
| s s a | +l+p+p | aaa | 3 | 3 | 3 | 367 |
| s a a | -l-l-p | app | 3 | 3 | 3 | 385 |
| a s a | -ld+l | aap | 3 | 3 | 3 | 3 |
| a a a | -p-p-p | ppp | 3 | 3 | 3 | 450 |
| s a a | -p-p-l | ppa | 3 | 3 | 3 | 66 |
| a a a | -p-p-l | ppa | 3 | 3 | 3 | 68 |
| a s a | -l-l-p | app | 3 | 3 | 3 | 8 |
| s a a | -p-l-l | pap | 3 | 3 | 3 | 91 |
| s s a | +l+p+l | paa | 3 | 3 | 3 | 95 |
| s a a | -p-p-l | ppa | 3 | 3 | 4 | 12 |
| s s a | +l+p+l | paa | 3 | 3 | 4 | 15 |
| a a a | -p-p-l | ppa | 3 | 3 | 4 | 19 |
| s a a | -l-l-l | apa | 3 | 3 | 4 | 19 |
| s s a | +ld-p | ppa | 3 | 4 | 1 | 24 |
| s a a | -pd+p | paa | 3 | 4 | 1 | 6 |
| a a a | -pd+p | paa | 3 | 4 | 1 | 9 |
| a a a | -pd+p | paa | 3 | 4 | 2 | 15 |
| s s a | +ld-p | ppa | 3 | 4 | 2 | 3 |
| s a a | -pd+p | paa | 3 | 4 | 2 | 5 |
| s a a | -pd+l | ppa | 3 | 4 | 3 | 12 |
| a a a | -pd+l | ppa | 3 | 4 | 3 | 13 |
| s a a | -p-l-l | pap | 3 | 4 | 3 | 14 |
| a a a | -p-l-l | pap | 3 | 4 | 3 | 3 |
| s s a | +ld-l | paa | 3 | 4 | 3 | 3 |
| s a a | -pd+p | paa | 3 | 4 | 3 | 44 |
| s s a | +ld-p | ppa | 3 | 4 | 3 | 51 |
| a a a | -pd+p | paa | 3 | 4 | 3 | 80 |
| s s a | +l+l+l | apa | 3 | 4 | 3 | 82 |
| s a a | -l-l-p | app | 4 | 3 | 1 | 2 |
| s a a | -l-l-p | app | 4 | 3 | 2 | 8 |
| a s a | -ld+l | aap | 4 | 3 | 3 | 2 |
| s a a | -l-l-p | app | 4 | 3 | 3 | 33 |
| a s a | d+pd | aap | 4 | 3 | 4 | 114 |
| s a a | d+pd | aap | 4 | 3 | 4 | 8 |
The notation here follows Webba da Silva [56]. syn/anti indicates the conformation at the glycosidic bond of the first three G's. The loop topology is indicated by letters p (parallel), l (lateral) and d (diagonal) preceded by + or - sign to indicate clockwise or anti clockwise rotation when the first strand is progressing towards the viewer (see Figure 1). Similarly, the parallel or antiparallel (a/p) strand polarity in column 2 is with reference to the first strand and the order is according to the position in the quadruplex (rotating anti-clockwise with the first strand progressing towards the viewer), and in general not according to sequence order. The next three fields indicate loop lengths and the last field indicate the number of built models found with these features.
Figure 2Model for a human telomeric DNA G-quadruplex structure (pdb id: 2HY9). In the stereoview the experimental structure is displayed as a ribbon with sugars and bases schematic representations and the model is displayed as solid bonds. The RMSD computed on all backbone atoms is 2.2 Å.
Topology distribution of model G-quadruplexes.
| -p-p-p | 1764 |
| +l+p+p | 698 |
| -l-l-p | 610 |
| -p-p-l | 285 |
| +l+l+l | 249 |
| -pd+p | 239 |
| +l+p+l | 155 |
| -p-l-l | 145 |
| d+pd | 122 |
| +ld-p | 78 |
| -pd+l | 36 |
| -l-l-l | 29 |
| -ld+l | 5 |
| +ld-l | 3 |
| total number of topologies | total number of models |
| 14 | 4418 |
The distribution of topologies (independent of glycosidic bond conformation and loop lengths) of all 4418 models is reported. The notation here follows Webba da Silva [56]. p, l and d stand for propeller-like, lateral or diagonal loop. Theand + signs refer to anti-clockwise or clockwise rotation of the loop around the G-quadruplex stem, respectively, when the first strand is progressing towards the viewer (see Figure 1).
Figure 3Model of TMPyP4 docking on the model for RET promoter G-quadruplex structure. In the stereoview TMPyP4 atoms are shown as Van der Waals spheres and DNA backbone is shown as a tube. The bonds of residues of the G-quadruplex tetrads are shown.
Human cancer genes containing potential all-parallel G-quadruplexes.
| AKT1 | v-akt murine thymoma viral oncogene homolog 1 |
| ASPSCR1 | alveolar soft part sarcoma chromosome region, candidate 1 |
| ATF1 | activating transcription factor 1 |
| BCL3 | B-cell CLL/lymphoma 3 |
| BRCA2 | familial breast/ovarian cancer gene 2 |
| CARD11 | caspase recruitment domain family, member 11 |
| CDH11 | cadherin 11, type 2, OB-cadherin (osteoblast) |
| CLTCL1 | clathrin, heavy polypeptide-like 1 |
| ELN | elastin |
| EPS15 | epidermal growth factor receptor pathway substrate 15 (AF1p) |
| ERCC2 | excision repair cross-complementing rodent repair deficiency complementation group 2 (xeroderma pigmentosum D) |
| ETV6 | ets variant gene 6 (TEL oncogene) |
| FGFR3 | fibroblast growth factor receptor 3 |
| FNBP1 | formin binding protein 1 (FBP17) |
| FOXP1 | forkhead box P1 |
| FSTL3 | follistatin-like 3 (secreted glycoprotein) |
| GATA1 | GATA binding protein 1 (globin transcription factor 1) |
| HIP1 | huntingtin interacting protein 1 |
| HOXA11 | homeo box A11 |
| HOXA13 | homeo box A13 |
| HOXA9 | homeo box A9 |
| IGK@ | immunoglobulin kappa locus |
| IRF4 | interferon regulatory factor 4 |
| JAZF1 | juxtaposed with another zinc finger gene 1 |
| LHFP | lipoma HMGIC fusion partner |
| MLLT6 | myeloid/lymphoid or mixed-lineage leukemia (trithorax homolog, Drosophila); translocated to, 6 (AF17) |
| MSI2 | musashi homolog 2 (Drosophila) |
| MSN | moesin |
| MUC1 | mucin 1, transmembrane |
| MYCL1 | v-myc myelocytomatosis viral oncogene homolog 1, lung carcinoma derived (avian) |
| MYCN | v-myc myelocytomatosis viral related oncogene, neuroblastoma derived (avian) |
| MYC | v-myc myelocytomatosis viral oncogene homolog (avian) |
| PIM1 | pim-1 oncogene |
| POU2AF1 | POU domain, class 2, associating factor 1 (OBF1) |
| PTEN | phosphatase and tensin homolog gene |
| RANBP17 | RAN binding protein 17 |
| RAP1GDS1 | RAP1, GTP-GDP dissociation stimulator 1 |
| RET | ret proto-oncogene |
| SEPT6 | septin 6 |
| SFRS3 | splicing factor, arginine/serine-rich 3 |
| SS18L1 | synovial sarcoma translocation gene on chromosome 18-like 1 |
| TAF15 | TAF15 RNA polymerase II, TATA box binding protein (TBP)-associated factor, 68 kDa |
| TCF12 | transcription factor 12 (HTF4, helix-loop-helix transcription factors 4) |
| TMPRSS2 | transmembrane protease, serine 2 |
| TRIM33 | tripartite motif-containing 33 (PTC7, TIF1G) |
| TSHR | thyroid stimulating hormone receptor |
| ZNFN1A1 | zinc finger protein, subfamily 1A, 1 (Ikaros) |