| Literature DB >> 26226151 |
Anna Sloutskin1, Yehuda M Danino, Yaron Orenstein, Yonathan Zehavi, Tirza Doniger, Ron Shamir, Tamar Juven-Gershon.
Abstract
Core promoter elements play a pivotal role in the transcriptional output, yet they are often detected manually within sequences of interest. Here, we present 2 contributions to the detection and curation of core promoter elements within given sequences. First, the Elements Navigation Tool (ElemeNT) is a user-friendly web-based, interactive tool for prediction and display of putative core promoter elements and their biologically-relevant combinations. Second, the CORE database summarizes ElemeNT-predicted core promoter elements near CAGE and RNA-seq-defined Drosophila melanogaster transcription start sites (TSSs). ElemeNT's predictions are based on biologically-functional core promoter elements, and can be used to infer core promoter compositions. ElemeNT does not assume prior knowledge of the actual TSS position, and can therefore assist in annotation of any given sequence. These resources, freely accessible at http://lifefaculty.biu.ac.il/gershon-tamar/index.php/resources, facilitate the identification of core promoter elements as active contributors to gene expression.Entities:
Keywords: BRE; BRE downstream of the TATA box; BREu; BRE upstream of the TATA box; DCE; DPE; MTE; RNA Polymerase II; TBP; RNAP II transcription; TATA box; TATA box-binding protein; TAFs; TBP-associated factors; TSS; TCT; TFIIB recognition element; BREd; computational tool; core promoter elements/motifs; downstream core element; DPE; downstream core promoter element; Inr; initiator; initiator; MTE; motif 10 element; PWM; position weight matrix; RNAP II; promoter prediction; transcription start site.
Mesh:
Year: 2015 PMID: 26226151 PMCID: PMC4581360 DOI: 10.1080/21541264.2015.1067286
Source DB: PubMed Journal: Transcription ISSN: 2154-1272
The precisely spaced known core promoter elements within focused promoters
| Name | Position (relative to the TSS) | PWM logo representation | Consensus (in IUPAC characters) | References |
|---|---|---|---|---|
| mammalian Initiator | −2 to +5 | YYANWYY | ||
| −2 to +4 | TCAKTY | |||
| TATA box | −30/-31 to -23/-24 | TATAWAAR | ||
| BREu | Immediately upstream of the TATA box | SSRCGCC | ||
| BRE d | Immediately downstream of the TATA box | RTDKKKK | ||
| DPE (Inr dependent) | +28 to +33 | DSWYVY (functional range set) | ||
| MTE (Inr dependent) | +18 to +29 | CSARCSSAACGS | ||
| Bridge (Inr dependent) | Part I: +18 to +22 Part II: +30 to +33 | Part I: CGANC Part II: WYGT | ||
| −2 to +6 | YYCTTTYY | |||
| Human TCT | −1 to +6 | YCTYTYY | ||
| XCPE1 | −8 to +2 | DSGYGGRASM | ||
| XCPE2 | −9 to +2 | VCYCRTTRCMY | ||
| DCE | +6 to +11, +16 to +21, +30 to +34 | — | Necessary motifs: CTTC, CTGT, AGC |
The table includes the position (relative to the TSS, +1), motif logo, IUPAC consensus sequence and references for each element.
Figure 1.Schematic representation of the major core promoter elements. The region of the core promoter area (−40 to +40 relative to the TSS) is illustrated. The diagram is roughly to scale, and each element is colored according to its color in the output table (see ).
Figure 2.A sample output of the ElemeNT program. (A) The input sequence annotated with the combinations of elements identified in it. ElemeNT detected a TATA box flanked by both a BREu element and a BREd element, Drosophila and mammalian initiator elements and DPE and Bridge elements. The two possible combinations result from a sequence match to both the Drosophila and mammalian initiators, due to the partial sequence redundancy of the 2 elements. (B) A table displaying all the elements identified within the input sequence, their location, PWM and consensus match scores. Note the message displayed for the TATA-box, indicating the presence of mammalian and Drosophila initiators, as well as BREu and BREd, at optimal distances for transcriptional synergy.
Figure 3.Distribution of core promoter elements’ occurrence at specific positions. The frequency of detected elements (dInr, DPE, TATA, and dTCT) at the allowed positions relative to the determined TSS is presented. The +1 position is the predicted TSS location. Black squares depict the frequency of discovered elements using CAGE whereas red circles depict the frequency of discovered elements using RNA-seq. For both CAGE (black) and RNA-seq (red) data, an enrichment in the frequency of discovered elements is detected at the expected positions (-30 for TATA, -2 for dInr and dTCT and 28 for DPE).
Figure 4.Average PWM score of different core promoter elements at specific positions. The average PWM score of elements (dInr, DPE, TATA and dTCT) at the allowed positions relative to the determined TSS is presented. The +1 position is the predicted TSS location. Black squares depict the average score of discovered elements using CAGE whereas red circles depict the average score of discovered elements using RNA-seq. For both CAGE and RNA-seq data, some enrichment of the mean score is detected at the expected positions (-30 for TATA, -2 for dInr and dTCT and 28 for DPE). Error bars represent the standard errors of the means (SEM).
Top enriched GO terms categories associated with the analyzed data sets
| TATA | Inr | DPE | TCT | |
|---|---|---|---|---|
| CAGE peak | • chitin-based cuticle development • cuticle development | • branch fusion, open tracheal system • tube fusion • cardiocyte differentiation • ventral cord development • genital disc development | • heart development • circulatory system development • peripheral nervous system development • digestive system development • digestive tract development • reproductive system development • reproductive structure development | • mitotic spindle elongation • centrosome duplication • spindle elongation • centrosome cycle • centrosome organization • microtubule organizing center organization • translation |
| CAGE broad | • chitin-based cuticle development | • NO ENRICHMENT | • negative regulation of molecular function | • translation • cellular macromolecule biosynthetic process • macromolecule biosynthetic process • gene expression • cellular biosynthetic process • organic substance biosynthetic process • biosynthetic process |
| CAGE unclassified | • chitin-based cuticle development | • stem cell fate commitment • regulation of protein localization to nucleus • female meiosis chromosome segregation • regulation of protein import into nucleus | • renal system development • urogenital system development • pigment metabolic process | • Translation • cellular macromolecule biosynthetic process • macromolecule biosynthetic process • gene expression • cellular biosynthetic process • organic substance biosynthetic process • biosynthetic process |
| CAGE all tags | • chitin-based cuticle development • neuropeptide signaling pathway • cuticle development | • NO ENRICHMENT | • cardiocyte differentiation | • translation • cellular macromolecule biosynthetic process • macromolecule biosynthetic process • gene expression • cellular biosynthetic process • organic substance biosynthetic process • biosynthetic process |
| RNA-seq | • cellular modified amino acid metabolic process • glutathione metabolic process • peptide metabolic process • cellular amide metabolic process • sulfur compound metabolic process• cellular amino acid metabolic process • determination of adult lifespan | • NO ENRICHMENT | • heart development • circulatory system development • cardiovascular system development • renal system development • urogenital system development • skeletal muscle organ development • muscle attachment | • translation • mitotic spindle elongation • spindle elongation • cellular macromolecule biosynthetic process • macromolecule biosynthetic process • gene expression |
For each dataset, up to 7 categories that showed significant enrichment (P < 0.05 after Bonferroni corrections) are listed. In case there were more than 7, the top 7 according to the P-value are shown. The different elements are enriched for distinct biological processes categories. The full list of categories along with their P-values is presented in file S3.