| Literature DB >> 31032841 |
Jonathan McMillan1,2, Zhaolian Lu1, Judith S Rodriguez3, Tae-Hyuk Ahn3,4, Zhenguo Lin1,3.
Abstract
The transcription initiation landscape of eukaryotic genes is complex and highly dynamic. In eukaryotes, genes can generate multiple transcript variants that differ in 5' boundaries due to usages of alternative transcription start sites (TSSs), and the abundance of transcript isoforms are highly variable. Due to a large number and complexity of the TSSs, it is not feasible to depict details of transcript initiation landscape of all genes using text-format genome annotation files. Therefore, it is necessary to provide data visualization of TSSs to represent quantitative TSS maps and the core promoters (CPs). In addition, the selection and activity of TSSs are influenced by various factors, such as transcription factors, chromatin remodeling and histone modifications. Thus, integration and visualization of functional genomic data related to these features could provide a better understanding of the gene promoter architecture and regulatory mechanism of transcription initiation. Yeast species play important roles for the research and human society, yet no database provides visualization and integration of functional genomic data in yeast. Here, we generated quantitative TSS maps for 12 important yeast species, inferred their CPs and built a public database, YeasTSS (www.yeastss.org). YeasTSS was designed as a central portal for visualization and integration of the TSS maps, CPs and functional genomic data related to transcription initiation in yeast. YeasTSS is expected to benefit the research community and public education for improving genome annotation, studies of promoter structure, regulated control of transcription initiation and inferring gene regulatory network.Entities:
Mesh:
Year: 2019 PMID: 31032841 PMCID: PMC6484093 DOI: 10.1093/database/baz048
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Yeast species, genome assembly and CAGE data
| Species | Strain | Assembly | CAGE reads | No. of TSS | No. of CPs |
|---|---|---|---|---|---|
|
| SC5314 | ASM18296v3 | 62 917 157 | 354 135 | 27 068 |
|
| NRRL Y-1140 | ASM251v1 | 59 908 113 | 377 665 | 23 163 |
|
| NCYC 2644 | ASM16711v1 | 47 907 986 | 312 918 | 24 069 |
|
| NRRL Y-12651 | ASM14922v1 | 36 988 291 | 220 879 | 17 732 |
|
| CBS 4309 | ASM23734v1 | 58 897 449 | 276 990 | 19 382 |
|
| 623-6C | ASM16703v1 | 30 739 852 | 288 198 | 19 617 |
|
| S288c | R64–2-1 (SacCer3) | 37 200 564 | 324 133 | 24 033 |
|
| IFO 1815 | ASM16697v1 | 63 468 342 | 246 308 | 17 109 |
|
| CBS432 | ASM207905v1 | 45 578 830 | 427 801 | 30 419 |
|
| yFS275 | SJ5 | 52 907 395 | 318 951 | 25 536 |
|
| 972 h- | ASM294v2 | 60 584 140 | 264 007 | 24 093 |
|
| CLIB122 | ASM252v1 | 48 428 838 | 356 722 | 28 307 |
Resources and databases used for compilation and curation of data in YeasTSS
| Data type | Data track | Species | Data description | Data format | Data source |
|---|---|---|---|---|---|
| TSS | YPD | All species | TSS maps produced by nAnT-iCAGE grown in YPD | BW | This study |
| Cell arrest; DNA damage; diauxic shift; glactose (2%); glucose (16%); H2O2; heat shock; NaCl |
| TSS maps produced by nAnT-iCAGE from cells growth in eight other conditions | ( | ||
| Pelechano_2013_YPD; Pelechano_2013_Galactose |
| TSS map generated by TIF-seq from cells grown in YPD and Galactose | BW | ( | |
| Malabat_2015_YPD |
| TSS map generated by transcription start site sequencing in wild type yeast cells grown in YPD | BW | ( | |
| Arribere_2013_YPD_Plus; Arribere_2013_YPD_Minus |
| TSS identified by TL-seq | BW | ( | |
| Doris_2018_TSS-seq |
| TSS maps obtained by TSS-seq from wild-type and spt6 mutant strain | BW | ( | |
| Li_2015_CAGE |
| TSS identified by CAGE in YPD | BW | ( | |
| Thodberg_2019_CAGE |
| TSS maps by CAGE from different grown environments | BW | ( | |
| CP | YPD | All species | CPs inferred based on nAnT-iCAGE TSS maps from cells grown in YPD | BED | This study |
| Consensus; cell arrest; DNA damage; diauxic shift; galactose (2%); glucose (16%); H2O2; heat shock; NaCl |
| CPs inferred from nAnT-iCAGE TSS maps from cells growth in eight other conditions | BW | ( | |
| TATA-box | Rhee_2012 |
| TATA-box or TATA-like elements identified based on ChIP-exo data | GFF3 | ( |
| TFBSs | Venters_2011_25°C; Venters_2011_37°C |
| TFBS based on ChIP-chip data with a 5% FDR threshold for cells grown at 25°C and at 37°C | GFF3 | ( |
| MacIsaac_2006 |
| Refined TFBS map based on re-analysis ChIP-chip assays | GFF3 | ( | |
| Wood_2012 |
| Predicted TFBS by Pombase | BED | ( | |
| RNA polymerase II binding | Ghavi-Helm_2008_WT_RNA_PolII_YPD_16°C; Ghavi-Helm_2008_WT_RNA_PolII_YPD_30°C; |
| Genome-wide location of RNA Pol II based on ChIP-chip assays | BW | ( |
| HMs | Kirmizis_2007_H3K4me1; Kirmizis_2007_H3K4me2; Kirmizis_2007_H3R2me2a; Kirmizis_2007_H3K4me3 |
| H3K4me1, H3K4me2, H3R2me2a, H3K4me3 based on ChIP-chip assays | BW | ( |
| Pokholok_2005_H3K14ac_vs_H3_H2O2; Pokholok_2005_H3K14ac_vs_H3_YPD; Pokholok_2005_H3K36me3_vs_H3_YPD; Pokholok_2005_H3K4me1_vs_H3_YPD; Pokholok_2005_H3K4me2_vs_H3_YPD; Pokholok_2005_H3K4me3_vs_H3_YPD; Pokholok_2005_H3K9ac_vs_H3_YPD; Pokholok_2005_H4ac_vs_H3_H2O2; Pokholok_2005_H4ac_vs_H3_YPD; |
| The distribution of methylated and acetylated histones based on ChIP-chip assays | BW | ( | |
| NO | Field_2009_YPD; Field_2009_GAL |
|
| BW | ( |
| Field_2009 |
|
| BW | ( | |
| Lantermann_2010 |
|
| BW | ( | |
| DS features | Zhou_2013 |
| Computation of DS features including minor groove width, roll, propeller twists and helix twists | BW | ( |
| TBs | Waern_2013_AlphaFactor; Waern_2013_Benomyl; Waern_2013_Calcofluor; Waern_2013_CongoRed; Waern_2013_DNADamage; Waern_2013_GrapeJuice; Waern_2013_HeatShock; Waern_2013_HighCalcium; Waern_2013_Hydroxyurea; Waern_2013_LowNitrogen; Waern_2013_LowPhosphate; Waern_2013_OxidativeStress; Waern_2013_Salt; Waern_2013_ScGlycerolMedia; Waern_2013_ScMedia; Waern_2013_Sorbitol; Waern_2013_StationaryPhase |
| Transcript coordination obtained in 18 growth conditions based on RNA-seq | GFF3 | ( |
Figure 1The overall design of YeasTSS. (A) Dataset: The dataset used in YeasTSS is illustrated in this central table. Currently, YeasTSS includes 12 yeast species. The evolutionary relationships of these species are demonstrated by the phylogenetic tree on the left side of data table. The clade of 10 budding yeast species is shaded in green, and the clade of 2 fission yeast species is shaded in yellow. The CP and TSS data of each species were generated by this study. NO data are integrated for S. cerevisiae, Sch. pombe and C. albicans. TFBSs are available in S. cerevisiae and Sch. pombe. For S. cerevisiae, several other functional genomic data are also integrated: TATA-box, DS, HMs, Polymerase II binding (PolII) and TBs obtained from 18 different growth conditions. (B) Genome browser: These data are visualized and integrated by dedicated JBrowse genome browser of each species. (C) Search: The `Search’ utility provides search tools in to retrieve TSS and CP information from gene-by-gene analysis or global approaches. (D) Download: The `Download’ utility allows users to download all raw data used in this database through web interface. (E) Help: The `Help’ page provides documentations about of CAGE technique, TSS identification, inferences of CPs and instructions of using genome browsers.
Figure 2An Example of using YeasTSS Genome Browser to explore transcription initiation landscape. A 1.5 kb region around the CP region of LAS17 on chrXV (674 807–676 348). The available tracks in S. cerevisiae are provided in the left panel of genome browser. Two consensus CPs on the forward strand are present within 1000 bp upstream of LAS17 ORF. The transcription activity of each CP can be visualized by the TSS tracks. Different CP activities can be observed between the YPD and YPGal (galactose 2%) growth conditions in S. cerevisiae. Only a few tracks were selected in this case. These tracks include: (A) Genome annotation from SGD; (B) consensus CPs; (C) TSS map on plus strand under YPD condition; (D) TSS map on plus strand under YPGal (galactose 2%) condition; (E) TSS map on minus strand under YPD condition; (F) TSS map on minus strand under YPGal condition; (G) nucleosome occupancy under YPD condition; (H) Nucleosome occupancy under YPGal condition; (I) TATA box; (J) position weight matrix (PWM) predicted binding sites; (K) histone modification H3K4me1; (L) histone modification H3K4me2.