| Literature DB >> 22086958 |
Riu Yamashita1, Sumio Sugano, Yutaka Suzuki, Kenta Nakai.
Abstract
To support transcriptional regulation studies, we have constructed DBTSS (DataBase of Transcriptional Start Sites), which contains exact positions of transcriptional start sites (TSSs), determined with our own technique named TSS-seq, in the genomes of various species. In its latest version, DBTSS covers the data of the majority of human adult and embryonic tissues: it now contains 418 million TSS tag sequences from 28 tissues/cell cultures. Moreover, we integrated a series of our own transcriptomic data, such as the RNA-seq data of subcellular-fractionated RNAs as well as the ChIP-seq data of histone modifications and the binding of RNA polymerase II/several transcription factors in cultured cell lines into our original TSS information. We also included several external epigenomic data, such as the chromatin map of the ENCODE project. We further associated our TSS information with public or original single-nucleotide variation (SNV) data, in order to identify SNVs in the regulatory regions. These data can be browsed in our new viewer, which supports versatile search conditions of users. We believe that our new DBTSS will be an invaluable resource for interpreting the differential uses of TSSs and for identifying human genetic variations that are associated with disordered transcriptional regulation. DBTSS can be accessed at http://dbtss.hgc.jp.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22086958 PMCID: PMC3245115 DOI: 10.1093/nar/gkr1005
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Statistics of TSS-seq data
| Category | No. of cell types or tissues | No. of total condition | No. of TSS-seq |
|---|---|---|---|
| human adult Tissues | 16 | 20 | 138 864 978 |
| human fetal tisssues | 5 | 5 | 41 744 136 |
| human cell lines | 7 | 23 | 237 537 518 |
| mouse embrio | 1 | 4 | 38 897 846 |
| mouse cell lines | 3 | 3 | 31 488 592 |
TSCs corresponding to NCBI RefSeq genes and SNP information
| TSS-seq tags | Total TSCs | TSCs in RefSeq | TSC >5 ppm | overlap RefSeq (>5 ppm) | db_SNP (>5 ppm) | JPT/YRI/CHS/CEU (>5 ppm) | |
|---|---|---|---|---|---|---|---|
| HEK293 | 20 686 169 | 193 140 | 137 518 | 11 338 | 10 234 | 11 285 | 3930/5471/4391/3476 |
| Ramos | 31 022 974 | 371 759 | 239 308 | 11 455 | 9227 | 11 418 | 4036/5433/4340/3513 |
| BEAS2B | 98 761 770 | 708 912 | 440 302 | 27 628 | 20 386 | 7220 | 2993/3979/3194/2167 |
| DLD1 | 48 580 850 | 462 724 | 272 171 | 19 941 | 16 965 | 19 878 | 7084/9735/7731/6211 |
| MCF7 | 15 785 949 | 172 834 | 120 695 | 11 790 | 10 383 | 11 743 | 4094/5645/4432/3584 |
| TIG3 | 18 780 087 | 198 129 | 144 622 | 11 893 | 10 512 | 11 847 | 4186/5857/4674/3711 |
| HeLa | 3 919 719 | 99 241 | 74 719 | 11 300 | 9710 | 11 262 | 4187/5787/4705/3734 |
| Adult tissue | 138 864 978 | 1 496 409 | 911 872 | 36 718 | 26 023 | 50 674 | 16 874/22 996/18 163/14 927 |
| Fetal tissue | 41 744 136 | 822 577 | 572 941 | 32 533 | 26 773 | 32 386 | 10 400/14 192/11 088/9260 |
‘Samples’: category of samples, ‘TSS-seq tags’: tag number in each category, ‘total TSCs’: observed TSC number, ‘TSCs in RefSeq’: TSCs overlapping with the Refseq transcribed region (including their 50 k bp upstream region), ‘TSC > 5 ppm’: number of TSCs whose expression level is higher than 5 ppm, ‘overlap Refseq (5 ppm)’: > 5ppm TSCs which overlap with the RefSeq transcribed region, ‘db_SNP (>5 ppm)’: number of TSCs which contain SNPs in dbSNP, ‘JPT/YRI/CHS/CEU (>5 ppm): number of TSCs which contain ethnic SNPs (JPT: Japanese in Tokyo, CEU: Utah residents with northern and western European ancestry from the CEPH collection, CHS: Chinese in Singapore, and YRI: Yoruba in Ibadan).
Figure 1.DBTSS input windows. (A) Users can use a RefSeq ID for the simplest search (red box in the figure). (B) After clicking ‘TSS-seq Detailed Search’, users will obtain the ‘search condition’ window. In this case, users can search TSSs that are overexpressed after IL4 stimulation by 2-folds, with their expression level higher than 5 ppm, showing H3k4me3 signals, and having nearby dbSNP data. (C) Users can search TSSs around a given SNP or any genomic position (upper window). SNPs that are neighboring with known genes can be sought, too (bottom window).
Figure 2.Example of search results (NM_013293: transformer 2 alpha homolog). (A) Overview of TSS-seq and ChIP-seq for NM_013293, transformer 2 alpha homolog. There are three major putative alternative promoters (AP4, AP10 and AP15) in DLD1 cells. The expression of AP10 under the normoxia condition (21%) is relatively low compared with that under the hypoxia condition (1%). Using check boxes, users can also check the TSS-seq and ChIP-seq results in other tissues. (B) Function of recalculating tag counts by specifying desired genomic regions. In this case, 1.5 ppm TSC specific to normoxia and 10.6 ppm TSC to hypoxia are observed. (C) Users can also recalculate tag counts for ChIP-seq tags. There is a clear difference in the H3K27 states between normoxia and hypoxia. (D) Detailed information of AP4. The green bars indicate our TSS-seq data. The start positions of known genes are displayed with arrows. An Ethnic SNP (CHS) and two dbSNPs (rs11523571 and rs41273990) are also found in this region. Searching ‘rs11523571’ or ‘chr1:159680868’ based on the input window (Figure 1C) also leads to a similar result.