| Literature DB >> 21097880 |
Ravi Gupta1, Anirban Bhattacharyya, Francisco J Agosto-Perez, Priyankara Wickramasinghe, Ramana V Davuluri.
Abstract
MPromDb (Mammalian Promoter Database) is a curated database that strives to annotate gene promoters identified from ChIP-seq results with the goal of providing an integrated resource for mammalian transcriptional regulation and epigenetics. We analyzed 507 million uniquely aligned RNAP-II ChIP-seq reads from 26 different data sets that include six human cell-types and 10 distinct mouse cell/tissues. The updated MPromDb version consists of computationally predicted (novel) and known active RNAP-II promoters (42,893 human and 48,366 mouse promoters) from various data sets freely available at NCBI GEO database. We found that 36% and 40% of protein-coding genes have alternative promoters in human and mouse genomes and ∼40% of promoters are tissue/cell specific. The identified RNAP-II promoters were annotated using various known and novel gene models. Additionally, for novel promoters we looked into other evidences-GenBank mRNAs, spliced ESTs, CAGE promoter tags and mRNA-seq reads. Users can search the database based on gene id/symbol, or by specific tissue/cell type and filter results based on any combination of tissue/cell specificity, Known/Novel, CpG/NonCpG, and protein-coding/non-coding gene promoters. We have also integrated GBrowse genome browser with MPromDb for visualization of ChIP-seq profiles and to display the annotations. The current release of MPromDb can be accessed at http://bioinformatics.wistar.upenn.edu/MPromDb/.Entities:
Mesh:
Substances:
Year: 2010 PMID: 21097880 PMCID: PMC3013732 DOI: 10.1093/nar/gkq1171
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.The block diagram and workflow of updated MPromDb database. Deep sequencing datasets were downloaded from NCBI GEO server and processed by our analysis and annotation pipeline. The identified promoters are deposited in MPromDb tables. Novel promoters are compared to various existing experimental and predicted gene promoter regions and status of novel promoters is deposited in the relational tables. The database is accessed through a user-friendly webpage. The database is integrated with open source genome browser (GBrowse) to visualize the promoter and various ChIP-seq enrichment profiles.
Summary of RNAP-II bound promoters identified in various tissues/cell types for human and mouse using ChIP-seq data sets
| Species | Tissue/cell type | No. of known promoters | No. of novel promoters | No. of tissue/ cell-specific promoters | No. of CpG promoters | No. of bidirectional promoters | No. of total promoters |
|---|---|---|---|---|---|---|---|
| Mouse | 15 948 | 5270 | 3978 | 13 864 | 1373 | 21 218 | |
| 12 319 | 3189 | 1642 | 10 421 | 1250 | 15 508 | ||
| 15 059 | 4632 | 1995 | 12 879 | 1348 | 19 691 | ||
| 9089 | 2121 | 806 | 8273 | 1067 | 11 210 | ||
| 15 373 | 5142 | 1935 | 13 986 | 1374 | 20 515 | ||
| 11 895 | 2880 | 2745 | 12 063 | 1314 | 14 775 | ||
| 10 558 | 2261 | 273 | 10 898 | 1241 | 12 819 | ||
| 11 887 | 2761 | 706 | 10 886 | 1237 | 14 648 | ||
| 13 320 | 3977 | 870 | 12 038 | 1298 | 17 297 | ||
| 12 647 | 3713 | 566 | 11 846 | 1294 | 16 260 | ||
| 13 119 | 4041 | 688 | 11 926 | 1292 | 17 160 | ||
| 8489 | 1373 | 113 | 8597 | 1038 | 9862 | ||
| 8684 | 1626 | 154 | 8803 | 1072 | 9310 | ||
| 8508 | 1593 | 174 | 8415 | 1042 | 10 101 | ||
| 8374 | 1540 | 136 | 8371 | 1035 | 9914 | ||
| 6976 | 1422 | 194 | 6793 | 848 | 8398 | ||
| 4039 | 1443 | 927 | 4030 | 511 | 5482 | ||
| Human | 7417 | 1403 | 541 | 7653 | 792 | 8820 | |
| 16 410 | 8012 | 6422 | 16 918 | 1320 | 24 422 | ||
| 19 617 | 8998 | 6629 | 20 682 | 1311 | 28 615 | ||
| 12 925 | 2944 | 916 | 13 650 | 1156 | 15 869 | ||
| 13 982 | 3502 | 2101 | 14 812 | 1212 | 17 484 | ||
| 14 329 | 4137 | 1336 | 15 740 | 1220 | 18 466 | ||
| 7354 | 1267 | 174 | 7955 | 882 | 8621 | ||
| 11 470 | 2389 | 377 | 12 740 | 1172 | 13 859 | ||
Alternative promoter usage for active protein-coding genes in mouse and human
| Protein-coding genes | Mouse (%) | Human (%) |
|---|---|---|
| 1-promoter genes | 9290 (60) | 9051 (63.44) |
| 2-promoter genes | 3490 (22.5) | 3192 (22.37) |
| ≥3-promoter genes | 2707 (17.5) | 2023 (14.18) |
| Total | 15 493 | 14 266 |
Figure 2.Screenshots of MPromDb and search results. (A) MPromDb main search page where a user can perform search based on either Entrez gene id/symbol or specific tissue/cell type and the resulting page is shown in (B) and (C), respectively. (D) User can visualize the ChIP-seq profile for any promoter displayed on (B) or (C) by clicking on the promoter position link.