Literature DB >> 17202159

TRED: a transcriptional regulatory element database, new entries and other development.

C Jiang1, Z Xuan, F Zhao, M Q Zhang.   

Abstract

Transcriptional factors (TFs) and many of their target genes are involved in gene regulation at the level of transcription. To decipher gene regulatory networks (GRNs) we require a comprehensive and accurate knowledge of transcriptional regulatory elements. TRED (http://rulai.cshl.edu/TRED) was designed as a resource for gene regulation and function studies. It collects mammalian cis- and trans-regulatory elements together with experimental evidence. All the regulatory elements were mapped on to the assembled genomes. In this new release, we included a total of 36 TF families involved in cancer. Accordingly, the number of target promoters and genes for TF families has increased dramatically. There are 11,660 target genes (7479 in human, 2691 in mouse and 1490 in rat) and 14,908 target promoters (10,225 in human, 2985 in mouse and 1698 in rat). Additionally, we constructed GRNs for each TF family by connecting the TF-target gene pairs. Such interaction data between TFs and their target genes will assist detailed functional studies and help to obtain a panoramic view of the GRNs for cancer research.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17202159      PMCID: PMC1899102          DOI: 10.1093/nar/gkl1041

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


OVERVIEW

TRED was originally designed as a resource for studies on gene function and regulation. It provides cis-elements, such as promoters and binding motifs, and trans-elements, such as transcriptional factors (TFs). The promoters of target genes came from two sources: experimental determination and sequence-based computational prediction. These two sources complement each other. In TRED, hand curation was applied as a crucial part of the data collection to ensure data accuracy. Based on the reliability of the supporting evidence for each promoter, a quality level was assigned. One key feature of TRED is the easy access to interaction data between TFs and the promoters of their target genes, including binding motifs reported by the previous studies. Although part of the data was obtained from the existing gene regulation resources, most of the data came from our exhaustive literature curation. The TF-binding motifs were mapped on to the promoter sequences of their target genes. Along with the binding motifs, the experimental evidence and other important pertinent information were also collected. A binding quality level was assigned based on definitiveness of the binding evidence, which was determined by the experimental approaches employed to demonstrate the binding and data interpretation from experts. For example, we assigned ‘known’ as the binding quality level to a binding that has been proved by gel-shift competition, DNase I footprinting, etc. In order to provide users with more complete information of the genes, cross-references to other well-known database such as PubMed, GenBank, GeneCards (1) and TRANSFAC (2) were established in TRED. A comprehensive description of the content and the structure of TRED has been published earlier (3). In addition, many on-the-fly tools were implemented for the analysis of sequences retrieved from TRED as well as imported from other resources. The user interface and software functionality were also described in the previous report (3).

RECENT DEVELOPMENTS

New entries

Upon the emergence of high-throughput technologies, a huge amount of large-scale gene expression and regulation profiling data have been made available by microarray and chromatin immunoprecipitation (chip-ChIP) studies. To uncover GRNs among the identified genes would require the knowledge of their promoter sequences. We used our promoter prediction program FirstEF (4) to predict promoters in the genomes of human, mouse and rat. These promoters were then combined with the known promoters extracted from EPD (5), DBTSS (6), GenBank, etc and were deposited in our database CSHLmpd (7). It should be noted that CSHLmpd also contains genes without any promoter. In this version, the number of genes with promoter(s) in each genome assembly is lower than that of the previous version after removal of the redundancy. However, the number of known promoters and their related genes are close in both versions. There are more promoters than genes in each species due to alternative transcription start sites in many genes (7). Table 1 gives the statistics of promoters and genes in each quality category.
Table 1

Number of promoters and genes in TRED, with gene numbers in parentheses

Promoter quality12345 + 6Sum
Human1842 (1767)13 619 (10 115)5311 (5150)7305 (6738)26 684 (15 222)54 761 (27 016)
Mouse172 (156)8407 (6552)6551 (6449)4250 (4041)23 500 (15 185)42 880 (25 751)
Rat91 (82)996 (680)3374 (3333)2917 (2834)25 681 (16 346)33 059 (21 440)

Promoter qualities are ranked from high to low: 1, known, curated promoters; 2, known, pipeline collected promoters; 3, predicted promoters with Refseq evidence and putative promoters taking 5′ ends of Refseq as TSS; 4, predicted promoters with mRNA (other than Refseq and EST) evidence; 5, predicted promoters with EST evidence; 6, predicted promoters supported only by gene prediction. Promoters included in a higher ranking are automatically excluded from the lower ranking categories.

Number of promoters and genes in TRED, with gene numbers in parentheses Promoter qualities are ranked from high to low: 1, known, curated promoters; 2, known, pipeline collected promoters; 3, predicted promoters with Refseq evidence and putative promoters taking 5′ ends of Refseq as TSS; 4, predicted promoters with mRNA (other than Refseq and EST) evidence; 5, predicted promoters with EST evidence; 6, predicted promoters supported only by gene prediction. Promoters included in a higher ranking are automatically excluded from the lower ranking categories. The human genome codes for ∼1850 TFs, which account for 6.0% of its estimated total number of genes (8). It is a daunting task to collect and curate comprehensive and precise interaction data between the TFs and their target genes. Since cancer is one of the greatest threats to human health and has been a field under extensive study, including a broad interest in understanding cell cycle regulatory networks, we started out by focusing on target genes of cancer-related TFs. Previously, TRED contained mainly the target genes and promoters for TF families E2F and Myc (3). In this new release, we expanded it to 34 new TF families that have been implicated in cancer pathways, including p53, AP1, ER and NFκB/Rel. They are involved in many cellular processes, such as proliferation, differentiation, cell motility and apoptosis. There are totally 9308 newly collected target genes (5365 in human, 2526 in mouse and 1417 in rat) and 10 251 target promoters (5956 in human, 2736 in mouse and 1559 in rat) for these TF families. The detailed distribution of the target promoters and the target genes is listed in Table 2.
Table 2

Number of curated target promoters/genes for the 36 TF families

TFHumanMouseRat
AP1432/383217/190157/143
AP2338/318123/12390/86
AR69/4919/1924/15
ATF189/17359/5926/26
BCL21/1915/150/0
BRCA20/204/40/0
CEBP335/325152/134241/179
CREB224/220138/13395/93
E2F1593/1329141/12711/11
EGR120/11167/5533/26
ELK47/4115/136/6
ER169/15240/3932/31
ERG21/215/50/0
ETS445/412207/19651/51
FLI141/4117/160/0
GLI16/168/80/0
HIF119/11263/6029/29
HLF10/105/52/2
HOX65/5793/815/5
LEF40/3326/235/5
MYB253/23940/406/6
MYC2676/785108/38128/62
NFI136/12775/6273/65
OCT232/195123/10834/34
p53337/313135/13032/30
PAX52/4776/6113/11
PPAR149/149125/12488/84
PR31/2714/1410/10
RAR233/21871/7140/40
REL445/396202/18187/87
SMAD139/13076/7517/17
SP655/515296/263235/220
STAT245/218111/10648/46
TAL115/149/60/0
USF235/21594/9172/62
WT178/4916/168/8
Number of curated target promoters/genes for the 36 TF families Although TRANSFAC also provides factor-site interaction data, it contains less information for the TFs and their target genes available in TRED. Its latest version (version 7.0) has collected 1040 factors and 608 genes for human and 765 factors and 417 genes for mouse (2). Therefore, on average each factor has less than one target gene. In contrast, the number of target genes per TF in TRED is much higher. For example, there are >200 target genes on average for each TF family for human in TRED (Table 2). This can provide fairly resolved gene regulatory networks (GRNs) for the 36 TF families involved in cancer pathways. In addition to this, TRED contains relatively complete genome-wide promoter annotation for human, mouse and rat. Moreover, the binding sites in TRED were also mapped on to the assembled chromosomes. These absolute genomic positions make it ready to associate TRED with other genomic data for various studies. However, it should be noted that TRANSFAC also collects factor–site interaction data of species other than human, mouse and rat. Therefore, although TRED and TRANSFAC overlap to certain extent, they complement each other at some aspects.

Other development

The accurate and comprehensive knowledge of transcriptional regulatory elements in TRED allows one to construct the GRNs for a given TF family by bringing all TF–target gene pairs together. In our initial analysis, we found that some TFs are the target genes of other TFs and often more than one TFs control the expression of the same gene. Furthermore, some target genes affect the expression or stability of TFs by feedback loop. As an example, Figure 1 shows a simplified GRN for TF family GLI (glioma-associated oncogene homolog). For the TF families with hundreds of target genes, such as AP1, CEBP and ETS, they would form more complex networks. There are cross talks between the networks of different TF families through the same target genes or through direct interactions between the members of different TF families. The experimental evidence for each interaction between a TF and its target gene is available through the references provided in TRED. This is an advantage over other networks computationally predicted from expression and/or phylogenetic profiles. In this release, GRNs for the TF families have been generated from the collected interaction data and statically stored in TRED. The dynamic links to GRNs will be provided in the query result in the future. Taken together, TRED can facilitate to decipher the GRNs and help researchers to better understand the gene regulatory mechanisms.
Figure 1

Sample pages showing access to the gene regulatory network of TF family GLI (glioma-associated oncogene homolog) in human. Ellipses, TFs; and squares, genes. Arrows indicate interactions between two genes. Red arrows imply that the binding quality level is known. Only official gene symbols are used in the network.

Sample pages showing access to the gene regulatory network of TF family GLI (glioma-associated oncogene homolog) in human. Ellipses, TFs; and squares, genes. Arrows indicate interactions between two genes. Red arrows imply that the binding quality level is known. Only official gene symbols are used in the network.

DATA ACCESS

The website () offers the following services: easy access to TRED entries through text-based query interface; search for the target genes of a given TF; retrieval of the promoter sequences and the TF-binding motifs; further analysis of the retrieved sequences of promoters and motifs. TRED homepage also provides the access to the GRNs of the TF families in human, mouse and rat, which were constructed from its collection of the interaction data between the TFs and their target genes.
  8 in total

1.  Computational identification of promoters and first exons in the human genome.

Authors:  R V Davuluri; I Grosse; M Q Zhang
Journal:  Nat Genet       Date:  2001-12       Impact factor: 38.330

2.  Human Gene-Centric Databases at the Weizmann Institute of Science: GeneCards, UDB, CroW 21 and HORDE.

Authors:  Marilyn Safran; Vered Chalifa-Caspi; Orit Shmueli; Tsviya Olender; Michal Lapidot; Naomi Rosen; Michael Shmoish; Yakov Peter; Gustavo Glusman; Ester Feldmesser; Avital Adato; Inga Peter; Miriam Khen; Tal Atarot; Yoram Groner; Doron Lancet
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

3.  The sequence of the human genome.

Authors:  J C Venter; M D Adams; E W Myers; P W Li; R J Mural; G G Sutton; H O Smith; M Yandell; C A Evans; R A Holt; J D Gocayne; P Amanatides; R M Ballew; D H Huson; J R Wortman; Q Zhang; C D Kodira; X H Zheng; L Chen; M Skupski; G Subramanian; P D Thomas; J Zhang; G L Gabor Miklos; C Nelson; S Broder; A G Clark; J Nadeau; V A McKusick; N Zinder; A J Levine; R J Roberts; M Simon; C Slayman; M Hunkapiller; R Bolanos; A Delcher; I Dew; D Fasulo; M Flanigan; L Florea; A Halpern; S Hannenhalli; S Kravitz; S Levy; C Mobarry; K Reinert; K Remington; J Abu-Threideh; E Beasley; K Biddick; V Bonazzi; R Brandon; M Cargill; I Chandramouliswaran; R Charlab; K Chaturvedi; Z Deng; V Di Francesco; P Dunn; K Eilbeck; C Evangelista; A E Gabrielian; W Gan; W Ge; F Gong; Z Gu; P Guan; T J Heiman; M E Higgins; R R Ji; Z Ke; K A Ketchum; Z Lai; Y Lei; Z Li; J Li; Y Liang; X Lin; F Lu; G V Merkulov; N Milshina; H M Moore; A K Naik; V A Narayan; B Neelam; D Nusskern; D B Rusch; S Salzberg; W Shao; B Shue; J Sun; Z Wang; A Wang; X Wang; J Wang; M Wei; R Wides; C Xiao; C Yan; A Yao; J Ye; M Zhan; W Zhang; H Zhang; Q Zhao; L Zheng; F Zhong; W Zhong; S Zhu; S Zhao; D Gilbert; S Baumhueter; G Spier; C Carter; A Cravchik; T Woodage; F Ali; H An; A Awe; D Baldwin; H Baden; M Barnstead; I Barrow; K Beeson; D Busam; A Carver; A Center; M L Cheng; L Curry; S Danaher; L Davenport; R Desilets; S Dietz; K Dodson; L Doup; S Ferriera; N Garg; A Gluecksmann; B Hart; J Haynes; C Haynes; C Heiner; S Hladun; D Hostin; J Houck; T Howland; C Ibegwam; J Johnson; F Kalush; L Kline; S Koduru; A Love; F Mann; D May; S McCawley; T McIntosh; I McMullen; M Moy; L Moy; B Murphy; K Nelson; C Pfannkoch; E Pratts; V Puri; H Qureshi; M Reardon; R Rodriguez; Y H Rogers; D Romblad; B Ruhfel; R Scott; C Sitter; M Smallwood; E Stewart; R Strong; E Suh; R Thomas; N N Tint; S Tse; C Vech; G Wang; J Wetter; S Williams; M Williams; S Windsor; E Winn-Deen; K Wolfe; J Zaveri; K Zaveri; J F Abril; R Guigó; M J Campbell; K V Sjolander; B Karlak; A Kejariwal; H Mi; B Lazareva; T Hatton; A Narechania; K Diemer; A Muruganujan; N Guo; S Sato; V Bafna; S Istrail; R Lippert; R Schwartz; B Walenz; S Yooseph; D Allen; A Basu; J Baxendale; L Blick; M Caminha; J Carnes-Stine; P Caulk; Y H Chiang; M Coyne; C Dahlke; A Deslattes Mays; M Dombroski; M Donnelly; D Ely; S Esparham; C Fosler; H Gire; S Glanowski; K Glasser; A Glodek; M Gorokhov; K Graham; B Gropman; M Harris; J Heil; S Henderson; J Hoover; D Jennings; C Jordan; J Jordan; J Kasha; L Kagan; C Kraft; A Levitsky; M Lewis; X Liu; J Lopez; D Ma; W Majoros; J McDaniel; S Murphy; M Newman; T Nguyen; N Nguyen; M Nodell; S Pan; J Peck; M Peterson; W Rowe; R Sanders; J Scott; M Simpson; T Smith; A Sprague; T Stockwell; R Turner; E Venter; M Wang; M Wen; D Wu; M Wu; A Xia; A Zandieh; X Zhu
Journal:  Science       Date:  2001-02-16       Impact factor: 47.728

4.  EPD in its twentieth year: towards complete promoter coverage of selected model organisms.

Authors:  Christoph D Schmid; Rouaïda Perier; Viviane Praz; Philipp Bucher
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

5.  DBTSS: DataBase of Human Transcription Start Sites, progress report 2006.

Authors:  Riu Yamashita; Yutaka Suzuki; Hiroyuki Wakaguri; Katsuki Tsuritani; Kenta Nakai; Sumio Sugano
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

6.  TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes.

Authors:  V Matys; O V Kel-Margoulis; E Fricke; I Liebich; S Land; A Barre-Dirrie; I Reuter; D Chekmenev; M Krull; K Hornischer; N Voss; P Stegmaier; B Lewicki-Potapov; H Saxel; A E Kel; E Wingender
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

7.  Genome-wide promoter extraction and analysis in human, mouse, and rat.

Authors:  Zhenyu Xuan; Fang Zhao; Jinhua Wang; Gengxin Chen; Michael Q Zhang
Journal:  Genome Biol       Date:  2005-08-01       Impact factor: 13.583

8.  TRED: a Transcriptional Regulatory Element Database and a platform for in silico gene regulation studies.

Authors:  Fang Zhao; Zhenyu Xuan; Lihua Liu; Michael Q Zhang
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

  8 in total
  149 in total

1.  Large-scale elucidation of drug response pathways in humans.

Authors:  Yael Silberberg; Assaf Gottlieb; Martin Kupiec; Eytan Ruppin; Roded Sharan
Journal:  J Comput Biol       Date:  2012-02       Impact factor: 1.479

2.  Computational Systems Biology of Psoriasis: Are We Ready for the Age of Omics and Systems Biomarkers?

Authors:  Tuba Sevimoglu; Kazim Yalcin Arga
Journal:  OMICS       Date:  2015-10-19

Review 3.  Transcription factor decoy: a pre-transcriptional approach for gene downregulation purpose in cancer.

Authors:  Seyed Mohammad Ali Hosseini Rad; Lida Langroudi; Fatemeh Kouhkan; Laleh Yazdani; Alireza Nouri Koupaee; Sara Asgharpour; Zahra Shojaei; Taravat Bamdad; Ehsan Arefian
Journal:  Tumour Biol       Date:  2015-04-04

Review 4.  Computational methods to dissect cis-regulatory transcriptional networks.

Authors:  Vibha Rani
Journal:  J Biosci       Date:  2007-12       Impact factor: 1.826

5.  Module cover - a new approach to genotype-phenotype studies.

Authors:  Yoo-Ah Kim; Raheleh Salari; Stefan Wuchty; Teresa M Przytycka
Journal:  Pac Symp Biocomput       Date:  2013

6.  Selenoprotein P regulation by the glucocorticoid receptor.

Authors:  Colleen Rock; Philip J Moos
Journal:  Biometals       Date:  2009-12       Impact factor: 2.949

7.  A human functional protein interaction network and its application to cancer data analysis.

Authors:  Guanming Wu; Xin Feng; Lincoln Stein
Journal:  Genome Biol       Date:  2010-05-19       Impact factor: 13.583

8.  ConsensusPathDB--a database for integrating human functional interaction networks.

Authors:  Atanas Kamburov; Christoph Wierling; Hans Lehrach; Ralf Herwig
Journal:  Nucleic Acids Res       Date:  2008-10-21       Impact factor: 16.971

9.  MotifAdjuster: a tool for computational reassessment of transcription factor binding site annotations.

Authors:  Jens Keilwagen; Jan Baumbach; Thomas A Kohl; Ivo Grosse
Journal:  Genome Biol       Date:  2009-05-01       Impact factor: 13.583

10.  Dissection of a complex transcriptional response using genome-wide transcriptional modelling.

Authors:  Martino Barenco; Daniel Brewer; Efterpi Papouli; Daniela Tomescu; Robin Callard; Jaroslav Stark; Michael Hubank
Journal:  Mol Syst Biol       Date:  2009-11-17       Impact factor: 11.429

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.