Literature DB >> 17535445

bZIPDB: a database of regulatory information for human bZIP transcription factors.

Taewoo Ryu1, Juhyun Jung, Sunjae Lee, Ho Jung Nam, Sun Woo Hong, Jae Wook Yoo, Dong-ki Lee, Doheon Lee.   

Abstract

BACKGROUND: Basic region-leucine zipper (bZIP) proteins are a class of transcription factors (TFs) that play diverse roles in eukaryotes. Malfunctions in these proteins lead to cancer and various other diseases. For detailed characterization of these TFs, further public resources are required. DESCRIPTION: We constructed a database, designated bZIPDB, containing information on 49 human bZIP TFs, by means of automated literature collection and manual curation. bZIPDB aims to provide public data required for deciphering the gene regulatory network of the human bZIP family, e.g., evaluation or reference information for the identification of regulatory modules. The resources provided by bZIPDB include (1) protein interaction data including direct binding, phosphorylation and functional associations between bZIP TFs and other cellular proteins, along with other types of interactions, (2) bZIP TF-target gene relationships, (3) the cellular network of bZIP TFs in particular cell lines, and (4) gene information and ontology. In the current version of the database, 721 protein interactions and 560 TF-target gene relationships are recorded. bZIPDB is annually updated for the newly discovered information.
CONCLUSION: bZIPDB is a repository of detailed regulatory information for human bZIP TFs that is collected and processed from the literature, designed to facilitate analysis of this protein family. bZIPDB is available for public use at http://biosoft.kaist.ac.kr/bzipdb.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17535445      PMCID: PMC1891292          DOI: 10.1186/1471-2164-8-136

Source DB:  PubMed          Journal:  BMC Genomics        ISSN: 1471-2164            Impact factor:   3.969


Background

Transcription factors (TFs) are responsible for gene expression in every living organism. The bZIP family shares a basic region and a leucine zipper domain. Homo/hetero-dimerization between family members is possible through the leucine zipper domain, and the proteins bind target promoters via the basic amino acid-rich region [1]. The bZIP TFs play essential roles in several processes in eukaryotic cells, from early development to tumorigenesis. For example, JUN is an oncogene that affects diverse cellular processes including proliferation, differentiation and apoptosis [2], while CEBPA is a well-known regulator of hepatocyte and adipocyte development [3]. With the assistance of high-throughput technology, such as microarray technology, several researchers have attempted to decipher the regulatory networks of bZIP TFs [4-7]. However, this type of evaluation is largely dependent on manual literature search, which is time-consuming and incomplete. While a number of the binding proteins or target genes of bZIP TFs can be retrieved from HPRD or TRANSFAC [8,9], the currently available data are relatively limited, and do not necessarily cover the entire cellular network. For gene transcription, multiple steps are required, i.e., signaling cascade of multiple proteins, interactions between TFs and other proteins (such as RNA polymerase) or other TFs, and TF binding to DNA in the proper orientation. Thus, to elucidate the entire regulatory network, extensive data on the above processes must be amassed and processed. To facilitate our understanding of these proteins, we have generated a bZIPDB database containing regulatory network information on the human bZIP TF family. In particular, we focus on the signaling protein-TF interactions, TF-TF interactions, and TF-target gene interactions that are important for regulatory network analysis with high-throughput technology.

Construction and content

The aim of bZIPDB is to accumulate known regulatory information on human bZIP TFs, particularly protein-protein and protein-DNA interactions. A list of human bZIP TFs with the appropriate synonyms is documented on our website. For database construction, public literature dealing with human bZIP TFs, including official symbols and synonyms, was initially obtained from PubMed [10] using web queries. The PubMed IDs of 2,498 papers for 49 TFs were stored and arranged in our internal web-based curation system via an automated process. Regulatory information was processed and saved under a suitable format in the database by experts. The regulatory network of the bZIP family is grouped into six tables, depending on specific attributes. The system architecture of bZIPDB is depicted in Figure 1, and details of the attributes are recorded on our website.
Figure 1

Schematic diagram of bZIPDB. Underlying relational database schema for bZIPDB.

• bZIP_TF_INFO: Basic information on human bZIP TFs, such as bZIPDB ID, official symbol, RefSeq ID and transcript variants. • GENE_INFORMATION: Information of the chromosomal loci and exons of human bZIP TFs. • PPI: Protein-protein interactions between bZIP TFs and other proteins. • TF_TARGET: bZIP TF-target gene promoter interactions. • CELL_LINE: Experimental cell lines and their origin. • TOI: Types of protein-protein interactions. For 49 human bZIP TFs, bZIPDB ID was assigned on the basis of the distinct mRNA transcript. Since alternatively spliced or transcribed products encoded by the same gene have different biochemical properties [11], we assigned different IDs to each bZIP TF and its transcript variant, as reflected in PPI and TF_TARGET tables. In the construction of a protein-protein interaction table, information on interaction types, directions of interactions, and cell lines is collected in addition to the identities of interacting proteins. While several databases have focused on the direct binding of proteins acting as complexes [8,12], cellular protein networks also consist of other interaction types, such as phosphorylation and SUMOylation. Functional association, which means that both proteins are present in the same pathway, is another important interaction type in transcriptome analysis, which basically assumes that coregulated genes share similar roles [13,14]. These interaction types are specified in the TOI table. 'Direction of interaction' indicates that one protein affects the activity of another protein, i.e. upstream or downstream in the signaling pathway. RefSeq ID for each protein is appended as a crosslink to NCBI. The organism from which the protein originates is also added as an attribute, since researchers often use proteins from different sources. Experimental cell lines are additionally classified as an important attribute, since they originate from different organisms and tissues and therefore have a distinct genomic context, which affects protein-protein interactions (described in the CELL_LINE table). The target genes of TF are less well characterized, compared to protein-protein interactions. For transcription, TFs bind to specific DNA sequences in the proper orientation, which is influenced by nearby proteins, such as histone or other TFs. While known TF-target gene relationships from a few databases have been used as positive examples, their number is too limited to constitute a positive dataset. For example, 51 mammalian target genes of human JUN protein are recorded in TRANSFAC 10.4 [9], while we have identified 88 in bZIPDB. Hence, several findings, including TFs, targets, results of transcription (i.e. activation or repression), binding sequences, binding positions, and cell lines, have been incorporated in the database. Moreover, as bZIP TFs often act via homo/hetero-dimerization between family members, the dimerizing partner is included if specified in the literature. The database statistics are summarized in Table 1.
Table 1

The statistics of bZIPDB

CategoryNumber of records
TFs49
Protein-protein interactions721
 Direct binding516
 Phosphorylation47
 Functional association143
 Other interactions15
TF-target gene relations560
 Activation279
 Repression57

A list of interaction types with the number of records

Another unique aspect of bZIPDB is the compilation of regulatory information for particular cell lines. Each cell line originates from different organisms or tissues, which maintain unique genetic and epigenetic compositions, hence affecting various cellular interactions. Therefore, careful consideration is required when data from several resources are used in conjunction. To clarify the distinctive cellular networks and accuracy of interactions, bZIPDB provides the list of regulatory interactions conducted in a particular cell type. Table 2 summarizes the popularly used cell lines and number of interactions listed in bZIPDB.
Table 2

Cell lines and summary of interactions in bZIPDB

cell lineorganism# of protein-protein interaction# of TF-target interaction
COS1monkey1014
COS7monkey239
HEK293human3819
HEK293Thuman4217
HELAhuman4846
HepG2human2389
Jurkathuman1922
MCF7human512
NIH3T3mouse1914
U937human147

Ten popularly used cell lines with the originating organisms and number of interactions (listed in bZIPDB).

In addition to protein-protein and protein-DNA interaction data, genomic information, such as chromosomal locus and exon/introns, synonyms and functional annotation, was obtained from Entrez [15] and the Gene ontology consortium [16].

Utility and Discussion

bZIPDB provides a convenient search engine with which users can explore the database either by typing bZIP TF names within the query box or by clicking on the listed names (Figure 2A). The known human bZIP TFs are listed on the 'search bZIPDB' page, according to the alphabetical order of the official symbol. The 'official symbol' is the approved gene name by public databases, such as NCBI and HGNC [17]. Synonyms collected from NCBI and HGNC are recorded next to the official symbol. On the input form, users can type in the individual bZIP TF name (either the official symbol or synonyms). By default, the results pages return all records in bZIPDB, regardless of the organism of TF, target gene or cell line. However, users can restrict the organism category to humans. For convenience, bZIPDB allows searches by simply clicking on the bZIP TF name with default options.
Figure 2

Queries and results of bZIPDB. (A) The 'Search bZIPDB' menu shows the query box and human bZIP TF list. Typing in or clicking on the names of bZIPs allows search initiation. (B) Protein-protein interactions of JUN. Interaction partners and types, interaction directions and other data are shown. (C) Protein-DNA interactions of JUN. JUN regulates target genes by binding to the TF binding site (TFBS). (D) 'Cellular Network' page. By clicking on the cell name, scientists can access interaction data for the specified cell line.

The results page returns basic information, such as names, RefSeq ID, chromosomal locus and exon/intron positions of the bZIP TF protein examined. By clicking on the 'Protein interaction' or 'Target genes' menu on the right side of the results page, researchers can recover detailed reports on protein-protein or protein-DNA interactions of bZIP TF, respectively (Fig. 2B and 2C) to facilitate further analysis. These include official symbol, organism, interaction type, TF binding sites and positions, cell lines, and PubMed id, among other information. An external link to NCBI RefSeq and PubMed is provided for each interaction and gene. If the organism is not specified in the literature, it is impossible to ascertain gene identity (RefSeq ID). In this case, the positions are denoted 'U' (unspecified). A bZIPDB report of human JUN is shown as an example (Figs. 2B and 2C). In bZIPDB, 148 protein-protein and 88 protein-DNA interactions are accessible, while 110 protein-protein and 51 protein-DNA interactions are retrieved from HPRD and TRANSFAC, respectively. Moreover, these two databases do not use official symbols in the search and result pages, and are therefore difficult to exploit in terms of bZIP TF analysis. The official symbols are very important, since they greatly facilitate integration between various information sources, e.g., microarray and interaction data. bZIPDB contains more information on human bZIP TFs than other databases, and is therefore more useful for the analysis of these proteins. Interactions within specific cell lines can be viewed on the 'Cellular Network' page (Figure 2D). In total, 12 popular cell lines are listed. By clicking on the name of the cell line, researchers may retrieve associated interactions from the database. The result format is similar to query results of individual bZIP TF proteins. Data in bZIPDB are available in a tab-delimited format on the 'Download' page. Interaction data subsets (protein-protein and protein-DNA) are also available in either the tab-delimited or the simple interaction format (SIF), supported by Cytoscape [18], a visualization and integration tool. bZIPDB aims to serve as a portal for researchers studying the human bZIP TF family. To date, the database has focused on amassing the relevant literature data. However, the updated version of bZIPDB will provide other types of data. One data category involves the potential target genes of bZIP TFs, which are computationally predicted using phylogenetic footprinting and motif search algorithms [19,20]. Another is genome-wide mRNA expression profiles, which are accumulated in public databases, such as NCBI GEO [21]. Differential expression patterns of bZIP TFs will be collected along with relevant information, such as experimental conditions and cell lines. Since integration of interaction data from different databases is an important issue, collected data will be subjected to the HUPO PSI's molecular interaction format [22]. Finally, the database will be updated annually.

Conclusion

bZIPDB contains extensive information on human bZIP TFs, such as manually curated protein-protein and protein-DNA interactions, genomic information, synonyms, and gene ontology. Moreover, this novel database provides classified interaction data for popularly used cell lines, leading to a clearer picture of the cell type-specific subnetwork. Thus, bZIPDB constitutes a valuable resource to facilitate comprehensive understanding and analysis of the cellular network of human bZIP TFs.

Availability and requirements

bZIPDB home page : Operating systems(s): Linux Programming language: Java License: the database is freely available to academic and non-academic users.

List of abbreviations

Basic region-leucine zipper (bZIP), transcription factors (TF), transcription factor binding site (TFBS), protein-protein interaction (PPI)

Authors' contributions

TR designed the database and drafted the manuscript. JJ, SL, HJN, SWH and JWY participated in data curation. Dong-ki Lee and Doheon Lee supervised the work.
  18 in total

1.  The Database of Interacting Proteins: 2004 update.

Authors:  Lukasz Salwinski; Christopher S Miller; Adam J Smith; Frank K Pettit; James U Bowie; David Eisenberg
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

2.  Comprehensive identification of human bZIP interactions with coiled-coil arrays.

Authors:  John R S Newman; Amy E Keating
Journal:  Science       Date:  2003-06-12       Impact factor: 47.728

3.  Identification of promoters bound by c-Jun/ATF2 during rapid large-scale gene activation following genotoxic stress.

Authors:  Jun Hayakawa; Shalu Mittal; Yipeng Wang; Kemal S Korkmaz; Eileen Adamson; Christopher English; Masahide Ohmichi; Masahide Omichi; Michael McClelland; Dan Mercola
Journal:  Mol Cell       Date:  2004-11-19       Impact factor: 17.970

4.  Transcriptional repression by a novel member of the bZIP family of transcription factors.

Authors:  I G Cowell; A Skinner; H C Hurst
Journal:  Mol Cell Biol       Date:  1992-07       Impact factor: 4.272

5.  Multiple functional domains of AML1: PU.1 and C/EBPalpha synergize with different regions of AML1.

Authors:  M S Petrovick; S W Hiebert; A D Friedman; C J Hetherington; D G Tenen; D E Zhang
Journal:  Mol Cell Biol       Date:  1998-07       Impact factor: 4.272

6.  Mect1-Maml2 fusion oncogene linked to the aberrant activation of cyclic AMP/CREB regulated genes.

Authors:  Amy Coxon; Ester Rozenblum; Yoon-Soo Park; Nina Joshi; Junji Tsurutani; Phillip A Dennis; Ilan R Kirsch; Frederic J Kaye
Journal:  Cancer Res       Date:  2005-08-15       Impact factor: 12.701

7.  An isoform of transcription factor CREM expressed during spermatogenesis lacks the phosphorylation domain and represses cAMP-induced transcription.

Authors:  W H Walker; B M Sanborn; J F Habener
Journal:  Proc Natl Acad Sci U S A       Date:  1994-12-20       Impact factor: 11.205

8.  The HUPO PSI's molecular interaction format--a community standard for the representation of protein interaction data.

Authors:  Henning Hermjakob; Luisa Montecchi-Palazzi; Gary Bader; Jérôme Wojcik; Lukasz Salwinski; Arnaud Ceol; Susan Moore; Sandra Orchard; Ugis Sarkans; Christian von Mering; Bernd Roechert; Sylvain Poux; Eva Jung; Henning Mersch; Paul Kersey; Michael Lappe; Yixue Li; Rong Zeng; Debashis Rana; Macha Nikolski; Holger Husi; Christine Brun; K Shanker; Seth G N Grant; Chris Sander; Peer Bork; Weimin Zhu; Akhilesh Pandey; Alvis Brazma; Bernard Jacq; Marc Vidal; David Sherman; Pierre Legrain; Gianni Cesareni; Ioannis Xenarios; David Eisenberg; Boris Steipe; Chris Hogue; Rolf Apweiler
Journal:  Nat Biotechnol       Date:  2004-02       Impact factor: 54.908

9.  CisMols Analyzer: identification of compositionally similar cis-element clusters in ortholog conserved regions of coordinately expressed genes.

Authors:  Anil G Jegga; Ashima Gupta; Sivakumar Gowrisankar; Mrunal A Deshmukh; Steven Connolly; Kevin Finley; Bruce J Aronow
Journal:  Nucleic Acids Res       Date:  2005-07-01       Impact factor: 16.971

10.  Entrez Gene: gene-centered information at NCBI.

Authors:  Donna Maglott; Jim Ostell; Kim D Pruitt; Tatiana Tatusova
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

View more
  10 in total

Review 1.  The importance of being flexible: the case of basic region leucine zipper transcriptional regulators.

Authors:  Maria Miller
Journal:  Curr Protein Pept Sci       Date:  2009-06       Impact factor: 3.272

2.  Basic leucine zipper family in barley: genome-wide characterization of members and expression analysis.

Authors:  Ehsan Pourabed; Farzan Ghane Golmohamadi; Peyman Soleymani Monfared; Seyed Morteza Razavi; Zahra-Sadat Shobbar
Journal:  Mol Biotechnol       Date:  2015-01       Impact factor: 2.695

3.  Context-dependent transcriptional regulations between signal transduction pathways.

Authors:  Sohyun Hwang; Sangwoo Kim; Heesung Shin; Doheon Lee
Journal:  BMC Bioinformatics       Date:  2011-01-13       Impact factor: 3.169

4.  LRRK2 mediates microglial neurotoxicity via NFATc2 in rodent models of synucleinopathies.

Authors:  Changyoun Kim; Alexandria Beilina; Nathan Smith; Yan Li; Minhyung Kim; Ravindran Kumaran; Alice Kaganovich; Adamantios Mamais; Anthony Adame; Michiyo Iba; Somin Kwon; Won-Jae Lee; Soo-Jean Shin; Robert A Rissman; Sungyong You; Seung-Jae Lee; Andrew B Singleton; Mark R Cookson; Eliezer Masliah
Journal:  Sci Transl Med       Date:  2020-10-14       Impact factor: 17.956

5.  A systems approach to rheumatoid arthritis.

Authors:  Sungyong You; Chul-Soo Cho; Inyoul Lee; Leroy Hood; Daehee Hwang; Wan-Uk Kim
Journal:  PLoS One       Date:  2012-12-11       Impact factor: 3.240

6.  TFCat: the curated catalog of mouse and human transcription factors.

Authors:  Debra L Fulton; Saravanan Sundararajan; Gwenael Badis; Timothy R Hughes; Wyeth W Wasserman; Jared C Roach; Rob Sladek
Journal:  Genome Biol       Date:  2009-03-12       Impact factor: 13.583

7.  Integration of proteomic and transcriptomic profiles identifies a novel PDGF-MYC network in human smooth muscle cells.

Authors:  Wei Yang; Aruna Ramachandran; Sungyong You; HyoBin Jeong; Samantha Morley; Michelle D Mulone; Tanya Logvinenko; Jayoung Kim; Daehee Hwang; Michael R Freeman; Rosalyn M Adam
Journal:  Cell Commun Signal       Date:  2014-08-01       Impact factor: 5.712

8.  Integrated analysis of global proteome, phosphoproteome, and glycoproteome enables complementary interpretation of disease-related protein networks.

Authors:  Jong-Moon Park; Ji-Hwan Park; Dong-Gi Mun; Jingi Bae; Jae Hun Jung; Seunghoon Back; Hangyeore Lee; Hokeun Kim; Hee-Jung Jung; Hark Kyun Kim; Hookeun Lee; Kwang Pyo Kim; Daehee Hwang; Sang-Won Lee
Journal:  Sci Rep       Date:  2015-12-11       Impact factor: 4.379

9.  Functional characterization of EI24-induced autophagy in the degradation of RING-domain E3 ligases.

Authors:  Sushil Devkota; Hyobin Jeong; Yunmi Kim; Muhammad Ali; Jae-Il Roh; Daehee Hwang; Han-Woong Lee
Journal:  Autophagy       Date:  2016-08-19       Impact factor: 16.016

Review 10.  IFP35 Is a Relevant Factor in Innate Immunity, Multiple Sclerosis, and Other Chronic Inflammatory Diseases: A Review.

Authors:  Roberto De Masi; Stefania Orlando; Francesco Bagordo; Tiziana Grassi
Journal:  Biology (Basel)       Date:  2021-12-14
  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.