| Literature DB >> 18321385 |
Martin Korb1, Aistair G Rust, Vesteinn Thorsson, Christophe Battail, Bin Li, Daehee Hwang, Kathleen A Kennedy, Jared C Roach, Carrie M Rosenberger, Mark Gilchrist, Daniel Zak, Carrie Johnson, Bruz Marzolf, Alan Aderem, Ilya Shmulevich, Hamid Bolouri.
Abstract
BACKGROUND: As part of a National Institute of Allergy and Infectious Diseases funded collaborative project, we have performed over 150 microarray experiments measuring the response of C57/BL6 mouse bone marrow macrophages to toll-like receptor stimuli. These microarray expression profiles are available freely from our project web site http://www.innateImmunity-systemsbiology.org. Here, we report the development of a database of computationally predicted transcription factor binding sites and related genomic features for a set of over 2000 murine immune genes of interest. Our database, which includes microarray co-expression clusters and a host of web-based query, analysis and visualization facilities, is available freely via the internet. It provides a broad resource to the research community, and a stepping stone towards the delineation of the network of transcriptional regulatory interactions underlying the integrated response of macrophages to pathogens. DESCRIPTION: We constructed a database indexed on genes and annotations of the immediate surrounding genomic regions. To facilitate both gene-specific and systems biology oriented research, our database provides the means to analyze individual genes or an entire genomic locus. Although our focus to-date has been on mammalian toll-like receptor signaling pathways, our database structure is not limited to this subject, and is intended to be broadly applicable to immunology. By focusing on selected immune-active genes, we were able to perform computationally intensive expression and sequence analyses that would currently be prohibitive if applied to the entire genome. Using six complementary computational algorithms and methodologies, we identified transcription factor binding sites based on the Position Weight Matrices available in TRANSFAC. For one example transcription factor (ATF3) for which experimental data is available, over 50% of our predicted binding sites coincide with genome-wide chromatin immnuopreciptation (ChIP-chip) results. Our database can be interrogated via a web interface. Genomic annotations and binding site predictions can be automatically viewed with a customized version of the Argo genome browser.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18321385 PMCID: PMC2268913 DOI: 10.1186/1471-2172-9-7
Source DB: PubMed Journal: BMC Immunol ISSN: 1471-2172 Impact factor: 3.615
Figure 1Snapshot of the IIDB web site entry page. The entry page includes links to user guides ('How to Use IIDB' and 'IIDB Tutorial'), and links to allow the searching and visualization of IIDB content in a variety of ways, as described in the main text. We also provide links to 3rd party data and software used by IIDB. Clicking the '?' symbol located to the left of each menu item pops open a help page explaining how to use that menu item (Inset).
Figure 2Exploring an annotated gene sequence. The user can choose the size of 5' upstream promoter and 3' downstream regions to search, and whether to include features identified within a gene's coding region. A link is provided to all available microarray expression profiles. The user can choose one or all features for viewing, including the list of putative regulatory transcription factors and significance thresholds. Each feature name is linked to additional pages with more information about that particular feature.
Macrophage TLR stimuli used in the experiments underlying IIDB
| CpG | Unmethylated CpG motif (cytosine and guanine separated by a phosphate) bacterial DNA (TLR9-specific stimulant) |
| LPS | Lipopolysaccharide (component of the cell membrane of Gram-negative bacteria), TLR4-specific stimulant |
| PAM2CSK | Synthetic diacylated lipoprotein, TLR2/6 stimulant |
| PAM3CSK | Synthetic triacylated lipopeptide, TLR 2/1 stimulant |
| Poly(I:C) | Polyriboinosinic polyribocytidylic acid (TLR3-specific stimulant) |
| R848 | Synthetic imidazoquinoline resiquimod, TLR 7, 8 stimulant |
Figure 3Searching genes for targets of specific transcription factors. IIDB provides a list of 268 unique transcription factors in the selection box on the left. The user can select several transcription factors, the p-value, the length of promoter region to explore, and any number of target genes. The target gene column is automatically populated if the user selects genes from a previous page (shown). Otherwise the user can add a comma separated list of gene identifiers or chromosomal locations (not shown). Several additional features can be displayed simultaneously on top of the predicted transcription factor binding sites (selection boxes at bottom). A link to a page which details the upload format for a search file for bulk queries is also provided (not shown).
Figure 4An example of a web-based Argo multigene display of IIDB search results. The following genes (IL6, Il10, IL12a, IL12b) were queried for ATF3 predicted and ChIP-chip hits (red circles and orange boxes labeled ATF3, respectively), and predicted NF-κB binding sites (green rectangles) in the proximal promoter regions (+1 to -3000). Evolutionary conserved promoter sequences (purple) are also shown. The predictions are in good agreement with the experimentally ChIP-Chip hits. Inset: Detail of an Evolutionary Conserved Promoter Region. By double clicking an evolutionary conserved promoter sequence (purple arrows) a new browser window displays details such as the human ortholog, start and end coordinates, and the human transcription factors associated with this segment (MS Internet Explorer only).
Comparison of IIDB TFBS predictions with ChIP-chip data. Data are presented based on the genome annotations available from both NCBI and ENSEMBL. Note that the annotations differ in the number of predicted genes.
| 1000 | 2000 | 1000 | 2000 | 1000 | 2000 | 1000 | 2000 | |
| Number of genes | 1151 | 1151 | 1151 | 1151 | 1935 | 1935 | 1935 | 1935 |
| Unique ATF3 ChIP-chip hits | 978 | 1761 | 978 | 1761 | 1494 | 2750 | 1494 | 2750 |
| Conserved promoter regions containing ATF3 TFBS ◇ | 833 | 979 | 833 | 979 | 1329 | 1550 | 1329 | 1550 |
| ATF3-group matrices hits* | 792 | 1187 | 212 | 299 | 1333 | 2029 | 337 | 474 |
| ATF3-group matrices within a ChIP-chip segment | 442 | 664 | 110 | 165 | 710 | 1031 | 196 | 272 |
| %overlap between ChIP-chip data & predictions ◉ | ||||||||
Notes: ◇ Conserved regions were mapped from the human data of Xie et.al [14]. ■ Threshold refers to the p-value below which predicted TFBS are considered significant. † Numbers refer to length of promoter annotated, in base pairs upstream of the transcription start site. * Since ATF3 binding sites have a strong overlap with CREB binding sites, we used a combined PWM including three ATF matrices and nine CREB matrices to calculate the ATF3 hits. Overlapping hits were collapsed into one as described in the main text. ◉ ChIP fragments in these experiments were estimated to have an average length of approximately 500 bp to 1 Kbp. To determine the coordinates of a ChIP-chip hit, we estimated the center of gravity of a bound region using a moving average filter, then set the start/end coordinates to be +/- 300 bp.