| Literature DB >> 17148480 |
Vincent Ferretti1, Christian Poitras, Dominique Bergeron, Benoit Coulombe, François Robert, Mathieu Blanchette.
Abstract
We describe PReMod, a new database of genome-wide cis-regulatory module (CRM) predictions for both the human and the mouse genomes. The prediction algorithm, described previously in Blanchette et al. (2006) Genome Res., 16, 656-668, exploits the fact that many known CRMs are made of clusters of phylogenetically conserved and repeated transcription factors (TF) binding sites. Contrary to other existing databases, PReMod is not restricted to modules located proximal to genes, but in fact mostly contains distal predicted CRMs (pCRMs). Through its web interface, PReMod allows users to (i) identify pCRMs around a gene of interest; (ii) identify pCRMs that have binding sites for a given TF (or a set of TFs) or (iii) download the entire dataset for local analyses. Queries can also be refined by filtering for specific chromosomal regions, for specific regions relative to genes or for the presence of CpG islands. The output includes information about the binding sites predicted within the selected pCRMs, and a graphical display of their distribution within the pCRMs. It also provides a visual depiction of the chromosomal context of the selected pCRMs in terms of neighboring pCRMs and genes, all of which are linked to the UCSC Genome Browser and the NCBI. PReMod: http://genomequebec.mcgill.ca/PReMod.Entities:
Mesh:
Substances:
Year: 2006 PMID: 17148480 PMCID: PMC1761432 DOI: 10.1093/nar/gkl879
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Key statistics on the PReMod 1.0 database
| PreMod1.0 human | PReMod1.0 mouse | |
|---|---|---|
| Genome assembly | Build 34—May 2004 (hg17) | Build 34—March 2005 (mm6) |
| Transfac version | 7.2 | 7.2 |
| Number of pCRMs | 123 510 | 91 412 |
| Fraction of genome contained in pCRMs | 1.93% | 1.68% |
| Average module length | 481 bp | 479 bp |
| Fraction of modules that are | ||
| proximal (<2 kb from TSS) | 10.8% | 8.4% |
| distal (2–10 kb from TSS) | 7.7% | 8.3% |
| long-range (>10 kb from TSS) | 81.6% | 83.4% |
| Average number of tags per module | 3.3 | 3.5 |
| Average number of different PWMs per module | 30.6 | 26.0 |
| Average number of predicted sites per module | 75.6 | 58.7 |
| Average number of module containing sites for a given TF | 7842 | 5395 |
Figure 1Sample screenshots of a query input and its related outputs generated by PReMod. (A) The Advanced Search page. By clicking on ‘Search Predicted Modules’, users access a page that allows searching pCRMs by module name, matrix name, gene name or Entrez gene Id. In the example shown here, the user wants to identify the pCRMs containing tags for both the ER (M00191) and the androgen receptor (M00447). Output can be ordered by name, position or score, and can be displayed in HTML or exported as an Excel file. (B) Query output page. Upon submitting a query, the list of modules satisfying the given constraints is displayed. ‘B’ shows the result of the query shown in ‘A’. For each module reported, the module identifier, genomic position, length, and score are given. Also given are the genes with the closest TSS upstream or downstream of the pCRM. Finally, the list of Transfac matrices selected as tags for the module is shown. (C) Module Information page. By clicking on a module name in the Query output page (B), a details page is obtained that contains all the binding site information about the selected module. The page includes the list of all the matrices that were used to calculate the moduleScore (Tag Matrices) as well as all the other matrices found in that pCRM (Other Matrices). The page also includes a list of the surrounding genes (within a 100 kb window) and the DNA sequence of the module. (D) The Module view. The Module Information page also contains a graphical representation of the position of the predicted binding sites for the TFs selected as tags. Any matrices present in that module (those listed in Tag Matrices and in Other Matrices) can be added to (or removed from) the display by a simple click. (E) The Genomic Context view. The genomic context of the selected module can also be visualized within the Module Information page. In this display, the selected pCRM, together with any other pCRMs and genes present within a 100 kb window are shown. By clicking on any module in that image the user is sent to the appropriate Module Information page.