| Literature DB >> 16381958 |
G Robertson1, M Bilenky, K Lin, A He, W Yuen, M Dagpinar, R Varhol, K Teague, O L Griffith, X Zhang, Y Pan, M Hassel, M C Sleumer, W Pan, E D Pleasance, M Chuang, H Hao, Y Y Li, N Robertson, C Fjell, B Li, S B Montgomery, T Astakhova, J Zhou, J Sander, A S Siddiqui, S J M Jones.
Abstract
We describe cisRED, a database for conserved regulatory elements that are identified and ranked by a genome-scale computational system (www.cisred.org). The database and high-throughput predictive pipeline are designed to address diverse target genomes in the context of rapidly evolving data resources and tools. Motifs are predicted in promoter regions using multiple discovery methods applied to sequence sets that include corresponding sequence regions from vertebrates. We estimate motif significance by applying discovery and post-processing methods to randomized sequence sets that are adaptively derived from target sequence sets, retain motifs with p-values below a threshold and identify groups of similar motifs and co-occurring motif patterns. The database offers information on atomic motifs, motif groups and patterns. It is web-accessible, and can be queried directly, downloaded or installed locally.Entities:
Mesh:
Year: 2006 PMID: 16381958 PMCID: PMC1347438 DOI: 10.1093/nar/gkj075
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1Data processing system for high-throughput motif discovery, clustering, co-occurrence, annotation and performance assessment.
Figure 2Database contents and high-level links, from a web user perspective, as of cisRED human v1.2. ‘Atomic motifs’ are motifs discovered in target sequence sets. ‘Groups’ are clusters of similar motifs that are identified by large-scale OPTICS (19) clustering. ‘Patterns’ are co-occurring sets of group-labelled motifs.
Contents of coexpression database (22)
| Species | Platform | Experiments | Unique genes |
|---|---|---|---|
| H.sapiens | SAGE (short) | 272 | 20 312 |
| Oligo microarray | 1640 | 12 452 | |
| cDNA microarray | 2852 | 13 111 | |
| M.musculus | SAGE (short) | 85 | 12 715 |
| Oligo microarray | 1802 | 8 164 | |
| cDNA microarray | 366 | 8 102 | |
| C.elegans | SAGE (long/short) | 26 | 15 685 |
| cDNA microarray | 1059 | 15 956 | |
| Total | 8102 | 54 434 |