| Literature DB >> 15980504 |
Abstract
Functionally associated genes tend to be co-expressed, which indicates that they could also be co-regulated. Since co-regulation is usually governed by transcription factors via their specific binding elements, putative regulators can be identified from promoter sets of (co-expressed) genes by screening for over-represented nucleotide patterns. Here, we present a program, POCO, which discovers such over-represented patterns from either one or two promoter sets. Typical microarray experiments yield up- and down-regulated gene sets that may represent, for example, distinct defense pathways. Assuming that a functional transcription factor cannot simultaneously both up- and down-regulate the gene sets, its binding element should respectively be over- and under-represented in the corresponding promoter sets. This idea is implemented in POCO, which tests the hypothesis that the distributions of a pattern differ among three sets of promoters: up-regulated, down-regulated and randomly-chosen. In the program, pattern discovery is based on explicit enumeration of all possible patterns on the alphabet (A, C, G, T and N). The mean occurrences and SDs of the patterns are estimated using bootstrapping and their significance is assessed using ANOVA F-statistics, Tukey's honestly significantly difference test and P-values. The program is freely available at http://ekhidna.biocenter.helsinki.fi/poco.Entities:
Mesh:
Substances:
Year: 2005 PMID: 15980504 PMCID: PMC1160228 DOI: 10.1093/nar/gki467
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
The five top patterns over-represented in the up-regulated and under-represented in the down-regulated WRKY70 promoter set
| Pattern | Up-regulated | Down-regulated | Background | P | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Occ | Pro | Avg | SD | Occ | Pro | Avg | SD | Avg | SD | |||
| TTTNNACT/AGTNNAAA | 70 | 23 | 58.22 | 5.46 | 7 | 5 | 14.00 | 3.55 | 40.63 | 6.66 | 17120.47 | 3.7 × 10−7 |
| GACTNNNA/TNNNAGTC | 110 | 24 | 91.82 | 9.53 | 19 | 9 | 38.09 | 4.78 | 47.78 | 7.33 | 14690.51 | 3.5 × 10−6 |
| TNANNCNT/ANGNNTNA | 424 | 24 | 353.72 | 24.74 | 105 | 10 | 209.88 | 17.88 | 310.68 | 19.96 | 12294.69 | 3.2 × 10−5 |
| ATNATTC/GAATNAT | 62 | 22 | 51.48 | 6.94 | 6 | 4 | 12.07 | 3.50 | 31.25 | 5.90 | 12244.94 | 3.3 × 10−5 |
| TNTNNACT/AGTNNANA | 169 | 24 | 140.62 | 10.91 | 36 | 9 | 71.98 | 8.83 | 106.31 | 10.90 | 11192.62 | 8.8 × 10−5 |
In the table, occ is the pattern occurrence, pro is the number of promoters with the pattern, and avg and SD are the bootstrap mean and standard deviation.
Figure 1Comparison of known and the discovered patterns (alignments were done by hand). (a) Patterns resembling the W-box (15). (b) Patterns resembling the novel chitin binding element (19). (c) Patterns resembling the auxin-responsive element (20). (d) Patterns resembling the AGP-factor element (23).
The five top patterns under-represented in the up-regulated and over-represented in the down-regulated WRKY70 promoter set (notation as in Table 1)
| Pattern | Up-regulated | Down-regulated | Background | P | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Occ | Pro | Avg | SD | Occ | Pro | Avg | SD | Avg | SD | |||
| CANNNCCC/GGGNNNTG | 13 | 9 | 10.82 | 3.61 | 24 | 10 | 48.07 | 5.54 | 21.90 | 5.20 | 15524.94 | 2.9 × 10−5 |
| GCNCNGA/TCNGNGC | 7 | 7 | 5.84 | 1.99 | 17 | 9 | 33.91 | 4.91 | 13.43 | 4.04 | 14230.12 | 7.6 × 10−5 |
| AGANCNCA/TGNGNTCT | 7 | 6 | 5.83 | 2.44 | 16 | 8 | 32.00 | 4.51 | 12.43 | 3.73 | 13837.37 | 1.0 × 10−4 |
| TAGCNCNG/CNGNGCTA | 0 | 0 | 0.00 | 0.00 | 6 | 6 | 11.94 | 2.25 | 3.30 | 1.95 | 12829.72 | 2.1 × 10−4 |
| GCCNNNC/GNNNGGC | 48 | 21 | 39.86 | 5.80 | 44 | 10 | 87.80 | 5.52 | 49.31 | 9.44 | 12629.53 | 2.5 × 10−4 |