| Literature DB >> 19958493 |
Je-Keun Rhee1, Je-Gun Joung, Jeong-Ho Chang, Zhangjun Fei, Byoung-Tak Zhang.
Abstract
BACKGROUND: Gene regulation is a key mechanism in higher eukaryotic cellular processes. One of the major challenges in gene regulation studies is to identify regulators affecting the expression of their target genes in specific biological processes. Despite their importance, regulators involved in diverse biological processes still remain largely unrevealed. In the present study, we propose a kernel-based approach to efficiently identify core regulatory elements involved in specific biological processes using gene expression profiles.Entities:
Mesh:
Year: 2009 PMID: 19958493 PMCID: PMC2788382 DOI: 10.1186/1471-2164-10-S3-S29
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Basic scheme of the kernel CCA. The sequence and expression data are transformed to Hilbert space by φ function. By taking inner products, uand uwere derived, which maximize the correlation between the upstream sequences and the expression profiles.
Known regulatory motifs in yeast (Saccharomyces cerevisiae)
| Motif | |||
|---|---|---|---|
| RAP1 | RPN4 | GCN4 | MCB |
| HAP234 | MIG1 | AFT1 | STRE' |
| CCA | CSRE | PHO4 | STE12 |
| HSE | ABF1 | ATRepeat | GAL |
| Leu3 | LYS14 | MET31-32 | OAF1 |
| PAC | PDR | PHO | REB1 |
| STRE | ECB | ndt80 (MSE) | Yap1 |
| SCB | Gcr1 | zap1 | MCM1' |
| MCM1 | SFF | SFF' | BAS1 |
| Ume6 (URS1) | SWI5 | ALPHA1' | ALPHA1 |
| ALPHA2' | ALPHA2 | ||
Figure 2Relationship between gene expression profiles and regulatory sequence motifs. (a) The plot shows the correlation between gene expression profiles and the regulatory sequence motifs. Each dot represents one gene in the dataset, and x-axis means the value of u, y-axis is u. (b) The plot is a close-up view of the boxed area in (a).
The list of top ranked motifs based on the weight scheme by the kernel CCA
| Motif | Weight | Function | Reference |
|---|---|---|---|
| SWI5 | 0.89026 | Transcription activation in G1 phase | [ |
| SFF' | 0.45399 | FKH1 binding site that regulate the cell cycle | [ |
| MCB | 0.29633 | MBF binding site that activates in late G1 phase | [ |
| LYS14 | 0.21796 | Lysine biosysthesis pathway | |
| ALPHA2 | 0.16532 | Encoding a homeobox-domain | [ |
Figure 3Weight distributions for MCB, SFF' and SWI5 motifs derived from cell cycle and non cell cycle-related datasets. The dotted line indicates the weight distribution from the non-cell cycle datasets and the solid line from cell cycle datasets.
Figure 4Correlation between expression profiles and motifs derived by using the raw upstream sequence data. The plot on (b) is an enlargement of the boxed area in (a).
High-scored motifs in the first and the second components using 5-mer raw upstream sequences
| Sequence | Motif Description | Weight | Component | Rank |
|---|---|---|---|---|
| GCGTG | MCB (ACGCGT) | 0.079567 | 1 | 1 |
| CGTGT | MATalpha2 (CRTGTWWWW) | 0.075340 | 1 | 2 |
| CATGT | MATalpha2 (CRTGTWWWW) | 0.046299 | 1 | 12 |
| CCACG | SCB (CACGAAA) | 0.018992 | 2 | 4 |
| CGCGT | MCB (ACGCGT) | 0.017870 | 2 | 5 |
| GTGTT | MATalpha2 (CRTGTWWWW) | 0.016595 | 2 | 9 |
The top 10 ranked motif pairs and their ECRScores
| Weight | Motif Pair | ECRScore | # of ORFs | Reference | |
|---|---|---|---|---|---|
| 2.5368 | MCB | MCM1 | 0.390 | 15 | [ |
| 2.5018 | MCB | ECB | 0.439 | 12 | |
| 2.0177 | PHO | MCM1' | 0.088 | 17 | |
| 1.848 | ECB | ALPHA2 | 0.088 | 14 | |
| 1.7535 | MCM1 | ALPHA2 | 0.074 | 17 | [ |
| 1.7263 | ATRepeat | MCM1 | 0.076 | 12 | |
| 1.6995 | PHO | ECB | 0.127 | 11 | |
| 1.6823 | REB1 | SWI5 | 0.099 | 14 | [ |
| 1.6476 | REB1 | MCM1' | 0.115 | 13 | [ |
| 1.4256 | REB1 | ALPHA1 | 0.067 | 15 | [ |