| Literature DB >> 21708002 |
Changqing Zhang1, Jin Wang, Xu Hua, Jinggui Fang, Huaiqiu Zhu, Xiang Gao.
Abstract
BACKGROUND: Current approaches for identifying transcriptional regulatory elements are mainly via the combination of two properties, the evolutionary conservation and the overrepresentation of functional elements in the promoters of co-regulated genes. Despite the development of many motif detection algorithms, the discovery of conserved motifs in a wide range of phylogenetically related promoters is still a challenge, especially for the short motifs embedded in distantly related gene promoters or very closely related promoters, or in the situation that there are not enough orthologous genes available.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21708002 PMCID: PMC3228546 DOI: 10.1186/1471-2105-12-262
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Performance comparisons of different tools on simulated data. The predictions shown in histogram are from AlignACE, GLAM2 and Weeder. These three tools are based on over-represented word detection. The predictions shown in line chart are from OCW, PhyloGibbs, PhyloCon, and WeederH, which introduced phylogenetic information in the algorithms. The extent of convergence of artificial orthologous sequences used in these tools is represented by the sequence identity.
The functional elements detected by 7 tools*
| Dataset | Binding site | OCW | WeederH | PhyloGibbs | PhyloCon | AlignACE | GLAM2 | Weeder | |
|---|---|---|---|---|---|---|---|---|---|
| 1 | ABRE(ACGTGKC) | + | + | + | - | + | + | + | [ |
| DRE(TACCGACAT) | + | - | - | + | + | + | - | [ | |
| 2 | ARF(TGTCTC) | + | + | + | + | + | + | + | [ |
| 3 | ARF(TGTCTC) | + | + | + | + | + | - | + | [ |
| 4 | XBP1BS/P-UPRE/ERSEI(CCACGTCAT) | + | + | + | + | + | - | + | [ |
| P-UPRE/ERSEI(ATTGGN9CCACG) | + | + | + | + | + | + | + | [ | |
| 5 | G-box(CACGTG) | - | + | + | + | + | + | + | [ |
| 6 | SAUR(CATATG) | + | + | - | - | - | + | + | [ |
| 7 | ABRE3(CAACGTG) | + | + | + | - | - | + | - | [ |
| extA(AACGTGT) | + | + | - | - | - | + | - | [ |
* Only those elements already reported in literatures are listed. '+' indicates successful detection, '-' means failed detection.
Figure 2Performance of OCW, PhyloCon, PhyloGibbs and WeederH on noisy data. The extent of noise was adjusted by introducing an increasing number (k) of random promoters into the phylogenetic sets.
Figure 3Illustrations for mutation degree model and OCW method. (A) Illustration of the mutation degree model. The phylogenetic promoter sequences of Gene#1, Gene#2 and Gene#3 etc. are highlighted in light blue. Mutation degrees between the promoter of species1 and its phylogenetic related promoters are denoted as a1%, b1%, etc. The data in the result column is only for demonstration. The co-expressed gene set highlighted in lavender belongs to Species1. (B) Flow chart of OCW. Step 1: All oligo-nucleotides presented in co-expressed genes are enumerated; Step 2: Fisher's exact test of the over-representation significance of the enumerated oligo-nucleotides; Step 3: Calculation of the conservation score of the elements resulted from step 2, the elements with S>1 are reported; Step 4: Reporting functional elements that meet the criteria assigned by user.