| Literature DB >> 19264796 |
Bogdan Tokovenko1, Rostyslav Golda, Oleksiy Protas, Maria Obolenskaya, Anna El'skaya.
Abstract
COTRASIF is a web-based tool for the genome-wide search of evolutionary conserved regulatory regions (transcription factor-binding sites, TFBS) in eukaryotic gene promoters. Predictions are made using either a position-weight matrix search method, or a hidden Markov model search method, depending on the availability of the matrix and actual sequences of the target TFBS. COTRASIF is a fully integrated solution incorporating both a gene promoter database (based on the regular Ensembl genome annotation releases) and both JASPAR and TRANSFAC databases of TFBS matrices. To decrease the false-positives rate an integrated evolutionary conservation filter is available, which allows the selection of only those of the predicted TFBS that are present in the promoters of the related species' orthologous genes. COTRASIF is very easy to use, implements a regularly updated database of promoters and is a powerful solution for genome-wide TFBS searching. COTRASIF is freely available at http://biomed.org.ua/COTRASIF/.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19264796 PMCID: PMC2673430 DOI: 10.1093/nar/gkp084
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.COTRASIF organization and data flow scheme.
Figure 2.An example of calculating a single co-occurrence matrix for positions 8 and 9, using the aligned known sequences of the ISRE (interferon-stimulated response element). In the co-occurrence matrix, numbers at the row-column intersections represent the observed frequency of the given nucleotides co-occurrence, e.g. the AT nucleotide pair is observed two times, and all of CA, CC, CG and CT are observed 0 times. (A) Eight aligned sequences; (B) the resulting co-occurrence matrix for the highlighted pair of positions 8 and 9.
GO over-represented terms analysis results for 707 rat genes with putative ISREs found using COTRASIF's HMM-PWM search method (list #1) versus all rat protein-coding genes (list #2)
| Index | Term | List #1 versus #2 | Adjusted | |
|---|---|---|---|---|
| GO molecular function at level 6 | ||||
| 10 | Serine-type endopeptidase activity (GO:0004252) | 75.27%, 24.73% | 6.2 × 10−5 | 0.0382 |
| GO molecular function at level 7 | ||||
| 11 | Tissue kallikrein activity (GO:0004293) | 93.62%, 6.38% | 1.4 × 10−3 | 0.0402 |
| GO cellular component at level 7 | ||||
| 18 | MHC protein complex (GO:0042611) | 83.59%, 16.41% | 3.76 × 10−4 | 0.0177 |
| GO cellular component at level 8 | ||||
| 19 | MHC class I protein complex (GO:0042612) | 86.64%, 13.36% | 8.52 × 10−5 | 0.00554 |
| GO molecular function at level 5 | ||||
| 9 | Serine-type peptidase activity (GO:0008236) | 74.91%, 25.09% | 5.05 × 10−5 | 0.0211 |