| Literature DB >> 20147307 |
Fabian A Buske1, Mikael Bodén, Denis C Bauer, Timothy L Bailey.
Abstract
MOTIVATION: Transcription factors (TFs) are crucial during the lifetime of the cell. Their functional roles are defined by the genes they regulate. Uncovering these roles not only sheds light on the TF at hand but puts it into the context of the complete regulatory network.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20147307 PMCID: PMC2844991 DOI: 10.1093/bioinformatics/btq049
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 2.Multiple-species Gomo prediction accuracy. Each point shows the average AUC50 of TF–GO term association predictions made by Gomo in the key species E.coli (a) or S.cerevisiae ( b). Points labeled ‘multiple-species’ use promoter sequences from the key species and three related species; Monkey results use Monkey (Moses et al., 2004) minimum P-value scores instead of Ama scores (Supplementary Material 1). Points labeled ‘single-species’ use promoter sequences from the key species only, and are shown for comparison. The AUC50 is computed using a single TF, then averaged over TFs. The X-axis shows the upstream extent of promoter sequences (‘full’), or the maximum upstream extent when they are truncated at the first ORF (‘intergenic’). For clarity, standard error bars are shown for the ‘full’ promoter sequence set only; standard error bars for the ‘intergenic’ promoter set are similar.
Fig. 1.Single-species Gomo prediction accuracy using transferred GO maps. Each point shows the average AUC50 of TF–GO term associations predicted by Gomo using the E.coli (a) or S.cerevisiae (b) GO map and TFs, and promoter sequences from the single given species. The AUC50 is computed using a single TF, then averaged over TFs. The X-axis shows the maximum upstream extent of promoter sequences, which are truncated at the first ORF. The inset shows the phylogenetic tree of the corresponding species. Branch lengths denote average substitutions per site.
Improvement in TF role prediction using comparative genomics
| Single | Multiple | Increase | Single | Multiple | Increase | Single | Multiple | Increase | |
|---|---|---|---|---|---|---|---|---|---|
| species | species | (%) | species | species | (%) | species | species | (%) | |
| Significant TF–GO term pairs | 0 | 14 | NA | 420 | 733 | 75 | 371 | 1112 | 200 |
| GO terms per TF tested | 0 | 0.16 | NA | 3.4 | 5.9 | 75 | 6.6 | 19.8 | 200 |
| Covered TFs | 0 | 9 | NA | 99 | 113 | 14 | 48 | 56 | 17 |
| Term specificity | 0 | 4.0 | NA | 4.5 | 4.6 | 2 | 3.8 | 4.2 | 11 |
| TFs tested | 85 | 124 | 56 | ||||||
The table shows FDR (q≤0.05) results for single- and multiple-species Gomo. The results shown are the total number of most-specific significant pairs (‘significant TF–GO term pairs’), the average number of most-specific GO terms per TF tested (‘GO terms per TF tested’), the number of TFs with at least one significant TF–GO term pair (‘covered TFs’), the average depth in the GO hierarchy of significant GO terms (‘term specificity’), and the total number of TFs in each experiment (‘TFs tested’). All results are for GC-compensated Ama scores and ‘full’ promoters of 500, 750 and 1000 bp for enterobacter, yeast and mammals, respectively. NA, not applicable.