| Literature DB >> 15807889 |
Markus Friberg1, Peter von Rohr, Gaston Gonnet.
Abstract
BACKGROUND: Transcription factor binding site (TFBS) prediction is a difficult problem, which requires a good scoring function to discriminate between real binding sites and background noise. Many scoring functions have been proposed in the literature, but it is difficult to assess their relative performance, because they are implemented in different software tools using different search methods and different TFBS representations.Entities:
Mesh:
Substances:
Year: 2005 PMID: 15807889 PMCID: PMC1140076 DOI: 10.1186/1471-2105-6-84
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Comparison of scoring functions on eight different data sets (lower rank is better)
Figure 2Comparison of scoring functions on the reb1 data set with different amounts of added noise. The average of ten independent runs is shown (lower average rank is better).
Figure 3Comparison of scoring functions on the mig1 data set with different amounts of added noise. The average of ten independent runs is shown (lower average rank is better).
Schematic interpretation of the results for the reb1 and mig1 data sets with added noise, good: top 3, ok: top 10, bad: worse than top 10
| MAP | GroupSpec | PosBias | LocPosBias | LLBG | LLBG LocPosBias | |
| reb1 | good | good | good | ok | good | good |
| reb1+10 | good | good | good | bad | good | good |
| reb1+20 | bad | ok | good | bad | good | ok |
| reb1+30 | bad | ok | good | bad | ok | ok |
| mig1 | good | good | bad | bad | good | good |
| mig1+10 | ok | bad | bad | bad | good | ok |
| mig1+20 | bad | bad | bad | bad | ok | bad |
| mig1+30 | bad | bad | bad | bad | bad | bad |
Figure 4Example of Local Positional Bias calculation: A promoter sequence of 1000 bp is split into windows of 100 bp each. 16 motif occurrences are distributed over the 10 windows.
Correlation coefficients between scores
| GroupSpec | PosBias | LocPosBias | LLBG | |
| MAP | 0.36 | 0.06 | 0.02 | 0.43 |
| GroupSpec | -0.19 | -0.02 | 0.26 | |
| PosBias | 0.01 | 0.02 | ||
| LocPosBias |
χ2 independence test (standard deviations)
| GroupSpec | PosBias | LocPosBias | LLBG | |
| MAP | 8.48 | 0.66 | 1.25 | 9.00 |
| GroupSpec | 3.86 | 1.09 | 5.62 | |
| PosBias | 0.47 | 1.27 | ||
| LocPosBias |
Data sets of promoter sequences of genes regulated by different transcription factors. By 'molecular biology approaches' we mean methods like DNAse footprinting and methylation interference. 'AlignACE' stands for functional group data from the AlignACE web server. '#seqs' stands for number of promoter sequences in the data set.
| TF | TFBS consensus motif | #seqs | source and type of evidence |
| abf1 | CGTNNNNNNTGA | 20 | molecular biology approaches [20] |
| gal4 | CGGNNNNNNNNNNNCCG | 10 | molecular biology approaches [20] and AlignACE [6] |
| mac1 | TTTGCTCA | 6 | microarray [21] |
| mcm1 | TTTCCCAAANNGGAAA | 24 | molecular biology approaches [20] |
| mig1 | AAAAATCTGGGG | 11 | molecular biology approaches [22] |
| pdr | TCCGCGGA | 11 | AlignACE [6] |
| rap1 | TACACCCATACATT | 44 | molecular biology approaches [23] [24] |
| reb1 | TTACCCG | 13 | molecular biology approaches [20] |