| Literature DB >> 21342545 |
Abstract
BACKGROUND: Discrimination of transcription factor binding sites (TFBS) from background sequences plays a key role in computational motif discovery. Current clustering based algorithms employ homogeneous model for problem solving, which assumes that motifs and background signals can be equivalently characterized. This assumption has some limitations because both sequence signals have distinct properties.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21342545 PMCID: PMC3044270 DOI: 10.1186/1471-2105-12-S1-S16
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Evaluation results with comparisons
| SOMEA | SOMBRERO | MEME | ALIGNACE | WEEDER | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| R | P | F | R | P | F | R | P | F | R | P | F | R | P | F | |
| CRP | 0.89 | 0.83 | 0.43 | 0.56 | 0.59 | 0.88 | 0.69 | 0.83 | 0.98 | 0.75 | 0.83 | 0.79 | |||
| GCN4 | 0.69 | 0.45 | 0.54 | 0.41 | 0.53 | 0.52 | 0.52 | 0.52 | 0.61 | 0.62 | 0.6 | 0.64 | |||
| ERE | 0.74 | 0.58 | 0.65 | 0.8 | 0.59 | 0.67 | 0.72 | 0.77 | 0.75 | 0.77 | 0.76 | 0.54 | 0.63 | ||
| MEF2 | 0.81 | 0.35 | 0.22 | 0.27 | 0.8 | 0.85 | 0.86 | 0.87 | 0.86 | 0.88 | 0.88 | 0.88 | |||
| SRF | 0.84 | 0.74 | 0.67 | 0.74 | 0.72 | 0.83 | 0.71 | 0.77 | 0.83 | 0.71 | 0.76 | ||||
| CREB | 0.67 | 0.83 | 0.43 | 0.56 | 0.59 | 0.69 | 0.52 | 0.66 | 0.57 | 0.79 | 0.71 | 0.75 | |||
| E2F | 0.64 | 0.71 | 0.76 | 0.67 | 0.71 | 0.68 | 0.64 | 0.65 | 0.75 | 0.89 | 0.67 | ||||
| MYOD | 0.39 | 0.5 | 0.32 | 0.39 | 0.23 | 0.38 | 0.27 | 0.34 | 0.31 | 0.32 | 0.43 | 0.46 | |||
| Average | 0.67 | 0.69 | 0.49 | 0.55 | 0.64 | 0.65 | 0.69 | 0.70 | 0.69 | 0.75 | 0.72 | ||||
Evaluation results with comparisons for multiple motifs datasets
| SOMEA | SOMBRERO | MEME | WEEDER | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| R | P | F | R | P | R | R | P | F | R | P | F | ||
| Dataset1 | CREB | 0.43 | 0.26 | 0.26 | 0.20 | 0.00 | 0.00 | 0.00 | |||||
| MYOD | 0.20 | 0.08 | 0.11 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | ||||
| TBP | 0.21 | 0.20 | 0.12 | 0.15 | 0.07 | 0.12 | 0.00 | 0.00 | 0.00 | ||||
| Avg | 0.23 | 0.28 | 0.15 | 0.20 | 0.09 | 0.15 | 0.00 | 0.00 | 0.00 | ||||
| Dataset2 | NFAT | 0.39 | 0.27 | 0.31 | 0.36 | 0.21 | 0.26 | 0.00 | 0.00 | 0.00 | |||
| HNF4 | 0.57 | 0.40 | 0.47 | 0.39 | 0.48 | 0.60 | 0.82 | 0.40 | 0.57 | ||||
| SP1 | 0.50 | 0.53 | 0.35 | 0.42 | 0.38 | 0.44 | 0.00 | 0.00 | 0.00 | ||||
| Avg | 0.49 | 0.40 | 0.43 | 0.32 | 0.39 | 0.47 | 0.13 | 0.33 | 0.19 | ||||
| Dataset3 | CAAT | 0.21 | 0.25 | 0.32 | 0.17 | 0.22 | 0.29 | 0.00 | 0.00 | 0.00 | |||
| SRF | 0.40 | 0.59 | 0.28 | 0.38 | 0.29 | 0.38 | 0.00 | 0.00 | 0.00 | ||||
| MEF2 | 0.79 | 0.45 | 0.57 | 0.65 | 0.31 | 0.27 | 0.57 | 0.27 | 0.42 | ||||
| Avg | 0.35 | 0.44 | 0.52 | 0.25 | 0.29 | 0.46 | 0.09 | 0.33 | 0.14 | ||||
| Dataset4 | USF | 0.68 | 0.39 | 0.48 | 0.48 | 0.41 | 0.56 | 0.00 | 0.00 | 0.00 | |||
| HNF3B | 0.25 | 0.26 | 0.13 | 0.17 | 0.15 | 0.27 | 0.00 | 0.00 | 0.00 | ||||
| NFKB | 0.71 | 0.47 | 0.56 | 0.66 | 0.46 | 0.54 | 0.57 | 0.33 | 0.50 | ||||
| Avg | 0.37 | 0.45 | 0.55 | 0.36 | 0.43 | 0.45 | 0.11 | 0.33 | 0.17 | ||||
| Dataset5 | GATA3 | 0.37 | 0.46 | 0.49 | 0.33 | 0.36 | 0.40 | 0.75 | 0.52 | 0.40 | |||
| CMYC | 0.74 | 0.47 | 0.57 | 0.70 | 0.84 | 0.75 | 0.19 | 0.75 | 0.30 | ||||
| EGR1 | 0.36 | 0.47 | 0.47 | 0.26 | 0.33 | 0.64 | 0.00 | 0.00 | 0.00 | ||||
| Avg | 0.40 | 0.50 | 0.62 | 0.43 | 0.51 | 0.60 | 0.20 | 0.58 | 0.29 | ||||
Comparisons of performance with different map sizes
| SOMEA | SOMBRERO | SOMEA | SOMBRERO | SOMEA | SOMBRERO | |
|---|---|---|---|---|---|---|
| 10 × 10 | 15 × 15 | 20 × 20 | ||||
| CREB | 0.70 | 0.41 | 0.76 | 0.67 | 0.72 | 0.67 |
| CRP | 0.81 | 0.71 | 0.66 | 0.71 | 0.58 | 0.52 |
| E2F | 0.58 | 0.73 | 0.69 | 0.63 | 0.72 | 0.67 |
| ERE | 0.53 | 0.42 | 0.66 | 0.60 | 0.61 | 0.74 |
| GCN | 0.41 | 0.44 | 0.51 | 0.52 | 0.58 | 0.60 |
| MEF | 0.68 | 0.92 | 0.91 | 0.80 | 0.82 | 0.44 |
| MYOD | 0.32 | 0.23 | 0.49 | 0.42 | 0.47 | 0.49 |
| SRF | 0.70 | 0.67 | 0.77 | 0.72 | 0.71 | 0.71 |
Showing is the average F-measure from five runs of eight real datasets using three map sizes 10 × 10,15 × 15 and 20 × 20.