| Literature DB >> 19615102 |
Fernando Garcia1, Francisco J Lopez, Carlos Cano, Armando Blanco.
Abstract
BACKGROUND: Regulatory motifs describe sets of related transcription factor binding sites (TFBSs) and can be represented as position frequency matrices (PFMs). De novo identification of TFBSs is a crucial problem in computational biology which includes the issue of comparing putative motifs with one another and with motifs that are already known. The relative importance of each nucleotide within a given position in the PFMs should be considered in order to compute PFM similarities. Furthermore, biological data are inherently noisy and imprecise. Fuzzy set theory is particularly suitable for modeling imprecise data, whereas fuzzy integrals are highly appropriate for representing the interaction among different information sources.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19615102 PMCID: PMC2722654 DOI: 10.1186/1471-2105-10-224
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Random Motifs. Power of the methods to recognize random PFMs generated by the same distribution.
Figure 2Case study. Ratio of distances. In order to facilitate the visual comparison of the non-conserved positions, fraction-based logos are used. We do not show results for the measures proposed by Pape et al. [13] nor Gupta et al. [12] since they need a background dataset to work properly.
Figure 3Related motifs. Three dissimilar positions are observed between the reference motif and both close and distant motif. Again, fraction-based logos are used to ease the visual comparison of the non-conserved positions.
Figure 4ROC curves. ROC curves for the case of three different columns. FISim provides a more consistent classification than the rest of the methods.
JASPAR family distribution
| Family | Number of motifs | Family | Number of motifs |
| ETS | 7 | TRP | 5 |
| FORKHEAD | 8 | HMG | 6 |
| bHLH | 10 | HOMEO | 8 |
| bZIP EBP | 4 | NUCLEAR | 8 |
| MADS | 5 | bZIP CREB | 4 |
| REL | 6 |
Summary of the JASPAR classification. There exist 71 motifs divided into 11 families.
Figure 5REL group retrieved by kcmeans. The FBP is computed from the multiple alignment of the TFs Dorsal_1 and RELA.
Figure 6FISim pseudocode. This figure shows the pseudocode of the algorithm followed to compute FISim.
FISim example
| 0.95 | 0.1 | |||
| 0.9 | 0.1 | |||
| 0.8 | 0.841 | |||
| 0.9 | 1 |
Summary of the computation of the fuzzy integral for the given example (λ = -0.979). In bold are the minimum between h(i) and μ(A). The fuzzy integral value is the maximum value of such minimums, i.e. 0.2.