| Literature DB >> 20511592 |
Szymon M Kiełbasa1, Holger Klein, Helge G Roider, Martin Vingron, Nils Blüthgen.
Abstract
The analysis of putative transcription factor binding sites in promoter regions of coregulated genes allows to infer the transcription factors that underlie observed changes in gene expression. While such analyses constitute a central component of the in-silico characterization of transcriptional regulatory networks, there is still a lack of simple-to-use web servers able to combine state-of-the-art prediction methods with phylogenetic analysis and appropriate multiple testing corrected statistics, which returns the results within a short time. Having these aims in mind we developed TransFind, which is freely available at http://transfind.sys-bio.net/.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20511592 PMCID: PMC2896106 DOI: 10.1093/nar/gkq438
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.General overview of the algorithm.
Figure 2.TransFind identifies transcription factors with significantly enriched numbers of predicted targets in the regulated gene set with respect to an unregulated set. (A) The putative targets of factor 1 distribute randomly among the regulated genes (positive set) and non-regulated genes (negative set). (B) In contrast, the top targets of factor 2 are strongly enriched in the regulated genes.
An example of a TransFind result showing all significant transcription factor matrices predicted to regulate a known set of 51 c-myc targets
| Rank | TF matrix | FDR | FP | Hits in positive set ES (%) | Hits in negative set ES (%) | |
|---|---|---|---|---|---|---|
| 1 | V$NMYC_01 / N-Myc | 0.000001 | 0.000001 | 0.000001 | 11 (22.45) | 489 (0.98) |
The top 500 predicted targets of each transcription factor were analysed. The set of matrices was limited to the most informative per Transfac factor. Following the matrix and factor names, the P-value, FDR and expected number of FP are reported. Furthermore, the number of high-affinity matches for the factor in the positive set and in the negative set are shown, together with their relative abundance.
Figure 3.Sensitivity of TransFind for different sizes of the positive gene set. We used TransFind on random subgroups of literature-derived E2F, c-myc and NFκB target genes in order to determine minimum number of genes sufficient for correct identification of the regulating transcription factor. We defined sensitivity as the fraction of randomly selected subgroups, which resulted in a significant prediction of the respective transcription factor. We used default TransFind settings (500 top affinities, subset of Transfac matrices with highest information content).
Figure 4.Performance of TransFind. We defined 397 sets of genes which were annotated with top Gene Ontology terms and searched for enriched putative transcription factor targets using different parameters. Results for original data (dark grey) are compared to results for shuffled matrices or promoter sequences (medium grey) or to random gene sets of the same size distribution (light grey). Panel A shows the fraction of GO sets with at least one significant factor and panel B shows the average number of discovered factors in a GO set.