| Literature DB >> 31483836 |
Linden J Gearing1,2, Helen E Cumming1,2, Ross Chapman1,2, Alexander M Finkel1,2, Isaac B Woodhouse1,2, Kevin Luu1,2, Jodee A Gould1,2, Samuel C Forster1,2, Paul J Hertzog1,2.
Abstract
The availability of large amounts of high-throughput genomic, transcriptomic and epigenomic data has provided opportunity to understand regulation of the cellular transcriptome with an unprecedented level of detail. As a result, research has advanced from identifying gene expression patterns associated with particular conditions to elucidating signalling pathways that regulate expression. There are over 1,000 transcription factors (TFs) in vertebrates that play a role in this regulation. Determining which of these are likely to be controlling a set of genes can be assisted by computational prediction, utilising experimentally verified binding site motifs. Here we present CiiiDER, an integrated computational toolkit for transcription factor binding analysis, written in the Java programming language, to make it independent of computer operating system. It is operated through an intuitive graphical user interface with interactive, high-quality visual outputs, making it accessible to all researchers. CiiiDER predicts transcription factor binding sites (TFBSs) across regulatory regions of interest, such as promoters and enhancers derived from any species. It can perform an enrichment analysis to identify TFs that are significantly over- or under-represented in comparison to a bespoke background set and thereby elucidate pathways regulating sets of genes of pathophysiological importance.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31483836 PMCID: PMC6726224 DOI: 10.1371/journal.pone.0215495
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1CiiiDER workflow.
A typical analysis involves submitting gene sets for scanning against known TF models, followed by the identification of sites that are statistically enriched relative to a submitted background gene list.
Fig 2CiiiDER interactive site map.
The scan and enrichment algorithms produce a graphical display of the TFBS locations on the sequences. There are many options to edit the images, including adjusting the deficit and P-value thresholds for displaying TFBSs, selecting or removing TFs to be viewed, editing the colour scheme for TFs and rearranging the order of the sequences.
Fig 3CiiiDER enrichment results for the breast cancer metastasis dataset.
The data are derived from the proportion of regions bound for each TF, which is the number of bound regions divided by the total number of regions. The plot shows the enrichment (ratio of proportion bound) and average log proportion bound. Size and colour show ∓log10(P-value) (significance score); it is greater than zero if the TF is over-represented and less than zero if under-represented. Underlying data are provided in S1 Data.
Fig 4Phylogenetic conservation of TF binding sites in the IFNβ promoter.
The results of the enrichment algorithm, displaying the ten most significantly enriched TFs present in at least half of the promoters. Underlying data are provided in S3 Data.