| Literature DB >> 28249564 |
Martin Nettling1, Hendrik Treutler2, Jesus Cerquides3, Ivo Grosse4,5.
Abstract
BACKGROUND: Transcriptional gene regulation is a fundamental process in nature, and the experimental and computational investigation of DNA binding motifs and their binding sites is a prerequisite for elucidating this process. Approaches for de-novo motif discovery can be subdivided in phylogenetic footprinting that takes into account phylogenetic dependencies in aligned sequences of more than one species and non-phylogenetic approaches based on sequences from only one species that typically take into account intra-motif dependencies. It has been shown that modeling (i) phylogenetic dependencies as well as (ii) intra-motif dependencies separately improves de-novo motif discovery, but there is no approach capable of modeling both (i) and (ii) simultaneously.Entities:
Keywords: ChIP-Seq; Evolution; Gene regulation; Phylogenetic footprinting; Transcription factor binding sites
Mesh:
Substances:
Year: 2017 PMID: 28249564 PMCID: PMC5333389 DOI: 10.1186/s12859-017-1495-1
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Sequence logos and intra-motif dependencies for the TFs a CJUN and b Nrf. We depict for both TFs (i) the sequence logo inferred by the PFM(2) from all species in the first row and (ii) the MI profiles of orders 1 and 2 inferred by the PFM(2) in the second row. The MI profiles of order 2 are larger than the MI profiles of order 1. Please see Additional file 3 for the MI profiles of all 35 TFs and Additional file 5 for all sequence logos of all 35 TFs for the PFMs of orders 0, 1, and 2
Fig. 2Maximum and average MIs of MI profiles inferred by the PFM(2) for all 35 TFs. In Fig. a we show the maximum MI of the MI profiles of orders 1 and 2. In Fig. b we show the average MI of the MI profiles of orders 1 and 2. The dashed lines indicate the mean of the maximum MIs and the mean of the average MIs for both MI profiles respectively. The degree of intra–motif dependencies depends of the TF and is always larger in case of intra–motif dependencies of order 2. Please see Additional file 3 for the MI profiles of all 35 TFs
Fig. 3Classification performance for PFMs with base dependencies of orders 0,1 and 2. a We show the mean and standard error of the ROC AUC for PFMs of orders 0, 1, and 2 averaged over 25–fold stratified repeated random subsampling. b We plot the mean and standard error of the relative increase of the ROC AUC for the PFMs of orders 1 and 2 relative to the PFM or order 0 for each of the 35 TFs. Taking into account base dependencies of order 1 increases the classification performance for 31 TFs. Taking into account base dependencies of order 2 increases the classification performance in all cases and is larger compared to taking into account base dependencies of order 1 in all cases. See Additional file 6 for detailed ROC and PR curves for the PFMs of order 2
Fig. 4Classification performance averaged for all 35 TFs. a We show the ROC AUC for PFMs of orders 0, 1, and 2 in percent averaged over 25–fold stratified repeated random subsampling and averaged over all 35 TFs. The overall classification performance increases with the order of the PFM. b We show the improvement of the ROC AUC for the PFMs of orders 1 and 2 relative to the PFM of order 0 averaged over 25–fold stratified repeated random subsampling and averaged over all 35 TFs