Literature DB >> 28633280

MotifHyades: expectation maximization for de novo DNA motif pair discovery on paired sequences.

Ka-Chun Wong1.   

Abstract

MOTIVATION: In higher eukaryotes, protein-DNA binding interactions are the central activities in gene regulation. In particular, DNA motifs such as transcription factor binding sites are the key components in gene transcription. Harnessing the recently available chromatin interaction data, computational methods are desired for identifying the coupling DNA motif pairs enriched on long-range chromatin-interacting sequence pairs (e.g. promoter-enhancer pairs) systematically.
RESULTS: To fill the void, a novel probabilistic model (namely, MotifHyades) is proposed and developed for de novo DNA motif pair discovery on paired sequences. In particular, two expectation maximization algorithms are derived for efficient model training with linear computational complexity. Under diverse scenarios, MotifHyades is demonstrated faster and more accurate than the existing ad hoc computational pipeline. In addition, MotifHyades is applied to discover thousands of DNA motif pairs with higher gold standard motif matching ratio, higher DNase accessibility and higher evolutionary conservation than the previous ones in the human K562 cell line. Lastly, it has been run on five other human cell lines (i.e. GM12878, HeLa-S3, HUVEC, IMR90, and NHEK), revealing another thousands of novel DNA motif pairs which are characterized across a broad spectrum of genomic features on long-range promoter-enhancer pairs.
AVAILABILITY AND IMPLEMENTATION: The matrix-algebra-optimized versions of MotifHyades and the discovered DNA motif pairs can be found in http://bioinfo.cs.cityu.edu.hk/MotifHyades. CONTACT: kc.w@cityu.edu.hk. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

Entities:  

Mesh:

Substances:

Year:  2017        PMID: 28633280     DOI: 10.1093/bioinformatics/btx381

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  4 in total

1.  Heterodimeric DNA motif synthesis and validations.

Authors:  Ka-Chun Wong; Jiecong Lin; Xiangtao Li; Qiuzhen Lin; Cheng Liang; You-Qiang Song
Journal:  Nucleic Acids Res       Date:  2019-02-28       Impact factor: 16.971

2.  SamSelect: a sample sequence selection algorithm for quorum planted motif search on large DNA datasets.

Authors:  Qiang Yu; Dingbang Wei; Hongwei Huo
Journal:  BMC Bioinformatics       Date:  2018-06-18       Impact factor: 3.169

3.  A Clustering Approach for Motif Discovery in ChIP-Seq Dataset.

Authors:  Chun-Xiao Sun; Yu Yang; Hua Wang; Wen-Hu Wang
Journal:  Entropy (Basel)       Date:  2019-08-16       Impact factor: 2.524

4.  GLNMDA: a novel method for miRNA-disease association prediction based on global linear neighborhoods.

Authors:  Sheng-Peng Yu; Cheng Liang; Qiu Xiao; Guang-Hui Li; Ping-Jian Ding; Jia-Wei Luo
Journal:  RNA Biol       Date:  2018-09-23       Impact factor: 4.652

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.