Literature DB >> 24162561

Discriminative motif analysis of high-throughput dataset.

Zizhen Yao1, Kyle L Macquarrie, Abraham P Fong, Stephen J Tapscott, Walter L Ruzzo, Robert C Gentleman.   

Abstract

MOTIVATION: High-throughput ChIP-seq studies typically identify thousands of peaks for a single transcription factor (TF). It is common for traditional motif discovery tools to predict motifs that are statistically significant against a naïve background distribution but are of questionable biological relevance.
RESULTS: We describe a simple yet effective algorithm for discovering differential motifs between two sequence datasets that is effective in eliminating systematic biases and scalable to large datasets. Tested on 207 ENCODE ChIP-seq datasets, our method identifies correct motifs in 78% of the datasets with known motifs, demonstrating improvement in both accuracy and efficiency compared with DREME, another state-of-art discriminative motif discovery tool. More interestingly, on the remaining more challenging datasets, we identify common technical or biological factors that compromise the motif search results and use advanced features of our tool to control for these factors. We also present case studies demonstrating the ability of our method to detect single base pair differences in DNA specificity of two similar TFs. Lastly, we demonstrate discovery of key TF motifs involved in tissue specification by examination of high-throughput DNase accessibility data. AVAILABILITY: The motifRG package is publically available via the bioconductor repository. CONTACT: yzizhen@fhcrc.org SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Mesh:

Substances:

Year:  2013        PMID: 24162561      PMCID: PMC3957073          DOI: 10.1093/bioinformatics/btt615

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  26 in total

1.  Identifying DNA and protein patterns with statistically significant alignments of multiple sequences.

Authors:  G Z Hertz; G D Stormo
Journal:  Bioinformatics       Date:  1999 Jul-Aug       Impact factor: 6.937

2.  Regulatory element detection using correlation with expression.

Authors:  H J Bussemaker; H Li; E D Siggia
Journal:  Nat Genet       Date:  2001-02       Impact factor: 38.330

3.  On counting position weight matrix matches in a sequence, with application to discriminative motif finding.

Authors:  Saurabh Sinha
Journal:  Bioinformatics       Date:  2006-07-15       Impact factor: 6.937

4.  RankMotif++: a motif-search algorithm that accounts for relative ranks of K-mers in binding transcription factors.

Authors:  Xiaoyu Chen; Timothy R Hughes; Quaid Morris
Journal:  Bioinformatics       Date:  2007-07-01       Impact factor: 6.937

Review 5.  Specificity, free energy and information content in protein-DNA interactions.

Authors:  G D Stormo; D S Fields
Journal:  Trends Biochem Sci       Date:  1998-03       Impact factor: 13.807

6.  The value of prior knowledge in discovering motifs with MEME.

Authors:  T L Bailey; C Elkan
Journal:  Proc Int Conf Intell Syst Mol Biol       Date:  1995

7.  The role of NF-Y and IRF-2 in the regulation of human IL-4 gene expression.

Authors:  M Li-Weber; I V Davydov; H Krafft; P H Krammer
Journal:  J Immunol       Date:  1994-11-01       Impact factor: 5.422

8.  DNA motifs in human and mouse proximal promoters predict tissue-specific expression.

Authors:  Andrew D Smith; Pavel Sumazin; Zhenyu Xuan; Michael Q Zhang
Journal:  Proc Natl Acad Sci U S A       Date:  2006-04-10       Impact factor: 11.205

9.  Assessing computational tools for the discovery of transcription factor binding sites.

Authors:  Martin Tompa; Nan Li; Timothy L Bailey; George M Church; Bart De Moor; Eleazar Eskin; Alexander V Favorov; Martin C Frith; Yutao Fu; W James Kent; Vsevolod J Makeev; Andrei A Mironov; William Stafford Noble; Giulio Pavesi; Graziano Pesole; Mireille Régnier; Nicolas Simonis; Saurabh Sinha; Gert Thijs; Jacques van Helden; Mathias Vandenbogaert; Zhiping Weng; Christopher Workman; Chun Ye; Zhou Zhu
Journal:  Nat Biotechnol       Date:  2005-01       Impact factor: 54.908

10.  Inferring direct DNA binding from ChIP-seq.

Authors:  Timothy L Bailey; Philip Machanick
Journal:  Nucleic Acids Res       Date:  2012-05-18       Impact factor: 16.971

View more
  13 in total

1.  ProSampler: an ultrafast and accurate motif finder in large ChIP-seq datasets for combinatory motif discovery.

Authors:  Yang Li; Pengyu Ni; Shaoqiang Zhang; Guojun Li; Zhengchang Su
Journal:  Bioinformatics       Date:  2019-11-01       Impact factor: 6.937

2.  Conversion of MyoD to a neurogenic factor: binding site specificity determines lineage.

Authors:  Abraham P Fong; Zizhen Yao; Jun Wen Zhong; Nathan M Johnson; Gist H Farr; Lisa Maves; Stephen J Tapscott
Journal:  Cell Rep       Date:  2015-03-19       Impact factor: 9.423

3.  SArKS: de novo discovery of gene expression regulatory motif sites and domains by suffix array kernel smoothing.

Authors:  Dennis C Wylie; Hans A Hofmann; Boris V Zemelman
Journal:  Bioinformatics       Date:  2019-10-15       Impact factor: 6.937

4.  DeepD2V: A Novel Deep Learning-Based Framework for Predicting Transcription Factor Binding Sites from Combined DNA Sequence.

Authors:  Lei Deng; Hui Wu; Xuejun Liu; Hui Liu
Journal:  Int J Mol Sci       Date:  2021-05-24       Impact factor: 5.923

5.  Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space.

Authors:  Rahul Karnik; Michael A Beer
Journal:  PLoS One       Date:  2015-10-14       Impact factor: 3.240

6.  SeAMotE: a method for high-throughput motif discovery in nucleic acid sequences.

Authors:  Federico Agostini; Davide Cirillo; Riccardo Delli Ponti; Gian Gaetano Tartaglia
Journal:  BMC Genomics       Date:  2014-10-23       Impact factor: 3.969

7.  Super-lncRNAs: identification of lncRNAs that target super-enhancers via RNA:DNA:DNA triplex formation.

Authors:  Benjamin Soibam
Journal:  RNA       Date:  2017-08-24       Impact factor: 4.942

8.  RCAS: an RNA centric annotation system for transcriptome-wide regions of interest.

Authors:  Bora Uyar; Dilmurat Yusuf; Ricardo Wurmus; Nikolaus Rajewsky; Uwe Ohler; Altuna Akalin
Journal:  Nucleic Acids Res       Date:  2017-06-02       Impact factor: 16.971

9.  Direct AUC optimization of regulatory motifs.

Authors:  Lin Zhu; Hong-Bo Zhang; De-Shuang Huang
Journal:  Bioinformatics       Date:  2017-07-15       Impact factor: 6.937

10.  Sequence- and Structure-Based Analysis of Tissue-Specific Phosphorylation Sites.

Authors:  Nermin Pinar Karabulut; Dmitrij Frishman
Journal:  PLoS One       Date:  2016-06-22       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.