Literature DB >> 28334224

Computational modeling of in vivo and in vitro protein-DNA interactions by multiple instance learning.

Zhen Gao1, Jianhua Ruan1.   

Abstract

MOTIVATION: The study of transcriptional regulation is still difficult yet fundamental in molecular biology research. While the development of both in vivo and in vitro profiling techniques have significantly enhanced our knowledge of transcription factor (TF)-DNA interactions, computational models of TF-DNA interactions are relatively simple and may not reveal sufficient biological insight. In particular, supervised learning based models for TF-DNA interactions attempt to map sequence-level features ( k -mers) to binding event but usually ignore the location of k -mers, which can cause data fragmentation and consequently inferior model performance.
RESULTS: Here, we propose a novel algorithm based on the so-called multiple-instance learning (MIL) paradigm. MIL breaks each DNA sequence into multiple overlapping subsequences and models each subsequence separately, therefore implicitly takes into consideration binding site locations, resulting in both higher accuracy and better interpretability of the models. The result from both in vivo and in vitro TF-DNA interaction data show that our approach significantly outperform conventional single-instance learning based algorithms. Importantly, the models learned from in vitro data using our approach can predict in vivo binding with very good accuracy. In addition, the location information obtained by our method provides additional insight for motif finding results from ChIP-Seq data. Finally, our approach can be easily combined with other state-of-the-art TF-DNA interaction modeling methods.
AVAILABILITY AND IMPLEMENTATION: http://www.cs.utsa.edu/∼jruan/MIL/. CONTACT: jianhua.ruan@utsa.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

Entities:  

Mesh:

Substances:

Year:  2017        PMID: 28334224      PMCID: PMC5870851          DOI: 10.1093/bioinformatics/btx115

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  33 in total

1.  Regulatory element detection using correlation with expression.

Authors:  H J Bussemaker; H Li; E D Siggia
Journal:  Nat Genet       Date:  2001-02       Impact factor: 38.330

2.  Binding site specificity and factor redundancy in activator protein-1-driven human papillomavirus chromatin-dependent transcription.

Authors:  Wei-Ming Wang; Shwu-Yuan Wu; A-Young Lee; Cheng-Ming Chiang
Journal:  J Biol Chem       Date:  2011-09-21       Impact factor: 5.157

3.  Epigenetic priors for identifying active transcription factor binding sites.

Authors:  Gabriel Cuellar-Partida; Fabian A Buske; Robert C McLeay; Tom Whitington; William Stafford Noble; Timothy L Bailey
Journal:  Bioinformatics       Date:  2011-11-08       Impact factor: 6.937

4.  Statistical mechanical modeling of genome-wide transcription factor occupancy data by MatrixREDUCE.

Authors:  Barrett C Foat; Alexandre V Morozov; Harmen J Bussemaker
Journal:  Bioinformatics       Date:  2006-07-15       Impact factor: 6.937

5.  Consensus patterns in DNA.

Authors:  G D Stormo
Journal:  Methods Enzymol       Date:  1990       Impact factor: 1.600

6.  Direct measurement of DNA affinity landscapes on a high-throughput sequencing instrument.

Authors:  Razvan Nutiu; Robin C Friedman; Shujun Luo; Irina Khrebtukova; David Silva; Robin Li; Lu Zhang; Gary P Schroth; Christopher B Burge
Journal:  Nat Biotechnol       Date:  2011-06-26       Impact factor: 54.908

7.  Concerted participation of NF-kappa B and C/EBP heteromer in lipopolysaccharide induction of serum amyloid A gene expression in liver.

Authors:  A Ray; M Hannink; B K Ray
Journal:  J Biol Chem       Date:  1995-03-31       Impact factor: 5.157

8.  Using sequence-specific chemical and structural properties of DNA to predict transcription factor binding sites.

Authors:  Amy L Bauer; William S Hlavacek; Pat J Unkefer; Fangping Mu
Journal:  PLoS Comput Biol       Date:  2010-11-18       Impact factor: 4.475

9.  A linear model for transcription factor binding affinity prediction in protein binding microarrays.

Authors:  Matti Annala; Kirsti Laurila; Harri Lähdesmäki; Matti Nykter
Journal:  PLoS One       Date:  2011-05-26       Impact factor: 3.240

10.  Integrative annotation of chromatin elements from ENCODE data.

Authors:  Michael M Hoffman; Jason Ernst; Steven P Wilder; Anshul Kundaje; Robert S Harris; Max Libbrecht; Belinda Giardine; Paul M Ellenbogen; Jeffrey A Bilmes; Ewan Birney; Ross C Hardison; Ian Dunham; Manolis Kellis; William Stafford Noble
Journal:  Nucleic Acids Res       Date:  2012-12-05       Impact factor: 16.971

View more
  2 in total

1.  MD-SVM: a novel SVM-based algorithm for the motif discovery of transcription factor binding sites.

Authors:  Jialu Hu; Jingru Wang; Jianan Lin; Tianwei Liu; Yuanke Zhong; Jie Liu; Yan Zheng; Yiqun Gao; Junhao He; Xuequn Shang
Journal:  BMC Bioinformatics       Date:  2019-05-01       Impact factor: 3.169

2.  Direct AUC optimization of regulatory motifs.

Authors:  Lin Zhu; Hong-Bo Zhang; De-Shuang Huang
Journal:  Bioinformatics       Date:  2017-07-15       Impact factor: 6.937

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.