Literature DB >> 17951829

Learning position weight matrices from sequence and expression data.

Xin Chen1, Lingqiong Guo, Zhaocheng Fan, Tao Jiang.   

Abstract

Position weight matrices (PWMs) are widely used to depict the DNA binding preferences of transcription factors (TFs) in computational molecular biology and regulatory genomics. Thus, learning an accurate PWM to characterize the binding sites of a specific TF is a fundamental problem that plays an important role in modeling regulatory motifs and discovering the binding targets of TFs. Given a set of binding sites bound by a TF, the learning problem can be formulated as a straightforward maximum likelihood problem, namely, finding a PWM such that the likelihood of the observed binding sites is maximized, and is usually solved by counting the base frequencies at each position of the aligned binding sequences. In this paper, we study the question of accurately learning a PWM from both binding site sequences and gene expression (or ChIP-chip) data. We revise the above maximum likelihood framework by taking into account the given gene expression or ChIP-chip data. More specifically, we attempt to find a PWM such that the likelihood of simultaneously observing both the binding sequences and the associated gene expression (or ChIP-chip) values is maximized, by using the sequence weighting scheme introduced in our recent work. We have incorporated this new approach for estimating PWMs into the popular motif finding program AlignACE. The modified program, called W-AlignACE, is compared with three other programs (AlignACE, MDscan, and MotifRegressor) on a variety of datasets, including simulated data, publicly available mRNA expression data, and ChIP-chip data. These large-scale tests demonstrate that W-AlignACE is an effective tool for discovering TF binding sites from gene expression or ChIP-chip data and, in particular, has the ability to find very weak motifs.

Mesh:

Substances:

Year:  2007        PMID: 17951829

Source DB:  PubMed          Journal:  Comput Syst Bioinformatics Conf        ISSN: 1752-7791


  1 in total

1.  The cis-regulatory code of Hox function in Drosophila.

Authors:  Sebastian Sorge; Nati Ha; Maria Polychronidou; Jana Friedrich; Daniela Bezdan; Petra Kaspar; Martin H Schaefer; Stephan Ossowski; Stefan R Henz; Juliane Mundorf; Jenny Rätzer; Fani Papagiannouli; Ingrid Lohmann
Journal:  EMBO J       Date:  2012-07-10       Impact factor: 11.598

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.