Literature DB >> 28849563

Mathematical Modeling of Avidity Distribution and Estimating General Binding Properties of Transcription Factors from Genome-Wide Binding Profiles.

Vladimir A Kuznetsov1,2.   

Abstract

The shape of the experimental frequency distributions (EFD) of diverse molecular interaction events quantifying genome-wide binding is often skewed to the rare but abundant quantities. Such distributions are systematically deviated from standard power-law functions proposed by scale-free network models suggesting that more explanatory and predictive probabilistic model(s) are needed. Identification of the mechanism-based data-driven statistical distributions that provide an estimation and prediction of binding properties of transcription factors from genome-wide binding profiles is the goal of this analytical survey. Here, we review and develop an analytical framework for modeling, analysis, and prediction of transcription factor (TF) DNA binding properties detected at the genome scale. We introduce a mixture probabilistic model of binding avidity function that includes nonspecific and specific binding events. A method for decomposition of specific and nonspecific TF-DNA binding events is proposed. We show that the Kolmogorov-Waring (KW) probability function (PF), modeling the steady state TF binding-dissociation stochastic process, fits well with the EFD for diverse TF-DNA binding datasets. Furthermore, this distribution predicts total number of TF-DNA binding sites (BSs), estimating specificity and sensitivity as well as other basic statistical features of DNA-TF binding when the experimental datasets are noise-rich and essentially incomplete. The KW distribution fits equally well to TF-DNA binding activity for different TFs including ERE, CREB, STAT1, Nanog, and Oct4. Our analysis reveals that the KW distribution and its generalized form provides the family of power-law-like distributions given in terms of hypergeometric series functions, including standard and generalized Pareto and Waring distributions, providing flexible and common skewed forms of the transcription factor binding site (TFBS) avidity distribution function. We suggest that the skewed binding events may be due to a wide range of evolutionary processes of creating weak avidity TFBS associated with random mutations, while the rare high-avidity binding sites (i.e., high-avidity evolutionarily conserved canonical e-boxes) rarely occurred. These, however, may be positively selected in microevolution.

Entities:  

Keywords:  Avidity; Binding site; Birth–death process; ChIP-PET; ChIP-Seq; Hypergeomeric function; Kemp distribution; Kolmogorov–Waring distribution; Mixture probability; Sample size; Scale dependence; Scale-free; Sensitivity; Skewed distribution; Specificity; Transcription factor

Mesh:

Substances:

Year:  2017        PMID: 28849563     DOI: 10.1007/978-1-4939-7027-8_9

Source DB:  PubMed          Journal:  Methods Mol Biol        ISSN: 1064-3745


  2 in total

1.  Toward predictive R-loop computational biology: genome-scale prediction of R-loops reveals their association with complex promoter structures, G-quadruplexes and transcriptionally active enhancers.

Authors:  Vladimir A Kuznetsov; Vladyslav Bondarenko; Thidathip Wongsurawat; Surya P Yenamandra; Piroon Jenjaroenpun
Journal:  Nucleic Acids Res       Date:  2018-09-06       Impact factor: 16.971

2.  Effect of promoter, promoter mutation and enhancer on transgene expression mediated by episomal vectors in transfected HEK293, Chang liver and primary cells.

Authors:  Zhong-Jie Xu; Yan-Long Jia; Meng Wang; Dan-Dan Yi; Wei-Li Zhang; Xiao-Yin Wang; Jun-He Zhang
Journal:  Bioengineered       Date:  2019-12       Impact factor: 3.269

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.