Literature DB >> 21193520

Discovering approximate-associated sequence patterns for protein-DNA interactions.

Tak-Ming Chan1, Ka-Chun Wong, Kin-Hong Lee, Man-Hon Wong, Chi-Kong Lau, Stephen Kwok-Wing Tsui, Kwong-Sak Leung.   

Abstract

MOTIVATION: The bindings between transcription factors (TFs) and transcription factor binding sites (TFBSs) are fundamental protein-DNA interactions in transcriptional regulation. Extensive efforts have been made to better understand the protein-DNA interactions. Recent mining on exact TF-TFBS-associated sequence patterns (rules) has shown great potentials and achieved very promising results. However, exact rules cannot handle variations in real data, resulting in limited informative rules. In this article, we generalize the exact rules to approximate ones for both TFs and TFBSs, which are essential for biological variations.
RESULTS: A progressive approach is proposed to address the approximation to alleviate the computational requirements. Firstly, similar TFBSs are grouped from the available TF-TFBS data (TRANSFAC database). Secondly, approximate and highly conserved binding cores are discovered from TF sequences corresponding to each TFBS group. A customized algorithm is developed for the specific objective. We discover the approximate TF-TFBS rules by associating the grouped TFBS consensuses and TF cores. The rules discovered are evaluated by matching (verifying with) the actual protein-DNA binding pairs from Protein Data Bank (PDB) 3D structures. The approximate results exhibit many more verified rules and up to 300% better verification ratios than the exact ones. The customized algorithm achieves over 73% better verification ratios than traditional methods. Approximate rules (64-79%) are shown statistically significant. Detailed variation analysis and conservation verification on NCBI records demonstrate that the approximate rules reveal both the flexible and specific protein-DNA interactions accurately. The approximate TF-TFBS rules discovered show great generalized capability of exploring more informative binding rules.

Mesh:

Substances:

Year:  2010        PMID: 21193520     DOI: 10.1093/bioinformatics/btq682

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  5 in total

1.  Characterization and prediction of the binding site in DNA-binding proteins: improvement of accuracy by combining residue composition, evolutionary conservation and structural parameters.

Authors:  Sucharita Dey; Arumay Pal; Mainak Guharoy; Shrihari Sonavane; Pinak Chakrabarti
Journal:  Nucleic Acids Res       Date:  2012-05-27       Impact factor: 16.971

2.  Imbalanced target prediction with pattern discovery on clinical data repositories.

Authors:  Tak-Ming Chan; Yuxi Li; Choo-Chiap Chiau; Jane Zhu; Jie Jiang; Yong Huo
Journal:  BMC Med Inform Decis Mak       Date:  2017-04-20       Impact factor: 2.796

3.  Subtypes of associated protein-DNA (Transcription Factor-Transcription Factor Binding Site) patterns.

Authors:  Tak-Ming Chan; Kwong-Sak Leung; Kin-Hong Lee; Man-Hon Wong; Terrence Chi-Kong Lau; Stephen Kwok-Wing Tsui
Journal:  Nucleic Acids Res       Date:  2012-08-16       Impact factor: 16.971

4.  Computational learning on specificity-determining residue-nucleotide interactions.

Authors:  Ka-Chun Wong; Yue Li; Chengbin Peng; Alan M Moses; Zhaolei Zhang
Journal:  Nucleic Acids Res       Date:  2015-11-02       Impact factor: 16.971

5.  DNA motif elucidation using belief propagation.

Authors:  Ka-Chun Wong; Tak-Ming Chan; Chengbin Peng; Yue Li; Zhaolei Zhang
Journal:  Nucleic Acids Res       Date:  2013-06-29       Impact factor: 16.971

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.