Shandar Ahmad1, M Michael Gromiha, Akinori Sarai. 1. Department of Biochemical Science and Engineering, Kyushu Institute of Technology, Fukuoka, Iizuka 820 8502, Japan. shandar@bse.kyutech.ac.jp
Abstract
MOTIVATION: Though vitally important to cell function, the mechanism of protein-DNA binding has not yet been completely understood. We therefore analysed the relationship between DNA binding and protein sequence composition, solvent accessibility and secondary structure. Using non-redundant databases of transcription factors and protein-DNA complexes, neural network models were developed to utilize the information present in this relationship to predict DNA-binding proteins and their binding residues. RESULTS: Sequence composition was found to provide sufficient information to predict the probability of its binding to DNA with nearly 69% sensitivity at 64% accuracy for the considered proteins; sequence neighbourhood and solvent accessibility information were sufficient to make binding site predictions with 40% sensitivity at 79% accuracy. Detailed analysis of binding residues shows that some three- and five-residue segments frequently bind to DNA and that solvent accessibility plays a major role in binding. Although, binding behaviour was not associated with any particular secondary structure, there were interesting exceptions at the residue level. Over-representation of some residues in the binding sites was largely lost at the total sequence level, but a different kind of compositional preference was observed in DNA-binding proteins.
MOTIVATION: Though vitally important to cell function, the mechanism of protein-DNA binding has not yet been completely understood. We therefore analysed the relationship between DNA binding and protein sequence composition, solvent accessibility and secondary structure. Using non-redundant databases of transcription factors and protein-DNA complexes, neural network models were developed to utilize the information present in this relationship to predict DNA-binding proteins and their binding residues. RESULTS: Sequence composition was found to provide sufficient information to predict the probability of its binding to DNA with nearly 69% sensitivity at 64% accuracy for the considered proteins; sequence neighbourhood and solvent accessibility information were sufficient to make binding site predictions with 40% sensitivity at 79% accuracy. Detailed analysis of binding residues shows that some three- and five-residue segments frequently bind to DNA and that solvent accessibility plays a major role in binding. Although, binding behaviour was not associated with any particular secondary structure, there were interesting exceptions at the residue level. Over-representation of some residues in the binding sites was largely lost at the total sequence level, but a different kind of compositional preference was observed in DNA-binding proteins.
Authors: Carla A H Prata; Xiao-Xiang Zhang; Dan Luo; Thomas J McIntosh; Philippe Barthelemy; Mark W Grinstaff Journal: Bioconjug Chem Date: 2008-01-11 Impact factor: 4.774
Authors: Michal Levy-Sakin; Assaf Grunwald; Soohong Kim; Natalie R Gassman; Anna Gottfried; Josh Antelman; Younggyu Kim; Sam O Ho; Robin Samuel; Xavier Michalet; Ron R Lin; Thomas Dertinger; Andrew S Kim; Sangyoon Chung; Ryan A Colyer; Elmar Weinhold; Shimon Weiss; Yuval Ebenstein Journal: ACS Nano Date: 2013-12-20 Impact factor: 15.881