Literature DB >> 20209034

On the Accuracy of Sequence-Based Computational Inference of Protein Residues Involved in Interactions with DNA.

Zhenkun Gou1, Igor B Kuznetsov.   

Abstract

Methods for computational inference of DNA-binding residues in DNA-binding proteins are usually developed using classification techniques trained to distinguish between binding and non-binding residues on the basis of known examples observed in experimentally determined high-resolution structures of protein-DNA complexes. What degree of accuracy can be expected when a computational methods is applied to a particular novel protein remains largely unknown. We test the utility of classification methods on the example of Kernel Logistic Regression (KLR) predictors of DNA-binding residues. We show that predictors that utilize sequence properties of proteins can successfully predict DNA-binding residues in proteins from a novel structural class. We use Multiple Linear Regression (MLR) to establish a quantitative relationship between protein properties and the expected accuracy of KLR predictors. Present results indicate that in the case of novel proteins the expected accuracy provided by an MLR model is close to the actual accuracy and can be used to assess the overall confidence of the prediction.

Entities:  

Year:  2008        PMID: 20209034      PMCID: PMC2832327          DOI: 10.3923/tasr.2008.285.291

Source DB:  PubMed          Journal:  Trends Appl Sci Res        ISSN: 1819-3579


  13 in total

Review 1.  Assessing the accuracy of prediction algorithms for classification: an overview.

Authors:  P Baldi; S Brunak; Y Chauvin; C A Andersen; H Nielsen
Journal:  Bioinformatics       Date:  2000-05       Impact factor: 6.937

2.  Using electrostatic potentials to predict DNA-binding sites on DNA-binding proteins.

Authors:  Susan Jones; Hugh P Shanahan; Helen M Berman; Janet M Thornton
Journal:  Nucleic Acids Res       Date:  2003-12-15       Impact factor: 16.971

3.  Structure-based prediction of DNA-binding sites on proteins using the empirical preference of electrostatic potential and the shape of molecular surfaces.

Authors:  Yuko Tsuchiya; Kengo Kinoshita; Haruki Nakamura
Journal:  Proteins       Date:  2004-06-01

4.  Using evolutionary and structural information to predict DNA-binding sites on DNA-binding proteins.

Authors:  Igor B Kuznetsov; Zhenkun Gou; Run Li; Seungwoo Hwang
Journal:  Proteins       Date:  2006-07-01

5.  An empirical approach for detecting nucleotide-binding sites on proteins.

Authors:  Mihoko Saito; Mitiko Go; Tsuyoshi Shirai
Journal:  Protein Eng Des Sel       Date:  2006-01-10       Impact factor: 1.650

6.  Residue-level prediction of DNA-binding sites and its application on DNA-binding protein predictions.

Authors:  Nitin Bhardwaj; Hui Lu
Journal:  FEBS Lett       Date:  2007-02-07       Impact factor: 4.124

7.  CATH--a hierarchic classification of protein domain structures.

Authors:  C A Orengo; A D Michie; S Jones; D T Jones; M B Swindells; J M Thornton
Journal:  Structure       Date:  1997-08-15       Impact factor: 5.006

8.  BindN: a web-based tool for efficient prediction of DNA and RNA binding sites in amino acid sequences.

Authors:  Liangjiang Wang; Susan J Brown
Journal:  Nucleic Acids Res       Date:  2006-07-01       Impact factor: 16.971

9.  Predicting DNA-binding sites of proteins from amino acid sequence.

Authors:  Changhui Yan; Michael Terribilini; Feihong Wu; Robert L Jernigan; Drena Dobbs; Vasant Honavar
Journal:  BMC Bioinformatics       Date:  2006-05-19       Impact factor: 3.169

10.  PSSM-based prediction of DNA binding sites in proteins.

Authors:  Shandar Ahmad; Akinori Sarai
Journal:  BMC Bioinformatics       Date:  2005-02-19       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.