Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Residue-level prediction of DNA-binding sites and its application on DNA-binding protein predictions.

Literature DB >> 17316627

Residue-level prediction of DNA-binding sites and its application on DNA-binding protein predictions.

Abstract

Protein-DNA interactions are crucial to many cellular activities such as expression-control and DNA-repair. These interactions between amino acids and nucleotides are highly specific and any aberrance at the binding site can render the interaction completely incompetent. In this study, we have three aims focusing on DNA-binding residues on the protein surface: to develop an automated approach for fast and reliable recognition of DNA-binding sites; to improve the prediction by distance-dependent refinement; use these predictions to identify DNA-binding proteins. We use a support vector machines (SVM)-based approach to harness the features of the DNA-binding residues to distinguish them from non-binding residues. Features used for distinction include the residue's identity, charge, solvent accessibility, average potential, the secondary structure it is embedded in, neighboring residues, and location in a cationic patch. These features collected from 50 proteins are used to train SVM. Testing is then performed on another set of 37 proteins, much larger than any testing set used in previous studies. The testing set has no more than 20% sequence identity not only among its pairs, but also with the proteins in the training set, thus removing any undesired redundancy due to homology. This set also has proteins with an unseen DNA-binding structural class not present in the training set. With the above features, an accuracy of 66% with balanced sensitivity and specificity is achieved without relying on homology or evolutionary information. We then develop a post-processing scheme to improve the prediction using the relative location of the predicted residues. Balanced success is then achieved with average sensitivity, specificity and accuracy pegged at 71.3%, 69.3% and 70.5%, respectively. Average net prediction is also around 70%. Finally, we show that the number of predicted DNA-binding residues can be used to differentiate DNA-binding proteins from non-DNA-binding proteins with an accuracy of 78%. Results presented here demonstrate that machine-learning can be applied to automated identification of DNA-binding residues and that the success rate can be ameliorated as more features are added. Such functional site prediction protocols can be useful in guiding consequent works such as site-directed mutagenesis and macromolecular docking.

Mesh：

Substances：

Year: 2007 PMID： 17316627 PMCID： PMC1993824 DOI： 10.1016/j.febslet.2007.01.086

Source DB: PubMed Journal: FEBS Lett ISSN： 0014-5793 Impact factor: 4.124

32 in total

Residue-level prediction of DNA-binding sites and its application on DNA-binding protein predictions.

1. Recognition of a protein fold in the context of the Structural Classification of Proteins (SCOP) classification.

2. Protein-DNA interactions: A structural analysis.

3. Amino acid substitution matrices from protein blocks.

Review 4. Protein-DNA recognition patterns and predictions.

5. Using evolutionary and structural information to predict DNA-binding sites on DNA-binding proteins.

6. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features.

7. Use of the 'Perceptron' algorithm to distinguish translational initiation sites in E. coli.

8. A general method for site-directed mutagenesis in prokaryotes.

9. Kernel-based machine learning protocol for predicting DNA-binding proteins.

10. PSSM-based prediction of DNA binding sites in proteins.

Review 1. Computational prediction of type III and IV secreted effectors in gram-negative bacteria.

2. On the Accuracy of Sequence-Based Computational Inference of Protein Residues Involved in Interactions with DNA.

3. NAPS: a residue-level nucleic acid-binding prediction server.

4. Boosting the prediction and understanding of DNA-binding domains from sequence.

5. Prediction of functionally important sites from protein sequences using sparse kernel least squares classifiers.

6. An improved machine learning protocol for the identification of correct Sequest search results.

7. Prediction of FAD interacting residues in a protein from its primary sequence using evolutionary information.

8. From nonspecific DNA-protein encounter complexes to the prediction of DNA-protein interactions.

9. DNA-binding residues and binding mode prediction with binding-mechanism concerned models.

10. DeepDISE: DNA Binding Site Prediction Using a Deep Learning Method.