Literature DB >> 17436108

Learning to translate sequence and structure to function: identifying DNA binding and membrane binding proteins.

Robert E Langlois1, Matthew B Carson, Nitin Bhardwaj, Hui Lu.   

Abstract

A protein's function depends in a large part on interactions with other molecules. With an increasing number of protein structures becoming available every year, a corresponding structural annotation approach identifying such interactions grows more expedient. At the same time, machine learning has gained popularity in bioinformatics providing robust annotation of genes and proteins without sequence homology. Here we have developed a general machine learning protocol to identify proteins that bind DNA and membrane. In general, there is no theory or even rule of thumb to pick the best machine learning algorithm. Thus, a systematic comparison of several classification algorithms known to perform well is investigated. Indeed, the boosted tree classifier is found to give the best performance, achieving 93% and 88% accuracy to discriminate non-homologous proteins that bind membrane and DNA, respectively, significantly outperforming all previously published works. We also attempted to address the importance of the attributes in function prediction and the relationships between relevant attributes. A graphical model based on boosted trees is applied to study the important features in discriminating DNA-binding proteins. In summary, the current protocol identified physical features important in DNA and membrane binding, rather than annotating function through sequence similarity.

Mesh:

Substances:

Year:  2007        PMID: 17436108      PMCID: PMC2706547          DOI: 10.1007/s10439-007-9312-z

Source DB:  PubMed          Journal:  Ann Biomed Eng        ISSN: 0090-6964            Impact factor:   3.934


  28 in total

Review 1.  DNA binding sites: representation and discovery.

Authors:  G D Stormo
Journal:  Bioinformatics       Date:  2000-01       Impact factor: 6.937

2.  Is nitrocellulose filter binding really a universal assay for protein-DNA interactions?

Authors:  S Oehler; R Alex; A Barker
Journal:  Anal Biochem       Date:  1999-03-15       Impact factor: 3.365

3.  Identifying DNA-binding proteins using structural motifs and the electrostatic potential.

Authors:  Hugh P Shanahan; Mario A Garcia; Susan Jones; Janet M Thornton
Journal:  Nucleic Acids Res       Date:  2004-09-08       Impact factor: 16.971

4.  Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information.

Authors:  Shandar Ahmad; M Michael Gromiha; Akinori Sarai
Journal:  Bioinformatics       Date:  2004-01-22       Impact factor: 6.937

5.  Improved protein fold assignment using support vector machines.

Authors:  Robert E Langlois; Alice Diec; Ognjen Perisic; Yang Dai
Journal:  Int J Bioinform Res Appl       Date:  2005

Review 6.  Structural genomics: beyond the human genome project.

Authors:  S K Burley; S C Almo; J B Bonanno; M Capel; M R Chance; T Gaasterland; D Lin; A Sali; F W Studier; S Swaminathan
Journal:  Nat Genet       Date:  1999-10       Impact factor: 38.330

7.  Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features.

Authors:  W Kabsch; C Sander
Journal:  Biopolymers       Date:  1983-12       Impact factor: 2.505

8.  Kernel-based machine learning protocol for predicting DNA-binding proteins.

Authors:  Nitin Bhardwaj; Robert E Langlois; Guijun Zhao; Hui Lu
Journal:  Nucleic Acids Res       Date:  2005-11-10       Impact factor: 16.971

9.  Predicting DNA-binding sites of proteins from amino acid sequence.

Authors:  Changhui Yan; Michael Terribilini; Feihong Wu; Robert L Jernigan; Drena Dobbs; Vasant Honavar
Journal:  BMC Bioinformatics       Date:  2006-05-19       Impact factor: 3.169

10.  Ab initio prediction of transcription factor targets using structural knowledge.

Authors:  Tommy Kaplan; Nir Friedman; Hanah Margalit
Journal:  PLoS Comput Biol       Date:  2005-06-24       Impact factor: 4.475

View more
  6 in total

1.  Structure-based prediction of DNA-binding proteins by structural alignment and a volume-fraction corrected DFIRE-based energy function.

Authors:  Huiying Zhao; Yuedong Yang; Yaoqi Zhou
Journal:  Bioinformatics       Date:  2010-06-04       Impact factor: 6.937

2.  Boosting the prediction and understanding of DNA-binding domains from sequence.

Authors:  Robert E Langlois; Hui Lu
Journal:  Nucleic Acids Res       Date:  2010-02-15       Impact factor: 16.971

3.  Genome-wide sequence-based prediction of peripheral proteins using a novel semi-supervised learning technique.

Authors:  Nitin Bhardwaj; Mark Gerstein; Hui Lu
Journal:  BMC Bioinformatics       Date:  2010-01-18       Impact factor: 3.169

4.  Enzyme classification with peptide programs: a comparative study.

Authors:  Daniel Faria; António E N Ferreira; André O Falcão
Journal:  BMC Bioinformatics       Date:  2009-07-24       Impact factor: 3.169

5.  A structure-based protocol for learning the family-specific mechanisms of membrane-binding domains.

Authors:  Morten Källberg; Nitin Bhardwaj; Robert Langlois; Hui Lu
Journal:  Bioinformatics       Date:  2012-09-15       Impact factor: 6.937

6.  Heterodimer Binding Scaffolds Recognition via the Analysis of Kinetically Hot Residues.

Authors:  Ognjen Perišić
Journal:  Pharmaceuticals (Basel)       Date:  2018-03-16
  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.