| Literature DB >> 22255025 |
Majid Masso1, Iosif I Vaisman.
Abstract
A computational mutagenesis methodology founded upon a structure-dependent and knowledge-based four-body statistical potential is utilized in generating feature vectors that characterize over 8500 individual amino acid substitutions occurring in seven proteins, each mutant having been experimentally ascertained for its relative effect on native protein activity. The proteins are diverse with respect to host organism (viral, bacterial, human) and function (enzymatic, nucleic acid binding, signaling), the structures span all four major SCOP classifications, and the mutations occur at positions well distributed throughout the seven structures. Implementation of the random forest algorithm, for classifying mutant activity as either unaffected or affected relative to the native protein, yields 84% accuracy based on tenfold cross-validation. A freely available online server for obtaining predictions with the trained model, which also displays 84% accuracy on an independent test set of mutants, is available at http://proteins.gmu.edu/automute/AUTO-MUTE_Activity.html.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22255025 DOI: 10.1109/IEMBS.2011.6090876
Source DB: PubMed Journal: Conf Proc IEEE Eng Med Biol Soc ISSN: 1557-170X