| Literature DB >> 17177203 |
Abstract
An integrated system of neural networks, called SPINE, is established and optimized for predicting structural properties of proteins. SPINE is applied to three-state secondary-structure and residue-solvent-accessibility (RSA) prediction in this paper. The integrated neural networks are carefully trained with a large dataset of 2640 chains, sequence profiles generated from multiple sequence alignment, representative amino acid properties, a slow learning rate, overfitting protection, and an optimized sliding-widow size. More than 200,000 weights in SPINE are optimized by maximizing the accuracy measured by Q(3) (the percentage of correctly classified residues). SPINE yields a 10-fold cross-validated accuracy of 79.5% (80.0% for chains of length between 50 and 300) in secondary-structure prediction after one-month (CPU time) training on 22 processors. An accuracy of 87.5% is achieved for exposed residues (RSA >95%). The latter approaches the theoretical upper limit of 88-90% accuracy in assigning secondary structures. An accuracy of 73% for three-state solvent-accessibility prediction (25%/75% cutoff) and 79.3% for two-state prediction (25% cutoff) is also obtained. (c) 2006 Wiley-Liss, Inc.Mesh:
Substances:
Year: 2007 PMID: 17177203 DOI: 10.1002/prot.21298
Source DB: PubMed Journal: Proteins ISSN: 0887-3585