| Literature DB >> 25392745 |
Son P Nguyen1, Yi Shang1, Dong Xu2.
Abstract
Computational protein structure prediction is very important for many applications in bioinformatics. In the process of predicting protein structures, it is essential to accurately assess the quality of generated models. Although many single-model quality assessment (QA) methods have been developed, their accuracy is not high enough for most real applications. In this paper, a new approach based on C-α atoms distance matrix and machine learning methods is proposed for single-model QA and the identification of native-like models. Different from existing energy/scoring functions and consensus approaches, this new approach is purely geometry based. Furthermore, a novel algorithm based on deep learning techniques, called DL-Pro, is proposed. For a protein model, DL-Pro uses its distance matrix that contains pairwise distances between two residues' C-α atoms in the model, which sometimes is also called contact map, as an orientation-independent representation. From training examples of distance matrices corresponding to good and bad models, DL-Pro learns a stacked autoencoder network as a classifier. In experiments on selected targets from the Critical Assessment of Structure Prediction (CASP) competition, DL-Pro obtained promising results, outperforming state-of-the-art energy/scoring functions, including OPUS-CA, DOPE, DFIRE, and RW.Entities:
Keywords: Critical Assessment of Structure Prediction (CASP); classification; deep learning; energy and scoring function; protein model quality assessment; stacked autoencoder
Year: 2014 PMID: 25392745 PMCID: PMC4226404 DOI: 10.1109/IJCNN.2014.6889891
Source DB: PubMed Journal: Proc Int Jt Conf Neural Netw ISSN: 2161-4407