| Literature DB >> 27334454 |
Abstract
The success in protein tertiary-structure prediction is considered to be a function of coverage and similarity/identity of their sequences with suitable templates in the structural databases. However, this measure of modelability of a protein sequence into its structure may be misleading. Addressing this limitation, we propose here a 'structural difficulty (SD)' index, which is derived from secondary structures, homology and physicochemical features of protein sequences. The SD index reflects the capability of predicting accurate structures and helps to assess the potential for developing proteome level structural databases for various organisms with some of the best methodologies available currently. For instance, the plausibility of populating the structural database of human proteome with reliable quality structures under 3 Å root mean square deviation from the corresponding natives is found to be ∼37% of a total of 11 084 manually curated soluble proteins and ∼64% for all annotated and reviewed unique soluble protein (344 661 sequences) of UniProtKB. Also for 77 human pathogenic viruses comprising 2365 globular viral proteins out of which only 162 structures are solved experimentally, SD index scores 1336 proteins in the modelable zone. Availability of reliable protein structures may prove a crucial aid in developing species-wise structural proteomic databases for accelerating function annotation and for drug development endeavors.Entities:
Keywords: human proteome modelability; proteomes modelability; structural difficulty; structural modelability
Mesh:
Substances:
Year: 2016 PMID: 27334454 DOI: 10.1093/protein/gzw025
Source DB: PubMed Journal: Protein Eng Des Sel ISSN: 1741-0126 Impact factor: 1.650