| Literature DB >> 29776380 |
Gayatri Kumar1, Richa Mudgal1,2, Narayanaswamy Srinivasan3, Sankaran Sandhya4.
Abstract
BACKGROUND: Knowledge of the protein structure is a pre-requisite for improved understanding of molecular function. The gap in the sequence-structure space has increased in the post-genomic era. Grouping related protein sequences into families can aid in narrowing the gap. In the Pfam database, structure description is provided for part or full-length proteins of 7726 families. For the remaining 52% of the families, information on 3-D structure is not yet available. We use the computationally designed sequences that are intermediately related to two protein domain families, which are already known to share the same fold. These strategically designed sequences enable detection of distant relationships and here, we have employed them for the purpose of structure recognition of protein families of yet unknown structure.Entities:
Keywords: Fold-assignment; Function annotation; Homology detection; Sequence-structure gap; Structural domain assignment; Structure recognition
Mesh:
Substances:
Year: 2018 PMID: 29776380 PMCID: PMC5960202 DOI: 10.1186/s13062-018-0209-6
Source DB: PubMed Journal: Biol Direct ISSN: 1745-6150 Impact factor: 4.540
Fig. 1Schematic outline of the workflow: Protocol adopted for structure recognition of families of unknown structure. A consensus was drawn from the structural mapping for the sequence families provided by Xu and Dunbrack [34] and PDB to Pfam mapping available in Pfam [30]
Fig. 2Evaluation of the approach: The precision, sensitivity and specificity of results from the assessment dataset are represented and the median value for the respective distributions is indicated above each boxplot, in percentage. The inset figure shows the histogram representing the frequency of each interval for the assessment dataset. Query coverage cutoffs of greater than 60% and E-value thresholds of better than 10−4 were used
Fig. 3The frequency of structural fold associations for sequence families as a function of coverage made in the searches: a) families with structure and fold information available. b) families with no prior structure information associated with a structural fold by our approach