Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Defining and predicting structurally conserved regions in protein superfamilies.

Literature DB >> 23193223

Defining and predicting structurally conserved regions in protein superfamilies.

Ivan K Huang¹, Jimin Pei, Nick V Grishin.

Abstract

MOTIVATION: The structures of homologous proteins are generally better conserved than their sequences. This phenomenon is demonstrated by the prevalence of structurally conserved regions (SCRs) even in highly divergent protein families. Defining SCRs requires the comparison of two or more homologous structures and is affected by their availability and divergence, and our ability to deduce structurally equivalent positions among them. In the absence of multiple homologous structures, it is necessary to predict SCRs of a protein using information from only a set of homologous sequences and (if available) a single structure. Accurate SCR predictions can benefit homology modelling and sequence alignment.
RESULTS: Using pairwise DaliLite alignments among a set of homologous structures, we devised a simple measure of structural conservation, termed structural conservation index (SCI). SCI was used to distinguish SCRs from non-SCRs. A database of SCRs was compiled from 386 SCOP superfamilies containing 6489 protein domains. Artificial neural networks were then trained to predict SCRs with various features deduced from a single structure and homologous sequences. Assessment of the predictions via a 5-fold cross-validation method revealed that predictions based on features derived from a single structure perform similarly to ones based on homologous sequences, while combining sequence and structural features was optimal in terms of accuracy (0.755) and Matthews correlation coefficient (0.476). These results suggest that even without information from multiple structures, it is still possible to effectively predict SCRs for a protein. Finally, inspection of the structures with the worst predictions pinpoints difficulties in SCR definitions. AVAILABILITY: The SCR database and the prediction server can be found at http://prodata.swmed.edu/SCR. CONTACT: 91huangi@gmail.com or grishin@chop.swmed.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Online.

Entities: Species

Mesh：

Substances：
Proteins

Year: 2012 PMID： 23193223 PMCID： PMC3546793 DOI： 10.1093/bioinformatics/bts682

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

35 in total

1. The hydrophobic cores of proteins predicted by wavelet analysis.

Authors: H Hirakawa; S Muta; S Kuhara
Journal: Bioinformatics Date: 1999-02 Impact factor: 6.937

2. A similar active site for non-specific and specific endonucleases.

Authors: P Friedhoff; I Franke; G Meiss; W Wende; K L Krause; A Pingoud
Journal: Nat Struct Biol Date: 1999-02

Review 3. Restriction enzymes and their isoschizomers.

Authors: R J Roberts; D Macelis
Journal: Nucleic Acids Res Date: 1991-04-25 Impact factor: 16.971

Review 4. Mapping the protein universe.

Authors: L Holm; C Sander
Journal: Science Date: 1996-08-02 Impact factor: 47.728

Review 5. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Authors: S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman
Journal: Nucleic Acids Res Date: 1997-09-01 Impact factor: 16.971

6. SCOP: a structural classification of proteins database for the investigation of sequences and structures.

Authors: A G Murzin; S E Brenner; T Hubbard; C Chothia
Journal: J Mol Biol Date: 1995-04-07 Impact factor: 5.469

7. Structural relationships of homologous proteins as a fundamental principle in homology modeling.

Authors: M Hilbert; G Böhm; R Jaenicke
Journal: Proteins Date: 1993-10

8. Amino acid sequence motif of group I intron endonucleases is conserved in open reading frames of group II introns.

Authors: D A Shub; H Goodrich-Blair; S R Eddy
Journal: Trends Biochem Sci Date: 1994-10 Impact factor: 13.807

9. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features.

Authors: W Kabsch; C Sander
Journal: Biopolymers Date: 1983-12 Impact factor: 2.505

10. The relation between the divergence of sequence and structure in proteins.

Authors: C Chothia; A M Lesk
Journal: EMBO J Date: 1986-04 Impact factor: 11.598

6 in total

1. A sequence family database built on ECOD structural domains.

Authors: Yuxing Liao; R Dustin Schaeffer; Jimin Pei; Nick V Grishin
Journal: Bioinformatics Date: 2018-09-01 Impact factor: 6.937

Review 2. Signal transduction: From the atomic age to the post-genomic era.

Authors: Jeremy Thorner; Tony Hunter; Lewis C Cantley; Richard Sever
Journal: Cold Spring Harb Perspect Biol Date: 2014-10-30 Impact factor: 10.005

3. Refinement by shifting secondary structure elements improves sequence alignments.

Authors: Jing Tong; Jimin Pei; Zbyszek Otwinowski; Nick V Grishin
Journal: Proteins Date: 2015-01-13

4. A sequence-based method for predicting extant fold switchers that undergo α-helix ↔ β-strand transitions.

Authors: Soumya Mishra; Loren L Looger; Lauren L Porter
Journal: Biopolymers Date: 2021-09-09 Impact factor: 2.240

5. Revisiting Myosin Families Through Large-scale Sequence Searches Leads to the Discovery of New Myosins.

Authors: Shaik Naseer Pasha; Iyer Meenakshi; Ramanathan Sowdhamini
Journal: Evol Bioinform Online Date: 2016-08-29 Impact factor: 1.625

6. Genome-wide identification of Calcineurin B-Like (CBL) gene family of plants reveals novel conserved motifs and evolutionary aspects in calcium signaling events.

Authors: Tapan Kumar Mohanta; Nibedita Mohanta; Yugal Kishore Mohanta; Pratap Parida; Hanhong Bae
Journal: BMC Plant Biol Date: 2015-08-06 Impact factor: 4.215

6 in total