Literature DB >> 25917548

Manual classification strategies in the ECOD database.

Hua Cheng1, Yuxing Liao2, R Dustin Schaeffer1, Nick V Grishin1,2.   

Abstract

ECOD (Evolutionary Classification Of protein Domains) is a comprehensive and up-to-date protein structure classification database. The majority of new structures released from the PDB (Protein Data Bank) each week already have close homologs in the ECOD hierarchy and thus can be reliably partitioned into domains and classified by software without manual intervention. However, those proteins that lack confidently detectable homologs require careful analysis by experts. Although many bioinformatics resources rely on expert curation to some degree, specific examples of how this curation occurs and in what cases it is necessary are not always described. Here, we illustrate the manual classification strategy in ECOD by example, focusing on two major issues in protein classification: domain partitioning and the relationship between homology and similarity scores. Most examples show recently released and manually classified PDB structures. We discuss multi-domain proteins, discordance between sequence and structural similarities, difficulties with assessing homology with scores, and integral membrane proteins homologous to soluble proteins. By timely assimilation of newly available structures into its hierarchy, ECOD strives to provide a most accurate and updated view of the protein structure world as a result of combined computational and expert-driven analysis.
© 2015 Wiley Periodicals, Inc.

Entities:  

Keywords:  classification; database; domain; evolution; homology; protein; sequence; structure

Mesh:

Substances:

Year:  2015        PMID: 25917548      PMCID: PMC4624060          DOI: 10.1002/prot.24818

Source DB:  PubMed          Journal:  Proteins        ISSN: 0887-3585


  98 in total

1.  A superfamily of archaeal, bacterial, and eukaryotic proteins homologous to animal transglutaminases.

Authors:  K S Makarova; L Aravind; E V Koonin
Journal:  Protein Sci       Date:  1999-08       Impact factor: 6.725

2.  Crystal structure and induction mechanism of AmiC-AmiR: a ligand-regulated transcription antitermination complex.

Authors:  B P O'Hara; R A Norman; P T Wan; S M Roe; T E Barrett; R E Drew; L H Pearl
Journal:  EMBO J       Date:  1999-10-01       Impact factor: 11.598

Review 3.  ANTAR: an RNA-binding domain in transcription antitermination regulatory proteins.

Authors:  Chengyi J Shu; Igor B Zhulin
Journal:  Trends Biochem Sci       Date:  2002-01       Impact factor: 13.807

Review 4.  Review: what can structural classifications reveal about protein evolution?

Authors:  C A Orengo; I Sillitoe; G Reeves; F M Pearl
Journal:  J Struct Biol       Date:  2001 May-Jun       Impact factor: 2.867

5.  A new family of plant transcription factors displays a novel ssDNA-binding surface.

Authors:  Darrell Desveaux; Julie Allard; Normand Brisson; Jurgen Sygusch
Journal:  Nat Struct Biol       Date:  2002-07

Review 6.  Structure, mechanism and function of prenyltransferases.

Authors:  Po-Huang Liang; Tzu-Ping Ko; Andrew H-J Wang
Journal:  Eur J Biochem       Date:  2002-07

Review 7.  Evolution of protein structures and functions.

Authors:  Lisa N Kinch; Nick V Grishin
Journal:  Curr Opin Struct Biol       Date:  2002-06       Impact factor: 6.809

Review 8.  How far divergent evolution goes in proteins.

Authors:  A G Murzin
Journal:  Curr Opin Struct Biol       Date:  1998-06       Impact factor: 6.809

9.  Toprim--a conserved catalytic domain in type IA and II topoisomerases, DnaG-type primases, OLD family nucleases and RecR proteins.

Authors:  L Aravind; D D Leipe; E V Koonin
Journal:  Nucleic Acids Res       Date:  1998-09-15       Impact factor: 16.971

10.  A 100-kD complex of two RNA-binding proteins from mitochondria of Leishmania tarentolae catalyzes RNA annealing and interacts with several RNA editing components.

Authors:  Ruslan Aphasizhev; Inna Aphasizheva; Robert E Nelson; Larry Simpson
Journal:  RNA       Date:  2003-01       Impact factor: 4.942

View more
  32 in total

Review 1.  Classification of proteins with shared motifs and internal repeats in the ECOD database.

Authors:  R Dustin Schaeffer; Lisa N Kinch; Yuxing Liao; Nick V Grishin
Journal:  Protein Sci       Date:  2016-02-21       Impact factor: 6.725

2.  Density Peak clustering of protein sequences associated to a Pfam clan reveals clear similarities and interesting differences with respect to manual family annotation.

Authors:  Alessandro Laio; Marco Punta; Elena Tea Russo
Journal:  BMC Bioinformatics       Date:  2021-03-12       Impact factor: 3.169

3.  A sequence family database built on ECOD structural domains.

Authors:  Yuxing Liao; R Dustin Schaeffer; Jimin Pei; Nick V Grishin
Journal:  Bioinformatics       Date:  2018-09-01       Impact factor: 6.937

4.  Short and simple sequences favored the emergence of N-helix phospho-ligand binding sites in the first enzymes.

Authors:  Liam M Longo; Dušan Petrović; Shina Caroline Lynn Kamerlin; Dan S Tawfik
Journal:  Proc Natl Acad Sci U S A       Date:  2020-02-20       Impact factor: 11.205

5.  β-Strand-mediated interactions of protein domains.

Authors:  Archana S Bhat; Lisa N Kinch; Nick V Grishin
Journal:  Proteins       Date:  2020-07-11

6.  DALI and the persistence of protein shape.

Authors:  Liisa Holm
Journal:  Protein Sci       Date:  2019-11-05       Impact factor: 6.725

7.  How to choose templates for modeling of protein complexes: Insights from benchmarking template-based docking.

Authors:  Devlina Chakravarty; G W McElfresh; Petras J Kundrotas; Ilya A Vakser
Journal:  Proteins       Date:  2020-02-07

8.  Improved 3-D Protein Structure Predictions using Deep ResNet Model.

Authors:  S Geethu; E R Vimina
Journal:  Protein J       Date:  2021-09-12       Impact factor: 2.371

9.  Crystal structure of tomato spotted wilt virus GN reveals a dimer complex formation and evolutionary link to animal-infecting viruses.

Authors:  Yoav Bahat; Joel Alter; Moshe Dessau
Journal:  Proc Natl Acad Sci U S A       Date:  2020-10-05       Impact factor: 11.205

10.  CASP13 target classification into tertiary structure prediction categories.

Authors:  Lisa N Kinch; Andriy Kryshtafovych; Bohdan Monastyrskyy; Nick V Grishin
Journal:  Proteins       Date:  2019-07-24
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.