Literature DB >> 29659718

A sequence family database built on ECOD structural domains.

Yuxing Liao1, R Dustin Schaeffer1,2, Jimin Pei1,2, Nick V Grishin1,2.   

Abstract

Motivation: The ECOD database classifies protein domains based on their evolutionary relationships, considering both remote and close homology. The family group in ECOD provides classification of domains that are closely related to each other based on sequence similarity. Due to different perspectives on domain definition, direct application of existing sequence domain databases, such as Pfam, to ECOD struggles with several shortcomings.
Results: We created multiple sequence alignments and profiles from ECOD domains with the help of structural information in alignment building and boundary delineation. We validated the alignment quality by scoring structure superposition to demonstrate that they are comparable to curated seed alignments in Pfam. Comparison to Pfam and CDD reveals that 27 and 16% of ECOD families are new, but they are also dominated by small families, likely because of the sampling bias from the PDB database. There are 35 and 48% of families whose boundaries are modified comparing to counterparts in Pfam and CDD, respectively. Availability and implementation: The new families are now integrated in the ECOD website. The aggregate HMMER profile library and alignment are available for download on ECOD website (http://prodata.swmed.edu/ecod). Supplementary information: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 29659718      PMCID: PMC6129306          DOI: 10.1093/bioinformatics/bty214

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  29 in total

1.  The PSIPRED protein structure prediction server.

Authors:  L J McGuffin; K Bryson; D T Jones
Journal:  Bioinformatics       Date:  2000-04       Impact factor: 6.937

2.  FAST: a novel protein structure alignment algorithm.

Authors:  Jianhua Zhu; Zhiping Weng
Journal:  Proteins       Date:  2005-02-15

3.  Separate inputs modulate phosphorylation-dependent and -independent type VI secretion activation.

Authors:  Julie M Silverman; Laura S Austin; FoSheng Hsu; Kevin G Hicks; Rachel D Hood; Joseph D Mougous
Journal:  Mol Microbiol       Date:  2011-11-04       Impact factor: 3.501

4.  CDD: NCBI's conserved domain database.

Authors:  Aron Marchler-Bauer; Myra K Derbyshire; Noreen R Gonzales; Shennan Lu; Farideh Chitsaz; Lewis Y Geer; Renata C Geer; Jane He; Marc Gwadz; David I Hurwitz; Christopher J Lanczycki; Fu Lu; Gabriele H Marchler; James S Song; Narmada Thanki; Zhouxi Wang; Roxanne A Yamashita; Dachuan Zhang; Chanjuan Zheng; Stephen H Bryant
Journal:  Nucleic Acids Res       Date:  2014-11-20       Impact factor: 16.971

5.  MAFFT multiple sequence alignment software version 7: improvements in performance and usability.

Authors:  Kazutaka Katoh; Daron M Standley
Journal:  Mol Biol Evol       Date:  2013-01-16       Impact factor: 16.240

6.  Manual classification strategies in the ECOD database.

Authors:  Hua Cheng; Yuxing Liao; R Dustin Schaeffer; Nick V Grishin
Journal:  Proteins       Date:  2015-05-08

7.  ECOD: new developments in the evolutionary classification of domains.

Authors:  R Dustin Schaeffer; Yuxing Liao; Hua Cheng; Nick V Grishin
Journal:  Nucleic Acids Res       Date:  2016-11-29       Impact factor: 16.971

8.  UniProt: the universal protein knowledgebase.

Authors: 
Journal:  Nucleic Acids Res       Date:  2016-11-29       Impact factor: 16.971

9.  CD-HIT: accelerated for clustering the next-generation sequencing data.

Authors:  Limin Fu; Beifang Niu; Zhengwei Zhu; Sitao Wu; Weizhong Li
Journal:  Bioinformatics       Date:  2012-10-11       Impact factor: 6.937

10.  The Pfam protein families database: towards a more sustainable future.

Authors:  Robert D Finn; Penelope Coggill; Ruth Y Eberhardt; Sean R Eddy; Jaina Mistry; Alex L Mitchell; Simon C Potter; Marco Punta; Matloob Qureshi; Amaia Sangrador-Vegas; Gustavo A Salazar; John Tate; Alex Bateman
Journal:  Nucleic Acids Res       Date:  2015-12-15       Impact factor: 16.971

View more
  2 in total

1.  β-Strand-mediated interactions of protein domains.

Authors:  Archana S Bhat; Lisa N Kinch; Nick V Grishin
Journal:  Proteins       Date:  2020-07-11

2.  ECOD: identification of distant homology among multidomain and transmembrane domain proteins.

Authors:  R Dustin Schaeffer; Lisa Kinch; Kirill E Medvedev; Jimin Pei; Hua Cheng; Nick Grishin
Journal:  BMC Mol Cell Biol       Date:  2019-06-21
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.