Literature DB >> 11790833

The CATH extended protein-family database: providing structural annotations for genome sequences.

Frances M G Pearl1, David Lee, James E Bray, Daniel W A Buchan, Adrian J Shepherd, Christine A Orengo.   

Abstract

An automatic sequence search and analysis protocol (DomainFinder) based on PSI-BLAST and IMPALA, and using conservative thresholds, has been developed for reliably integrating gene sequences from GenBank into their respective structural families within the CATH domain database (http://www.biochem.ucl.ac.uk/bsm/cath_new). DomainFinder assigns a new gene sequence to a CATH homologous superfamily provided that PSI-BLAST identifies a clear relationship to at least one other Protein Data Bank sequence within that superfamily. This has resulted in an expansion of the CATH protein family database (CATH-PFDB v1.6) from 19,563 domain structures to 176,597 domain sequences. A further 50,000 putative homologous relationships can be identified using less stringent cut-offs and these relationships are maintained within neighbour tables in the CATH Oracle database, pending further evidence of their suggested evolutionary relationship. Analysis of the CATH-PFDB has shown that only 15% of the sequence families are close enough to a known structure for reliable homology modeling. IMPALA/PSI-BLAST profiles have been generated for each of the sequence families in the expanded CATH-PFDB and a web server has been provided so that new sequences may be scanned against the profile library and be assigned to a structure and homologous superfamily.

Entities:  

Mesh:

Substances:

Year:  2002        PMID: 11790833      PMCID: PMC2373435          DOI: 10.1110/ps.16802

Source DB:  PubMed          Journal:  Protein Sci        ISSN: 0961-8368            Impact factor:   6.725


  30 in total

1.  GenBank.

Authors:  D A Benson; I Karsch-Mizrachi; D J Lipman; J Ostell; B A Rapp; D L Wheeler
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

3.  Assigning genomic sequences to CATH.

Authors:  F M Pearl; D Lee; J E Bray; I Sillitoe; A E Todd; A P Harrison; J M Thornton; C A Orengo
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

4.  Evolution of function in protein superfamilies, from a structural perspective.

Authors:  A E Todd; C A Orengo; J M Thornton
Journal:  J Mol Biol       Date:  2001-04-06       Impact factor: 5.469

Review 5.  Evolution of protein function, from a structural perspective.

Authors:  A E Todd; C A Orengo; J M Thornton
Journal:  Curr Opin Chem Biol       Date:  1999-10       Impact factor: 8.822

6.  IMPALA: matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matrices.

Authors:  A A Schäffer; Y I Wolf; C P Ponting; E V Koonin; L Aravind; S F Altschul
Journal:  Bioinformatics       Date:  1999-12       Impact factor: 6.937

7.  The CATH Dictionary of Homologous Superfamilies (DHS): a consensus approach for identifying distant structural homologues.

Authors:  J E Bray; A E Todd; F M Pearl; J M Thornton; C A Orengo
Journal:  Protein Eng       Date:  2000-03

8.  Fast assignment of protein structures to sequences using the intermediate sequence library PDB-ISL.

Authors:  S A Teichmann; C Chothia; G M Church; J Park
Journal:  Bioinformatics       Date:  2000-02       Impact factor: 6.937

Review 9.  Fold change in evolution of protein structures.

Authors:  N V Grishin
Journal:  J Struct Biol       Date:  2001 May-Jun       Impact factor: 2.867

10.  New sequence motifs in flavoproteins: evidence for common ancestry and tools to predict structure.

Authors:  O Vallon
Journal:  Proteins       Date:  2000-01-01
View more
  16 in total

1.  Gene3D: structural assignment for whole genes and genomes using the CATH domain structure database.

Authors:  Daniel W A Buchan; Adrian J Shepherd; David Lee; Frances M G Pearl; Stuart C G Rison; Janet M Thornton; Christine A Orengo
Journal:  Genome Res       Date:  2002-03       Impact factor: 9.043

2.  The CATH database: an extended protein family resource for structural and functional genomics.

Authors:  F M G Pearl; C F Bennett; J E Bray; A P Harrison; N Martin; A Shepherd; I Sillitoe; J Thornton; C A Orengo
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

3.  Gene3D: structural assignments for the biologist and bioinformaticist alike.

Authors:  Daniel W A Buchan; Stuart C G Rison; James E Bray; David Lee; Frances Pearl; Janet M Thornton; Christine A Orengo
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

4.  The InterPro Database, 2003 brings increased coverage and new features.

Authors:  Nicola J Mulder; Rolf Apweiler; Teresa K Attwood; Amos Bairoch; Daniel Barrell; Alex Bateman; David Binns; Margaret Biswas; Paul Bradley; Peer Bork; Phillip Bucher; Richard R Copley; Emmanuel Courcelle; Ujjwal Das; Richard Durbin; Laurent Falquet; Wolfgang Fleischmann; Sam Griffiths-Jones; Daniel Haft; Nicola Harte; Nicolas Hulo; Daniel Kahn; Alexander Kanapin; Maria Krestyaninova; Rodrigo Lopez; Ivica Letunic; David Lonsdale; Ville Silventoinen; Sandra E Orchard; Marco Pagni; David Peyruc; Chris P Ponting; Jeremy D Selengut; Florence Servant; Christian J A Sigrist; Robert Vaughan; Evgueni M Zdobnov
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

5.  Strategies for selection from protein libraries composed of de novo designed secondary structure modules.

Authors:  Tomoaki Matsuura; Andreas Plückthun
Journal:  Orig Life Evol Biosph       Date:  2004-02       Impact factor: 1.950

6.  Assessing strategies for improved superfamily recognition.

Authors:  Ian Sillitoe; Mark Dibley; James Bray; Sarah Addou; Christine Orengo
Journal:  Protein Sci       Date:  2005-06-03       Impact factor: 6.725

7.  Enhanced functional and structural domain assignments using remote similarity detection procedures for proteins encoded in the genome of Mycobacterium tuberculosis H37Rv.

Authors:  Seema Namboori; Natasha Mhatre; Sentivel Sujatha; Narayanaswamy Srinivasan; Shashi Bhushan Pandit
Journal:  J Biosci       Date:  2004-09       Impact factor: 1.826

8.  Convergent evolution in structural elements of proteins investigated using cross profile analysis.

Authors:  Kentaro Tomii; Yoshito Sawada; Shinya Honda
Journal:  BMC Bioinformatics       Date:  2012-01-16       Impact factor: 3.169

9.  Comprehensive genome analysis of 203 genomes provides structural genomics with new insights into protein family space.

Authors:  Russell L Marsden; David Lee; Michael Maibaum; Corin Yeats; Christine A Orengo
Journal:  Nucleic Acids Res       Date:  2006-02-15       Impact factor: 16.971

10.  Towards a comprehensive structural coverage of completed genomes: a structural genomics viewpoint.

Authors:  Russell L Marsden; Tony A Lewis; Christine A Orengo
Journal:  BMC Bioinformatics       Date:  2007-03-09       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.