Literature DB >> 9342336

A structural census of the current population of protein sequences.

M Gerstein1, M Levitt.   

Abstract

We examine the occurrence of the approximately 300 known protein folds in different groups of organisms. To do this, we characterize a large fraction of the currently known protein sequences ( approximately 140,000) in structural terms, by matching them to known structures via sequence comparison (or by secondary-structure class prediction for those without structural homologues). Overall, we find that an appreciable fraction of the known folds are present in each of the major groups of organisms (e.g., bacteria and eukaryotes share 156 of 275 folds), and most of the common folds are associated with many families of nonhomologous sequences (i.e., >10 sequence families for each common fold). However, different groups of organisms have characteristically distinct distributions of folds. So, for instance, some of the most common folds in vertebrates, such as globins or zinc fingers, are rare or absent in bacteria. Many of these differences in fold usage are biologically reasonable, such as the folds of metabolic enzymes being common in bacteria and those associated with extracellular transport and communication being common in animals. They also have important implications for database-based methods for fold recognition, suggesting that an unknown sequence from a plant is more likely to have a certain fold (e.g., a TIM barrel) than an unknown sequence from an animal.

Mesh:

Substances:

Year:  1997        PMID: 9342336      PMCID: PMC23653          DOI: 10.1073/pnas.94.22.11911

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  45 in total

1.  How big is the universe of exons?

Authors:  R L Dorit; L Schoenbach; W Gilbert
Journal:  Science       Date:  1990-12-07       Impact factor: 47.728

2.  Database of homology-derived protein structures and the structural meaning of sequence alignment.

Authors:  C Sander; R Schneider
Journal:  Proteins       Date:  1991

3.  The SWISS-PROT protein sequence data bank.

Authors:  A Bairoch; B Boeckmann
Journal:  Nucleic Acids Res       Date:  1992-05-11       Impact factor: 16.971

4.  A data bank merging related protein structures and sequences.

Authors:  S Pascarella; P Argos
Journal:  Protein Eng       Date:  1992-03

5.  Proteins. One thousand families for the molecular biologist.

Authors:  C Chothia
Journal:  Nature       Date:  1992-06-18       Impact factor: 49.962

Review 6.  A structural taxonomy of DNA-binding domains.

Authors:  S C Harrison
Journal:  Nature       Date:  1991-10-24       Impact factor: 49.962

7.  The Protein Data Bank: a computer-based archival file for macromolecular structures.

Authors:  F C Bernstein; T F Koetzle; G J Williams; E F Meyer; M D Brice; J R Rodgers; O Kennard; T Shimanouchi; M Tasumi
Journal:  J Mol Biol       Date:  1977-05-25       Impact factor: 5.469

8.  Construction of validated, non-redundant composite protein sequence databases.

Authors:  A J Bleasby; J C Wootton
Journal:  Protein Eng       Date:  1990-01

9.  Profile analysis.

Authors:  M Gribskov; R Lüthy; D Eisenberg
Journal:  Methods Enzymol       Date:  1990       Impact factor: 1.600

10.  Ancient conserved regions in new gene sequences and the protein databases.

Authors:  P Green; D Lipman; L Hillier; R Waterston; D States; J M Claverie
Journal:  Science       Date:  1993-03-19       Impact factor: 47.728

View more
  21 in total

1.  The ASTRAL compendium for protein structure and sequence analysis.

Authors:  S E Brenner; P Koehl; M Levitt
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  PartsList: a web-based system for dynamically ranking protein folds based on disparate attributes, including whole-genome expression and interaction information.

Authors:  J Qian; B Stenger; C A Wilson; J Lin; R Jansen; S A Teichmann; J Park; W G Krebs; H Yu; V Alexandrov; N Echols; M Gerstein
Journal:  Nucleic Acids Res       Date:  2001-04-15       Impact factor: 16.971

Review 3.  Structural biology.

Authors:  K C Holmes
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  1999-12-29       Impact factor: 6.237

4.  Comparing function and structure between entire proteomes.

Authors:  J Liu; B Rost
Journal:  Protein Sci       Date:  2001-10       Impact factor: 6.725

5.  Comprehensive analysis of amino acid and nucleotide composition in eukaryotic genomes, comparing genes and pseudogenes.

Authors:  Nathaniel Echols; Paul Harrison; Suganthi Balasubramanian; Nicholas M Luscombe; Paul Bertone; Zhaolei Zhang; Mark Gerstein
Journal:  Nucleic Acids Res       Date:  2002-06-01       Impact factor: 16.971

6.  Digging for dead genes: an analysis of the characteristics of the pseudogene population in the Caenorhabditis elegans genome.

Authors:  P M Harrison; N Echols; M B Gerstein
Journal:  Nucleic Acids Res       Date:  2001-02-01       Impact factor: 16.971

7.  A glimpse at the organization of the protein universe.

Authors:  Michele Vendruscolo; Christopher M Dobson
Journal:  Proc Natl Acad Sci U S A       Date:  2005-04-12       Impact factor: 11.205

8.  In silico and in vivo studies of molecular structures and mechanisms of AtPCS1 protein involved in binding arsenite and/or cadmium in plant cells.

Authors:  Noor Nahar; Aminur Rahman; Maria Moś; Tomasz Warzecha; Sibdas Ghosh; Khaled Hossain; Neelu N Nawani; Abul Mandal
Journal:  J Mol Model       Date:  2014-02-20       Impact factor: 1.810

9.  Lineage-specific differences in the amino acid substitution process.

Authors:  Snehalata Huzurbazar; Grigory Kolesov; Steven E Massey; Katherine C Harris; Alexander Churbanov; David A Liberles
Journal:  J Mol Biol       Date:  2010-01-15       Impact factor: 5.469

10.  Molecular models of protein targets from Mycobacterium tuberculosis.

Authors:  Nelson José Freitas da Silveira; Hugo Brandão Uchôa; José Henrique Pereira; Fernanda Canduri; Luiz Augusto Basso; Mário Sérgio Palma; Diógenes Santiago Santos; Walter Filgueira de Azevedo
Journal:  J Mol Model       Date:  2005-03-10       Impact factor: 1.810

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.