Literature DB >> 14747997

The number of protein folds and their distribution over families in nature.

Xinsheng Liu1, Ke Fan, Wei Wang.   

Abstract

Currently, of the 10(6) known protein sequences, only about 10(4) structures have been solved. Based on homologies and similarities, proteins are grouped into different families in which each has a structural prototype, namely, the fold, and some share the same folds. However, the total number of folds and families, and furthermore, the distribution of folds over families in nature, are still an enigma. Here, we report a study on the distribution of folds over families and the total number of folds in nature, using a maximum probability principle and the moment method of estimation. A quadratic relation between the numbers of families and folds is found for the number of families in an interval from 6000 to 30,000. For example, about 2700 folds for 23,100 families are obtained, among them about 33 superfolds, including more than 100 families each, and the largest superfold comprises about 800 families. Our results suggest that although the majority of folds have only a single family per fold, a considerably larger number of folds include many more families each than in the database, and the distribution of folds over families in nature differs markedly from the sampled distribution. The long tail of fold distribution is first estimated in this article. The results fit the data for different versions of the structural classification of proteins (SCOP) excellently, and the goodness-of-fit tests strongly support the results. In addition, the method of directly "enlarging" the sample to the population may be useful in inferring distributions of species in different fields. Copyright 2003 Wiley-Liss, Inc.

Mesh:

Substances:

Year:  2004        PMID: 14747997     DOI: 10.1002/prot.10514

Source DB:  PubMed          Journal:  Proteins        ISSN: 0887-3585


  23 in total

1.  An amino acid position at crossroads of evolution of protein function: antibiotic sensor domain of BlaR1 protein from Staphylococcus aureus versus clasS D β-lactamases.

Authors:  Malika Kumarasiri; Leticia I Llarrull; Oleg Borbulevych; Jennifer Fishovitz; Elena Lastochkin; Brian M Baker; Shahriar Mobashery
Journal:  J Biol Chem       Date:  2012-01-18       Impact factor: 5.157

2.  Relative packing groups in template-based structure prediction: cooperative effects of true positive constraints.

Authors:  Ryan Day; Xiaotao Qu; Rosemarie Swanson; Zach Bohannan; Robert Bliss; Jerry Tsai
Journal:  J Comput Biol       Date:  2011-01       Impact factor: 1.479

3.  Divergent evolution within protein superfolds inferred from profile-based phylogenetics.

Authors:  Douglas L Theobald; Deborah S Wuttke
Journal:  J Mol Biol       Date:  2005-09-20       Impact factor: 5.469

4.  Evolutionary plasticity of protein families: coupling between sequence and structure variation.

Authors:  Anna R Panchenko; Yuri I Wolf; Larisa A Panchenko; Thomas Madej
Journal:  Proteins       Date:  2005-11-15

Review 5.  Advances in homology protein structure modeling.

Authors:  Zhexin Xiang
Journal:  Curr Protein Pept Sci       Date:  2006-06       Impact factor: 3.272

6.  A limited universe of membrane protein families and folds.

Authors:  Amit Oberai; Yungok Ihm; Sanguk Kim; James U Bowie
Journal:  Protein Sci       Date:  2006-07       Impact factor: 6.725

7.  Structural diversity of protein segments follows a power-law distribution.

Authors:  Yoshito Sawada; Shinya Honda
Journal:  Biophys J       Date:  2006-05-26       Impact factor: 4.033

8.  Growth of novel protein structural data.

Authors:  Michael Levitt
Journal:  Proc Natl Acad Sci U S A       Date:  2007-02-20       Impact factor: 11.205

Review 9.  The impact of extremophiles on structural genomics (and vice versa).

Authors:  Francis E Jenney; Michael W W Adams
Journal:  Extremophiles       Date:  2007-06-13       Impact factor: 2.395

10.  Novel protein folds and their nonsequential structural analogs.

Authors:  Aysam Guerler; Ernst-Walter Knapp
Journal:  Protein Sci       Date:  2008-06-26       Impact factor: 6.725

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.