Literature DB >> 18344544

Pfam 10 years on: 10,000 families and still growing.

Stephen John Sammut1, Robert D Finn, Alex Bateman.   

Abstract

Classifications of proteins into groups of related sequences are in some respects like a periodic table for biology, allowing us to understand the underlying molecular biology of any organism. Pfam is a large collection of protein domains and families. Its scientific goal is to provide a complete and accurate classification of protein families and domains. The next release of the database will contain over 10,000 entries, which leads us to reflect on how far we are from completing this work. Currently Pfam matches 72% of known protein sequences, but for proteins with known structure Pfam matches 95%, which we believe represents the likely upper bound. Based on our analysis a further 28,000 families would be required to achieve this level of coverage for the current sequence database. We also show that as more sequences are added to the sequence databases the fraction of sequences that Pfam matches is reduced, suggesting that continued addition of new families is essential to maintain its relevance.

Mesh:

Substances:

Year:  2008        PMID: 18344544     DOI: 10.1093/bib/bbn010

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  54 in total

1.  The phage lambda major tail protein structure reveals a common evolution for long-tailed phages and the type VI bacterial secretion system.

Authors:  Lisa G Pell; Voula Kanelis; Logan W Donaldson; P Lynne Howell; Alan R Davidson
Journal:  Proc Natl Acad Sci U S A       Date:  2009-02-27       Impact factor: 11.205

Review 2.  FINDSITE: a combined evolution/structure-based approach to protein function prediction.

Authors:  Jeffrey Skolnick; Michal Brylinski
Journal:  Brief Bioinform       Date:  2009-03-26       Impact factor: 11.622

3.  Exploiting genomic patterns to discover new supramolecular protein assemblies.

Authors:  Morgan Beeby; Thomas A Bobik; Todd O Yeates
Journal:  Protein Sci       Date:  2009-01       Impact factor: 6.725

4.  Nature of the protein universe.

Authors:  Michael Levitt
Journal:  Proc Natl Acad Sci U S A       Date:  2009-06-18       Impact factor: 11.205

5.  Genome reduction by deletion of paralogs in the marine cyanobacterium Prochlorococcus.

Authors:  Haiwei Luo; Robert Friedman; Jijun Tang; Austin L Hughes
Journal:  Mol Biol Evol       Date:  2011-04-29       Impact factor: 16.240

Review 6.  The evolutionary origin of orphan genes.

Authors:  Diethard Tautz; Tomislav Domazet-Lošo
Journal:  Nat Rev Genet       Date:  2011-08-31       Impact factor: 53.242

7.  Identification of MDP (muramyl dipeptide)-binding key domains in NOD2 (nucleotide-binding and oligomerization domain-2) receptor of Labeo rohita.

Authors:  Jitendra Maharana; Banikalyan Swain; Bikash R Sahoo; Manas R Dikhit; Madhubanti Basu; Abhijit S Mahapatra; Pallipuram Jayasankar; Mrinal Samanta
Journal:  Fish Physiol Biochem       Date:  2012-12-20       Impact factor: 2.794

8.  Intrinsic structural disorder confers cellular viability on oncogenic fusion proteins.

Authors:  Hedi Hegyi; László Buday; Peter Tompa
Journal:  PLoS Comput Biol       Date:  2009-10-30       Impact factor: 4.475

9.  In silico analysis of transcription factor repertoire and prediction of stress responsive transcription factors in soybean.

Authors:  Keiichi Mochida; Takuhiro Yoshida; Tetsuya Sakurai; Kazuko Yamaguchi-Shinozaki; Kazuo Shinozaki; Lam-Son Phan Tran
Journal:  DNA Res       Date:  2009-11-02       Impact factor: 4.458

10.  Sequence analysis of GerM and SpoVS, uncharacterized bacterial 'sporulation' proteins with widespread phylogenetic distribution.

Authors:  Daniel J Rigden; Michael Y Galperin
Journal:  Bioinformatics       Date:  2008-06-17       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.