| Literature DB >> 16381856 |
Robert D Finn1, Jaina Mistry, Benjamin Schuster-Böckler, Sam Griffiths-Jones, Volker Hollich, Timo Lassmann, Simon Moxon, Mhairi Marshall, Ajay Khanna, Richard Durbin, Sean R Eddy, Erik L L Sonnhammer, Alex Bateman.
Abstract
Pfam is a database of protein families that currently contains 7973 entries (release 18.0). A recent development in Pfam has enabled the grouping of related families into clans. Pfam clans are described in detail, together with the new associated web pages. Improvements to the range of Pfam web tools and the first set of Pfam web services that allow programmatic access to the database and associated tools are also presented. Pfam is available on the web in the UK (http://www.sanger.ac.uk/Software/Pfam/), the USA (http://pfam.wustl.edu/), France (http://pfam.jouy.inra.fr/) and Sweden (http://pfam.cgb.ki.se/).Entities:
Mesh:
Substances:
Year: 2006 PMID: 16381856 PMCID: PMC1347511 DOI: 10.1093/nar/gkj149
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Increase in coverage of 6 representative proteomes over the past 9 years of Pfam
| Release/Date | No. Models | E.coli K12 | R.prowazekii | M.jannaschii | S.cerevisiae | C.elegans | H.sapiens | |
|---|---|---|---|---|---|---|---|---|
| Protein coverage (%) | 18.0 (07/2005) | 7973 | 84 | 81 | 72 | 64 | 60 | 64 |
| 10.0 (07/2003) | 6190 | 76 | 77 | 68 | 60 | 57 | 60 | |
| 5.5 (09/2000) | 2478 | 55 | 62 | 52 | 47 | 47 | 52 | |
| Residue coverage (%) | 18.0 (07/2005) | 7973 | 65 | 61 | 55 | 36 | 35 | 37 |
| 10.0 (07/2003) | 6190 | 61 | 58 | 53 | 35 | 34 | 35 | |
| 5.5 (09/2000) | 2478 | 42 | 44 | 40 | 25 | 26 | 29 |
The models from releases 5.5, 10.0 and 18.0 were searched against each proteome, downloaded from Integr8 () (16). Every protein domain satisfying the curated Pfam gathering threshold cut-off was scored as a hit. Two different coverage measures have been included, protein coverage and residue coverage.
Figure 1Clan pages in Pfam. (A) A screen shot of a clan summary page, containing the description, annotation and membership of the clan. From this page, the user can view the family relationship diagram (B). Each family in the clan is represented by a blue box and its relationship to other families is represented by solid lines (significant profile–profile comparison score) or dashed lines (non-significant profile-profile comparison score). Beside each line, the profile–profile comparison E-value score is presented. This score is also linked to a visualization of the profile–profile comparison alignment (C). The clan summary page also provides a link to the clan alignment (D) (for more details see text). The clan alignment is a multiple sequence alignment of all of the clan members seed alignments (each set of seed sequences are separated by the alternate background shading). The alignments are coloured using Jalview.
Summary of the new website features and web services, including server location
| Feature | Mirror site | Specific URL |
|---|---|---|
| Clan summaries | UK, Sweden | Follow links from: |
| Clan alignments/relationship diagrams | UK | Example URLs: |
| Coloured alignments | Sweden | Example: |
| Domain images/XML upload | UK | |
| HMM logo | UK | |
| Domain query tools | Sweden | |
| Core web services | UK | |
| Web service Perl client | UK | |
| DQL web service | Sweden | |
| PfamAlyzer | Sweden |
Figure 2(A) Graphical representation of domains on the sequence ADA19_HUMAN. The sequence is represented as a grey bar. As of release 18.0, Pfam identifies four domains: Pep_M12B_propep (PF01562, coloured green), Reprolysin (PF01421, red), Disintegrin (PF00200, yellow) and EGF_2 (PF07974, magenta). The black domain is the ACR domain from SMART (15). The striped boxes represent PfamB families, while the small blue and red boxes represent low-complexity and transmembrane regions respectively. Above the domain images, the dashed lines represent disulphide bridges found within the sequence. The red diamond below the Reprolysin domain indicates an active site position. (B) The seed alignment of SH2 (PF00017) marked-up according to the Belvu colouring system, using the new multiple sequence alignment viewer on the Swedish site.