| Literature DB >> 23193261 |
Ivo Pedruzzi1, Catherine Rivoire, Andrea H Auchincloss, Elisabeth Coudert, Guillaume Keller, Edouard de Castro, Delphine Baratin, Béatrice A Cuche, Lydie Bougueleret, Sylvain Poux, Nicole Redaschi, Ioannis Xenarios, Alan Bridge.
Abstract
HAMAP (High-quality Automated and Manual Annotation of Proteins-available at http://hamap.expasy.org/) is a system for the classification and annotation of protein sequences. It consists of a collection of manually curated family profiles for protein classification, and associated annotation rules that specify annotations that apply to family members. HAMAP was originally developed to support the manual curation of UniProtKB/Swiss-Prot records describing microbial proteins. Here we describe new developments in HAMAP, including the extension of HAMAP to eukaryotic proteins, the use of HAMAP in the automated annotation of UniProtKB/TrEMBL, providing high-quality annotation for millions of protein sequences, and the future integration of HAMAP into a unified system for UniProtKB annotation, UniRule. HAMAP is continuously updated by expert curators with new family profiles and annotation rules as new protein families are characterized. The collection of HAMAP family classification profiles and annotation rules can be browsed and viewed on the HAMAP website, which also provides an interface to scan user sequences against HAMAP profiles.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23193261 PMCID: PMC3531088 DOI: 10.1093/nar/gks1157
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.A sample HAMAP profile page. The page provides information such as a family name and description, taxonomic range of the hits, associated annotation rule(s), cross-references to InterPro and access to matching proteins in UniProtKB. Additionally, links on the page provide access to (a) the actual family classification profile, (b) the seed alignment that was used to generate the profile with highlighted features from the annotation rule, (c) an interactive, graphical view of the score distribution of matching proteins, including those that fall below the trusted cutoff, and (d) an expandable view of the taxonomic distribution of matching proteins in UniProtKB.