| Literature DB >> 21929785 |
Carol J Bult1, Harold J Drabkin, Alexei Evsikov, Darren Natale, Cecilia Arighi, Natalia Roberts, Alan Ruttenberg, Peter D'Eustachio, Barry Smith, Judith A Blake, Cathy Wu.
Abstract
BACKGROUND: Representing species-specific proteins and protein complexes in ontologies that are both human- and machine-readable facilitates the retrieval, analysis, and interpretation of genome-scale data sets. Although existing protin-centric informatics resources provide the biomedical research community with well-curated compendia of protein sequence and structure, these resources lack formal ontological representations of the relationships among the proteins themselves. The Protein Ontology (PRO) Consortium is filling this informatics resource gap by developing ontological representations and relationships among proteins and their variants and modified forms. Because proteins are often functional only as members of stable protein complexes, the PRO Consortium, in collaboration with existing protein and pathway databases, has launched a new initiative to implement logical and consistent representation of protein complexes. DESCRIPTION: We describe here how the PRO Consortium is meeting the challenge of representing species-specific protein complexes, how protein complex representation in PRO supports annotation of protein complexes and comparative biology, and how PRO is being integrated into existing community bioinformatics resources. The PRO resource is accessible at http://pir.georgetown.edu/pro/.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21929785 PMCID: PMC3189193 DOI: 10.1186/1471-2105-12-371
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Schematic showing the relationships between UniProtKB, ProForm, GO, and ProComp. Arrows between ProForm and UniProtKB are xref, those between ProComp and ProForm are has_part, and between ProComp and Gene Ontology the arrows depict is_a relationships. In ProForm each protein form is assigned a unique identifier and is cross-referenced to protein entries in UniProtKB. Protein isoforms and modified forms are described in UniProtKB records, but in contrast to PRO, each protein form is not represented as a separate, uniquely accessioned entity in UniProtKB. For example, for the alpha subunit of IDH in mouse there is a PRO entry for the protein (PR:000025358) and for each of the alpha protein isoforms (PR:000025355 and PR:000025356). In UniProt the IDH alpha subunit and its isoforms are all represented in the same record (UniProt: Q9D6R2). In ProComp, accessioned, species-specific protein complex entities are described using protein entries from ProForm. The protein complexes in ProComp are cross-referenced to species-independent complex representations in the Gene Ontology (GO).
Figure 2PRO hierarchy depicting relationship for the human and bacterial SPT complex protein subunits, and for the SPT complexes. The blue arrows and "I" icons represent is_a relationships among the entities. The protein components have PR ids and the complex concepts have Gene Ontology (GO) ids.
Figure 3Screenshot of the PRO entry for human serine palmitoyltransferase complex core 1 (SPT) as displayed on the PRO web site.
Figure 4Example of a Protein Annotation File (PAF) showing biological annotations for MCCC1 (MCCA) and MCCC2 (MCCB) human variants. Only columns with information are shown and short names of the variants are used. This PAF file is available via the PRO ftp site: ftp://ftp.pir.georgetown.edu/databases/ontology/pro_obo/.