Literature DB >> 17597866

On gene ontology and function annotation.

Debnath Pal1.   

Abstract

The effort of function annotation does not merely involve associating a gene with some structured vocabulary that describes action. Rather the details of the actions, the components of the actions, the larger context of the actions are important issues that are of direct relevance, because they help understand the biological system to which the gene/protein belongs. Currently Gene Ontology (GO) Consortium offers the most comprehensive sets of relationships to describe gene/protein activity. However, its choice to segregate gene ontology to subdomains of molecular function, biological process and cellular component is creating significant limitations in terms of future scope of use. If we are to understand biology in its total complexity, comprehensive ontologies in larger biological domains are essential. A vigorous discussion on this topic is necessary for the larger benefit of the biological community. I highlight this point because larger-bio-domain ontologies cannot be simply created by integrating subdomain ontologies. Relationships in larger bio-domain-ontologies are more complex due to larger size of the system and are therefore more labor intensive to create. The current limitations of GO will be a handicap in derivation of more complex relationships from the high throughput biology data.

Entities:  

Year:  2006        PMID: 17597866      PMCID: PMC1891657          DOI: 10.6026/97320630001097

Source DB:  PubMed          Journal:  Bioinformation        ISSN: 0973-2063


Background

The protein function annotation problem is central to understanding of biological systems. A major challenge that is currently facing the biology community is to precisely define what action(s) constitutes a given function. The challenge lies in the fact that the same action is interpretable in different ways to different people if the context in which the action is taking place is not known. For example in proteins, the action “bind” describes association between two proteins. But left to itself, the action is open to further interpretation. For example: bind for catalysis, bind for transport or bind for regulation? Similar things can be said about actions like fusion, docking, porting, sensing, silencing and so on. It is apparent that some actions share parent child relationships and these natural relationships must be recognized for assisting us in the function annotation efforts. Another degree of complexity in the definition of function comes from the description of functional levels. As is evident, function can be defined as any of a group of related actions contributing to a larger action in a living organism (Webster Dictionary definition). This means lower level functions will interact to generate higher-level function and at the same time, a lower level function will be part of many higher-level functions. These relationships among individual functions at a each level and among many levels must be established properly to identify a function and its order (order of a function is the number of component functions that makes it). The relationship between functions at different levels and also among themselves must also be identified.

Description

Gene Ontology (GO, http://www.geneontology.org) currently houses the most extensive set of description of molecular functions, biological processes and cellular components using controlled vocabulary. [1] Each description in these sets can be called a term, which are related to another by a directed edge. The graph that is thus formed is called a directed acyclic graph where in the detail of description improves as we go further away from the root term. The GO takes into account the dependencies of terms and parent child relationship. For example, in biological processes if either X biosynthesis or X catabolism exists, then the parent X metabolism is included. Similarly, if regulation of X exists, then the process X must also exist. However, all such logically related terms are not included in a defined manner. For example, in molecular function, binding activity is segregated from related activities like transport and catalysis. The current argument of GO is that binding terms should only be used in cases where a stable binding interaction occurs. It has been argued that if actions were indeed subdivided, splitting the catalysis of a reaction into steps such as “substrate binding”, “formation of unstable intermediate” or “attraction of electrons to positive charge” - will lead to saying that a reaction was actually a series of functions ­ i.e., a process. There is certain paradox in the arguments here, because ontologies in GO are developed at three segregated levels: molecular function, biological process and cellular component. In reality, molecular functions are also processes involving atoms, and this fits nicely within our prescribed definition of function. There should naturally exist relationships that link molecular function to biological process, or any level that involves entities such as atoms, tissues, organs and so on (Figure 1).
Figure 1

A proposed outline of hierarchy in an unified biological ontology describing function. Each species may have different organization of the hierarchies depending on its structural organization. While for unicellular organisms the hierarchy is restricted to cell; in multicellular organism this hierarchy can extend to tissues, glands, bones, organs, appendage, body and so on. In all these cases, the functional relationships must conform to the definitions proposed on the right panel. In our scheme, although we have restricted our hierarchy to molecular function, in principle it can be extended to molecular subdomains, atoms and electrons. The graph formed from the relationships is expected to be directed and acyclic to make it amenable to computations.

Therefore a critical challenge remains in understanding function terms in their proper perspective so that all functions at different levels can be related to each other. Certain effort is already ongoing [2 ], how we can develop Gene Ontology with clarity of logical relationships, which are also mathematically amenable to interpretation and useful for functional annotation. Open Biomedical Ontologies (OBO, http://obo.sourceforge.net/) are another effort in this direction. Although effort to build ontologies in a given domain is useful, its utility will be limited by lack of cross domain link, which cannot be overcome by simply integrating individual domain ontologies. Such larger bio-domain ontologies therefore need to be built almost from scratches, since the complexities of relationships are diverse, an essential requirement for understanding the complexity of biological actions. Therefore efforts to build ontologies in larger biological domains must be started so that the inherent biosystem complexities are truly resolved. Because complexities of relations in large domains are manifold, only experts are properly qualified to define clear relations. With a worldwide community initiative, however, the possibilities of having these ontologies are not impractical. Voluntary contribution of these ontologies derived from experimental work of the authors, who are self-experts in the area, will benefit the effort. Research journals can encourage such contributions by making ontology submissions mandatory to a public database. It is arguably one of the most challenging frontiers facing the biology community today.
  2 in total

1.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors:  M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal:  Nat Genet       Date:  2000-05       Impact factor: 38.330

Review 2.  Ontologies for molecular biology and bioinformatics.

Authors:  Steffen Schulze-Kremer
Journal:  In Silico Biol       Date:  2002
  2 in total
  7 in total

Review 1.  Bioinformatics and cancer research: building bridges for translational research.

Authors:  Gonzalo Gómez-López; Alfonso Valencia
Journal:  Clin Transl Oncol       Date:  2008-02       Impact factor: 3.405

2.  In silico gene expression analysis in Codonopsis lanceolata root.

Authors:  Subramaniyam Sathiyamoorthy; Jun-Gyo In; Ok Ran Lee; Bum-Soo Lee; Sri Renuka Devi; Deok-Chun Yang
Journal:  Mol Biol Rep       Date:  2010-11-19       Impact factor: 2.316

3.  Generation and gene ontology based analysis of expressed sequence tags (EST) from a Panax ginseng C. A. Meyer roots.

Authors:  Subramaniyam Sathiyamoorthy; Jun-Gyo In; Sathiyaraj Gayathri; Yeon-Ju Kim; Deok-Chun Yang
Journal:  Mol Biol Rep       Date:  2009-11-27       Impact factor: 2.316

4.  An ontological analysis of some biological ontologies.

Authors:  Briti Deb
Journal:  Front Genet       Date:  2012-11-26       Impact factor: 4.599

5.  The Mechanism Research of Qishen Yiqi Formula by Module-Network Analysis.

Authors:  Shichao Zheng; Yanling Zhang; Yanjiang Qiao
Journal:  Evid Based Complement Alternat Med       Date:  2015-08-24       Impact factor: 2.629

6.  PDON: Parkinson's disease ontology for representation and modeling of the Parkinson's disease knowledge domain.

Authors:  Erfan Younesi; Ashutosh Malhotra; Michaela Gündel; Phil Scordis; Alpha Tom Kodamullil; Matt Page; Bernd Müller; Stephan Springstubbe; Ullrich Wüllner; Dieter Scheller; Martin Hofmann-Apitius
Journal:  Theor Biol Med Model       Date:  2015-09-22       Impact factor: 2.432

7.  Large-scale Gene Ontology analysis of plant transcriptome-derived sequences retrieved by AFLP technology.

Authors:  Alessandro Botton; Giulio Galla; Ana Conesa; Christian Bachem; Angelo Ramina; Gianni Barcaccia
Journal:  BMC Genomics       Date:  2008-07-24       Impact factor: 3.969

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.