Literature DB >> 12093368

Ontologies for programs, not people.

Lawrence Hunter1.   

Abstract

A response to Life sentences: Ontology recapitulates philology by Sydney Brenner, Genome Biology 2002, 3:comment1006.1-1006.2.

Entities:  

Mesh:

Year:  2002        PMID: 12093368      PMCID: PMC139366          DOI: 10.1186/gb-2002-3-6-interactions1002

Source DB:  PubMed          Journal:  Genome Biol        ISSN: 1474-7596            Impact factor:   13.583


In a recent column [1], the wry and erudite Sydney Brenner expressed his disdain for current efforts to create computational ontologies of molecular biology. In essence, Brenner argued that building a network of names of biological entities is a waste of time. It is the nucleotide sequences or amino-acid conformations of these objects, not their names, "that create the processes that produce outcomes for cells, organs and organism," he says. "Very simply, the network we should be interested in is not the network of names but the network of the objects themselves." Brenner's article misses the point - several points, actually. First, the essence of the Gene Ontology project, of which he is specifically critical, and of other knowledge-bases of molecular biology, such as EcoCyc [2] or the Unified Medical Language System (UMLS) [3], is not in the list of names they embody, but in the relationships they represent. The names are convenient symbols to which more complex statements can be attached. Without the names, it is impossible to specifically represent relationships such as 'activates' or 'binds to'. Surely that sort of information must be the kind of thing that Brenner means when he says we are interested in the interactions between the objects themselves, rather than their names. If we are to build useful databases of the interactions that Brenner suggests ought to hold our interest, then there are significant advantages to being able to make statements about various groupings of genes and gene products together, using the terminology that is familiar to molecular biologists. For example, representation of the statement 'the balance between pro- and anti-apoptotic members of the bcl2 family of genes determines whether apoptosis proceeds' is straightforward if we use an ontology that contains the appropriate abstractions, and painfully difficult if we are limited to expressions of direct interactions between pairs of genes and proteins. The third important point to consider is to whom Brenner is referring when he uses "we" in his argument. Knowledge-bases are not generally used directly by an end user, but instead by computer programs in order to accomplish complex inference tasks. Many productive and promising approaches to bioinformatics require a computationally manipulable representation of existing biological understanding - incomplete and incorrect as it may be - as a vital prerequisite. For example, inference from gene-expression data using Bayesian networks [4] can take advantage of online sources of information about the likely probabilistic dependencies among expression levels of various genes. Knowledge-bases built from textbooks, review articles, or even the Oxford Dictionary of Molecular Biology can provide precisely this sort of computationally useful information. The fourth issue is that if bioinformaticians are to build useful tools for managing the ever-growing onslaught of research publications resulting from high-throughput instrumentation and exacerbated by the collapse of subdisciplinary distinctions, then they must first create computer programs that recognize references to genes, proteins and other biological entities in texts. Automatically linking references to molecular entities and processes in texts (such as Medline abstracts) to the appropriate entries in molecular databases (such as GenBank) can save enormous amounts of researcher time and facilitate the kind of biology that Brenner holds dear. Such a mapping, however, requires the presence of a well-represented knowledge-base of molecular biological entities - perhaps like the Gene Ontology. Brenner is, of course, entitled to his opinion about the utility of efforts like the Gene Ontology and the UMLS. Perhaps he doesn't need any of the computational tools for analyzing high-throughput data in light of prior knowledge, or managing the vast scientific literature, either. For those of us who use bioinformatics software to advance scientific understanding, however, broad community efforts at knowledge representation - like the effort of the Gene Ontology Consortium - are invaluable.
  4 in total

1.  Rich probabilistic models for gene expression.

Authors:  E Segal; B Taskar; A Gasch; N Friedman; D Koller
Journal:  Bioinformatics       Date:  2001       Impact factor: 6.937

2.  Pathway databases: a case study in computational symbolic theories.

Authors:  P D Karp
Journal:  Science       Date:  2001-09-14       Impact factor: 47.728

3.  The Unified Medical Language System: an informatics research collaboration.

Authors:  B L Humphreys; D A Lindberg; H M Schoolman; G O Barnett
Journal:  J Am Med Inform Assoc       Date:  1998 Jan-Feb       Impact factor: 4.497

4.  Life sentences: Ontology recapitulates philology.

Authors:  Sydney Brenner
Journal:  Genome Biol       Date:  2002-03-19       Impact factor: 13.583

  4 in total
  1 in total

1.  Coreference annotation and resolution in the Colorado Richly Annotated Full Text (CRAFT) corpus of biomedical journal articles.

Authors:  K Bretonnel Cohen; Arrick Lanfranchi; Miji Joo-Young Choi; Michael Bada; William A Baumgartner; Natalya Panteleyeva; Karin Verspoor; Martha Palmer; Lawrence E Hunter
Journal:  BMC Bioinformatics       Date:  2017-08-17       Impact factor: 3.169

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.