| Literature DB >> 25937883 |
Heiko Dietze1, Tanya Z Berardini2, Rebecca E Foulger3, David P Hill4, Jane Lomax3, David Osumi-Sutherland3, Paola Roncaglia2, Christopher J Mungall1.
Abstract
BACKGROUND: Biological ontologies are continually growing and improving from requests for new classes (terms) by biocurators. These ontology requests can frequently create bottlenecks in the biocuration process, as ontology developers struggle to keep up, while manually processing these requests and create classes.Entities:
Keywords: Class generation; Ontology
Year: 2014 PMID: 25937883 PMCID: PMC4417543 DOI: 10.1186/2041-1480-5-48
Source DB: PubMed Journal: J Biomed Semantics
Figure 1Conventional ontology class request workflow. General workflow for ontology class requests using a traditional issue tracker. A simple class request may take several days, for complex cases even longer.
Figure 2Overview of TermGenie Components and Workflow. (1) Retrieve existing templates for user selection; (2) Term generation processing and validation; (2a) Generate textual data and OWL axioms; (2b) Use reasoning to check for existing classes and new or changed relations; (3) Review of generated classes by the user in the web interface; (4) After review, assign permanent identifiers to the new classes; (5) Add the new classes into the queue for review; (6) Senior ontology developers review the classes: accept, modify, obsolete; (7) Commit the changes to the ontology; (8) Send confirmation e-mail to the user.
Figure 3Example template and configuration for TermGenie. (top-left) XML-based example template configuration for the Gene Ontology template chemical_export. Includes declarations for required and optional input fields and corresponding JavaScript file; (top-right) Javascript snippet from the JavaScript file. for generating a class and OWL axioms; (bottom) Screenshot of the generated TermGenie input fields. Also shows autocompletion on ChEBI classes.
Figure 4Inferences for a class using a standard OWL reasoner. Reasoning example for a genus + differentia pattern for camptothecin catabolism in the GeneOntology. The class is defined by its genus ‘catabolic process’ (GO:0009045) and differentia ‘has_input camptothecin’ (CHEBI:27656). Following that definition, the class is a subclass of catabolic process. Using the additional axioms from ChEBI and the GeneOntology, a standard OWL reasoner can infer the more specific superclass ‘alkaloid catabolic process’ (GO:0009822).
Figure 5TermGenie user workflow. To create a class in TermGenie, Biocurators go to the TermGenie website and select the relevant template for their request. The template consists of a set of required and optional input fields. TermGenie provides autocompletion for appropriate input fields. After passing some quick checks, the request is sent to the server, where generation and reasoning are executed. The results are send back and the users have the chance to review the proposed classes. The next step is the submission of the generated classes for review. As part of this process, a new permanent identifier is generated using a customizable identifier pattern and range. Furthermore, the request is added to the review queue for final approval by the ontology developers.
Figure 6TermGenie workflow during a submitted class review by an ontology developer. After a user has submitted their generated class requests, the requests are put into a queue for review by an ontology developer. During the review the ontology developer has the following three choices: approve, modify, or obsolete. For the commit, the server uses the version control adapter to create a clean checkout. From there TermGenie loads the ontology as a separate instance and applies the relevant changes. After writing the changed ontology as a file, TermGenie tries to commit the updated file into the version control. After a successful commit the queue is updated and a confirmation e-mail is sent to the requester.
TermGenie generated class counts in GO over time
| Quarter | 2010-III | 2010-IV | 2011-I | 2011-II | 2011-III | 2011-IV | 2012-I | 2012-II | |
|---|---|---|---|---|---|---|---|---|---|
| TermGenie | 139 | 154 | 236 | 254 | 307 | 175 | 255 | 806 | |
| Manual | 575 | 413 | 332 | 295 | 313 | 364 | 462 | 324 | |
| Fraction | 19.47% | 27.16% | 41.55% | 46.27% | 49.52% | 32.47% | 35.56% | 71.33% | |
|
|
|
|
|
|
|
|
|
|
|
| TermGenie | 303 | 352 | 357 | 285 | 218 | 231 | 301 | 342 | 4715 |
| Manual | 371 | 283 | 62 | 92 | 170 | 164 | 109 | 110 | 4439 |
| Fraction | 44.96% | 55.43% | 85.20% | 75.60% | 56.19% | 58.48% | 73.41% | 75.66% | 51.51% |
Available templates for the geneontology termgenie instance
| Template | Input fields | Equivalent class statement |
|---|---|---|
|
| ||
| regulation | X:BP | GO:0065007 and ‘regulates’ some ?X |
| negative_regulation | X:BP | GO:0065007 and ‘negatively regulates’ some ?X |
| positive_regulation | X:BP | GO:0065007 and ‘positively regulates’ some ?X |
|
| ||
| regulation | X:MF | GO:0065007 and ‘regulates’ some ?X |
| negative_regulation | X:MF | GO:0065007 and ‘negatively regulates’ some ?X |
| positive_regulation | X:MF | GO:0065007 and ‘positively regulates’ some ?X |
| involved_in | P:BP, W:BP | ?P and ‘part_of’ some ?W |
| involved_in_mf_bp | P:MF, W:BP | ?P and ‘part_of’ some ?W |
| occurs_in | P:BP, C:CC | ?P and ‘occurs in’ some ?C |
| regulation_by | R:GO:0050789, P:BP | ?R and ‘results_in’ some ?P |
| part_of_cell_component | P:CC, W: CC | ?P and ‘part_of’ some ?W |
| chemical_transport | X:chebi | GO:0006810 and ‘transports or maintains localization of’ some ?X |
| chemical_transporter_activity | X:chebi | GO:0005215 and ‘transports or maintains localization of’ some ?X |
| chemical_binding | X:chebi | GO:0005488 and ‘has input’ some ?X |
|
| ||
| metabolism | X:chebi | GO:0008152 and ‘has participant’ some ?X |
| catabolism | X:chebi | GO:0009056 and ‘has input’ some ?X |
| biosynthesis | X:chebi | GO:0009058 and ‘has output’ some ?X |
| chemical_transmembrane_transport | X:chebi | GO:0055085 and ‘transports or maintains localization of’ some ?X |
|
| ||
| transmembrane transporter activity | X:chebi | GO:0022857 and ‘transports or maintains localization of’ some ?X |
| secondary active transmembrane transporter activity | X:chebi | GO:0015291 and ‘transports or maintains localization of’ some ?X |
| uptake transmembrane transporter activity | X:chebi | GO:0015563 and ‘transports or maintains localization of’ some ?X |
| transmembrane-transporting ATPase activity | X:chebi | GO:0042626 and ‘transports or maintains localization of’ some ?X |
|
| ||
| response to | X:chebi | GO:0050896 and ‘has input’ some ?X |
| cellular response to | X:chebi | GO:0070887 and ‘has input’ some ?X |
|
| ||
| chemical homeostasis | X:chebi | GO:0048878 and ‘regulates level of’ some ?X |
| cellular chemical homeostasis | X:chebi | GO:0055082 and ‘regulates level of’ some ?X |
| chemical_import | X:chebi | GO:0006810 and ‘imports’ some ?X |
| chemical_export | X:chebi | GO:0006810 and ‘exports’ some ?X |
| chemical_import_into | S:chebi, T:CC | GO:0006810 and ‘has target end location’ some ?T and ‘imports’ some ?S |
|
| ||
| transport | F:CC, T:CC | GO:0006810 and ‘has target start location’ some ?F and ‘has target end location’ some ?T |
| vesicle-mediated transport | F:CC, T:CC | GO:0016192 and ‘has target start location’ some ?F and ‘has target end location’ some ?T |
|
| ||
| transport | C:CC | GO:0006810 and ‘transports or maintains localization of’ some ?C |
| vesicle-mediated transport | C:CC | GO:0016192 and ‘transports or maintains localization of’ some ?C |
|
| ||
| transport | X:chebi, [F:CC], [T:CC] | GO:0006810 and ‘transports or maintains localization of’ some ?X [and ‘has target start location’ some ?F] [and ‘has target end location’ some ?T] |
| vesicle-mediated transport | X:chebi, [F:CC], [T:CC] | GO:0016192 and ‘transports or maintains localization of’ some ?X [and ‘has target start location’ some ?F] [and ‘has target end location’ some ?T] |
|
| ||
| assembly | C:CC | GO:0022607 and ‘results_in_assembly_of’ some ?C |
| disassembly | C:CC | GO:0022411 and ‘results_in_disassembly_of’ some ?C |
| plant_development | P:plant | anatomical structure development’ and ‘results in development of’ some ?P |
| plant_formation | X:plant | anatomical structure formation involved in morphogenesis’ and ‘results in formation of’ some ?X |
| plant_maturation | X:plant | developmental maturation’ and ‘results in developmental progression of’ some ?X |
| plant_morphogenesis | X:plant | anatomical structure morphogenesis’ and ‘results in morphogenesis of’ some ?X |
| plant_structural_organization | X:plant | anatomical structure arrangement’ and ‘results in structural organization of’ some ?X |
| cell_apoptotic_process | C:cell | cell-type specific apoptotic process’ and ‘occurs in’ some ?C |
| cell_differentiation | C:cell | GO:0030154 and ‘results in acquisition of features of’ some ?C |
| cell_migration | C:cell | cell migration’ and ‘alters location of’ some ?C |
|
| ||
| protein localization | C:CC | GO:0008104 and ‘has target end location’ some ?C |
| establishment of protein localization | C:CC | GO:0045184 and ‘has target end location’ some ?C |
| protein_complex_by_activity | A:MF | GO:0043234 and ‘capable_of’ some ?A |
|
| ||
| single-organism | P:BP | ?P and ‘bearer of’ some PATO:0002487 |
| multi-organism | P:BP | ?P and ‘bearer of’ some PATO:0002486 |
| biosynthesis_from | T:chebi, F:chebi | GO:0009058 and ‘has output’ some ?T and ‘has input’ some ?F |
| biosynthesis_via | T:chebi, V:chebi | GO:0009058 and ‘has output’ some ?T and ‘has intermediate’ some ?V |
| catabolism_to | S:chebi, R:chebi | GO:0009056 and ‘has input’ some ?S and ‘has output’ some ?T |
| catabolism_via | X:chebi, V:chebi | GO:0009056 and ‘has input’ some ?X and ‘has intermediate’ some ?V |
| metazoan_development | X:Uberon | anatomical structure development’ and ‘results in development of’ some ?X |
The first column contains the template names and available templates variations. The second column lists the expected ontology inputs for the equivalent class statement in the third column, with BP = GO:biological_process, MF = GO:molecular_function, CC = GO:cellular_component, chebi = ‘chemical entity’ (CHEBI:24431), plant = ‘plant anatomical entity’ (PO:0025131), cell = ‘native cell’ (CL:0000003), Uberon = ‘anatomical entity’ (UBERON:0001062).