Literature DB >> 19734151

Saint: a lightweight integration environment for model annotation.

Allyson L Lister¹, Matthew Pocock, Morgan Taschuk, Anil Wipat.

Abstract

UNLABELLED: Saint is a web application which provides a lightweight annotation integration environment for quantitative biological models. The system enables modellers to rapidly mark up models with biological information derived from a range of data sources.
AVAILABILITY AND IMPLEMENTATION: Saint is freely available for use on the web at http://www.cisban.ac.uk/saint. The web application is implemented in Google Web Toolkit and Tomcat, with all major browsers supported. The Java source code is freely available for download at http://saint-annotate.sourceforge.net. The Saint web server requires an installation of libSBML and has been tested on Linux (32-bit Ubuntu 8.10 and 9.04).

Entities: Chemical Species

Mesh：

Year: 2009 PMID： 19734151 PMCID： PMC2773255 DOI： 10.1093/bioinformatics/btp523

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

1 INTRODUCTION

Quantitative modelling is at the heart of systems biology. Model description languages such as the Systems Biology Markup Language (SBML; Hucka et al., 2003) or CellML (Lloyd et al., 2008) allow the relationships between biological entities to be captured and the dynamics of these interactions to be described mathematically. Currently, however, many dynamic models include only the mathematical information required to run simulations, and do not explicitly contain the full biological context. The efficient exchange, reuse and integration of models is aided by the presence of biological information in the model. Model annotations are necessary to describe how the model has been generated and to define the meaning of the components that make up the model in a computationally accessible fashion. If biological information about a model is added consistently and thoroughly, the model becomes useful not just for simulation, but also as an input in other computational tasks and as a reference for researchers. Without a biological context, models are only easily understandable by their creators. Interested third parties must rely on extra documentations such as publications or possibly incomplete descriptions of species and reactions included in an SBML element's name or identifier. Without computationally amenable biological annotations, it is difficult to create software tools that determine the biological pathway or reaction being modelled. The addition of biological annotations to a model is usually a manual, time-consuming process; there is no single resource that encompasses all suitable data sources. A modeller usually has to visit many web sites, applications and interfaces in order to identify relevant information, and may not be aware of all potentially useful databases. Because modellers add information manually, it is very difficult to annotate exhaustively. Whilst annotations are vital for model sharing and reuse, they do not contribute to the mathematical content of a model and are not critical to its successful functioning. The addition of biological knowledge must be performed quickly and easily in order to make annotation worthwhile to a modeller. A large number of tools are available for the construction, manipulation and simulation of models, but there is currently a lack of tools to facilitate rapid and systematic model annotation. While web sites and applications specializing in integrating disparate data sources exist, such as BioMart (Smedley et al., 2009) and Pathway Commons, none are designed to put information directly into a model. In this article, we describe a lightweight SBML model annotation tool called Saint, specifically designed to identify and integrate biological information relevant to computational models. Saint is an application which supports the addition of basic annotation to SBML entities and identifies new reactions which may be valuable for extending a model. Whilst the addition of biological annotation does not modify the behaviour of the model, the incorporation of new reactions or species adds new features that can later be built upon to potentially change the model's output.

2 IMPLEMENTATION

On the client side, Saint is a web application implemented in Google Web Toolkit and hosted on a Tomcat server, with a query translation service connecting to a number of external web services running on the server side. New annotation is presented to the user in a single integrated view after retrieval by the server-side queries. Reactions and associated species are added directly to the SBML model, whereas the majority of the remaining biological annotation is added to Annotation elements according to the Minimal Information Required in the Annotation of Biochemical Models (MIRIAM) specification (Le Novère et al., 2005). MIRIAM annotations are resource annotations that are added to SBML in a standardized way which link external resources such as ontologies and data sources to a model. MIRIAM, among other things, defines an annotation scheme accessible via web services which specifies the format and set of standard data types which should be used for these URIs (Laibe and Le Novere, 2007). The use of the MIRIAM format provides a standard structure for explicit links between the mathematical and biological aspects of an SBML model. Saint facilitates the biological annotation of SBML models by using query translation to present an integrated view of data sources and suggested ontological terms. Data sources include UniProtKB (The UniProt Consortium, 2008), STRING (Jensen et al., 2008) and Pathway Commons. Supported ontologies and standards include MIRIAM, the Systems Biology Ontology (SBO; Le Novère, 2006) and Gene Ontology (GO; Ashburner et al., 2000). Query translation within Saint occurs when the query for each species is translated into a set of queries over these resources' web services. Data are matched to a species through syntactic equivalence between the query term and the external data source. The combined query results are then displayed in the web browser. If a model is valid, Saint displays the parts of the model available for annotation. The display is organized around species, which are the main target of annotation. Saint makes use of the Google Web Toolkit to provide both asynchronous calls to external resources and cross-browser compatibility. New annotation can be viewed by the modeller, even if the other species are not annotated yet. The modeller can select or delete annotations as it suits their model, or hide entire species from consideration. When the modeller is satisfied with the new state of the model, it can be converted back to SBML and saved. Parsing and validation of the models are handled with libSBML (Bornstein et al., 2008). As an example, a Saccharomyces cerevisiae model containing a species with a single, simple identifier of ‘cdc13’ is loaded into Saint. Saint suggests the SBO term ‘macromolecule’ (SBO:0000245), which is added as an sboTerm attribute of that species element, as the best SBO match to a protein. This term was suggested both because ‘cdc13’ was found within UniProtKB and because the Pathway Commons interaction set identified the species as a protein. Saint also suggests the UniProtKB accession P32797, and GO terms including ‘nuclear telomere cap complex’ (GO:0000783) and ‘single-stranded telomeric DNA binding’ (GO:0043047) as retrieved from UniProtKB. This information is stored within the model via MIRIAM annotations. Extensions to the model are also suggested. For each species, new reactions and their associated species and species references are retrieved from both Pathway Commons and STRING. More examples and comparisons are available in the Supplementary Material.

3 DISCUSSION

To date, there are few tools available for automating the retrieval and integration of data for the annotation of SBML models. The Saint application was developed as an interactive web tool to annotate models with new MIRIAM resources and reactions, keeping track of data provenance so that the modeller can make an informed decision about the quality of the suggested annotation. The system makes it easy for modellers to add explicit biological knowledge to their models, increasing a model's usefulness both as a reference for other researchers and as an input for further computational analysis. A small number of similar tools are available. SemanticSBML provides MIRIAM annotations via a combination of data warehousing and query translation via web services as part of a larger application. The Java library libAnnotationSBML (Swainston and Mendes, 2009) uses query translation to provide annotation functionality with a minimal user interface. Unlike libAnnotationSBML, Saint is accessible through an easy-to-use web interface and unlike both tools is unique in its ability to add new reactions and associated species. Saint is under active development. Future enhancements will include the addition of new data sources and ontologies, annotation of elements other than species and reactions and support for other modelling formalisms such as CellML.

11 in total

1. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors: M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal: Nat Genet Date: 2000-05 Impact factor: 38.330

2. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models.

Authors: M Hucka; A Finney; H M Sauro; H Bolouri; J C Doyle; H Kitano; A P Arkin; B J Bornstein; D Bray; A Cornish-Bowden; A A Cuellar; S Dronov; E D Gilles; M Ginkel; V Gor; I I Goryanin; W J Hedley; T C Hodgman; J-H Hofmeyr; P J Hunter; N S Juty; J L Kasberger; A Kremling; U Kummer; N Le Novère; L M Loew; D Lucio; P Mendes; E Minch; E D Mjolsness; Y Nakayama; M R Nelson; P F Nielsen; T Sakurada; J C Schaff; B E Shapiro; T S Shimizu; H D Spence; J Stelling; K Takahashi; M Tomita; J Wagner; J Wang
Journal: Bioinformatics Date: 2003-03-01 Impact factor: 6.937

3. Minimum information requested in the annotation of biochemical models (MIRIAM).

Authors: Nicolas Le Novère; Andrew Finney; Michael Hucka; Upinder S Bhalla; Fabien Campagne; Julio Collado-Vides; Edmund J Crampin; Matt Halstead; Edda Klipp; Pedro Mendes; Poul Nielsen; Herbert Sauro; Bruce Shapiro; Jacky L Snoep; Hugh D Spence; Barry L Wanner
Journal: Nat Biotechnol Date: 2005-12 Impact factor: 54.908

4. The CellML Model Repository.

Authors: Catherine M Lloyd; James R Lawson; Peter J Hunter; Poul F Nielsen
Journal: Bioinformatics Date: 2008-07-25 Impact factor: 6.937

5. libAnnotationSBML: a library for exploiting SBML annotations.

Authors: Neil Swainston; Pedro Mendes
Journal: Bioinformatics Date: 2009-06-26 Impact factor: 6.937

6. STRING 8--a global view on proteins and their functional interactions in 630 organisms.

Authors: Lars J Jensen; Michael Kuhn; Manuel Stark; Samuel Chaffron; Chris Creevey; Jean Muller; Tobias Doerks; Philippe Julien; Alexander Roth; Milan Simonovic; Peer Bork; Christian von Mering
Journal: Nucleic Acids Res Date: 2008-10-21 Impact factor: 16.971

7. Model storage, exchange and integration.

Authors: Nicolas Le Novère
Journal: BMC Neurosci Date: 2006-10-30 Impact factor: 3.288

8. BioMart--biological queries made easy.

Authors: Damian Smedley; Syed Haider; Benoit Ballester; Richard Holland; Darin London; Gudmundur Thorisson; Arek Kasprzyk
Journal: BMC Genomics Date: 2009-01-14 Impact factor: 3.969

9. The universal protein resource (UniProt).

Authors:
Journal: Nucleic Acids Res Date: 2007-11-27 Impact factor: 16.971

10. MIRIAM Resources: tools to generate and resolve robust cross-references in Systems Biology.

Authors: Camille Laibe; Nicolas Le Novère
Journal: BMC Syst Biol Date: 2007-12-13

12 in total

1. Multiple ontologies in action: composite annotations for biosimulation models.

Authors: John H Gennari; Maxwell L Neal; Michal Galdzicki; Daniel L Cook
Journal: J Biomed Inform Date: 2010-06-30 Impact factor: 6.317

Review 2. Enzyme-substrate relationships in the ubiquitin system: approaches for identifying substrates of ubiquitin ligases.

Authors: Hazel F O'Connor; Jon M Huibregtse
Journal: Cell Mol Life Sci Date: 2017-04-28 Impact factor: 9.261

3. BioModels Database: An enhanced, curated and annotated resource for published quantitative kinetic models.

Authors: Chen Li; Marco Donizelli; Nicolas Rodriguez; Harish Dharuri; Lukas Endler; Vijayalakshmi Chelliah; Lu Li; Enuo He; Arnaud Henry; Melanie I Stefan; Jacky L Snoep; Michael Hucka; Nicolas Le Novère; Camille Laibe
Journal: BMC Syst Biol Date: 2010-06-29

4. Annotation of rule-based models with formal semantics to enable creation, analysis, reuse and visualization.

Authors: Goksel Misirli; Matteo Cavaliere; William Waites; Matthew Pocock; Curtis Madsen; Owen Gilfellon; Ricardo Honorato-Zimmer; Paolo Zuliani; Vincent Danos; Anil Wipat
Journal: Bioinformatics Date: 2015-11-11 Impact factor: 6.937

5. Multiscale modeling and data integration in the virtual physiological rat project.

Authors: Daniel A Beard; Maxwell L Neal; Nazanin Tabesh-Saleki; Christopher T Thompson; James B Bassingthwaighte; Mary Shimoyama; Brian E Carlson
Journal: Ann Biomed Eng Date: 2012-07-18 Impact factor: 3.934

6. Annotation of SBML models through rule-based semantic integration.

Authors: Allyson L Lister; Phillip Lord; Matthew Pocock; Anil Wipat
Journal: J Biomed Semantics Date: 2010-06-22

7. Model annotation for synthetic biology: automating model to nucleotide sequence conversion.

Authors: Goksel Misirli; Jennifer S Hallinan; Tommy Yu; James R Lawson; Sarala M Wimalaratne; Michael T Cooling; Anil Wipat
Journal: Bioinformatics Date: 2011-02-04 Impact factor: 6.937

8. Integrating systems biology models and biomedical ontologies.

Authors: Robert Hoehndorf; Michel Dumontier; John H Gennari; Sarala Wimalaratne; Bernard de Bono; Daniel L Cook; Georgios V Gkoutos
Journal: BMC Syst Biol Date: 2011-08-11

9. Controlled vocabularies and semantics in systems biology.

Authors: Mélanie Courtot; Nick Juty; Christian Knüpfer; Dagmar Waltemath; Anna Zhukova; Andreas Dräger; Michel Dumontier; Andrew Finney; Martin Golebiewski; Janna Hastings; Stefan Hoops; Sarah Keating; Douglas B Kell; Samuel Kerrien; James Lawson; Allyson Lister; James Lu; Rainer Machne; Pedro Mendes; Matthew Pocock; Nicolas Rodriguez; Alice Villeger; Darren J Wilkinson; Sarala Wimalaratne; Camille Laibe; Michael Hucka; Nicolas Le Novère
Journal: Mol Syst Biol Date: 2011-10-25 Impact factor: 11.429

Review 10. Harmonizing semantic annotations for computational models in biology.

Authors: Maxwell Lewis Neal; Matthias König; David Nickerson; Göksel Mısırlı; Reza Kalbasi; Andreas Dräger; Koray Atalag; Vijayalakshmi Chelliah; Michael T Cooling; Daniel L Cook; Sharon Crook; Miguel de Alba; Samuel H Friedman; Alan Garny; John H Gennari; Padraig Gleeson; Martin Golebiewski; Michael Hucka; Nick Juty; Chris Myers; Brett G Olivier; Herbert M Sauro; Martin Scharm; Jacky L Snoep; Vasundra Touré; Anil Wipat; Olaf Wolkenhauer; Dagmar Waltemath
Journal: Brief Bioinform Date: 2019-03-22 Impact factor: 11.622