MOTIVATION: Data collection in spreadsheets is ubiquitous, but current solutions lack support for collaborative semantic annotation that would promote shared and interdisciplinary annotation practices, supporting geographically distributed players. RESULTS: OntoMaton is an open source solution that brings ontology lookup and tagging capabilities into a cloud-based collaborative editing environment, harnessing Google Spreadsheets and the NCBO Web services. It is a general purpose, format-agnostic tool that may serve as a component of the ISA software suite. OntoMaton can also be used to assist the ontology development process. AVAILABILITY: OntoMaton is freely available from Google widgets under the CPAL open source license; documentation and examples at: https://github.com/ISA-tools/OntoMaton.
MOTIVATION: Data collection in spreadsheets is ubiquitous, but current solutions lack support for collaborative semantic annotation that would promote shared and interdisciplinary annotation practices, supporting geographically distributed players. RESULTS: OntoMaton is an open source solution that brings ontology lookup and tagging capabilities into a cloud-based collaborative editing environment, harnessing Google Spreadsheets and the NCBO Web services. It is a general purpose, format-agnostic tool that may serve as a component of the ISA software suite. OntoMaton can also be used to assist the ontology development process. AVAILABILITY: OntoMaton is freely available from Google widgets under the CPAL open source license; documentation and examples at: https://github.com/ISA-tools/OntoMaton.
Well-annotated and shared bioscience research data offer new discovery opportunities and drive science of the future. Several data management plans and sharing policies have emerged, along with a growing number of community-developed guidelines and ontologies to harmonize the reporting of experiments from different domains so that these can be comprehensible and in turn, reproducible and reusable. In many research projects however, the generation and collection of experimental data occur in a multicentric, distributed fashion; and a variety of data types are generated often in a single experiment. Use of spreadsheets and related editors, such as Microsoft Excel for collecting experimental description is widespread among researchers due to their flexibility, low learning curve and above all ubiquity of tooling. However, misalignment, conflicting versions, the heterogeneity of free text, and also silent and unwanted ‘auto-corrections’ are major shortcomings to be addressed (Zeeberg ). This scenario and the current budgetary restrictions require ‘invest to save’ solutions to promote consistent annotation and collaborative editing of bioscience experiments, assisting researchers in complying with reporting policies and community standards. OntoMaton is an open source tool that leverages the collaborative environment and editing functionalities brought by Google Spreadsheets, and provides access to ontology look-up and tagging functionalities served by the NCBO BioPortal and Annotator web services (Jonquet ; Whetzel ).
2 OntoMaton DESIGN AND USE CASES
Four main use cases drove the development of OntoMaton: (i) to allow collaborative, distributed and coordinated annotation while enabling configurations and restrictions to be defined; (ii) to reduce free text description in metadata tracking of experimental data; (iii) to assist design patterns-based ontology development by facilitating interaction with domain experts; and (iv) to ease mapping between models and semantic representations. Two use cases are discussed more specifically in the next sections: one insisting on the free form of the widget and its ability to integrate in any layout, agnostic of any framework; the other aligning with a standardization effort, the ISA syntax (Sansone ). While ontology-enabled standalone tools exist (Rocca-Serra ; Wolstencroft ), they lack collaborative features. OntoMaton, with the aid of the Google Spreadsheet environment delivers this. OntoMaton is implemented in JavaScript upon the Google App Script API and accesses the NCBO RESTful web services. A webcast tutorial of how to use it is available at http://goo.gl/FjghA.
3 COLLABORATIVE SEMANTIC ANNOTATION
The OntoMaton Google widget can be installed and invoked from any Google Spreadsheet document or embedded in Google Templates. It provides a facility for searching ontologies hosted at NCBO BioPortal, or for calling a tagging functionality, by relying on NCBO’s Annotator services (Jonquet ). An OntoMaton-enabled Google spreadsheet can also be configured to restrict the ontological search space to specific resources. While OntoMaton is syntax neutral, its usefulness is demonstrated when exploited by a data management infrastructure, for instance to support the creation of ISA-Tab compatible templates. The ultimate goal is to foster adoption of reporting standard conformant spreadsheets for managing biological experimental data description. Figure 1 provides detailed alternative uses of OntoMaton and a snippet of an experiment being marked-up. Several data management projects—with an existing large user base are currently using OntoMaton-based templates to assist with their data collection and management needs. These include: the Earth Microbiome Project (http://goo.gl/JLG5d); Bioplatforms Australia (http://goo.gl/uXLve)—with a focus on soil metagenomics sample collection; and Metabolights (Steinbeck )—a repository of metabolite profiling data at the European Bioinformatics Institute.
Fig. 1.
Uses of OntoMaton: (1) open the OntoMaton-enabled Google Spreadsheet template from the online gallery; (2) create a standard Google Spreadsheet and install OntoMaton within that; or (3) as part of the ISA suite, export an Excel template from ISAconfigurator and upload it into Google Spreadsheets
Uses of OntoMaton: (1) open the OntoMaton-enabled Google Spreadsheet template from the online gallery; (2) create a standard Google Spreadsheet and install OntoMaton within that; or (3) as part of the ISA suite, export an Excel template from ISAconfigurator and upload it into Google Spreadsheets
4 COLLABORATIVE ONTOLOGY ENGINEERING
Developing ontologies and knowledge representation artefacts requires the interaction of domain experts and computer scientists. The core interaction consists of converting domain expert vetted representations (a.k.a a design pattern) to OWL representations through the intervention of knowledge engineers. Tools such as Populous (Jupp ) and Protege Mapping Master (http://protege.stanford.edu) have been developed to support these activities. The developers of the Ontology of Biomedical Investigations (OBI) (Brinkman ), currently rely on the Quick Term Template (Rocca-Serra ) approach to quickly add defined classes based on a template and the Manchester OWL Syntax for the mapping. However, owing to the collaborative nature of OBI development, the approach has been hindered by the lack of tools. OntoMaton closes this gap and several templates have now been documented to support different design patterns. Those templates unfold the restrictions of a class model in a table: fields correspond to facet fillers and cell values should be class names or URIs. OntoMaton, by enabling in situ resource lookups, simplifies development, review and curation by the pool of OBI editors.
5 DISCUSSION
Developed by harnessing the Google Spreadsheet environment and the term lookup and annotation power of the NCBO Web services, OntoMaton is an effective tool assisting both collaborative semantic annotation of experiments and the ontology development process. Several annotation tools exist (Nelson ) but not all have support for community-driven guidelines and ontologies, and none of them allow collaborative annotation. Moreover, Excel-based tools (Jones and Côté, 2008) tend to be platform and version dependent. Google Spreadsheets on the other hand work across all platforms. A comparison of tools attempting to mix spreadsheets with access to vocabulary servers is available at http://goo.gl/NV3lZ. Ongoing development of OntoMaton focuses on: (i) transformation of data into the Resource Description Framework and Linked Data; (ii) support for cell level, vocabulary drop-down list as soon as the Google API supports it; and (iii) further integration with the ISA software suite as requested by users.Funding: This work was supported by the Biotechnology and Biological Sciences Research Council [grant BB/I025840/1, BB/I000771/1 and BB/I000917/1 to S.A.S.] and the National Institutes of Health [grant U54 HG004028 supporting TW].Conflict of Interest: none declared.
Authors: Katy Wolstencroft; Stuart Owen; Matthew Horridge; Olga Krebs; Wolfgang Mueller; Jacky L Snoep; Franco du Preez; Carole Goble Journal: Bioinformatics Date: 2011-05-26 Impact factor: 6.937
Authors: Susanna-Assunta Sansone; Philippe Rocca-Serra; Dawn Field; Eamonn Maguire; Chris Taylor; Oliver Hofmann; Hong Fang; Steffen Neumann; Weida Tong; Linda Amaral-Zettler; Kimberly Begley; Tim Booth; Lydie Bougueleret; Gully Burns; Brad Chapman; Tim Clark; Lee-Ann Coleman; Jay Copeland; Sudeshna Das; Antoine de Daruvar; Paula de Matos; Ian Dix; Scott Edmunds; Chris T Evelo; Mark J Forster; Pascale Gaudet; Jack Gilbert; Carole Goble; Julian L Griffin; Daniel Jacob; Jos Kleinjans; Lee Harland; Kenneth Haug; Henning Hermjakob; Shannan J Ho Sui; Alain Laederach; Shaoguang Liang; Stephen Marshall; Annette McGrath; Emily Merrill; Dorothy Reilly; Magali Roux; Caroline E Shamu; Catherine A Shang; Christoph Steinbeck; Anne Trefethen; Bryn Williams-Jones; Katherine Wolstencroft; Ioannis Xenarios; Winston Hide Journal: Nat Genet Date: 2012-01-27 Impact factor: 38.330
Authors: Simon Jupp; Matthew Horridge; Luigi Iannone; Julie Klein; Stuart Owen; Joost Schanstra; Katy Wolstencroft; Robert Stevens Journal: BMC Bioinformatics Date: 2012-01-25 Impact factor: 3.169
Authors: Elizabeth K Nelson; Britt Piehler; Josh Eckels; Adam Rauch; Matthew Bellew; Peter Hussey; Sarah Ramsay; Cory Nathe; Karl Lum; Kevin Krouse; David Stearns; Brian Connolly; Tom Skillman; Mark Igra Journal: BMC Bioinformatics Date: 2011-03-09 Impact factor: 3.307
Authors: Patricia L Whetzel; Natalya F Noy; Nigam H Shah; Paul R Alexander; Csongor Nyulas; Tania Tudorache; Mark A Musen Journal: Nucleic Acids Res Date: 2011-06-14 Impact factor: 16.971
Authors: Christoph Steinbeck; Pablo Conesa; Kenneth Haug; Tejasvi Mahendraker; Mark Williams; Eamonn Maguire; Philippe Rocca-Serra; Susanna-Assunta Sansone; Reza M Salek; Julian L Griffin Journal: Metabolomics Date: 2012-09-25 Impact factor: 4.290
Authors: Barry R Zeeberg; Joseph Riss; David W Kane; Kimberly J Bussey; Edward Uchio; W Marston Linehan; J Carl Barrett; John N Weinstein Journal: BMC Bioinformatics Date: 2004-06-23 Impact factor: 3.169
Authors: Natalie J Stanford; Martin Scharm; Paul D Dobson; Martin Golebiewski; Michael Hucka; Varun B Kothamachu; David Nickerson; Stuart Owen; Jürgen Pahle; Ulrike Wittig; Dagmar Waltemath; Carole Goble; Pedro Mendes; Jacky Snoep Journal: Methods Mol Biol Date: 2019
Authors: D G Howe; Y M Bradford; A Eagle; D Fashena; K Frazer; P Kalita; P Mani; R Martin; S T Moxon; H Paddock; C Pich; S Ramachandran; L Ruzicka; K Schaper; X Shao; A Singer; S Toro; C Van Slyke; M Westerfield Journal: Methods Cell Biol Date: 2016-05-12 Impact factor: 1.441
Authors: Alison Layton; E Anne Eady; Maggie Peat; Heather Whitehouse; Nick Levell; Matthew Ridd; Fiona Cowdell; Mahenda Patel; Stephen Andrews; Christine Oxnard; Mark Fenton; Lester Firkins Journal: BMJ Open Date: 2015-07-17 Impact factor: 2.692
Authors: Reza M Salek; Steffen Neumann; Daniel Schober; Jan Hummel; Kenny Billiau; Joachim Kopka; Elon Correa; Theo Reijmers; Antonio Rosato; Leonardo Tenori; Paola Turano; Silvia Marin; Catherine Deborde; Daniel Jacob; Dominique Rolin; Benjamin Dartigues; Pablo Conesa; Kenneth Haug; Philippe Rocca-Serra; Steve O'Hagan; Jie Hao; Michael van Vliet; Marko Sysi-Aho; Christian Ludwig; Jildau Bouwman; Marta Cascante; Timothy Ebbels; Julian L Griffin; Annick Moing; Macha Nikolski; Matej Oresic; Susanna-Assunta Sansone; Mark R Viant; Royston Goodacre; Ulrich L Günther; Thomas Hankemeier; Claudio Luchinat; Dirk Walther; Christoph Steinbeck Journal: Metabolomics Date: 2015-05-26 Impact factor: 4.290
Authors: Alejandra González-Beltrán; Peter Li; Jun Zhao; Maria Susana Avila-Garcia; Marco Roos; Mark Thompson; Eelke van der Horst; Rajaram Kaliyaperumal; Ruibang Luo; Tin-Lap Lee; Tak-Wah Lam; Scott C Edmunds; Susanna-Assunta Sansone; Philippe Rocca-Serra Journal: PLoS One Date: 2015-07-08 Impact factor: 3.240