Literature DB >> 23267176

OntoMaton: a bioportal powered ontology widget for Google Spreadsheets.

Eamonn Maguire1, Alejandra González-Beltrán, Patricia L Whetzel, Susanna-Assunta Sansone, Philippe Rocca-Serra.   

Abstract

MOTIVATION: Data collection in spreadsheets is ubiquitous, but current solutions lack support for collaborative semantic annotation that would promote shared and interdisciplinary annotation practices, supporting geographically distributed players.
RESULTS: OntoMaton is an open source solution that brings ontology lookup and tagging capabilities into a cloud-based collaborative editing environment, harnessing Google Spreadsheets and the NCBO Web services. It is a general purpose, format-agnostic tool that may serve as a component of the ISA software suite. OntoMaton can also be used to assist the ontology development process. AVAILABILITY: OntoMaton is freely available from Google widgets under the CPAL open source license; documentation and examples at: https://github.com/ISA-tools/OntoMaton.

Entities:  

Mesh:

Year:  2012        PMID: 23267176      PMCID: PMC3570217          DOI: 10.1093/bioinformatics/bts718

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

Well-annotated and shared bioscience research data offer new discovery opportunities and drive science of the future. Several data management plans and sharing policies have emerged, along with a growing number of community-developed guidelines and ontologies to harmonize the reporting of experiments from different domains so that these can be comprehensible and in turn, reproducible and reusable. In many research projects however, the generation and collection of experimental data occur in a multicentric, distributed fashion; and a variety of data types are generated often in a single experiment. Use of spreadsheets and related editors, such as Microsoft Excel for collecting experimental description is widespread among researchers due to their flexibility, low learning curve and above all ubiquity of tooling. However, misalignment, conflicting versions, the heterogeneity of free text, and also silent and unwanted ‘auto-corrections’ are major shortcomings to be addressed (Zeeberg ). This scenario and the current budgetary restrictions require ‘invest to save’ solutions to promote consistent annotation and collaborative editing of bioscience experiments, assisting researchers in complying with reporting policies and community standards. OntoMaton is an open source tool that leverages the collaborative environment and editing functionalities brought by Google Spreadsheets, and provides access to ontology look-up and tagging functionalities served by the NCBO BioPortal and Annotator web services (Jonquet ; Whetzel ).

2 OntoMaton DESIGN AND USE CASES

Four main use cases drove the development of OntoMaton: (i) to allow collaborative, distributed and coordinated annotation while enabling configurations and restrictions to be defined; (ii) to reduce free text description in metadata tracking of experimental data; (iii) to assist design patterns-based ontology development by facilitating interaction with domain experts; and (iv) to ease mapping between models and semantic representations. Two use cases are discussed more specifically in the next sections: one insisting on the free form of the widget and its ability to integrate in any layout, agnostic of any framework; the other aligning with a standardization effort, the ISA syntax (Sansone ). While ontology-enabled standalone tools exist (Rocca-Serra ; Wolstencroft ), they lack collaborative features. OntoMaton, with the aid of the Google Spreadsheet environment delivers this. OntoMaton is implemented in JavaScript upon the Google App Script API and accesses the NCBO RESTful web services. A webcast tutorial of how to use it is available at http://goo.gl/FjghA.

3 COLLABORATIVE SEMANTIC ANNOTATION

The OntoMaton Google widget can be installed and invoked from any Google Spreadsheet document or embedded in Google Templates. It provides a facility for searching ontologies hosted at NCBO BioPortal, or for calling a tagging functionality, by relying on NCBO’s Annotator services (Jonquet ). An OntoMaton-enabled Google spreadsheet can also be configured to restrict the ontological search space to specific resources. While OntoMaton is syntax neutral, its usefulness is demonstrated when exploited by a data management infrastructure, for instance to support the creation of ISA-Tab compatible templates. The ultimate goal is to foster adoption of reporting standard conformant spreadsheets for managing biological experimental data description. Figure 1 provides detailed alternative uses of OntoMaton and a snippet of an experiment being marked-up. Several data management projects—with an existing large user base are currently using OntoMaton-based templates to assist with their data collection and management needs. These include: the Earth Microbiome Project (http://goo.gl/JLG5d); Bioplatforms Australia (http://goo.gl/uXLve)—with a focus on soil metagenomics sample collection; and Metabolights (Steinbeck )—a repository of metabolite profiling data at the European Bioinformatics Institute.
Fig. 1.

Uses of OntoMaton: (1) open the OntoMaton-enabled Google Spreadsheet template from the online gallery; (2) create a standard Google Spreadsheet and install OntoMaton within that; or (3) as part of the ISA suite, export an Excel template from ISAconfigurator and upload it into Google Spreadsheets

Uses of OntoMaton: (1) open the OntoMaton-enabled Google Spreadsheet template from the online gallery; (2) create a standard Google Spreadsheet and install OntoMaton within that; or (3) as part of the ISA suite, export an Excel template from ISAconfigurator and upload it into Google Spreadsheets

4 COLLABORATIVE ONTOLOGY ENGINEERING

Developing ontologies and knowledge representation artefacts requires the interaction of domain experts and computer scientists. The core interaction consists of converting domain expert vetted representations (a.k.a a design pattern) to OWL representations through the intervention of knowledge engineers. Tools such as Populous (Jupp ) and Protege Mapping Master (http://protege.stanford.edu) have been developed to support these activities. The developers of the Ontology of Biomedical Investigations (OBI) (Brinkman ), currently rely on the Quick Term Template (Rocca-Serra ) approach to quickly add defined classes based on a template and the Manchester OWL Syntax for the mapping. However, owing to the collaborative nature of OBI development, the approach has been hindered by the lack of tools. OntoMaton closes this gap and several templates have now been documented to support different design patterns. Those templates unfold the restrictions of a class model in a table: fields correspond to facet fillers and cell values should be class names or URIs. OntoMaton, by enabling in situ resource lookups, simplifies development, review and curation by the pool of OBI editors.

5 DISCUSSION

Developed by harnessing the Google Spreadsheet environment and the term lookup and annotation power of the NCBO Web services, OntoMaton is an effective tool assisting both collaborative semantic annotation of experiments and the ontology development process. Several annotation tools exist (Nelson ) but not all have support for community-driven guidelines and ontologies, and none of them allow collaborative annotation. Moreover, Excel-based tools (Jones and Côté, 2008) tend to be platform and version dependent. Google Spreadsheets on the other hand work across all platforms. A comparison of tools attempting to mix spreadsheets with access to vocabulary servers is available at http://goo.gl/NV3lZ. Ongoing development of OntoMaton focuses on: (i) transformation of data into the Resource Description Framework and Linked Data; (ii) support for cell level, vocabulary drop-down list as soon as the Google API supports it; and (iii) further integration with the ISA software suite as requested by users. Funding: This work was supported by the Biotechnology and Biological Sciences Research Council [grant BB/I025840/1, BB/I000771/1 and BB/I000917/1 to S.A.S.] and the National Institutes of Health [grant U54 HG004028 supporting TW]. Conflict of Interest: none declared.
  11 in total

1.  The PRIDE proteomics identifications database: data submission, query, and dataset comparison.

Authors:  Philip Jones; Richard Côté
Journal:  Methods Mol Biol       Date:  2008

2.  RightField: embedding ontology annotation in spreadsheets.

Authors:  Katy Wolstencroft; Stuart Owen; Matthew Horridge; Olga Krebs; Wolfgang Mueller; Jacky L Snoep; Franco du Preez; Carole Goble
Journal:  Bioinformatics       Date:  2011-05-26       Impact factor: 6.937

3.  ISA software suite: supporting standards-compliant experimental annotation and enabling curation at the community level.

Authors:  Philippe Rocca-Serra; Marco Brandizi; Eamonn Maguire; Nataliya Sklyar; Chris Taylor; Kimberly Begley; Dawn Field; Stephen Harris; Winston Hide; Oliver Hofmann; Steffen Neumann; Peter Sterk; Weida Tong; Susanna-Assunta Sansone
Journal:  Bioinformatics       Date:  2010-08-02       Impact factor: 6.937

4.  Toward interoperable bioscience data.

Authors:  Susanna-Assunta Sansone; Philippe Rocca-Serra; Dawn Field; Eamonn Maguire; Chris Taylor; Oliver Hofmann; Hong Fang; Steffen Neumann; Weida Tong; Linda Amaral-Zettler; Kimberly Begley; Tim Booth; Lydie Bougueleret; Gully Burns; Brad Chapman; Tim Clark; Lee-Ann Coleman; Jay Copeland; Sudeshna Das; Antoine de Daruvar; Paula de Matos; Ian Dix; Scott Edmunds; Chris T Evelo; Mark J Forster; Pascale Gaudet; Jack Gilbert; Carole Goble; Julian L Griffin; Daniel Jacob; Jos Kleinjans; Lee Harland; Kenneth Haug; Henning Hermjakob; Shannan J Ho Sui; Alain Laederach; Shaoguang Liang; Stephen Marshall; Annette McGrath; Emily Merrill; Dorothy Reilly; Magali Roux; Caroline E Shamu; Catherine A Shang; Christoph Steinbeck; Anne Trefethen; Bryn Williams-Jones; Katherine Wolstencroft; Ioannis Xenarios; Winston Hide
Journal:  Nat Genet       Date:  2012-01-27       Impact factor: 38.330

5.  Populous: a tool for building OWL ontologies from templates.

Authors:  Simon Jupp; Matthew Horridge; Luigi Iannone; Julie Klein; Stuart Owen; Joost Schanstra; Katy Wolstencroft; Robert Stevens
Journal:  BMC Bioinformatics       Date:  2012-01-25       Impact factor: 3.169

6.  Building a biomedical ontology recommender web service.

Authors:  Clement Jonquet; Mark A Musen; Nigam H Shah
Journal:  J Biomed Semantics       Date:  2010-06-22

7.  LabKey Server: an open source platform for scientific data integration, analysis and collaboration.

Authors:  Elizabeth K Nelson; Britt Piehler; Josh Eckels; Adam Rauch; Matthew Bellew; Peter Hussey; Sarah Ramsay; Cory Nathe; Karl Lum; Kevin Krouse; David Stearns; Brian Connolly; Tom Skillman; Mark Igra
Journal:  BMC Bioinformatics       Date:  2011-03-09       Impact factor: 3.307

8.  BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications.

Authors:  Patricia L Whetzel; Natalya F Noy; Nigam H Shah; Paul R Alexander; Csongor Nyulas; Tania Tudorache; Mark A Musen
Journal:  Nucleic Acids Res       Date:  2011-06-14       Impact factor: 16.971

9.  MetaboLights: towards a new COSMOS of metabolomics data management.

Authors:  Christoph Steinbeck; Pablo Conesa; Kenneth Haug; Tejasvi Mahendraker; Mark Williams; Eamonn Maguire; Philippe Rocca-Serra; Susanna-Assunta Sansone; Reza M Salek; Julian L Griffin
Journal:  Metabolomics       Date:  2012-09-25       Impact factor: 4.290

10.  Mistaken identifiers: gene name errors can be introduced inadvertently when using Excel in bioinformatics.

Authors:  Barry R Zeeberg; Joseph Riss; David W Kane; Kimberly J Bussey; Edward Uchio; W Marston Linehan; J Carl Barrett; John N Weinstein
Journal:  BMC Bioinformatics       Date:  2004-06-23       Impact factor: 3.169

View more
  25 in total

1.  Data Management in Computational Systems Biology: Exploring Standards, Tools, Databases, and Packaging Best Practices.

Authors:  Natalie J Stanford; Martin Scharm; Paul D Dobson; Martin Golebiewski; Michael Hucka; Varun B Kothamachu; David Nickerson; Stuart Owen; Jürgen Pahle; Ulrike Wittig; Dagmar Waltemath; Carole Goble; Pedro Mendes; Jacky Snoep
Journal:  Methods Mol Biol       Date:  2019

2.  A curated, ontology-based, large-scale knowledge graph of artificial intelligence tasks and benchmarks.

Authors:  Kathrin Blagec; Adriano Barbosa-Silva; Simon Ott; Matthias Samwald
Journal:  Sci Data       Date:  2022-06-17       Impact factor: 8.501

3.  The Risa R/Bioconductor package: integrative data analysis from experimental metadata and back again.

Authors:  Alejandra González-Beltrán; Steffen Neumann; Eamonn Maguire; Susanna-Assunta Sansone; Philippe Rocca-Serra
Journal:  BMC Bioinformatics       Date:  2014-01-10       Impact factor: 3.169

Review 4.  Biomarkers in autism spectrum disorder: the old and the new.

Authors:  Barbara Ruggeri; Ugis Sarkans; Gunter Schumann; Antonio M Persico
Journal:  Psychopharmacology (Berl)       Date:  2013-10-06       Impact factor: 4.530

5.  A scientist's guide for submitting data to ZFIN.

Authors:  D G Howe; Y M Bradford; A Eagle; D Fashena; K Frazer; P Kalita; P Mani; R Martin; S T Moxon; H Paddock; C Pich; S Ramachandran; L Ruzicka; K Schaper; X Shao; A Singer; S Toro; C Van Slyke; M Westerfield
Journal:  Methods Cell Biol       Date:  2016-05-12       Impact factor: 1.441

6.  Identifying acne treatment uncertainties via a James Lind Alliance Priority Setting Partnership.

Authors:  Alison Layton; E Anne Eady; Maggie Peat; Heather Whitehouse; Nick Levell; Matthew Ridd; Fiona Cowdell; Mahenda Patel; Stephen Andrews; Christine Oxnard; Mark Fenton; Lester Firkins
Journal:  BMJ Open       Date:  2015-07-17       Impact factor: 2.692

7.  NCBO Technology: Powering semantically aware applications.

Authors:  Patricia L Whetzel
Journal:  J Biomed Semantics       Date:  2013-04-15

8.  The MetaboLights repository: curation challenges in metabolomics.

Authors:  Reza M Salek; Kenneth Haug; Pablo Conesa; Janna Hastings; Mark Williams; Tejasvi Mahendraker; Eamonn Maguire; Alejandra N González-Beltrán; Philippe Rocca-Serra; Susanna-Assunta Sansone; Christoph Steinbeck
Journal:  Database (Oxford)       Date:  2013-04-29       Impact factor: 3.451

9.  COordination of Standards in MetabOlomicS (COSMOS): facilitating integrated metabolomics data access.

Authors:  Reza M Salek; Steffen Neumann; Daniel Schober; Jan Hummel; Kenny Billiau; Joachim Kopka; Elon Correa; Theo Reijmers; Antonio Rosato; Leonardo Tenori; Paola Turano; Silvia Marin; Catherine Deborde; Daniel Jacob; Dominique Rolin; Benjamin Dartigues; Pablo Conesa; Kenneth Haug; Philippe Rocca-Serra; Steve O'Hagan; Jie Hao; Michael van Vliet; Marko Sysi-Aho; Christian Ludwig; Jildau Bouwman; Marta Cascante; Timothy Ebbels; Julian L Griffin; Annick Moing; Macha Nikolski; Matej Oresic; Susanna-Assunta Sansone; Mark R Viant; Royston Goodacre; Ulrich L Günther; Thomas Hankemeier; Claudio Luchinat; Dirk Walther; Christoph Steinbeck
Journal:  Metabolomics       Date:  2015-05-26       Impact factor: 4.290

10.  From Peer-Reviewed to Peer-Reproduced in Scholarly Publishing: The Complementary Roles of Data Models and Workflows in Bioinformatics.

Authors:  Alejandra González-Beltrán; Peter Li; Jun Zhao; Maria Susana Avila-Garcia; Marco Roos; Mark Thompson; Eelke van der Horst; Rajaram Kaliyaperumal; Ruibang Luo; Tin-Lap Lee; Tak-Wah Lam; Scott C Edmunds; Susanna-Assunta Sansone; Philippe Rocca-Serra
Journal:  PLoS One       Date:  2015-07-08       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.