| Literature DB >> 25276335 |
Kristina M Hettne1, Harish Dharuri1, Jun Zhao2, Katherine Wolstencroft3, Khalid Belhajjame4, Stian Soiland-Reyes4, Eleni Mina1, Mark Thompson1, Don Cruickshank2, Lourdes Verdes-Montenegro5, Julian Garrido5, David de Roure2, Oscar Corcho6, Graham Klyne2, Reinout van Schouwen1, Peter A C 't Hoen1, Sean Bechhofer4, Carole Goble4, Marco Roos1.
Abstract
BACKGROUND: One of the main challenges for biomedical research lies in the computer-assisted integrative study of large and increasingly complex combinations of data in order to understand molecular mechanisms. The preservation of the materials and methods of such computational experiments with clear annotations is essential for understanding an experiment, and this is increasingly recognized in the bioinformatics community. Our assumption is that offering means of digital, structured aggregation and annotation of the objects of an experiment will provide necessary meta-data for a scientist to understand and recreate the results of an experiment. To support this we explored a model for the semantic description of a workflow-centric Research Object (RO), where an RO is defined as a resource that aggregates other resources, e.g., datasets, software, spreadsheets, text, etc. We applied this model to a case study where we analysed human metabolite variation by workflows.Entities:
Keywords: Digital libraries; Genome wide association study; Scientific workflows; Semantic web models
Year: 2014 PMID: 25276335 PMCID: PMC4177597 DOI: 10.1186/2041-1480-5-41
Source DB: PubMed Journal: J Biomed Semantics
Figure 1An overview of the Minim model. An overview of the four components: a constraint, a model, a requirement, and a rule.
Figure 2Screenshots from myExperiment illustrating the process of creating a Research Object placeholder. Before pressing the “create” button the user can enter a title and description (A), while pressing the “create” button will result in a placeholder Research Object with an identifier (B).
Figure 3Workflow sketch. A workflow sketch showing that our experiment follows two paths to interpret genome wide association study results: matching with concept profiles and matching with KEGG pathways.
Figure 4Screenshot of the results from the second check with the checklist evaluation service. The results from checklist evaluation service show that the Research Object satisfies the defined checklist for a Research Object.
Figure 5Screenshot of the relationships in the RO in myExperiment. The relationships between example inputs and workflows in the Research Object have been defined in myExperiment.
Figure 6Taverna workflow diagram for the KEGG workflow. Blue boxes are workflow inputs, brown boxes are scripts, grey boxes are constant values, green boxes are Web services, purple boxes are Taverna internal services, and pink boxes are nested workflows.
Figure 7Taverna workflow diagram for the concept profile mining workflow. Blue boxes are workflow inputs, purple boxes are Taverna internal services, and pink boxes are nested workflows.
Figure 8Taverna workflow annotation example. An example of an annotation of the purpose of a nested workflow in Taverna.
Figure 9Simplified diagram showing part of the Research Object for our experiment. The Research Object contains the items that were aggregated by the “Research Object-enabled” version of myExperiment. Shown is the part of the RDF graph that aggregates and annotates the KEGG pathway mining workflow.
RO items checklist
| Research object item | Requirement | RO ontology term |
|---|---|---|
| Hypothesis or Research question | Should | roterms: Hypothesis/roterms:Research Question |
| Workflow sketch | Should | roterms:Sketch |
| One or more workflows | Must | wfdesc:Workflow |
| Web services of the workflow | Must | wfdesc:Process |
| Example input data | Must | roterms:exampleValue |
| Provenance of workflow runs | Must | wfprov:WorkflowRun |
| Example results | Must | roterms:Result |
| Conclusions | Must | roterms:Conclusion |
RO items for a workflow-based experiment annotated with the appropriate term from the Minim vocabulary.
Figure 10Screenshot showing a SPARQL query and its results. Query to obtain a reference to the data that was used as input to our workflows and the conclusions that we drew from evaluating the workflow results.