| Literature DB >> 34991705 |
Sheeba Samuel1,2, Birgitta König-Ries3,4.
Abstract
BACKGROUND: The advancement of science and technologies play an immense role in the way scientific experiments are being conducted. Understanding how experiments are performed and how results are derived has become significantly more complex with the recent explosive growth of heterogeneous research data and methods. Therefore, it is important that the provenance of results is tracked, described, and managed throughout the research lifecycle starting from the beginning of an experiment to its end to ensure reproducibility of results described in publications. However, there is a lack of interoperable representation of end-to-end provenance of scientific experiments that interlinks data, processing steps, and results from an experiment's computational and non-computational processes.Entities:
Keywords: Experiments; Jupyter notebooks; Ontology; Provenance; Reproducibility; Semantic web
Mesh:
Year: 2022 PMID: 34991705 PMCID: PMC8734275 DOI: 10.1186/s13326-021-00253-1
Source DB: PubMed Journal: J Biomed Semantics
Fig. 1An overall view of the scientific experiments and practices
Fig. 2The expanded view of the REPRODUCE-ME data model used to represent a scientific experiment
Fig. 3A scientific experiment depicted using the REPRODUCE-ME ontology [56]
Overview of the ontology terms to model script and computational notebooks provenance
| Component | Ontology term | Description | Provenance | Remarks |
|---|---|---|---|---|
| Script | Program or code that is used in a scientific experiment | Prospective | Subclass of | |
| Function | A programming language code snippet | Prospective | Subclass of | |
| Module | A part of a computer program or software which provides declarations and functions | Prospective | Subclass of | |
| Module Version | The version of a module | Retrospective | Subclass of | |
| Argument | The parameter taken as an input, or declared/used in a script | Retrospective | Subclass of | |
| Input | The variable used as an input to a script or a function | Retrospective | Subclass of | |
| Output | The variable generated as an output of a script or a function | Retrospective | Subclass of | |
| Programming Language | The programming language in which a script is written | Prospective | Subclass of | |
| Programming Language Version | The version of the programming language in which a script is written | Retrospective | Subclass of | |
| Operating System | The operating system where the script is run | Retrospective | Subclass of | |
| Operating System Version | The version of the operating system where the script is run | Retrospective | Subclass of | |
| Author | The person who is the author of the script | Prospective | Subclass of | |
| Function Activation | Denotes when a function is activated or run | Retrospective | Subclass of | |
| Trial | Denotes a run or execution of a script | Retrospective | Subclass of | |
| Start Time | Denotes the time when the script is started to execute | Retrospective | Data property | |
| Finish Time | Denotes the time when the script finishes its execution | Retrospective | Data property | |
| Experimenter | Denotes the person who is executing the script | Retrospective | Subclass of | |
| Location | Denotes the location where the script is executed | Retrospective | Using | |
| Accessed File | Denotes the files that are accessed during the script execution | Retrospective | Subclass of | |
| Order of execution | Denotes how the functions are executed inside a script | Retrospective | Object property | |
| Experiment | Denotes the scientific experiment in which the script was used to perform data computation to produce result | Prospective | Subclass of | |
| Notebook | A computational notebook used in an experiment | Prospective | Subclass of | |
| Cell | A multiline text input field in a computational notebook | Prospective | Subclass of | |
| Source | The input of each cell | Retrospective | Subclass of | |
| CellExecution | Denotes an execution of a cell | Retrospective | Subclass of |
Fig. 4The semantic representation of a computational notebook [59]
Fig. 5The steps involved in an experiment which used the Plasmid ‘pCherry-RAD54’
Fig. 6Complete path taken by a scientist for a computational notebook experiment: The corresponding SPARQL query and a part of results
Fig. 7Complete path taken by a scientist for an experiment: The corresponding SPARQL query and a part of results