| Literature DB >> 34528664 |
David Johnson1,2, Dominique Batista1, Keeva Cochrane3, Robert P Davey4, Anthony Etuk4, Alejandra Gonzalez-Beltran1,5, Kenneth Haug3,6, Massimiliano Izzo1, Martin Larralde7, Thomas N Lawson8, Alice Minotto4, Pablo Moreno3, Venkata Chandrasekhar Nainala3, Claire O'Donovan3, Luca Pireddu9, Pierrick Roger10, Felix Shaw4, Christoph Steinbeck11, Ralf J M Weber8,12, Susanna-Assunta Sansone1, Philippe Rocca-Serra1.
Abstract
BACKGROUND: The Investigation/Study/Assay (ISA) Metadata Framework is an established and widely used set of open source community specifications and software tools for enabling discovery, exchange, and publication of metadata from experiments in the life sciences. The original ISA software suite provided a set of user-facing Java tools for creating and manipulating the information structured in ISA-Tab-a now widely used tabular format. To make the ISA framework more accessible to machines and enable programmatic manipulation of experiment metadata, the JSON serialization ISA-JSON was developed.Entities:
Keywords: API; life science; metadata; open-source software; reproducibility; standards
Mesh:
Year: 2021 PMID: 34528664 PMCID: PMC8444265 DOI: 10.1093/gigascience/giab060
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Figure 1:An overview of the ISA data model showing its key constituent classes and their relationships with one another. The model is structured around the concepts of investigation, study, and assay (in red). Other model elements exist to qualify these core elements (in green), e.g., relating investigations and sub-studies with relevant contact persons or related publications; or information about the study design used and any experimental protocols implemented. Experimental workflows are modelled as sequences of process events with inputs and outputs that correspond to biological materials and data objects (in blue). Values can be made explicit by using term annotations or unit declarations that reference published ontologies (in orange).
Figure 2:ISA API and its interactions with other software ecosystems and data formats. Apart from running in the Python interpreter, the ISA API can also be accessed through the iPython, Jupyter, and Docker. It also supports interoperation with other systems through standardized machine-readable data formats.
Figure 3:Example of a Python script using the ISA API's class model objects to programmatically construct metadata about a study and serialize it to ISA-Tab. This script first creates the Investigation and Study structures that store general metadata about the experiment being described. Next, a source material object is created, and 3 sample materials. These are connected as inputs and outputs, respectively, to a sample collection process, forming a workflow from source to samples. This is then added to the study's “process sequence”: a container for experimental process event descriptions. Finally, the composition of the model objects is serialized as ISA-Tab to the standard output. Scripts such as these can form part of a larger Python software program, or be executed directly from the command-line, to automate the construction of ISA metadata descriptors.
Figure 4:Example Jupyter Notebook using the ISA API to use ISA class objects to programmatically construct metadata about a study, using similar code to that shown in Fig. 3. Being Python-based, ISA API can integrate with any notebook environment that supports Python kernels including Jupyter Notebook, JupyterLab, and JupyterHub, and proprietary notebook environments such as Google Colab [51], Microsoft Azure Notebooks [52], and Amazon's SageMaker [53]. A set of Jupyter notebooks detailing how to use ISA-API key functionalities is available on GitHub and the isa-cookbook [54,55].
Figure 5:The ISA Create-Validate-Upload workflow, published as part of the PhenoMeNal platform Dalcotidine release. The workflow takes a user-configured study plan and creates an ISA-Tab template ready for the experimentalist to use in their study. The ISA-Tab then goes through 2 paths of processing: (i) a summary of study factors according to the study design is extracted and then visualized as a parallel sets plot; and (ii) the ISA-Tab is validated, and if valid a pre-submission request is made by uploading the ISA-Tab to MetaboLights Labs. A preparatory study accession ID is then issued by MetaboLights if accepted. Parts of the workflow that directly use the ISA API are highlighted in green along with the packages used.
Figure 6:Download statistics for the isatools Python package from PyPI from 21 February 2016 (first release of ISA API, v0.1) to 24 October 2020 (after release v0.11). Note that the total number of downloads for 2020 was incomplete at the time the data were collated. A, Bar chart depicting the annual total downloads of isatools. B, Line chart depicting the monthly total number of downloads of isatools with major releases of the ISA API indicated with vertical dashed lines.