| Literature DB >> 34264488 |
Maxwell Adam Levinson1, Justin Niestroy1, Sadnan Al Manir1, Karen Fairchild2,3, Douglas E Lake3,4,5, J Randall Moorman3,4, Timothy Clark6,7,8.
Abstract
Results of computational analyses require transparent disclosure of their supporting resources, while the analyses themselves often can be very large scale and involve multiple processing steps separated in time. Evidence for the correctness of any analysis should include not only a textual description, but also a formal record of the computations which produced the result, including accessible data and software with runtime parameters, environment, and personnel involved. This article describes FAIRSCAPE, a reusable computational framework, enabling simplified access to modern scalable cloud-based components. FAIRSCAPE fully implements the FAIR data principles and extends them to provide fully FAIR Evidence, including machine-interpretable provenance of datasets, software and computations, as metadata for all computed results. The FAIRSCAPE microservices framework creates a complete Evidence Graph for every computational result, including persistent identifiers with metadata, resolvable to the software, computations, and datasets used in the computation; and stores a URI to the root of the graph in the result's metadata. An ontology for Evidence Graphs, EVI ( https://w3id.org/EVI ), supports inferential reasoning over the evidence. FAIRSCAPE can run nested or disjoint workflows and preserves provenance across them. It can run Apache Spark jobs, scripts, workflows, or user-supplied containers. All objects are assigned persistent IDs, including software. All results are annotated with FAIR metadata using the evidence graph model for access, validation, reproducibility, and re-use of archived data and software.Entities:
Keywords: Agumentation; Digital Commons; Evidence graph; FAIR data; FAIR software; Provenance; Reproducibility
Mesh:
Year: 2021 PMID: 34264488 PMCID: PMC8760356 DOI: 10.1007/s12021-021-09529-4
Source DB: PubMed Journal: Neuroinformatics ISSN: 1539-2791
Fig. 1FAIRSCAPE architectural layers and components
Fig. 2NICU HCTSA clustering heatmap. X axis and Y axis are operations (algorithms using specific parameter sets), color is correlation between algorithms. The large white squares are clusters of highly correlated operations which suggest the dimension of the data may be greatly diminished by selecting “representative” algorithms from these clusters
Fig. 3Simplified Evidence graph for one patient’s computations. Vital signs = dark blue box bottom right; computations = yellow boxes; processed data = dark blue box in middle; green box = heatmap of correlations
Fig. 4JSON-LD Evidence Graph for patient computation as illustrated in Fig. 3
Fig. 5Evidence Graph visualization for the neuroimaging workflow execution