| Literature DB >> 33978761 |
Denis Yuen1, Louise Cabansay2, Andrew Duncan1, Gary Luu1, Gregory Hogue1, Charles Overbeck2, Natalie Perez2, Walt Shands2, David Steinberg2, Chaz Reid2, Nneka Olunwa2, Richard Hansen2, Elizabeth Sheets2, Ash O'Farrell2, Kim Cullion1, Brian D O'Connor3, Benedict Paten2, Lincoln Stein1.
Abstract
Dockstore (https://dockstore.org/) is an open source platform for publishing, sharing, and finding bioinformatics tools and workflows. The platform has facilitated large-scale biomedical research collaborations by using cloud technologies to increase the Findability, Accessibility, Interoperability and Reusability (FAIR) of computational resources, thereby promoting the reproducibility of complex bioinformatics analyses. Dockstore supports a variety of source repositories, analysis frameworks, and language technologies to provide a seamless publishing platform for authors to create a centralized catalogue of scientific software. The ready-to-use packaging of hundreds of tools and workflows, combined with the implementation of interoperability standards, enables users to launch analyses across multiple environments. Dockstore is widely used, more than twenty-five high-profile organizations share analysis collections through the platform in a variety of workflow languages, including the Broad Institute's GATK best practice and COVID-19 workflows (WDL), nf-core workflows (Nextflow), the Intergalactic Workflow Commission tools (Galaxy), and workflows from Seven Bridges (CWL) to highlight just a few. Here we describe the improvements made over the last four years, including the expansion of system integrations supporting authors, the addition of collaboration features and analysis platform integrations supporting users, and other enhancements that improve the overall scientific reproducibility of Dockstore content.Entities:
Mesh:
Year: 2021 PMID: 33978761 PMCID: PMC8218198 DOI: 10.1093/nar/gkab346
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Dockstore makes computational analysis accessible and reproducible by combining containers, descriptor languages, and test parameter files to simplify software reuse and dependency management.
Dockstore's support for FAIR principles
|
|
|
|
|
|---|---|---|---|
| Findable | Accessible | Interoperable | Reusable |
| All runtime needs and metadata are packaged together, parsed, and indexed for robust searching with the option to generate DOIs. | Dockstore never requires a user to login to search and inspect contents for workflows and tools. Links to source repositories always provided. | Standardized APIs and agnostic support of multiple languages and repositories enables the simple launching of workflows to a variety of compute platforms. | Ready-to-use, version controlled portability using containers and human readable workflow languages with provided test files and documentation to simplify reproducibility. |
For the WDL (10) workflow language, Dockstore offers Launch with DNAstack (https://www.dnastack.com/), DNAnexus (https://www.dnanexus.com/), Terra (https://terra.bio/), FireCloud (http://firecloud.terra.bio) through Terra's integration, NHLBI Biodata Catalyst (https://biodatacatalyst.nhlbi.nih.gov/), and AnVIL (https://anvilproject.org/). For the CWL (11) workflow language, Dockstore offers Launch with the Cancer Genomics Cloud (https://www.cancergenomicscloud.org/), Cavatica(https://cavatica.squarespace.com/) powered by Seven Bridges Genomics, and NHLBI Biodata Catalyst
| Cloud platform | Languages | Academic or commercial | Browser launch from Dockstore | Launch from within Cloud Platform |
|---|---|---|---|---|
| DNAstack | WDL | Commercial | yes | Yes |
| DNAnexus | WDL | Commercial | yes | |
| Terra | WDL | Academic | yes | |
| Firecloud | WDL | Academic | yes, via Terra ‘ | |
| Cancer Genomics Cloud (CGC) | CWL | Partnership | yes | |
| AnVIL | WDL | Academic | yes | |
| NHLBI BioData Catalyst | CWL, WDL | Partnership | yes | |
| Cavatica | CWL | Partnership | yes | |
| Galaxy Project | Galaxy | Academic | Yes |
Figure 2.Dockstore can register workflows in three main ways, from source control, stored on dockstore.org directly, or tools with descriptors found via quay.io.
Figure 3.Dockstore contributed to the development of GA4GH TRS, an API that it uses for distributing workflows to Launch with partners and tools such as CWL’s cwltool. (A) Currently, cloud analysis environments use proprietary APIs or custom scripts to access tools. This makes it difficult to publish tools in one place and use them in different cloud analysis environments. (B) The TRS (Tool Registry Service) API provides a standard way to retrieve standard workflows from multiple cloud environments. TRS also provides a channel for different groups to share tools. Courtesy of Stephanie Li, GA4GH.