| Literature DB >> 29333231 |
Kenzo-Hugo Hillion1, Ivan Kuzmin2, Anton Khodak3, Eric Rasche4, Michael Crusoe5, Hedi Peterson2, Jon Ison6, Hervé Ménager1.
Abstract
Workbench and workflow systems such as Galaxy, Taverna, Chipster, or Common Workflow Language (CWL)-based frameworks, facilitate the access to bioinformatics tools in a user-friendly, scalable and reproducible way. Still, the integration of tools in such environments remains a cumbersome, time consuming and error-prone process. A major consequence is the incomplete or outdated description of tools that are often missing important information, including parameters and metadata such as publication or links to documentation. ToolDog (Tool DescriptiOn Generator) facilitates the integration of tools - which have been registered in the ELIXIR tools registry (https://bio.tools) - into workbench environments by generating tool description templates. ToolDog includes two modules. The first module analyses the source code of the bioinformatics software with language-specific plugins, and generates a skeleton for a Galaxy XML or CWL tool description. The second module is dedicated to the enrichment of the generated tool description, using metadata provided by bio.tools. This last module can also be used on its own to complete or correct existing tool descriptions with missing metadata.Entities:
Keywords: bioinformatics; common workflow language; galaxy; interoperability; registry; tool integration
Year: 2017 PMID: 29333231 PMCID: PMC5747335 DOI: 10.12688/f1000research.12974.1
Source DB: PubMed Journal: F1000Res ISSN: 2046-1402
Figure 1. Workbench Integration Enabler overview.
The objective is to integrate the bio.tools registry with workbench environments in two ways: (1) “ReGaTE”, a utility for en masse registration of services from Galaxy instances; (2) the “ToolDog” utility, to translate the description of any tool or service that is registered in bio.tools, into the format required by the existing major workbench environments.
Figure 2. Metadata coverage for Galaxy tool descriptions from ( A) the main Galaxy instance ( https://usegalaxy.org) and ( B) the Institut Pasteur Galaxy instance ( https://galaxy.pasteur.fr). The graphs show the percentage of tools possessing various metadata types: Help: usage instructions; Description: description of the tool to be displayed in the tool menu; Citations: tool citation information using either a DOI or a BibTeX entry; H+D+C: contains a help, description and citations section; Operations: description of the EDAM operation(s) performed; Topics: description of the EDAM topics covered. The total number of tools includes those which were successfully retrieved and analyzed (672 out of 1209 on Galaxy main, 351 out of 526 on Pasteur); not all available tools were retrieved - some because they are not available in a ToolShed, and some because we chose to retrieve only the latest version of each tool and discarded the earlier ones.
Figure 3. ToolDog generates tool descriptors from bio.tools resources descriptions.
Figure 4. Output of the run of ToolDog using the bio.tools entry of IntegronFinder to generate the corresponding CWL and Galaxy tool descriptions.
Figure 5. Tool descriptions automated mapping and enrichment.
Out of 665 retrieved tool descriptions, 399 have a DOI and 224 of these descriptions could be mapped to a bio.tools entry. 217 tool descriptions have been successfully annotated using ToolDog ( Citations: presence of tool citation information; DOI: tool citation information described using a DOI; Corresponding bio.tools: tool descriptions with a corresponding bio.tools entry retrieved using the DOI; Annotated tools: tool descriptions successfully annotated with ToolDog).