| Literature DB >> 25960870 |
Anthony Bretaudeau1, Cyril Monjeaud2, Yvan Le Bras2, Fabrice Legeai3, Olivier Collin2.
Abstract
BACKGROUND: Many bioinformatics tools use reference data, such as genome assemblies or sequence databanks. Galaxy offers multiple ways to give access to this data through its web interface. However, the process of adding new reference data was customarily manual and time consuming, even more so when this data needed to be indexed in a variety of formats (e.g. Blast, Bowtie, BWA, or 2bit). BioMAJ is a widely used and stable software that is designed to automate the download and transformation of data from various sources. This data can be used directly from the command line, in more complex systems, such as Mobyle, or by using a REST API.Entities:
Keywords: BioMAJ; Data libraries; Data manager; Galaxy; Reference data
Mesh:
Year: 2015 PMID: 25960870 PMCID: PMC4425870 DOI: 10.1186/s13742-015-0063-8
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Figure 1BioMAJ2Galaxy architecture: BioMAJ fetches data from external online repositories. Post-processes are in charge of formatting the data in various formats. They then invoke Galaxy data managers using the Galaxy API via the BioBlend Python library. Alternatively, it is possible to invoke a post-process that adds reference data to the Galaxy data libraries. The injected reference data can then be used directly in tools or visualizations, and/or can be accessed in data libraries. When an obsolete databank version is removed, BioMAJ remove-processes are launched to remove any reference to the corresponding reference data in the Galaxy data tables or data libraries.
Figure 2BioMAJ databank configuration examples from a databank configuration file, only the configuration specific to BioMAJ2Galaxy is shown here. More complete example files are available in the BioMAJ archive. A: Post-process and remove-process for a databank populating Galaxy data libraries. B: Post-process and remove-process for a databank populating Galaxy data tables.