| Literature DB >> 27765035 |
Alex R Hardisty1, Finn Bacall2, Niall Beard2, Maria-Paula Balcázar-Vargas3, Bachir Balech4, Zoltán Barcza5, Sarah J Bourlat6, Renato De Giovanni7, Yde de Jong3,8, Francesca De Leo4, Laura Dobor5, Giacinto Donvito9, Donal Fellows2, Antonio Fernandez Guerra10,11, Nuno Ferreira12, Yuliya Fetyukova8, Bruno Fosso4, Jonathan Giddy13, Carole Goble2, Anton Güntsch14, Robert Haines15, Vera Hernández Ernst16, Hannes Hettling17, Dóra Hidy18, Ferenc Horváth19, Dóra Ittzés19, Péter Ittzés19, Andrew Jones13, Renzo Kottmann10, Robert Kulawik16, Sonja Leidenberger20, Päivi Lyytikäinen-Saarenmaa21, Cherian Mathew14, Norman Morrison2, Aleksandra Nenadic2, Abraham Nieva de la Hidalga13, Matthias Obst6, Gerard Oostermeijer3, Elisabeth Paymal22, Graziano Pesole4,23, Salvatore Pinto12, Axel Poigné16, Francisco Quevedo Fernandez13, Monica Santamaria4, Hannu Saarenmaa8, Gergely Sipos12, Karl-Heinz Sylla16, Marko Tähtinen24, Saverio Vicario25, Rutger Aldo Vos3,17, Alan R Williams2, Pelin Yilmaz10.
Abstract
BACKGROUND: Making forecasts about biodiversity and giving support to policy relies increasingly on large collections of data held electronically, and on substantial computational capability and capacity to analyse, model, simulate and predict using such data. However, the physically distributed nature of data resources and of expertise in advanced analytical tools creates many challenges for the modern scientist. Across the wider biological sciences, presenting such capabilities on the Internet (as "Web services") and using scientific workflow systems to compose them for particular tasks is a practical way to carry out robust "in silico" science. However, use of this approach in biodiversity science and ecology has thus far been quite limited.Entities:
Keywords: Analysis; Automation; Biodiversity science; Biodiversity virtual e-laboratory; Computing software; Data processing; Ecology; Informatics; Virtual laboratory; Workflows
Mesh:
Year: 2016 PMID: 27765035 PMCID: PMC5073428 DOI: 10.1186/s12898-016-0103-y
Source DB: PubMed Journal: BMC Ecol ISSN: 1472-6785 Impact factor: 2.964
Fig. 1Biodiversity virtual laboratory (BioVeL) is a software environment that assists scientists in collecting, organising, and sharing data processing and analysis tasks in biodiversity and ecological research. The main components of the platform are: A the Biodiversity Catalogue (a library with well-annotated data and analysis services); B the environment, such as RStudio for creating R programs; C the workbench for assembling data access and analysis pipelines; D the myExperiment workflow library that stores existing workflows; E the BioVeL Portal that allows researchers and collaborators to execute and share workflows; and F the documentation wiki. Infrastructure is indicated in bold, while processes related to research activities are indicated in italics. Components A–F are referred to from the text, where they are described in detail. See also ‘how-to’ guidelines in the Additional information
Services for data processing and analysis (Additional file 2)
| Service group | Capabilities (web services) |
|---|---|
| General purpose, including mapping and visualization | General-purpose capabilities needed in many situations, such as for: |
| Ecological niche modelling | Built up from the existing openModeller web service [ |
| Ecosystem modelling | A basic toolbox for studies of carbon sequestration and ecosystem function. It includes data-model integration and calibration services, model testing and Monte Carlo Experiment services, ecosystem valuation services, and bioclimatic services |
| Metagenomics | A basic set of services for studying community structure and function from metagenomic ecological datasets. It includes services for geo-referenced annotation, metadata services, taxonomic binning and classification services, metagenomic traits services, and services for multivariate analysis |
| Phylogenetics | Services to enable DNA sequence mining and alignment, core phylogenetic inference, tree visualization, and phylogenetic community structure, for broad use in evolutionary and ecological studies |
| Population modelling | Services for demographic data and their integration into matrix projection models and integral projection models (MPM, IPM) |
| Taxonomy | Services for taxonomic name resolution, checklists and classification, and species occurrence data retrieval |
Workflows for biodiversity science (Additional file 3)
|
|
|
| Data refinement | The data refinement workflow (DRW) is for preparing taxonomically accurate species lists and observational data sets for use in scientific analyses such as: species distribution analysis, species richness and diversity studies, and analyses of community structure |
| Ecological niche modelling (ENM) | The generic ENM workflow creates, tests, and projects ecological niche models (ENM), choosing from a wide range of algorithms, environmental layers and geographical masks |
| ENM statistical difference (ESW) | Statistical post-processing of results from ecological niche modelling |
| Population modelling | Matrix population model construction and analysis workflows provide a complete environment for creating a stage-matrix with no density dependence, and then to perform several analyses on it. Each of the workflows in the collection is also available separately. The expanded version of this table, available as Additional information contains a link |
| Ecosystem modelling | Based around the Biome-BGC biogeochemical model, a collection of five workflows for calibrating and using Biome-BGC for modelling ecosystems and calculating a range of ecosystem service indicators. The Biome-BGC projects database and management system provides a user interface for setting of model parameters, for support sharing and reusing of datasets and parameter settings |
| Metagenomics | Microbial metagenomic trait calculation and statistical analysis (MMT) workflow calculates key ecological traits of bacterial communities as observed by high throughput metagenomic DNA sequencing. Typical use is in the analysis of environmental sequencing information from natural and disturbed habitats as a routine part of monitoring programs |
| Phylogenetics | Bayesian phylogenetic inference workflows are for performing phylogenetic inference for systematics and diversity research. Bayesian methods guide selection of the evolutionary model and a post hoc validation of the inference is also made. Phylogenetic partitioning of the diversity across samples allows study of mutual information between phylogeny and environmental variables |