Literature DB >> 30152810

Computational workflow to study the seasonal variation of secondary metabolites in nine different bryophytes.

Kristian Peters1, Karin Gorzolka1, Helge Bruelheide2,3, Steffen Neumann1,3.   

Abstract

In Eco-Metabolomics interactions are studied of non-model organisms in their natural environment and relations are made between biochemistry and ecological function. Current challenges when processing such metabolomics data involve complex experiment designs which are often carried out in large field campaigns involving multiple study factors, peak detection parameter settings, the high variation of metabolite profiles and the analysis of non-model species with scarcely characterised metabolomes. Here, we present a dataset generated from 108 samples of nine bryophyte species obtained in four seasons using an untargeted liquid chromatography coupled with mass spectrometry acquisition method (LC/MS). Using this dataset we address the current challenges when processing Eco-Metabolomics data. Here, we also present a reproducible and reusable computational workflow implemented in Galaxy focusing on standard formats, data import, technical validation, feature detection, diversity analysis and multivariate statistics. We expect that the representative dataset and the reusable processing pipeline will facilitate future studies in the research field of Eco-Metabolomics.

Entities:  

Mesh:

Year:  2018        PMID: 30152810      PMCID: PMC6111888          DOI: 10.1038/sdata.2018.179

Source DB:  PubMed          Journal:  Sci Data        ISSN: 2052-4463            Impact factor:   6.444


Background & Summary

In Ecological Metabolomics (or short “Eco-Metabolomics”), metabolite profiles of organisms are studied in order to describe ecological processes such as biotic interactions or the impact of environmental changes on various biological species[1-3]. In contrast to biochemistry, wild non-model species are typically studied in their natural environment in ecology. This often involves different individuals of one or more species from populations growing under quite heterogeneous conditions when compared to the controlled conditions in greenhouses or growth chambers. As a result, metabolite profiles are highly variable when compared to each other. Moreover, profiles of non-model species contain a large number of novel compounds (so called “unknown unknowns”) that are difficult to identify because of lacking reference compounds, which have so far been mostly elucidated in model organisms[3,4]. Furthermore, designing ecological experiments is often complex and involves multiple factors[5]. Thus, the metabolomics data processing pipeline needs to be adapted in order to deal with the particular hypotheses and idiosyncrasies of ecological experiments. Here, we present a descriptor for a dataset that we consider representative for the research field of Eco-Metabolomics. Our study makes use of a field campaign with a two-factorial design (seasons and species), which includes (except Marchantia polymorpha) non-model species of bryophytes. In order to facilitate subsequent analysis, we kept the experiment design as simple as possible. The sampling was conducted on-site at the Botanical Garden of Martin Luther University Halle-Wittenberg once in each season over a period of one year (see below). Metabolite profiles were acquired using untargeted liquid chromatography coupled with mass spectrometry (LC/MS). Raw metabolite profiles are available in the metabolomics data repository MetaboLights[6] (Data Citation 1). In biochemistry there are strict laboratory protocols that ensure reproducibility of the analytical methods, while in bioinformatics this function is accomplished by implementing reusable computational workflows[7,8]. Thus, in addition to the dataset we also address the typical bioinformatic challenges that come with Eco-Metabolomics experiments by implementing a reproducible and reusable computational workflow (Fig. 1). While the analysis and ecological interpretation of the study is described in Peters et al.[9], here we focus on the analytical and bioinformatic work that is required to create a computational processing pipeline that is reproducible and that can be reused by other subsequent studies.
Figure 1

Computational workflow of the whole study (Data Citation 1) running in the Galaxy Workflow Management system.

Each of the modules represent a particular step in the study of Peters et al.[9]. The modules have defined inputs, outputs and sets of parameters. The modules are connected to each other to give the resulting workflow. The function of the modules is explained in Table 1 (available online only).

We describe in detail the experimental methodology that was used to create the dataset as well as the methodology to make the computational workflow reproducible (to give identical results in different computational environments). By formalizing and validating the processes that led to the results[10,11], we expect that this approach can serve as a model for subsequent studies. We further expect that Eco-Metabolomics studies use our dataset and the computational workflow to foster reuse and improve future data processing pipelines.

Methods

These methods describe in detail the steps in producing the data, including full descriptions of the experimental design in our related work[9], data acquisition, computational processing, diversity analysis, biostatistics and bioinformatics procedures.

Sampling campaign

Samples of the nine moss species Brachythecium rutabulum (Hedw.) Schimp., Calliergonella cuspidata (Hedw.) Loeske, Fissidens taxifolius Hedw., Grimmia pulvinata (Hedw.) Sm., Hypnum cupressiforme Hedw. (H. lacunosum was not differentiated), Marchantia polymorpha L., Plagiomnium undulatum (Hedw.) T.J. Kop., Polytrichum strictum Menzies ex Brid. and Rhytidiadelphus squarrosus (Hedw.) Warnst. were collected in the Botanical Gardens of the Martin-Luther-University Halle-Wittenberg, Germany. Sampling was performed in summer (2016/08/08), autumn (2016/11/09), winter (2017/01/27) and spring (2017/05/11) at relatively stable weather conditions as it is known that short-term climatic fluctuations and rainfall can influence secondary metabolite content and ammonium uptake of bryophytes[12]. Thus, the bryophytes were only collected when there was sunshine at least two days prior to and during sampling. Furthermore, sampling was performed after mid-day between 13:00 and 15:00.

Sampling protocol

In each season, three composite samples of different individuals of each species were taken, leading to a total of 3 * 9 * 4=108 samples. Only above-ground parts of the moss gametophytes such as leaves, branches, stems or thalloid parts were taken for sampling. From dioecious species such as M. polymorpha, P. strictum and P. undulatum female, male and sterile gametophytes were collected in a composite sample. Before sampling, visible archegonial and antheridial heads and any belowground parts such as rhizoids and rooting stems were removed with a sterile tweezer. The gametophytic moss parts were put in Eppendorf tubes and were frozen instantly on dry ice and later in the lab in liquid nitrogen.

Collecting ecological characteristics

In order to relate metabolomes of the bryophytes to ecology, several ecological characteristics were recorded on-site and compiled from literature. The on-site characteristics type of substrate with the nominal/categorical levels “soil”, “rock with lean soil cover” and “rock”; light conditions with the ordinal levels “sunny”, “half-shade” and “shade”; moisture of the substrate with the ordinal levels “dry”, “fresh”, “damp” and wet; and exposition with the nominal levels “North”, “East”, “South”, “West”, “Northeast”, “Northwest”, “Southeast” and “Southwest” were recorded when taking the samples in the field. The nominal characteristics growth form, habitat type, substrate and life strategy, the ordinal life-history characteristics spore size, gametangia distribution and sexual reproduction frequency, as well as the ordinal Ellenberg indicator values (indices for light, temperature, continentality, moisture, reaction, nitrogen and life-form) were collected from the literature[13-17]. For an overview, please refer Table 1 (available online only) in Peters et al.[9] or the file m_characteristics.csv in the dataset (see Data Citation 1, and Table 1 (available online only)).
Table 1

Names, types, descriptions and locations of primary data and additional scripts referenced in this article

File NameDescriptionTypeLocationURLData Record Accession
DockerfileSource code for building the container imageTextGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
galaxy/mtbls520_workflow.gaGalaxy workflowXMLGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
galaxy/mtbls520_workflow.jpgScreenshot of the workflow running in GalaxyJPGGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
korseby/mtbls520Container imageContainerDockerhubhttps://hub.docker.com/r/korseby/mtbls520/korseby/mtbls520
m_bryos_metabolite_*_negative_mode.mafPeak table matrix of bryophyte samples (negative mode)Tabular SeparatedMetaboLights 520 (Data Citation 1)https://www.ebi.ac.uk/metabolights/MTBLS520MTBLS520
m_bryos_metabolite_*_positive_mode.mafPeak table matrix of bryophyte samples (positive mode)Tabular SeparatedMetaboLights 520 (Data Citation 1)https://www.ebi.ac.uk/metabolights/MTBLS520MTBLS520
m_bryos_quality_*_negative_mode.mafPeak table matrix of QC samples (negative mode)Tabular SeparatedMetaboLights 520 (Data Citation 1)https://www.ebi.ac.uk/metabolights/MTBLS520MTBLS520
m_bryos_quality_*_positive_mode.mafPeak table matrix of QC samples (positivemode)Tabular SeparatedMetaboLights 520 (Data Citation 1)https://www.ebi.ac.uk/metabolights/MTBLS520MTBLS520
m_characteristics.csvEcological characteristics of the bryophyte species compiled from literatureComma SeparatedMetaboLights 520 (Data Citation 1)https://www.ebi.ac.uk/metabolights/MTBLS520MTBLS520, doi:<NEW_PHYTOL>
m_moss_phylo.trePhylogenetic distances of bryophyte speciesTreeMetaboLights 520 (Data Citation 1)https://www.ebi.ac.uk/metabolights/MTBLS520MTBLS520
mtbls520_01_mtbls_download.shScript: Download the whole study from MetaboLightsShell scriptGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_01_mtbls_download.xmlGalaxy module: Download the whole study from MetaboLightsXMLGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_02_extract.shScript: Extract files from a downloaded MetaboLights archiveShell scriptGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_02_extract.xmlGalaxy module: Extract files from the downloaded MetaboLights archiveXMLGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_03_qc_perform.rScript: Perform Quality ControlR scriptGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_03_qc_preparations.shScript: Make preparations for the QCShell scriptGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_03_quality_control.xmlGalaxy module: Process the quality control pipelineXMLGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_04_preparations.rScript: Preparations and settings for files and the R environmentR scriptGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_04_preparations.shScript: Import peak table matrixShell scriptGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_04_preparations.xmlGalaxy module: Preparations and settings for files and the R environmentXMLGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_05a_import_maf.rScript: Generating matrix for diversity calculationsR scriptGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_05a_import_maf.xmlGalaxy module: Import peak table matrixXMLGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_06_import_traits.rScript: Import ecological characteristicsR scriptGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_06_import_traits.xmlGalaxy module: Import ecological characteristicsXMLGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_07_species_diversity.rScript: Create species unique features plotR scriptGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_07_species_diversity.xmlGalaxy module: Generating matrix for diversity calculationsXMLGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_08a_species_shannon.rScript: Create species Shannon diversity plotR scriptGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_08a_species_shannon.xmlGalaxy module: Create species Shannon diversity plotXMLGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_08b_species_unique.rScript: Create species unique features plotR scriptGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_08b_species_unique.xmlGalaxy module: Create species unique features plotXMLGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_08c_species_variability.rScript: Create species variability plotR scriptGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_08c_species_variability.xmlGalaxy module: Create species variability plotXMLGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_08d_concentration.xmlGalaxy module: Create species metabolite concentration plotXMLGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_08d_species_concentration.rScript: Create species concentration plotR scriptGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_09_species_venn.rScript: Create Venn diagram plotsR scriptGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_09_species_venn.xmlGalaxy module: Create species Venn plotsXMLGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_10_species_varpart.rScript: Create variation partitioning plotR scriptGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_10_species_varpart.xmlGalaxy module: Create variation partitioning plotXMLGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_12_species_marchantia.rScript: Create annotation of Marchantia polymorpha profileR scriptGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_12_species_marchantia.xmlGalaxy module: Create annotation of Marchantia polymorpha profile plotXMLGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_16_ecology_rda.rScript: Create ecology dbRDA plotR scriptGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_16_ecology_rda.xmlGalaxy module: Create ecology dbRDA plotXMLGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_18_phylogeny.rScript: Create phylogeny and chemotaxonomy plotR scriptGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_18_phylogeny.xmlGalaxy module: Create phylogeny and chemotaxonomy plotXMLGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_19a_seasons_shannon.rScript: Create seasons Shannon diversity plotR scriptGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_19a_seasons_shannon.xmlGalaxy module: Create seasons Shannon diversity plotXMLGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_19b_seasons_unique.rScript: Create seasons unique features plotR scriptGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_19b_seasons_unique.xmlGalaxy module: Create seasons unique features plotXMLGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_19c_seasons_variability.rScript: Create seasons variability plotR scriptGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_19c_seasons_variability.xmlGalaxy module: Create seasons variability plotXMLGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_19d_seasons_concentration.rScript: Create seasons concentration plotR scriptGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_19d_seasons_concentration.xmlGalaxy module: Create seasons metabolite concentration plotXMLGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_23_seasons_rda.rScript: Create seasons dbRDA plotR scriptGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
mtbls520_23_seasons_rda.xmlGalaxy module: Create seasons RDA plotXMLGitHubhttps://github.com/korseby/container-mtbls520korseby/container-mtbls520
neg_[01-27]_*.mzMLMetabolite profiles of the bryophyte species measured in negative modeRawMetaboLights 520 (Data Citation 1)https://www.ebi.ac.uk/metabolights/MTBLS520MTBLS520
neg_MM8_*.mzMLQuality control profiles measured in negative modeRawMetaboLights 520 (Data Citation 1)https://www.ebi.ac.uk/metabolights/MTBLS520MTBLS520
pos_[01-27]_*.mzMLMetabolite profiles of the bryophyte species measured in positive modeRawMetaboLights 520 (Data Citation 1)https://www.ebi.ac.uk/metabolights/MTBLS520MTBLS520
pos_MM8_*.mzMLQuality control profiles measured in positive modeRawMetaboLights 520 (Data Citation 1)https://www.ebi.ac.uk/metabolights/MTBLS520MTBLS520

Extraction protocol and LC/MS analysis

Frozen moss samples were homogenized by adding 200 mg ceramic beads (0.5 mm diameter, Roth) and ribolysing (Precellys 24, 2×20 s at 6500 r.p.m., 5 min pause in liquid nitrogen). 1 ml ice-cold 80/20 (v/v) methanol/water spiked with internal standards 5 μM biochanin A (Sigma-Aldrich), 5 μM kinetin (Sigma-Aldrich) and 5 μM N-(3-indolylacetyl)-l-valine (Sigma-Aldrich) were added. Samples were vortexed and thawed while shaking for 15 min at 1,000 r.p.m. at room temperature followed by ultrasonification for 15 min and again 15 min shaking. After 15 min centrifugation at 13,000 r.p.m. 500 μl of supernatant were dried in a vacuum centrifuge at 40 °C and reconstituted in 80/20 (v/v) methanol/water with the volume adjusted to the initial fresh weight of the sample to a final concentration of 10 mg fresh weight per 100 μl extract. Chromatographic separations were performed at 40 °C on an Acquity UPLC system (Waters) equipped with an HSS T3 column (100×1 mm, particle size 1.8 μm; Waters) applying the following binary gradient at a flow rate of 150 μL min−1: 0 to 1 min, isocratic 95% A (water:formic acid: 99.9:0.1 [v/v]), 5% B (acetonitrile:formic acid: 99.9:0.1 [v/v]); 1 to 18 min, linear from 5 to 95% B; 18 to 20 min, isocratic 95% B. The injection volume was 2.0 μL (full loop injection). Ultra-performance liquid chromatography coupled to electrospray ionization quadrupole time-of-flight mass spectrometry (UPLC/ESI-QTOF-MS) was performed using a high resolution MicrOTOF-Q II hybrid quadrupole time-of-flight mass spectrometer[18]. Data were acquired with the following MS instrument settings: nebulizer gas: nitrogen, 1.4 bar; dry gas: nitrogen, 6 L min−1, 190 °C; capillary: 5000 V (+4000 V for negative mode); end plate offset: −500 V; funnel 1 radio frequency (RF): 200 Volts peak-to-peak (Vpp); funnel 2 RF: 200 Vpp; in-source collision-induced dissociation (CID) energy: 10 eV; hexapole RF: 100 Vpp; quadrupole ion energy: 3 eV (−5 eV for neg-mode); collision gas: nitrogen; collision energy: 7 eV (−7 eV for negative mode); collision cell RF: 250 Vpp (150 Vpp for negative mode); transfer time: 70 μs; prepulse storage: 5 μs; pulser frequency: 10 kHz; and spectra rate: 3 Hz. Mass spectra were acquired in centroid mode. Calibration of the m/z scale was performed for individual raw data files on lithium formate cluster ions obtained by automatic infusion of 20 μL of 10 mM lithium hydroxide in isopropanol:water:formic acid, 49.9:49.9:0.2 (v/v/v) at the end of the gradient.

Quality control

In order to validate the instrument performance and to detect batch effects between the instrument runs, the following quality control (QC) protocol was realized. Samples with a lab-internal standard mix (MM8) were interspersed before and after 7 bryophyte samples in the MicrOTOF[18]. The following substances were used in the MM8: 2-Phenylglycine (Fluka), Kinetin (Roth), Rutin (Acros Organics), O-Methylsalicylic acid (Sigma), Phlorizin dihydrate (Sigma), N-(3-Indolyacetyl)-L-valine (Sigma), 3-Indolylacetonitrile (Fluka) and Biochanin A (Sigma). Substances in the MM8 were selected based on their ionization properties (ionization in both positive and negative mode and the differential adduct formation) and a wide coverage of known retention times throughout the gradient with our instrumental setup. Known ionization properties were used to detect shifts and effects in mass-to-charge ratios (m/z) and retention times (RT) of the respective batches and to validate RT correction made by XCMS (see below).

Raw data acquisition

Raw LC/MS data were converted to the open data format mzML[19] with the software CompassXPort 3.0.9 from Bruker Daltonics (available at http://www.bruker.com/service/support-upgrades/software-downloads.html). In compliance with the minimum information guidelines for Metabolomics studies[20], metadata were recorded to ISA-Tab format[21] using ISAcreator 1.7.10 (ref. 22) (available at https://github.com/ISA-tools/ISAcreator/releases) and uploaded together with the raw data to the metabolomics repository MetaboLights[6] (Data Citation 1). Profiles of positive mode were used for the data analyses as many important and known secondary metabolites classes in bryophytes such as flavonoids, phenylpropanoids, anthocyans, glycosides and previously characterized compounds such as Marchantins, Communins and Ohioensins ionize well in positive mode with our instrumental setup.

Peak detection

Chromatographic peak picking was performed in R 3.4.2 (available at https://cran.r-project.org) with the package XCMS 1.52.0 (ref. 23) using the centWave algorithm and the following parameters: ppm=35, peakwidth=4,21, snthresh=10, prefilter=5–50, fitgauss=TRUE, verbose.columns=TRUE. Grouping of chromatographic peaks was performed with two factors (in XCMS called “phenoData”): seasons with the levels summer, autumn, winter and spring; and species with the levels Brarut, Calcus, Fistax, Gripul, Hypcup, Marpol, Plaund, Polstr and Rhysqu. The following parameters were used for grouping: mzwid=0.01, minfrac=0.5, bw=4. To improve subsequent data analyses, intensities in the peak table were log transformed before grouping. For further analysis, only features between the retention times 20 s and 1020 s were kept. Retention time correction was performed using the function retcor in XCMS using the parameters method=loess, family=gaussian, missing=10, extra=1, span=2. The parameters were additionally optimized using the R package IPO 1.3.3 (ref. 24), but better alignment precision was achieved with manual control and knowledge of instrument settings[25].

Peak annotation

Adduct annotation was performed with the R package CAMERA 1.33.3 (ref. 26) by using the following functions: xsAnnotate, groupFWHM, findIsotopes, groupCorr, findAdducts; with the following parameters: perfwhm=0.6, ppm=5, mzabs=0.005, calcIso=TRUE, calcCiS=TRUE, calcCaS=TRUE, graphMethod=lpc, pval=0.05, cor_eic_th=0.75. In order to improve subsequent statistical analyses instead of the CAMERA function getPeaklist the function getReducedPeaklist was written that aggregates the adducts of putative compounds into a feature list with singular components (see pull request in GitHub: https://github.com/sneumann/CAMERA/pull/16). Since version 1.33.3 the function getReducedPeaklist is officially part of CAMERA. The parameter method=median was chosen for the study.

Exemplary compound annotation

Compounds were putatively annotated for the follow-up validation and biochemical interpretation with the software Bruker Compass IsotopePattern 4.4. Annotation was performed by calculating accurate masses (mass-to-charge values) from known compounds in M. polymorpha and other liverworts found in PubChem, the KNApSAcK database and Asakawa et al.[27,28]. In the software Bruker Compass DataAnalysis 4.4 the mass-to-charge was matched to device-specific retention times in the metabolite profile. To validate whether the known compound was present in the profile, Extracted Ion Chromatograms (EIC) and area-under-curve (integrated intensities) were checked manually.

Diversity analysis

Statistical analyses were performed using the additional R packages: multtest, RColorBrewer, vegan, multcomp, multtest, nlme, ape, pvclust, dendextend, phangorn, Hmisc, gplots and VennDiagram. A presence-absence matrix was generated from the feature matrix to determine the differences in metabolite features between the experimental factors species and season. In accordance with the minfrac parameter in the alignment step in XCMS (see above), a feature was considered present when it was detected at least in two out of three replicates. The presence-absence matrix was used for measuring the metabolite richness for each species and season by calculating the Shannon diversity index (H’) for each sample i using the function diversity in vegan with the parameter index=shannon[29]. The following equation was used for calculation: where t represents the number of samples in the particular group. The total number of features and the number of unique features were calculated from the presence-absence matrix accordingly. To test factor levels for significant differences, the Tukey HSD on a one-way ANOVA was performed post-hoc using the multcomp package. Variability was calculated with the Pearson Correlation Coefficient (PCC, Pearson’s r) using the function rcorr in the package Hmisc. Venn diagrams were created for each species separately using the package VennDiagram. Each set in the Venn diagram represents one season and shows distinct and shared features in all possible combinations between the sets.

Multivariate statistical analysis

Variation partitioning was performed using the function varpart in the package vegan to analyze the influence of the factors species and seasons on the metabolite profiles. Distance-based redundancy analysis (dbRDA) using the function capscale with Bray-Curtis distance and multidimensional scaling in the package vegan was chosen to analyse the relation of the ecological characteristics with the species metabolite profiles[30,31]. Ordinal and categorical ecological characteristics were transformed to presence-absence matrices for the ordination. The optimal model for the dbRDA was chosen with forward and backward selection using the function ordistep in the package vegan. Ecological characteristics were added to the plots as post-hoc variables using the function envfit in the package vegan.

Chemotaxonomic comparison to phylogeny

Relationships between metabolite profiles and phylogeny were analysed by calculating dissimilarities for phylogeny and the feature matrix using Bray-Curtis distance (function vegdist in vegan) followed by hierarchical clustering using the function hclust and the complete linkage method. In order to improve the visual comparison between the two trees, the chemotaxonomic plot was reordered using the function order.optimal (package cba) and leaves of Polstr and Plaund were swapped using the function reorder in vegan. The similarity of the two trees was determined with the normalized Robinson-Foulds metric (function RF.dist in package phangorn). The similarity of the distance matrices was determined with the Mantel statistics (function mantel in vegan).

Computational workflow

For the computational workflow, the required software tools, their dependencies, as well as software libraries and R packages were containerized using Docker technology[32]. The container was based on Linux and Ubuntu 16.04 and included R version 3.4.2 from the R apt repository. The commands for building the container can be found in the Dockerfile (Table 1 (available online only)). The resulting container image was made available at DockerHub (https://hub.docker.com/r/korseby/mtbls520/). The computational workflow was constructed with the Galaxy workflow management system[33]. It consists of 20 modules and each individual module represents one or more dedicated steps in the Peters et al. study[9], e.g. data retrieval, feature detection, alignment or statistical analysis (Fig. 1). For the workflow, individual Galaxy modules were written in XML format. Each Galaxy module executes a shell or R script with defined inputs and outputs. Scripts are only executed inside the software container. Thus, code execution is encapsulated and all required software dependencies were resolved in the software container. In order to comply with the Interoperability criterion in the FAIR guidelines[34], the PhenoMeNal cloud e-infrastructure was used to test the workflow in different computational environments (https://phenomenal-h2020.eu). To ensure that the workflow generates the same results in different computational environments, continuous automatic workflow testing was implemented with wft4galaxy[35].

Data Records

The primary access site for the dataset is MetaboLights (Data Citation 1), which includes the 108 metabolite profles of the bryophytes in positive and negative mode, QC profiles, ecological data and meta-data (see Table 2 (available online only) for an overview of sample names and associated factor levels). Table 1 (available online only) provides an overview of data files, formats and functions in the computational workflow.
Table 2

Overview of sample names and associated factor levels of the study

Sample NameTypeSpeciesSeasonReplicate
pos_01_Fistax_1-A.2_01_5675Bryophyte species metabolite profileFissidens taxifoliusSummer01
pos_02_Fistax_1-A.3_01_5676Bryophyte species metabolite profileFissidens taxifoliusSummer02
pos_03_Fistax_1-A.4_01_5677Bryophyte species metabolite profileFissidens taxifoliusSummer03
pos_04_Polstr_1-A.5_01_5678Bryophyte species metabolite profilePolytrichum strictumSummer01
pos_05_Polstr_1-A.6_01_5679Bryophyte species metabolite profilePolytrichum strictumSummer02
pos_06_Polstr_1-A.7_01_5680Bryophyte species metabolite profilePolytrichum strictumSummer03
pos_07_Hypcup_1-A.8_01_5681Bryophyte species metabolite profileHypnum cupressiformeSummer01
pos_08_Gripul_1-B.1_01_5684Bryophyte species metabolite profileGrimmia pulvinataSummer01
pos_09_Plaund_1-B.2_01_5685Bryophyte species metabolite profilePlagiomnium undulatumSummer01
pos_10_Plaund_1-B.3_01_5686Bryophyte species metabolite profilePlagiomnium undulatumSummer02
pos_11_Plaund_1-B.4_01_5687Bryophyte species metabolite profilePlagiomnium undulatumSummer03
pos_12_Rhysqu_1-B.5_01_5688Bryophyte species metabolite profileRhytidiadelphus squarrosusSummer01
pos_13_Rhysqu_1-B.6_01_5689Bryophyte species metabolite profileRhytidiadelphus squarrosusSummer02
pos_14_Calcus_1-B.7_01_5690Bryophyte species metabolite profileCalliergonella cuspidataSummer01
pos_15_Calcus_1-C.1_01_5693Bryophyte species metabolite profileCalliergonella cuspidataSummer02
pos_16_Rhysqu_1-C.2_01_5694Bryophyte species metabolite profileRhytidiadelphus squarrosusSummer03
pos_17_Calcus_1-C.3_01_5695Bryophyte species metabolite profileCalliergonella cuspidataSummer03
pos_18_Brarut_1-C.4_01_5696Bryophyte species metabolite profileBrachythecium rutabulumSummer01
pos_19_Brarut_1-C.5_01_5697Bryophyte species metabolite profileBrachythecium rutabulumSummer02
pos_20_Hypcup_1-C.6_01_5698Bryophyte species metabolite profileHypnum cupressiformeSummer02
pos_21_Hypcup_1-C.7_01_5699Bryophyte species metabolite profileHypnum cupressiformeSummer03
pos_22_Gripul_1-D.1_01_5702Bryophyte species metabolite profileGrimmia pulvinataSummer02
pos_23_Gripul_1-D.2_01_5703Bryophyte species metabolite profileGrimmia pulvinataSummer03
pos_24_Brarut_1-D.3_01_5704Bryophyte species metabolite profileBrachythecium rutabulumSummer03
pos_25_Marpol_1-D.4_01_5705Bryophyte species metabolite profileMarchantia polymorphaSummer01
pos_26_Marpol_1-D.5_01_5706Bryophyte species metabolite profileMarchantia polymorphaSummer02
pos_27_Marpol_1-D.6_01_5707Bryophyte species metabolite profileMarchantia polymorphaSummer03
pos_01_Fistax_1-A.2_01_6578Bryophyte species metabolite profileFissidens taxifoliusAutumn01
pos_02_Fistax_1-A.3_01_6579Bryophyte species metabolite profileFissidens taxifoliusAutumn02
pos_03_Fistax_1-A.4_01_6580Bryophyte species metabolite profileFissidens taxifoliusAutumn03
pos_04_Hypcup_1-A.5_01_6581Bryophyte species metabolite profileHypnum cupressiformeAutumn01
pos_05_Gripul_1-A.6_01_6582Bryophyte species metabolite profileGrimmia pulvinataAutumn01
pos_06_Brarut_1-A.7_01_6583Bryophyte species metabolite profileBrachythecium rutabulumAutumn01
pos_07_Polstr_1-A.8_01_6584Bryophyte species metabolite profilePolytrichum strictumAutumn01
pos_08_Polstr_1-B.1_01_6587Bryophyte species metabolite profilePolytrichum strictumAutumn02
pos_09_Polstr_1-B.2_01_6588Bryophyte species metabolite profilePolytrichum strictumAutumn03
pos_10_Hypcup_1-B.3_01_6589Bryophyte species metabolite profileHypnum cupressiformeAutumn02
pos_11_Gripul_1-B.4_01_6590Bryophyte species metabolite profileGrimmia pulvinataAutumn02
pos_12_Brarut_1-B.5_01_6591Bryophyte species metabolite profileBrachythecium rutabulumAutumn02
pos_13_Plaund_1-B.6_01_6592Bryophyte species metabolite profilePlagiomnium undulatumAutumn01
pos_14_Plaund_1-B.7_01_6593Bryophyte species metabolite profilePlagiomnium undulatumAutumn02
pos_15_Plaund_1-C.1_01_6596Bryophyte species metabolite profilePlagiomnium undulatumAutumn03
pos_16_Calcus_1-C.2_01_6597Bryophyte species metabolite profileCalliergonella cuspidataAutumn01
pos_17_Calcus_1-C.3_01_6598Bryophyte species metabolite profileCalliergonella cuspidataAutumn02
pos_18_Rhysqu_1-C.4_01_6599Bryophyte species metabolite profileRhytidiadelphus squarrosusAutumn01
pos_19_Rhysqu_1-C.5_01_6600Bryophyte species metabolite profileRhytidiadelphus squarrosusAutumn02
pos_20_Calcus_1-C.6_01_6601Bryophyte species metabolite profileCalliergonella cuspidataAutumn03
pos_21_Rhysqu_1-C.7_01_6602Bryophyte species metabolite profileRhytidiadelphus squarrosusAutumn03
pos_22_Hypcup_1-D.1_01_6605Bryophyte species metabolite profileHypnum cupressiformeAutumn03
pos_23_Gripul_1-D.2_01_6609Bryophyte species metabolite profileGrimmia pulvinataAutumn03
pos_24_Brarut_1-D.3_01_6610Bryophyte species metabolite profileBrachythecium rutabulumAutumn03
pos_25_Marpol_1-D.4_01_6611Bryophyte species metabolite profileMarchantia polymorphaAutumn01
pos_26_Marpol_1-D.5_01_6612Bryophyte species metabolite profileMarchantia polymorphaAutumn02
pos_27_Marpol_1-D.6_01_6613Bryophyte species metabolite profileMarchantia polymorphaAutumn03
pos_01_Fistax_1-A.2_01_7105Bryophyte species metabolite profileFissidens taxifoliusWinter01
pos_02_Fistax_1-A.3_01_7106Bryophyte species metabolite profileFissidens taxifoliusWinter02
pos_03_Fistax_1-A.4_01_7107Bryophyte species metabolite profileFissidens taxifoliusWinter03
pos_04_Hypcup_1-A.5_01_7108Bryophyte species metabolite profileHypnum cupressiformeWinter01
pos_05_Brarut_1-A.6_01_7109Bryophyte species metabolite profileBrachythecium rutabulumWinter01
pos_06_Polstr_1-A.7_01_7110Bryophyte species metabolite profilePolytrichum strictumWinter01
pos_07_Polstr_1-A.8_01_7111Bryophyte species metabolite profilePolytrichum strictumWinter02
pos_08_Polstr_1-A.1_01_7114Bryophyte species metabolite profilePolytrichum strictumWinter03
pos_09_Gripul_1-A.2_01_7115Bryophyte species metabolite profileGrimmia pulvinataWinter01
pos_10_Hypcup_1-A.3_01_7116Bryophyte species metabolite profileHypnum cupressiformeWinter02
pos_11_Plaund_1-A.4_01_7117Bryophyte species metabolite profilePlagiomnium undulatumWinter01
pos_12_Plaund_1-A.5_01_7118Bryophyte species metabolite profilePlagiomnium undulatumWinter02
pos_13_Plaund_1-A.6_01_7119Bryophyte species metabolite profilePlagiomnium undulatumWinter03
pos_14_Rhysqu_1-A.7_01_7120Bryophyte species metabolite profileRhytidiadelphus squarrosusWinter01
pos_15_Rhysqu_1-A.8_01_7138Bryophyte species metabolite profileRhytidiadelphus squarrosusWinter02
pos_16_Rhysqu_1-A.1_01_7139Bryophyte species metabolite profileRhytidiadelphus squarrosusWinter03
pos_17_Calcus_1-A.2_01_7140Bryophyte species metabolite profileCalliergonella cuspidataWinter01
pos_18_Calcus_1-A.3_01_7141Bryophyte species metabolite profileCalliergonella cuspidataWinter02
pos_19_Calcus_1-A.4_01_7142Bryophyte species metabolite profileCalliergonella cuspidataWinter03
pos_20_Brarut_1-A.5_01_7143Bryophyte species metabolite profileBrachythecium rutabulumWinter02
pos_21_Hypcup_1-A.6_01_7144Bryophyte species metabolite profileHypnum cupressiformeWinter03
pos_22_Brarut_1-A.7_01_7146Bryophyte species metabolite profileBrachythecium rutabulumWinter03
pos_23_Gripul_1-A.8_01_7147Bryophyte species metabolite profileGrimmia pulvinataWinter02
pos_24_Gripul_1-A.1_01_7148Bryophyte species metabolite profileGrimmia pulvinataWinter03
pos_25_Marpol_1-A.2_01_7149Bryophyte species metabolite profileMarchantia polymorphaWinter01
pos_26_Marpol_1-A.3_01_7150Bryophyte species metabolite profileMarchantia polymorphaWinter02
pos_27_Marpol_1-A.4_01_7151Bryophyte species metabolite profileMarchantia polymorphaWinter03
pos_01_Fistax_1-A.2_01_7396Bryophyte species metabolite profileFissidens taxifoliusSpring01
pos_02_Fistax_1-A.3_01_7398Bryophyte species metabolite profileFissidens taxifoliusSpring02
pos_03_Fistax_1-A.4_01_7399Bryophyte species metabolite profileFissidens taxifoliusSpring03
pos_04_Gripul_1-A.5_01_7400Bryophyte species metabolite profileGrimmia pulvinataSpring01
pos_05_Brarut_1-A.6_01_7401Bryophyte species metabolite profileBrachythecium rutabulumSpring01
pos_06_Polstr_1-A.7_01_7402Bryophyte species metabolite profilePolytrichum strictumSpring01
pos_07_Polstr_1-A.8_01_7403Bryophyte species metabolite profilePolytrichum strictumSpring02
pos_08_Polstr_1-B.1_01_7406Bryophyte species metabolite profilePolytrichum strictumSpring03
pos_09_Hypcup_1-B.2_01_7407Bryophyte species metabolite profileHypnum cupressiformeSpring01
pos_10_Brarut_1-B.3_01_7408Bryophyte species metabolite profileBrachythecium rutabulumSpring02
pos_11_Brarut_1-B.4_01_7409Bryophyte species metabolite profileBrachythecium rutabulumSpring03
pos_12_Plaund_1-B.5_01_7410Bryophyte species metabolite profilePlagiomnium undulatumSpring01
pos_13_Plaund_1-B.6_01_7411Bryophyte species metabolite profilePlagiomnium undulatumSpring02
pos_14_Plaund_1-B.7_01_7412Bryophyte species metabolite profilePlagiomnium undulatumSpring03
pos_15_Calcus_1-B.8_01_7415Bryophyte species metabolite profileCalliergonella cuspidataSpring01
pos_16_Rhysqu_1-C.1_01_7416Bryophyte species metabolite profileRhytidiadelphus squarrosusSpring01
pos_17_Calcus_1-C.2_01_7417Bryophyte species metabolite profileCalliergonella cuspidataSpring02
pos_18_Rhysqu_1-C.3_01_7418Bryophyte species metabolite profileRhytidiadelphus squarrosusSpring02
pos_19_Rhysqu_1-C.4_01_7419Bryophyte species metabolite profileRhytidiadelphus squarrosusSpring03
pos_20_Calcus_1-C.5_01_7420Bryophyte species metabolite profileCalliergonella cuspidataSpring03
pos_21_Hypcup_1-C.6_01_7421Bryophyte species metabolite profileHypnum cupressiformeSpring02
pos_22_Hypcup_1-C.7_01_7424Bryophyte species metabolite profileHypnum cupressiformeSpring03
pos_23_Gripul_1-C.8_01_7425Bryophyte species metabolite profileGrimmia pulvinataSpring02
pos_24_Gripul_1-D.1_01_7426Bryophyte species metabolite profileGrimmia pulvinataSpring03
pos_25_Marpol_1-D.2_01_7427Bryophyte species metabolite profileMarchantia polymorphaSpring01
pos_26_Marpol_1-D.3_01_7428Bryophyte species metabolite profileMarchantia polymorphaSpring02
pos_27_Marpol_1-D.4_01_7429Bryophyte species metabolite profileMarchantia polymorphaSpring03
pos_MM8_1-F.8_01_5674QC profile-summer01
pos_MM8_1-F.8_01_5683QC profile-summer02
pos_MM8_1-F.8_01_5692QC profile-summer03
pos_MM8_1-F.8_01_5701QC profile-summer04
pos_MM8_1-F.8_01_5710QC profile-summer05
pos_MM8_1-F.8_01_6577QC profile-autumn01
pos_MM8_1-F.8_01_6586QC profile-autumn02
pos_MM8_1-F.8_01_6595QC profile-autumn03
pos_MM8_1-F.8_01_6604QC profile-autumn04
pos_MM8_1-F.8_01_6615QC profile-autumn05
pos_MM8_1-F.8_01_7103QC profile-winter01
pos_MM8_1-F.8_01_7104QC profile-winter02
pos_MM8_1-F.8_01_7113QC profile-winter03
pos_MM8_1-F.8_01_7122QC profile-winter04
pos_MM8_1-F.8_01_7131QC profile-winter05
pos_MM8_1-F.8_01_7145QC profile-winter06
pos_MM8_1-F.8_01_7153QC profile-winter07
pos_MM8_1-F.8_01_7360QC profile-spring01
pos_MM8_1-F.8_01_7361QC profile-spring02
pos_MM8_1-F.8_01_7362QC profile-spring03
pos_MM8_1-F.8_01_7365QC profile-spring04
pos_MM8_1-F.8_01_7366QC profile-spring05
pos_MM8_1-F.8_01_7394QC profile-spring06
pos_MM8_1-F.8_01_7397QC profile-spring07
pos_MM8_1-F.8_01_7405QC profile-spring08
pos_MM8_1-F.8_01_7414QC profile-spring09
pos_MM8_1-F.8_01_7423QC profile-spring10
pos_MM8_1-F.8_01_7433QC profile-spring11

Code Availability

The source code (also deposited at https://github.com/korseby/container-mtbls520/) was published[36] and made available under the terms of APACHE license 2.0. Please refer Table 1 (available online only) for an overview of the function of each file of the source code. Code for building the software container image and the workflow including Galaxy modules and scripts that are executed inside the container were published under Open Access[36]. A pre-built binary software container image was made available at DockerHub (retrievable at https://hub.docker.com/r/korseby/mtbls520/).

Technical Validation

Four sets of 27 bryophyte samples were generated in the experiment. One set for each season was analyzed with UPLC/ESI-QTOF-MS (see methods below) which resulted in a total of 108 bryophyte metabolite profiles. In order to validate the instrument performance and to detect batch effects between the four instrument runs, a quality control (QC) protocol was implemented. Sets of 27 species samples were interspersed by samples of a lab-internal standard mix (MM8) before and after 7 bryophyte samples. Peak detection in these MM8 profiles was performed with the identical parameters as for the bryophyte samples. The four sets containing the MM8 metabolite profiles were checked visually for differences by plotting them against each other (Fig. 2a) and stacked next to each other (Fig. 2b). The density distribution of the intensities within the sets of MM8 profiles were also checked and compared to each other with a density plot (histogram) (Fig. 2c).
Figure 2

Plots of sets of the MM8 profiles to assess the performance of the technical setup.

Green=spring, yellow=summer, red=autumn, blue=winter. n=28. (a) Plot of the four sets of MM8 profiles against each other. X axis: Retention time [s]. Y axis: Logarithmic total ion current. (b) Stacked plot of the sets of MM8 profiles next to each other. X axis: Retention time [s]. Y axis: Logarithmic total ion current. (c) Density plot (histogram) of log intensities of the sets of MM8 profiles. X axis: Sample size. Y axis: Estimated kernel density.

Mass-to-charge ratio and retention time deviation (in seconds) and correction made by XCMS were checked with diagnostic plots made by XCMS (Fig. 3). We found maximum retention time deviations within 2 s (Fig. 3a and b) which are in the expected range of the analytical setup[18]. The determined mass-to-charge deviations (Fig. 3c and d) are within instrument specification as well[18].
Figure 3

Quality control plots to assess shifts in retention time (RT) and mass-to-charge ratio (m/z) in the four sets of MM8 profiles.

Green=spring, yellow=summer, red=autumn, blue=winter. n=28. (a) Median retention time deviation for the sets of MM8 profiles. X axis: Name of MM8 profile. Y axis: Retention time deviation [s]. (b) Retention time deviation plotted against retention time. X axis: Retention time [s]. Y axis: Retention time deviation [s] per profile. (c) Median mass-to-charge deviation for each profile. X axis: MM8 profile. Y axis: m/z deviation. (d) Mass-to-charge deviation plotted against retention time. X axis: Retention time [s]. Y axis: m/z deviation per profile.

The variation in the intensities of the internal lab standards was also checked for each reference compound individually as shown in Figs. 4 and 5. In general, the variation for each reference compound and the deviations between MM8 profiles are both well within the typical range of 10 to 15% (ref. 18).
Figure 4

The eight compounds used for the internal lab standard mix (MM8) plotted next to each other.

Shown are the regions of the respective compounds in the raw chromatograms before the alignment of XCMS. Green=spring, yellow=summer, red=autumn, blue=winter. X axis: Retention time [s]. Y axis: Logarithmic total ion current. n=28.

Figure 5

Boxplots of the variation in the intensities of the eight compounds used in the internal lab standard mix of all the MM8 profiles.

X axis: Compound. Y axis: Logarithmic total ion current. n=28 for each box.

We conclude that there are no significant batch effects in the technical replicates to overlap with the factor seasons of the experiment. Thus, the automatic retention time correction made by XCMS is validated for the parameters used in the peak detection process.

Exemplary annotation of Marchantia polymorpha profile

With known accurate masses (m/z values) and calculated retention time values (see methods), we confirm the annotation of many known compounds which are described in literature for the model species Marchantia polymorpha[27,28] (Fig. 6). Many of these known compounds also constitute the most abundant features in the profile of M. polymorpha (Fig. 6).
Figure 6

Total Ion Count (TIC) chromatogram obtained from the extracts of Marchantia polymorpha.

This exemplary chromatogram was obtained from the third sample of summer. Values of the retention times (RT values), accurate masses and sum formulas are available in Table 3 (available online only).

We have implemented the computational workflow in the Galaxy workflow management system[33] and have made the workflow and underlying code available as Open Source[36]. The Galaxy workflow represents the entire computational processing pipeline that is used in the Peters et al. study[9] (Fig. 1). Each of the individual modules represents a particular step in the workflow and has defined inputs (e.g. pre-processed peak table data matrix) and outputs (e.g. PDF containing the plot of a particular statistical method) (Fig. 1). We used data standards and minimum information criteria for constructing the modules of the workflow[20,22]. Continuous automatic testing of the workflow was performed with wft4galaxy[35] in the PhenoMeNal e-infrastructure (https://phenomenal-h2020.eu) to ensure that the workflow generates the same results in different computational environments. We proceeded according to the FAIR guiding principles[34] in order to implement a reusable computational workflow. The acronym FAIR stands for Findable, Accessible, Interoperable and Reusable and encompasses several criteria to support the reuse of scholarly data. So far, the FAIR guidelines have only been aspired to make data reusable. However, as the conceptual formulation within FAIR are quite generalized[37], these principles can also be applied to computational workflows. Nonetheless, there are some computational challenges involved. For example, software runs in different software environments and software dependencies need to be resolved. We tackle this by creating software containers which can be run on multiple systems and contain the software tools, all required libraries and R packages[32,38]. As dependencies in the container have already been resolved, sharing the container image greatly facilitates to allow the software to be run in multiple environments. We have chosen the Galaxy Workflow Management system[33,39] to implement the whole data processing pipeline (Fig. 1) as it is already known to facilitate reproducible results[40]. Several processing modules were constructed that represent the individual steps of the Peters et al. study[9]. Software tools are invoked from the Galaxy modules and are executed inside the container, thus, adding a level of encapsulation and eliminating the need for the user to install additional software[41]. Galaxy has a graphical user interface that hides the technical complexity from the end user and does not need intensive bioinformatic background knowledge to run the particular modules and workflows. This greatly contributes to the adoption by the end users (biochemists and ecologists) and facilitates future studies in the research field of Eco-Metabolomics.

Statistical analyses

With untargeted metabolomics analysis in ecology, diversity analysis is typically used to characterize the richness and the abundance of biochemical features in the metabolite profiles of biological species[42]. Metabolite richness is a simple measure that counts the individual biochemical features in the metabolite profiles of the species[43]. The abundance of features in the metabolite profiles is usually calculated by diversity indices such as the Shannon diversity index (H’) in order to characterize simple relationships with regard to the study factors[44]. Ordination methods such as Redundancy Analysis (RDA) and distance-based Redundancy Analysis (dbRDA) are frequently used in Ecology[30]. They allow to derive correlations of specific variables between the matrix of predictors containing the measurements (X matrix) and the response matrix with the ecological traits (Y matrix)[30,45]. These methods are also suitable for Eco-Metabolomics data as they allow the use of multiple (non-categorial) variables in a single model and allow to calculate the amount of explained variance of the model. We have chosen the dbRDA, which can also be regarded as a constrained version of metric scaling (MDS)[46,47]. We have implemented dedicated modules for these statistical operations in our computational workflow (see Methods section and Fig. 1).

Usage Notes

Building the container image

Following are instructions to manually build the container image. The file Dockerfile in Table 1 (available online only) contains the ruleset. The container has been built using Docker version 17.05-ce under Linux Ubuntu 16.04. The following commands were run to generate the image: sudo apt-get install apt-transport-https ca-certificates git sudo echo “deb http://apt.dockerproject.org/repo ubuntu-xenial main” >>/etc/apt/sources.list sudo apt-key adv --keyserver hkp://ha.pool.sks-keyservers.net :80 --recv-keys 58118E89F3A912897C070ADBF76221572C52609D sudo apt-get update && sudo apt-get install docker git clone https://github.com/korseby/container-mtbls520 cd container-mtbls520 docker build -t korseby/mtbls520.

Installing and using Galaxy to run the workflow

The workflow was tested with Galaxy version 17.09. Instructions how to install Galaxy can be found in the training material of the Galaxy project (accessible at https://galaxyproject.github.io/training-material/). However, it is recommended that an official Galaxy server is used, such as those from the PhenoMeNal infrastructure (available at https://public.phenomenal-h2020.eu/). After being logged into Galaxy, a click on “Workflow” in the menu bar on the top and then a click on the “Upload” button opens up a new page. In the field “Galaxy workflow URL:” enter the following address “https://raw.githubusercontent.com/korseby/container-mtbls520/develop/galaxy/mtbls520_workflow.ga” or upload the .ga file from the GitHub repository (Table 1 (available online only)) and then clicking on the button “Import”. This will import the workflow of the study into Galaxy. The workflow will now be available in Galaxy under Workflows as “Metabolights 520 Eco-Metabolomics Workflow”. From there, clicking on the drop-down menu there are options to “Edit” (visually view the complete workflow in the Galaxy workflow editor) or to “Run” the workflow. Required data can be downloaded from MetaboLights with the Galaxy module “mtbls520_01_mtbls_download” (Table 1 (available online only)). Once the download has been completed, data can be extracted with the Galaxy module “mtbls520_02_extract” (Table 1 (available online only)). The workflow can be directly run once the inputs have been assigned to the extracted data files. Processing will take approx. 40 min depending on the work load of the computational infrastructure.

Additional information

How to cite this article: Peters, K. et al. Computational workflow to study the seasonal variation of secondary metabolites in nine different bryophytes. Sci. Data 5:180179 doi: 10.1038/sdata.2018.179 (2018). Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Table 3

Some putatively annotated compounds in the metabolite profile “pos_27_Marpol_1-D.6_01_5707” of Marchantia polymorpha in the summer season

CompoundSum formulam/zRT [s]Area (Integrated Intensity)
Columns are given for the compound name, mass-to-charge ratio (m/z), retention time (RT) in seconds and area under curve (integrated intensity values).    
(-)-delta-CuparenolC15H22O219.1747661.0642907.19
(-)-delta-CuparenolC15H22O241.1559655.327828.00
(-)-delta-CuparenolC15H22O241.1559644.1167619.28
(-)-delta-CuparenolC15H22O219.1745637.5808057.69
(-)-delta-CuparenolC15H22O219.1743690.7537473.44
(-)-delta-CuparenolC15H22O203.1794672.8167255.34
(-)-delta-CuparenolC15H22O219.1743706.744150.91
(-)-delta-CupreneneC15H24205.1952712.01508130.50
(-)-delta-CupreneneC15H25205.1949764.163622.58
(-)-delta-CupreneneC15H26205.1950757.558668.16
(-)-delta-CupreneneC15H27205.1949743.7819416.63
(-)-delta-CupreneneC15H28205.1952735.21419703.50
(-)-delta-CupreneneC15H29205.1952726.21152596.13
(-)-delta-CupreneneC15H30205.1951693.478010.95
(-)-delta-CupreneneC15H31205.1950687.3318694.16
(-)-delta-CupreneneC15H32205.1949679.5128213.59
(-)-delta-CupreneneC15H33205.1949672.0223135.42
2,7-Dihydroxy-3-methoxyphenanthreneC15H12O3241.0859442.51805637.88
2-Hydroxy-3,6-dimethoxyphenanthreneC16H14O3479.0818266.36064.95
2-Hydroxy-3,6-dimethoxyphenanthreneC16H14O4463.0869277.981793.41
3-Hydroxy-2,5-dimethoxyphenanthreneC16H14O5479.0818266.36064.95
3-Hydroxy-2,5-dimethoxyphenanthreneC16H14O6463.0869277.981793.41
8-Hydroxyluteolin 8-glucuronideC21H18O13479.0816267.170267.63
Apigenin 7-galacturonideC21H18O11447.0919298.377469.98
Apigenin 7-galacturonideC21H18O11447.0919306.81109165.63
Aureusidin 6-glucuronideC21H18O12463.0872318.1695087.38
Aureusidin 6-glucuronideC21H18O13463.0862313.218427.22
Aureusidin 6-glucuronideC21H18O14463.0868304.422474.41
Aureusidin 6-glucuronideC21H18O15463.0865295.0180278.44
Aureusidin 6-glucuronideC21H18O16463.0870277.41091277.38
Isoscutellarein 8-glucuronideC21H18O12463.0872318.1695087.38
Isoscutellarein 8-glucuronideC21H18O12463.0868304.422474.41
Isoscutellarein 8-glucuronideC21H18O12463.0865295.0180278.44
Isoscutellarein 8-glucuronideC21H18O12463.0862313.218427.22
Isoscutellarein 8-glucuronideC21H18O12463.0870277.41091277.38
Lunularic AcidC15H14O4241.0859442.5807506.38
LunularinC16H18O2356.2576447.139223.60
LunularinC16H18O2434.1947454.5844.54
LunularinC16H18O2241.0859442.0690.39
Luteolin 7,3-diglucuronideC27H26O18639.1183232.198583.70
Luteolin 7,3-diglucuronideC27H26O18639.1185253.650941.54
Luteolin 7,3-diglucuronideC27H26O18639.1181263.0424450.06
Luteolin 7,3-diglucuronideC27H26O18624.3007226.17404.98
Marchantin AC28H24O5441.1694603.5253393.97
Marchantin AC28H24O5441.1700592.92877328.75
Marchantin BC28H24O6457.1643533.1687680.19
Marchantin CC28H24O4425.1747663.1281065.75
Marchantin GC28H22O6455.1486548.447325.94
Marchantin GC28H22O6356.2577446.713174.81
Marchantin GC28H22O6340.2637483.815443.49
Marchantin GC28H22O6312.1601516.782309.63
Marchantin GC28H22O6457.1643533.526383.32
Marchantin KC29H26O7487.1737553.729407.29
Paleatin AC29H28O6473.1951562.869794.13
Perrottetin EC28H26O4427.1900631.1145353.53
Pheophorbide AC35H36N4O5593.2746944.1338536.13
Pheophorbide AC35H36N4O5593.2753919.32613247.25
ThujopsenoneC15H22O219.1743690.7536620.25
ThujopsenoneC15H22O219.1747661.0641821.44
ThujopsenoneC15H22O219.1743706.744027.77
ThujopsenoneC15H22O219.1745637.5806036.38
ThujopsenoneC15H22O241.1559644.1166482.11
ThujopsenoneC15H22O241.1559655.327266.09
ThujopsenoneC15H22O203.1794672.8166351.58
Thunberginol AC15H10O5271.0598409.0469594.28
  29 in total

1.  Reproducible research in computational science.

Authors:  Roger D Peng
Journal:  Science       Date:  2011-12-02       Impact factor: 47.728

2.  CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography/mass spectrometry data sets.

Authors:  Carsten Kuhl; Ralf Tautenhahn; Christoph Böttcher; Tony R Larson; Steffen Neumann
Journal:  Anal Chem       Date:  2011-12-12       Impact factor: 6.986

Review 3.  Comparative cryptogam ecology: a review of bryophyte and lichen traits that drive biogeochemistry.

Authors:  Johannes H C Cornelissen; Simone I Lang; Nadejda A Soudzilovskaia; Heinjo J During
Journal:  Ann Bot       Date:  2007-03-12       Impact factor: 4.357

4.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences.

Authors:  Jeremy Goecks; Anton Nekrutenko; James Taylor
Journal:  Genome Biol       Date:  2010-08-25       Impact factor: 13.583

5.  IPO: a tool for automated optimization of XCMS parameters.

Authors:  Gunnar Libiseller; Michaela Dvorzak; Ulrike Kleb; Edgar Gander; Tobias Eisenberg; Frank Madeo; Steffen Neumann; Gert Trausinger; Frank Sinner; Thomas Pieber; Christoph Magnes
Journal:  BMC Bioinformatics       Date:  2015-04-16       Impact factor: 3.169

6.  Are the metabolomic responses to folivory of closely related plant species linked to macroevolutionary and plant-folivore coevolutionary processes?

Authors:  Albert Rivas-Ubach; José A Hódar; Jordi Sardans; Jennifer E Kyle; Young-Mo Kim; Michal Oravec; Otmar Urban; Alex Guenther; Josep Peñuelas
Journal:  Ecol Evol       Date:  2016-06-02       Impact factor: 2.912

7.  wft4galaxy: a workflow testing tool for galaxy.

Authors:  Marco Enrico Piras; Luca Pireddu; Gianluigi Zanetti
Journal:  Bioinformatics       Date:  2017-12-01       Impact factor: 6.937

Review 8.  Current Challenges in Plant Eco-Metabolomics.

Authors:  Kristian Peters; Anja Worrich; Alexander Weinhold; Oliver Alka; Gerd Balcke; Claudia Birkemeyer; Helge Bruelheide; Onno W Calf; Sophie Dietz; Kai Dührkop; Emmanuel Gaquerel; Uwe Heinig; Marlen Kücklich; Mirka Macel; Caroline Müller; Yvonne Poeschl; Georg Pohnert; Christian Ristok; Victor Manuel Rodríguez; Christoph Ruttkies; Meredith Schuman; Rabea Schweiger; Nir Shahaf; Christoph Steinbeck; Maria Tortosa; Hendrik Treutler; Nico Ueberschaar; Pablo Velasco; Brigitte M Weiß; Anja Widdig; Steffen Neumann; Nicole M van Dam
Journal:  Int J Mol Sci       Date:  2018-05-06       Impact factor: 5.923

9.  Highly sensitive feature detection for high resolution LC/MS.

Authors:  Ralf Tautenhahn; Christoph Böttcher; Steffen Neumann
Journal:  BMC Bioinformatics       Date:  2008-11-28       Impact factor: 3.169

10.  The FAIR Guiding Principles for scientific data management and stewardship.

Authors:  Mark D Wilkinson; Michel Dumontier; I Jsbrand Jan Aalbersberg; Gabrielle Appleton; Myles Axton; Arie Baak; Niklas Blomberg; Jan-Willem Boiten; Luiz Bonino da Silva Santos; Philip E Bourne; Jildau Bouwman; Anthony J Brookes; Tim Clark; Mercè Crosas; Ingrid Dillo; Olivier Dumon; Scott Edmunds; Chris T Evelo; Richard Finkers; Alejandra Gonzalez-Beltran; Alasdair J G Gray; Paul Groth; Carole Goble; Jeffrey S Grethe; Jaap Heringa; Peter A C 't Hoen; Rob Hooft; Tobias Kuhn; Ruben Kok; Joost Kok; Scott J Lusher; Maryann E Martone; Albert Mons; Abel L Packer; Bengt Persson; Philippe Rocca-Serra; Marco Roos; Rene van Schaik; Susanna-Assunta Sansone; Erik Schultes; Thierry Sengstag; Ted Slater; George Strawn; Morris A Swertz; Mark Thompson; Johan van der Lei; Erik van Mulligen; Jan Velterop; Andra Waagmeester; Peter Wittenburg; Katherine Wolstencroft; Jun Zhao; Barend Mons
Journal:  Sci Data       Date:  2016-03-15       Impact factor: 6.444

View more
  4 in total

1.  PhenoMeNal: processing and analysis of metabolomics data in the cloud.

Authors:  Kristian Peters; James Bradbury; Sven Bergmann; Marco Capuccini; Marta Cascante; Pedro de Atauri; Timothy M D Ebbels; Carles Foguet; Robert Glen; Alejandra Gonzalez-Beltran; Ulrich L Günther; Evangelos Handakas; Thomas Hankemeier; Kenneth Haug; Stephanie Herman; Petr Holub; Massimiliano Izzo; Daniel Jacob; David Johnson; Fabien Jourdan; Namrata Kale; Ibrahim Karaman; Bita Khalili; Payam Emami Khonsari; Kim Kultima; Samuel Lampa; Anders Larsson; Christian Ludwig; Pablo Moreno; Steffen Neumann; Jon Ander Novella; Claire O'Donovan; Jake T M Pearce; Alina Peluso; Marco Enrico Piras; Luca Pireddu; Michelle A C Reed; Philippe Rocca-Serra; Pierrick Roger; Antonio Rosato; Rico Rueedi; Christoph Ruttkies; Noureddin Sadawi; Reza M Salek; Susanna-Assunta Sansone; Vitaly Selivanov; Ola Spjuth; Daniel Schober; Etienne A Thévenot; Mattia Tomasoni; Merlijn van Rijswijk; Michael van Vliet; Mark R Viant; Ralf J M Weber; Gianluigi Zanetti; Christoph Steinbeck
Journal:  Gigascience       Date:  2019-02-01       Impact factor: 6.524

2.  Chemical Diversity and Classification of Secondary Metabolites in Nine Bryophyte Species.

Authors:  Kristian Peters; Hendrik Treutler; Stefanie Döll; Alida S D Kindt; Thomas Hankemeier; Steffen Neumann
Journal:  Metabolites       Date:  2019-10-11

3.  Seasonal variation of secondary metabolites in nine different bryophytes.

Authors:  Kristian Peters; Karin Gorzolka; Helge Bruelheide; Steffen Neumann
Journal:  Ecol Evol       Date:  2018-08-22       Impact factor: 2.912

Review 4.  Bibenzyls and bisbybenzyls of bryophytic origin as promising source of novel therapeutics: pharmacology, synthesis and structure-activity.

Authors:  Samapika Nandy; Abhijit Dey
Journal:  Daru       Date:  2020-08-15       Impact factor: 3.117

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.