Literature DB >> 25977786

Metabolic differences in ripening of Solanum lycopersicum 'Ailsa Craig' and three monogenic mutants.

Stephan Beisken¹, Mark Earll², Charles Baxter², David Portwood², Zsuzsanna Ament², Aniko Kende², Charlie Hodgman³, Graham Seymour³, Rebecca Smith³, Paul Fraser⁴, Mark Seymour², Reza M Salek¹, Christoph Steinbeck¹.

Abstract

Application of mass spectrometry enables the detection of metabolic differences between groups of related organisms. Differences in the metabolic fingerprints of wild-type Solanum lycopersicum and three monogenic mutants, ripening inhibitor (rin), non-ripening (nor) and Colourless non-ripening (Cnr), of tomato are captured with regard to ripening behaviour. A high-resolution tandem mass spectrometry system coupled to liquid chromatography produced a time series of the ripening behaviour at discrete intervals with a focus on changes post-anthesis. Internal standards and quality controls were used to ensure system stability. The raw data of the samples and reference compounds including study protocols have been deposited in the open metabolomics database MetaboLights via the metadata annotation tool Isatab to enable efficient re-use of the datasets, such as in metabolomics cross-study comparisons or data fusion exercises.

Entities: Chemical

Mesh：

Year: 2014 PMID： 25977786 PMCID： PMC4322568 DOI： 10.1038/sdata.2014.29

Source DB: PubMed Journal: Sci Data ISSN： 2052-4463 Impact factor: 6.444

Background & Summary

Liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS) enables capturing of metabolic snapshots at discrete time intervals[1]. Models can be built inspecting changes in metabolic trajectories over time for a group of closely related organisms distinguished by the characteristic properties under investigation[2]. The ripening behaviour of Solanum lycopersicum and three ripening-inhibited mutants was explored using a LC-MS/MS assay (Table 1). The plant material was harvested in five to ten day intervals up to the point of ripening and daily after. Each genotype was grown in triplicate. Each block of 52 plant samples was measured using a group-randomization setup with interspersed quality controls: blank injections (solvent), pooled samples (mix, see table below), and aliquots of a pooled in-house standard tomato reference (see Methods). Randomization groups were defined by day of harvest.

Table 1

Summary of study samples.

Name	Group	Description
The table contains the sample name, its group, and description.
AC⁺⁺	Wild type (WT)	Ailsa Craig variety
NOR	Monogenic mutant	Ailsa Craig near isogenic lines containing the Non-ripening mutation
RIN	Monogenic mutant	Ailsa Craig near isogenic line containing the ripening inhibitor mutation
CNR	Monogenic mutant	Ailsa Craig near isogenic line containing the Colourless non-ripe mutation
Mix	n/a	Pooled sample of AC⁺⁺, NOR, RIN, CNR, and TomQC
TomQC	n/a	Standard in-house tomato aliquots
Blank	n/a	Blank sample with solvent

The study resulted in a metabolic data set about the ripening behaviour of wild-type and mutant tomato, which completely abolish the normal ripening process (Data Citation 1)[3]. A total of 219 and 225 samples (including controls) were acquired in positive and negative ion mode respectively. Quality measures including positive and negative control samples as well as representative internal standards were taken into account during study design, facilitating data analysis and enabling filtering of unintended biological variation in the data. A total of 58 distinct reference standards relevant to plant metabolism were measured on the same instrumental setup to aid metabolite identification (Data Citation 2) and consequently model validation via biological interpretation. Tomato is a model system to study ripening in fleshy fruits[4]. Many single gene mutants are well described, making it a suitable target to investigate various biological processes, including ripening, and integrate these[5]. Understanding metabolic changes underlying factors of ripening are essential to exploit regulatory mechanisms and improve fruit crop. Building on research of ethylene regulation in ripening, changes in metabolites in this climacteric fruit are of particular interest because of the link between ethylene biosynthesis and central metabolic pathways[6]. By making this data set publically available, it lends itself to applications in the biochemistry of fruit ripening, metabolic cross-study comparisons or investigations into reproducibility of data analysis in metabolomics. The presence of standard reference files and complex biological data files—acquired on the same LC-MS/MS system under identical conditions—also make it highly useful for exercises around data analysis and interpretation. Other studies about the ripening behaviour of tomato include the multi-platform metabolomics analysis by Carrari et al. [7], Perez-Fons et al. [8] as well as others[9-129-129-129-12] that could be used for combined analysis.

Methods

Plant material

Wild-type Solanum lycopersicum (Ailsa Craig, AC++) and three ripening inhibited AC++ mutants were used in this study: ripening inhibitor (rin), non-ripening (nor) and Colourless non-ripening (Cnr) mutations. The plants were grown in 24 cm-diameter pots in M3 compost (Levington Horticulture, Ipswich, and Suffolk, UK) and watered daily under standard greenhouse conditions. Developing fruit were sampled in five to ten day intervals (10, 15, 20, 30, 40 days post-anthesis) and daily from Breaker (49, 50, 51, 52, 53, 54, 55, 56 days post-anthesis). Breaker fruit were defined as those showing the first signs of ripening-associated colour change from green to orange. Non-ripe mutants were taken at day 49 as equivalents to breaker WT fruits. All plant samples were taken at the same time each day, frozen in liquid nitrogen, and stored at −70 °C until required.

Sample preparation

Stock standard solutions were prepared for the analytical reference standards at a concentration of 1,000 μg/ml in 20/80 HPLC analytical grade Ethanol/Water and then diluted 10× for injection. Tomato samples were subjected to an untargeted metabolite analysis by LC-MS/MS of polar extracts. Approximately 30 mg of dried tomato tissue was extracted with HPLC analytical grade Ethanol/Water 20:80. The polar extracts were diluted 10:1 with water and injected underivatised. Samples were acquired in three batches on 17, 20, and 21 September 2010 in positive ion mode and on 23, 24, and 27 September 2010 in negative ion mode.

Chromatography

All samples were run on a Waters Acquity® UPLC system (Waters Corporation, USA), HSS T3 150×2 mm, 1.7 μm particles, UPLC column at 30 °C oven temperature. The solvents used for the assay consisted of 0.2% Formic Acid (Solvent A) and 98/2/0.2 Acetonitrile/Water/Formic Acid (Solvent B). Gradient [time (min)/%B] starting at flow rate 0.25 ml/min: 2.5/0, 7.5/10 (flow rate to 0.4 ml/min), 10.0/100, 12.0/100, 18.0/0, 25.0/0. Aliquots of 2 μl were injected.

Mass spectrometry

The compounds were detected using a Thermo LTQ Velos Orbitrap mass spectrometer operating in positive and negative Electrospray ionization (ESI) mode at a resolution of 30,000 with a scan range from 85–900 m/z and 95–900 m/z respectively. MS/MS spectra were obtained in a data dependent manner through higher-energy collisional dissociation (HCD, normalized collision energy: 50.0) at a resolution of 7,500: The two most intense mass spectral peaks detected in each scan were fragmented to give MS2 spectra (100–900 m/z). Full scan data was acquired in FT (accurate mass) mode, MS/MS spectra were acquired in centroid mode. The LTQ Velos Orbitrap used the Xcalibur control software version 2.1.0 for data acquisition. Reference standards were acquired using the same protocol and experimental setting.

Reference and internal standards

Reference standards were commercially purchased from Fluka Analytical, Sigma-Aldrich, and C/D/N Isotopes or prepared in-house. Internal standards for reference tomato aliquots comprised the following (final concentration): Citric acid-d4 (1,000 μg/ml), L-Alanine-d4 (200 μg/ml), Glutamic acid-d5 (200 μg/ml), and L-Phenyl-alanine-d5 (100 μg/ml). Pooled in-house standard tomato reference was prepared from the shop bought Angelle variety: mashed up in bulk and aliquoted out following the protocol outlined above.

Data processing and transformation

Non-targeted LC-MS/MS vendor raw data files were converted to the open source format mzML using the program ProteoWizard[13]. Vendor-based peak picking was enabled for MS1. The resulting mzML files were subsequently processed with MassCascade[14] in KNIME[15], the open source Konstanz Information Miner environment: features were extracted with ±5 p.p.m. mass accuracy, smoothed using a third order polynomial and deconvoluted using a modified Bieman algorithm[16]. Noise reduction was firstly carried out by removing ion signals that are not consistent across six adjacent scans[17] and secondly via a Durbin-Watson criterion set to a threshold of 2.8[18]. The workflow including all settings is available at MyExperiment[19] under this article’s title. Obiwarp[20] was used for cross-sample alignment before signals below an intensity threshold of 10,000 were filtered out. The maximum percentage of missing values per group and feature was set to 10% to account for limits in detection while ensuring that features used for statistical analysis are at least consistently present in 90% of the samples. Lower and higher percentages of missing values were found to decrease and increase the number of features respectively, without significantly affecting subsequent data validation. Overall, differences between groups were less pronounced with an allowed missingness of 0% and above a missingness value of 20%, most likely due to an increase in the sparsity of the feature matrix. Gaps in the resulting sample by feature matrix were either backfilled or—if missing—imputed using a readily available PCA-based approach[21]. In contrast to naïve methods (e.g., mean replacement), PCA-based gap filling takes the natural variance of the data into account and has been shown to give good results[22]. Statistical analysis was carried out in the statistical programming environment R, version 3.0.2. Total signal intensity normalization and Pareto scaling were used for data pre-treatment.

Data Records

All samples used in this study have been submitted to MetaboLights[23] at the European Bioinformatics Institute (EMBL-EBI). Each MetaboLights entry contains protocols about sample collection, extraction, chromatography, mass spectrometry, metabolite identification, and data transformation. The study was metadata tagged using the Investigation/Study/Assay (ISA) suite[24]. ISA uses tab-separated text files to store the experimental information. In addition, identified metabolites were stored in mzTab[25] compatible tab separated files provided by the MetaboLights’s Isacreator plug-in extension.

Data record 1

Data Citation 1 contains the study samples: 442 LC-MS/MS files (.mzML, 64-bit) acquired in continuous mode: 219 in positive and 223 in negative ion mode. Consistent file names are composed of: ___.

Data record 2

Data Citation 2 contains the reference standards: 71 LC-MS/MS files (.mzML, 64-bit) acquired in continuous mode: 43 in positive and 28 in negative ion mode. Chemical names are used as file names and linked to the ChEBI database[26]. Additionally, putative in silico generated fragment structures for MS2 are collated in individual mzTab files for 34 out of 43 samples acquired in positive ion mode to aid interpretation. No MS2 spectra are available for the remaining 9 samples.

Technical Validation

In mass spectrometry experiments, acquired data needs to be checked for unintended systematic and random factors that may distort analysis. These factors can result from the biological or experimental level and can be detected through good experimental design. Quality controls consisting of blanks, mixed samples, and standard tomato samples were used to assess and to validate the experimental design. The stability of the experimental system was inspected using principal component analysis (PCA) on the processed sample by feature matrices of the two data sets, acquired in positive and negative ion modes (Fig. 1). PCA is a well-established technique that can be used to provide a first overview of the data with regard to questions related to high variance, e.g., sample clusters, trends, and outliers. In all of the models the total signal is normalized and aligned signals Pareto scaled. As shown in Fig. 1, quality control samples cluster distinctively, indicating absence of significant non-biological induced variation in the study. Nota bene, PCA of the batches in the absence of quality controls shows that samples acquired on 17 September show strong batch variation compared to samples acquired on 20 and 21 September; this is very common and expected[27,28].

Figure 1

Principal Component Analysis of the data sets acquired in positive (a) and negative (b) ion mode. The first two principal components are shown. The data are coloured by group. Total signal normalization and Pareto scaling was applied to both data sets. Quality controls—blanks, mixed, and aliquots of reference standard tomato samples—cluster distinctively outside the main group. Blanks are well outside Hotelling’s T2 confidence region (95%, indicated as grey circle).

Ion traces from internal standards were extracted from reference tomato aliquots after data processing and used to validate the consistency of the data set. Tables 2 and 3 summarize the results of the ‘positive’ and ‘negative’ data set. L-Alanine-d4 was not ionized in negative ion mode and consequently not detected in the negative mode.

Table 2

Summary of Internal standards in the ‘positive’ data set.

Name	PubChem ID	[M+H] ⁺	Retention time	Average abundance	Standard deviation
The chemical name, PubChem compound identifier, main adduct mass-to-charge ratio, retention time, averaged abundance, and standard deviation is shown for the four deuterated standards.
Citric acid-d4	16213286	197.0594	244 s	0.3172	0.061
L-Alanine-d4	12205373	94.0804	84 s	0.7830	0.087
Glutamic acid-d5	56845948	153.0919	91 s	0.4229	0.368
L-Phenyl-alanine-d5	13000995	171.1175	414 s	7.0474	0.543

Table 3

Summary of Internal standards in the ‘negative’ data set.

Name	PubChem ID	[M+H] ⁺	Retention time	Average abundance	Standard deviation
The chemical name, PubChem compound identifier, main adduct mass-to-charge ratio, retention time, averaged abundance, and standard deviation is shown for the three deuterated standards. L-Alanine-d4 was not detected in negative ion mode.
Citric acid-d4	16213286	195.0449	260 s	0.0183	0.005
Glutamic acid-d5	56845948	151.0771	91 s	0.4402	0.028
L-Phenyl-alanine-d5	13000995	169.1033	417 s	0.0613	0.022

The consistency of the retention times and abundances of the internal standards across samples—together with the clusters observed in the PCA—indicate little technical sample-to-sample variation[29].

Usage Notes

This data set can be used for data-driven cross-tool comparisons in metabolomics, investigating the effects of data processing, e.g., peak picking and deconvolution, and analysis of statistical models and subsequent biological interpretation. Open source tools and databases can be used to support and demonstrate efforts within the metabolomics community, driven by COSMOS[30] as well as recent initiatives in the metabolomics society. Such a comparison can include feature extraction and cross-sample alignment by tools such as MzMine2[31], MassCascade, and XCMS[32] and subsequent exploration of interpretability and predictability of multivariate models build on different data processing approaches and tools. Additionally, effects of data pre-treatment methods such as scaling and normalization could be investigated with regard to metabolite ranking on biological importance according to developed models used. With regard to ripening, the monogenic mutants (nor, rin and Cnr) can be compared to the wild-type AC++ to identify and investigate metabolites correlated to differences between the genotypes. This comparative analysis can involve Partial Least Squares Discriminant Analysis (PLS-DA) and S-plots to highlight statistically reliable features that are strongly correlated with the models between the individual mutants and wild-type. Knowledge of identified features, i.e. metabolites, is essential for metabolic modelling and exploration of regulatory mechanisms. The data set can also contribute to the generation of a standard metabolome of tomatoes: submission of high quality data sets of key organisms such as Solanum lycopersicum into public databases, enable the community to gradually collect and subsequently discover the metabolome of a species. This study features metabolomics data from four different varieties of tomato that are semantically annotated, complete, and fully publicly available.

Additional information

How to cite this article: Beisken, S. et al. Metabolic differences in ripening of Solanum lycopersicum ‘Ailsa Craig’ and three monogenic mutants. Sci. Data 1:140029 doi: 10.1038/sdata.2014.29 (2014).

27 in total

1. XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification.

Authors: Colin A Smith; Elizabeth J Want; Grace O'Maille; Ruben Abagyan; Gary Siuzdak
Journal: Anal Chem Date: 2006-02-01 Impact factor: 6.986

Review 2. Mass spectrometry-based metabolomics towards understanding of gene functions with a diversity of biological contexts.

Authors: Haitao Lv
Journal: Mass Spectrom Rev Date: 2012-08-13 Impact factor: 10.946

3. Analytical error reduction using single point calibration for accurate and precise metabolomic phenotyping.

Authors: Frans M van der Kloet; Ivana Bobeldijk; Elwin R Verheij; Renger H Jellema
Journal: J Proteome Res Date: 2009-11 Impact factor: 4.466

Review 4. Liquid chromatography-mass spectrometry in metabolomics research: mass analyzers in ultra high pressure liquid chromatography coupling.

Authors: Sara Forcisi; Franco Moritz; Basem Kanawati; Dimitrios Tziotis; Rainer Lehmann; Philippe Schmitt-Kopplin
Journal: J Chromatogr A Date: 2013-04-11 Impact factor: 4.759

5. Targeted systems biology profiling of tomato fruit reveals coordination of the Yang cycle and a distinct regulation of ethylene biosynthesis during postclimacteric ripening.

Authors: Bram Van de Poel; Inge Bulens; Aikaterina Markoula; Maarten L A T M Hertog; Rozemarijn Dreesen; Markus Wirtz; Sandy Vandoninck; Yasmin Oppermann; Johan Keulemans; Ruediger Hell; Etienne Waelkens; Maurice P De Proft; Margret Sauter; Bart M Nicolai; Annemie H Geeraerd
Journal: Plant Physiol Date: 2012-09-13 Impact factor: 8.340

Review 6. Tackling the widespread and critical impact of batch effects in high-throughput data.

Authors: Jeffrey T Leek; Robert B Scharpf; Héctor Corrada Bravo; David Simcha; Benjamin Langmead; W Evan Johnson; Donald Geman; Keith Baggerly; Rafael A Irizarry
Journal: Nat Rev Genet Date: 2010-09-14 Impact factor: 53.242

7. The tomato FRUITFULL homologs TDR4/FUL1 and MBP7/FUL2 regulate ethylene-independent aspects of fruit ripening.

Authors: Marian Bemer; Rumyana Karlova; Ana Rosa Ballester; Yury M Tikunov; Arnaud G Bovy; Mieke Wolters-Arts; Priscilla de Barros Rossetto; Gerco C Angenent; Ruud A de Maagd
Journal: Plant Cell Date: 2012-11-06 Impact factor: 11.277

8. MassCascade: Visual Programming for LC-MS Data Processing in Metabolomics.

Authors: Stephan Beisken; Mark Earll; David Portwood; Mark Seymour; Christoph Steinbeck
Journal: Mol Inform Date: 2014-04-22 Impact factor: 3.353

9. A genome-wide metabolomic resource for tomato fruit from Solanum pennellii.

Authors: Laura Perez-Fons; Tom Wells; Delia I Corol; Jane L Ward; Christopher Gerrish; Michael H Beale; Graham B Seymour; Peter M Bramley; Paul D Fraser
Journal: Sci Rep Date: 2014-01-24 Impact factor: 4.379

10. KNIME-CDK: Workflow-driven cheminformatics.

Authors: Stephan Beisken; Thorsten Meinl; Bernd Wiswedel; Luis F de Figueiredo; Michael Berthold; Christoph Steinbeck
Journal: BMC Bioinformatics Date: 2013-08-22 Impact factor: 3.169

4 in total

1. Laser microdissection of tomato fruit cell and tissue types for transcriptome profiling.

Authors: Laetitia B B Martin; Philippe Nicolas; Antonio J Matas; Yoshihito Shinozaki; Carmen Catalá; Jocelyn K C Rose
Journal: Nat Protoc Date: 2016-11-03 Impact factor: 13.491

2. Metabolic profilings of rat INS-1 β-cells under changing levels of essential amino acids.

Authors: Lianbin Xu; Xueyan Lin; Xiuli Li; Zhiyong Hu; Qiuling Hou; Yun Wang; Zhonghua Wang
Journal: Sci Data Date: 2022-06-14 Impact factor: 8.501

3. TOMATOMET: A metabolome database consists of 7118 accurate mass values detected in mature fruits of 25 tomato cultivars.

Authors: Takeshi Ara; Nozomu Sakurai; Shingo Takahashi; Naoko Waki; Hiroyuki Suganuma; Koichi Aizawa; Yasuki Matsumura; Teruo Kawada; Daisuke Shibata
Journal: Plant Direct Date: 2021-04-29

Review 4. The value of universally available raw NMR data for transparency, reproducibility, and integrity in natural product research.

Authors: James B McAlpine; Shao-Nong Chen; Andrei Kutateladze; John B MacMillan; Giovanni Appendino; Andersson Barison; Mehdi A Beniddir; Maique W Biavatti; Stefan Bluml; Asmaa Boufridi; Mark S Butler; Robert J Capon; Young H Choi; David Coppage; Phillip Crews; Michael T Crimmins; Marie Csete; Pradeep Dewapriya; Joseph M Egan; Mary J Garson; Grégory Genta-Jouve; William H Gerwick; Harald Gross; Mary Kay Harper; Precilia Hermanto; James M Hook; Luke Hunter; Damien Jeannerat; Nai-Yun Ji; Tyler A Johnson; David G I Kingston; Hiroyuki Koshino; Hsiau-Wei Lee; Guy Lewin; Jie Li; Roger G Linington; Miaomiao Liu; Kerry L McPhail; Tadeusz F Molinski; Bradley S Moore; Joo-Won Nam; Ram P Neupane; Matthias Niemitz; Jean-Marc Nuzillard; Nicholas H Oberlies; Fernanda M M Ocampos; Guohui Pan; Ronald J Quinn; D Sai Reddy; Jean-Hugues Renault; José Rivera-Chávez; Wolfgang Robien; Carla M Saunders; Thomas J Schmidt; Christoph Seger; Ben Shen; Christoph Steinbeck; Hermann Stuppner; Sonja Sturm; Orazio Taglialatela-Scafati; Dean J Tantillo; Robert Verpoorte; Bin-Gui Wang; Craig M Williams; Philip G Williams; Julien Wist; Jian-Min Yue; Chen Zhang; Zhengren Xu; Charlotte Simmler; David C Lankin; Jonathan Bisson; Guido F Pauli
Journal: Nat Prod Rep Date: 2018-07-13 Impact factor: 13.423

4 in total