Literature DB >> 18266924

Metabolite annotations based on the integration of mass spectral information.

Yoko Iijima1, Yukiko Nakamura, Yoshiyuki Ogata, Ken'ichi Tanaka, Nozomu Sakurai, Kunihiro Suda, Tatsuya Suzuki, Hideyuki Suzuki, Koei Okazaki, Masahiko Kitayama, Shigehiko Kanaya, Koh Aoki, Daisuke Shibata.   

Abstract

A large number of metabolites are found in each plant, most of which have not yet been identified. Development of a methodology is required to deal systematically with unknown metabolites, and to elucidate their biological roles in an integrated 'omics' framework. Here we report the development of a 'metabolite annotation' procedure. The metabolite annotation is a process by which structures and functions are inferred for metabolites. class="Species">Tomato (class="Chemical">n class="Species">Solanum lycopersicum cv. Micro-Tom) was used as a model for this study using LC-FTICR-MS. Collected mass spectral features, together with predicted molecular formulae and putative structures, were provided as metabolite annotations for 869 metabolites. Comparison with public databases suggests that 494 metabolites are novel. A grading system was introduced to describe the evidence supporting the annotations. Based on the comprehensive characterization of tomato fruit metabolites, we identified chemical building blocks that are frequently found in tomato fruit tissues, and predicted novel metabolic pathways for flavonoids and glycoalkaloids. These results demonstrate that metabolite annotation facilitates the systematic analysis of unknown metabolites and biological interpretation of their relationships, which provide a basis for integrating metabolite information into the system-level study of plant biology.

Entities:  

Mesh:

Year:  2008        PMID: 18266924      PMCID: PMC2440531          DOI: 10.1111/j.1365-313X.2008.03434.x

Source DB:  PubMed          Journal:  Plant J        ISSN: 0960-7412            Impact factor:   6.417


Introduction

Large-scale biology studies supported by high-throughput data acquisition technologies require a method to bridge the gap between the data obtained and their biological interpretation. In genomics, without an analytical method to define genes, the nucleotide seclass="Chemical">queclass="Chemical">nce of a whole geclass="Chemical">nome is merely a series of letters (Ashburclass="Chemical">ner, 2000). Usiclass="Chemical">ng the process of aclass="Chemical">nclass="Chemical">notatioclass="Chemical">n, by which iclass="Chemical">nformatioclass="Chemical">n about the locatioclass="Chemical">n aclass="Chemical">nd the class="Chemical">number of geclass="Chemical">nes aclass="Chemical">nd the fuclass="Chemical">nctioclass="Chemical">ns of eclass="Chemical">ncoded proteiclass="Chemical">ns is iclass="Chemical">nferred, researchers obtaiclass="Chemical">n biological meaclass="Chemical">niclass="Chemical">ng from the geclass="Chemical">nome seclass="Chemical">n class="Chemical">quence (Stein, 2001). Metabolomics researchers are currently experiencing a similar situation to that which faced early genomics researchers. Recent progress in data acquisition technologies such as chromatography-coupled mass spectrometry has facilitated simultaneous detection and quantification of a large number of metabolite-derived peaks (Hall, 2006). However, the data obtained by high-throughput MS are merely a series of peaks without metabolite assignment. At this stage in metabolomics research, most of the peaks detected using MS cannot be assigned to identified metabolites. Such peaks are labeled as ‘unknown’ and usually are not characterized further. Thus the limited capability for metabolite identification has been one of the major obstacles in metabolomics (Kind and Fiehn, 2006; Wagner ). One approach to overcoming this obstacle is to quantify all detected peaks and compile them as un-annotated variables (Bino ; Roessner ; Schauer ). This approach, non-targeted metabolic profiling, is fren class="Chemical">queclass="Chemical">ntly combiclass="Chemical">ned with statistical correlatioclass="Chemical">n aclass="Chemical">nalysis to hypothesize biological roles for the detected metabolites (Carrari ; Schauer ). Another approach to overcoming the obstacle is to create a comprehensive dataset of plant metabolites by compiling various pieces of chemical information as has been done for n class="Species">human metabolites (Smith ), aclass="Chemical">nd to provide aclass="Chemical">nclass="Chemical">notatioclass="Chemical">ns for the metabolites. FTICR-MS is a promisiclass="Chemical">ng caclass="Chemical">ndidate techclass="Chemical">nology to achieve this goal. FTICR-MS measuremeclass="Chemical">nt provides mass values with very high accuracy aclass="Chemical">nd resolutioclass="Chemical">n. This techclass="Chemical">nology has beeclass="Chemical">n employed for class="Chemical">noclass="Chemical">n-targeted aclass="Chemical">nalyses of metabolites, aclass="Chemical">nd has democlass="Chemical">nstrated its advaclass="Chemical">ntage iclass="Chemical">n detecticlass="Chemical">ng differeclass="Chemical">ntially expressed metabolites (Aharoclass="Chemical">ni ; Murch ; Oikawa ). However, despite maclass="Chemical">ny techclass="Chemical">nical advaclass="Chemical">ntages, FTICR-MS has a drawback iclass="Chemical">n that it is iclass="Chemical">ncapable of separaticlass="Chemical">ng isomers that have the same elemeclass="Chemical">ntal compositioclass="Chemical">ns. It has beeclass="Chemical">n democlass="Chemical">nstrated receclass="Chemical">ntly that coupliclass="Chemical">ng of liquid chromatography to FTICR-MS facilitates the effective separatioclass="Chemical">n of isomers (Suzuki ). However, a compreheclass="Chemical">nsive metabolite dataset usiclass="Chemical">ng chromatography-coupled FTICR-MS has class="Chemical">not yet beeclass="Chemical">n produced. In the present study, we propose a procedure for metabolite annotation using the data obtained by high-performance LC-FTICR-MS. class="Species">Tomato (class="Chemical">n class="Species">Solanum lycopersicum cv. Micro-Tom) fruit was analyzed as a model plant for two reasons. First, tomato contains a number of secondary metabolites that are not present in other model plants such as Arabidopsis and rice. Second, a tomato genome sequencing project is currently underway (Mueller ) that will allow interpretation of metabolite data in conjunction with annotated gene functions. class="Species">Tomato metabolite data were collected iclass="Chemical">n a class="Chemical">noclass="Chemical">n-targeted maclass="Chemical">nclass="Chemical">ner. We theclass="Chemical">n compiled a dataset comprised of mass spectral features iclass="Chemical">ncludiclass="Chemical">ng reteclass="Chemical">ntioclass="Chemical">n time, UV/visible absorptioclass="Chemical">n spectrum, m/z value, m/z value of the MS/MS fragmeclass="Chemical">nt, aclass="Chemical">nd relative iclass="Chemical">nteclass="Chemical">nsity of the MS/MS fragmeclass="Chemical">nt. These mass spectral features were attached as aclass="Chemical">nclass="Chemical">notatioclass="Chemical">ns to iclass="Chemical">ndividual metabolites. This iclass="Chemical">nformatioclass="Chemical">n allowed us to provide aclass="Chemical">nclass="Chemical">notatioclass="Chemical">ns of predicted molecular formulae for 869 metabolites. Comparisoclass="Chemical">n with public databases suggests that 494 of the metabolites are class="Chemical">novel. Additioclass="Chemical">nally, MS/MS fragmeclass="Chemical">ntatioclass="Chemical">n profile data allowed provisioclass="Chemical">n of aclass="Chemical">nclass="Chemical">notatioclass="Chemical">ns for a class="Chemical">number of secoclass="Chemical">ndary metabolites with kclass="Chemical">nowclass="Chemical">n chemical structures. We coclass="Chemical">nstructed a web-based database compiliclass="Chemical">ng the metabolite aclass="Chemical">nclass="Chemical">notatioclass="Chemical">ns (http://webs2.kazusa.or.jp/komics/). Based oclass="Chemical">n compreheclass="Chemical">nsive characterizatioclass="Chemical">n of class="Chemical">n class="Species">tomato fruit metabolites, we identified chemical building blocks that appear frequently in the tomato fruit tissues. We also assigned several unknown flavonoids and glycoalkaloids to novel metabolic pathways based on the annotations of putative structures. These results demonstrate that metabolite annotation allows us to systematically analyze unknown metabolites and facilitates biological interpretation of their roles in metabolic processes.

Results

Procedure of metabolite annotation

We developed a procedure to organize MS data in a metabolite-oriented manner, which hereafter is referred to as a metabolite annotation procedure. The procedure comprises eight seclass="Chemical">queclass="Chemical">ntial steps. First, the whole raw data set comprisiclass="Chemical">ng data from successive mass scaclass="Chemical">ns were exported as a text file (Figure 1a). Secoclass="Chemical">nd, the observed m/z values of mass sigclass="Chemical">nals were calibrated with those of iclass="Chemical">nterclass="Chemical">nal staclass="Chemical">ndards detected iclass="Chemical">n the same scaclass="Chemical">n (Oikawa ) (Figure 1b). After iclass="Chemical">nterclass="Chemical">nal staclass="Chemical">ndard calibratioclass="Chemical">n, errors iclass="Chemical">n m/z values decreased to less thaclass="Chemical">n 1 ppm (Table S1). Third, we grouped mass sigclass="Chemical">nals if the same m/z value was detected iclass="Chemical">n coclass="Chemical">nsecutive scaclass="Chemical">ns, hereafter referred to as a ‘peak group’ (Figure 1c). Aclass="Chemical">n accurate m/z value for each peak group was calculated as the meaclass="Chemical">n of the m/z values for the mass sigclass="Chemical">nals with the highest iclass="Chemical">nteclass="Chemical">nsities (for details, see Experimeclass="Chemical">ntal procedures). Fourth, we searched for pairs of peak groups that had m/z iclass="Chemical">ntervals (Δ) of 1.0033 aclass="Chemical">nd 1.9958 to ideclass="Chemical">ntify 12C/class="Chemical">n class="Chemical">13C1 isotopic peak pairs and 32S/34S1 isotopic peak pairs, respectively (Figure 1d). A peak group for the quasi-molecular ion accompanied by isotopic peaks was regarded as an individual ‘metabolite’. Fifth, molecular formulae were predicted from the accurate m/z values of the metabolites (Figure 1e). To avoid obtaining obviously unnatural formulae, we surveyed elemental compositions in the DNP database (Dictionary of Natural Products). Although the results for such a survey have been reported previously (Kind and Fiehn, 2007), we checked the maximum element numbers within our mass scan range (50–1500 Da). Our survey demonstrated that 95.65% of the DNP compounds (186 788 compounds in a range 50–1500 Da) consist of C, H, N, O, P and S within the ranges C 1–95, H 1–182, N 0–10, O 1–45, P 0–6 and S 0–5. Thus, we set these as upper limits for elemental compositions in the molecular formula calculations. Sixth, we narrowed down the number of candidate formulae using the relative intensity of the 13C1 and 34S1 isotopic ions (Figure 1f). A particular advantage of LC-FTICR-MS is that the resolution is high enough to separate the 34S1 isotopic ion from the 13C2 isotopic ion. Thus, we could use the relative intensity of the 34S1 isotopic ion as a constraint for the number of sulfur atoms. Seventh, we manually performed the isotopic peak group assignment and in-source fragment peak group assignment (Figure 1g). Assignment of the peak groups composed of adduct ions was also performed manually in this step. After these manual curation processes, metabolites were finally designated as ‘annotated metabolites’. In the eighth step, the mass spectral features (including retention time, m/z value, m/z value of the MS/MS fragment, relative intensity of the MS/MS fragment and UV/visible absorption spectrum) and database search results were attached to each metabolite as annotations (Figure 1h). All of the steps, except the manual curation process, are computerized. The annotated metabolites were classified using an annotation grading system (Figure 2, see Experimental procedures).
Figure 1

Schematic flow of the metabolite annotation procedure. (a) Raw data acquisition. (b) m/z calibration with internal standards. (c) Extraction of peak groups. (d) Isotopic ion assignment. (e) Molecular formula calculation. (f) Molecular formula screening using the relative intensity of isotopic ions. (g) Manual curation of isotopic, fragment and adduct peak assignment. (h) Provision of metabolite annotations. This procedure aims to identify a putative ‘metabolite’, which is defined as a group of mass signals that are detected in consecutive scans to form a peak group, accompanied by isotopic ions.

Figure 2

Annotation grading system. Metabolite annotations were classified according to the evidence that supports the annotations. Grade A consists of metabolites with annotations supported by comparison with authentic compounds. Grade B consists of metabolites with a single molecular formula. Grade C consists of metabolites with multiple molecular formulae. Grades B and C were divided into eight sub-grades according to the availability of MS/MS, λmax and reference information.

Annotation grading system. Metabolite annotations were classified according to the evidence that supports the annotations. Grade A consists of metabolites with annotations supported by comparison with authentic compounds. Grade B consists of metabolites with a single molecular formula. Grade C consists of metabolites with multiple molecular formulae. Grades B and C were divided into eight sub-grades according to the availability of MS/MS, λmax and reference information. Schematic flow of the metabolite annotation procedure. (a) Raw data acquisition. (b) m/z calibration with internal standards. (c) Extraction of peak groups. (d) Isotopic ion assignment. (e) Molecular formula calculation. (f) Molecular formula screening using the relative intensity of isotopic ions. (g) Manual curation of isotopic, fragment and adduct peak assignment. (h) Provision of metabolite annotations. This procedure aims to identify a putative ‘metabolite’, which is defined as a group of mass signals that are detected in consecutive scans to form a peak group, accompanied by isotopic ions.

Number of annotated metabolites in tomato fruit

We applied the metabolite annotation procedure to the MS data obtained from eight different class="Species">tomato fruit tissues, comprisiclass="Chemical">ng peel aclass="Chemical">nd flesh at the mature greeclass="Chemical">n, breaker, turclass="Chemical">niclass="Chemical">ng aclass="Chemical">nd the red stages. The class="Chemical">number of detected mass sigclass="Chemical">nals raclass="Chemical">nged from 12 498 to 70 278 (Table 1). Oclass="Chemical">n average, 14.0 ± 3.6 mass sigclass="Chemical">nals were combiclass="Chemical">ned iclass="Chemical">nto oclass="Chemical">ne peak group. Iclass="Chemical">n both positive- aclass="Chemical">nd class="Chemical">negative-class="Chemical">n class="Disease">ionization modes, 21 ± 1.7% of the peak groups were consistently assigned with the isotopic ions and recognized as metabolites. After manual curation, 57 ± 7.9% of the metabolites were provided with molecular formula annotations and designated as annotated metabolites. After removing the redundancy across samples, the total number of annotated metabolites was 869 (Table S2).
Table 1

The numbers of mass signals, peak groups, metabolites and annotated metabolites in tomato fruits

Annotation grade

TissuesIonization modeNumber of mass signalsaNumber of peak groupsaNumber of metabolitesaNumber of annotated metabolitesTotal number of annotated metabolites in each tissuebABC
Mature green
 FleshPositive30 412 ± 30691470 ± 155306 ± 3515426713146108
Negative17 292 ± 14831673 ± 102305 ± 22167
 PeelPositive42 734 ± 50672311 ± 260479 ± 6922836818184166
Negative20 769 ± 29381925 ± 226397 ± 51228
Breaker
 FleshPositive28 782 ± 88351729 ± 271357 ± 9618229115166110
Negative15 853 ± 40781604 ± 311308 ± 66168
 PeelPositive43 462 ± 95402621 ± 379636 ± 11925044023236181
Negative32 675 ± 44402733 ± 376602 ± 85295
Turning
 FleshPositive24 353 ± 61111680 ± 58352 ± 2618828415158111
Negative12 498 ± 49241239 ± 460251 ± 134156
 PeelPositive63 258 ± 66453495 ± 348784 ± 11235861126329256
Negative39 274 ± 34493187 ± 364676 ± 79402
Red
 FleshPositive28 109 ± 17911700 ± 132353 ± 421792631814798
Negative13 808 ± 44031444 ± 414266 ± 64144
 PeelPositive70 278 ± 36194305 ± 2881039 ± 7744569629372295
Negative55 429 ± 24524723 ± 3011026 ± 68428

Numbers indicate means ± SD of three measurements.

Total numbers of non-redundant annotated metabolites detected in positive- and negative-ionization modes.

The numbers of mass signals, peak groups, metabolites and annotated metabolites in n class="Species">tomato fruits Numbers indicate means ± SD of three measurements. Total numbers of non-redundant annotated metabolites detected in positive- and negative-n class="Disease">ionization modes. Only 3.6% of the metabolites were identified by comparison with authentic compounds (grade A, Table 1). Database searches in the n class="Chemical">DNP, KNApSAcK (Oikawa ), Kyoto Eclass="Chemical">ncyclopedia of Geclass="Chemical">nes aclass="Chemical">nd Geclass="Chemical">nomes (KEGG) (Goto ) aclass="Chemical">nd MotoDB (Moco ) revealed that 494 of the aclass="Chemical">nclass="Chemical">notated metabolites were class="Chemical">not preseclass="Chemical">nt iclass="Chemical">n the databases, suggesticlass="Chemical">ng that they are class="Chemical">novel metabolites. The complete set of LC-FTICR-MS data and metabolite annotations is accessible at http://webs2.kazusa.or.jp/komics/.

Qualitative analysis of metabolite composition

Based on the metabolite annotations (Table S2), we investigated the distribution of mass differences between metabolites. Given that a metabolite is generated from a pre-existing metabolite by substitution of chemical building blocks, mass differences may provide insights into the types of reactions that have occurred between two metabolites. The distribution of Δ[m/z ] values showed ‘spikes’, demonstrating that certain Δ[m/z ] values occurred more freclass="Chemical">queclass="Chemical">ntly thaclass="Chemical">n others (Figure 3; the threshold probability to ideclass="Chemical">ntify Δ[m/z ] spikes was determiclass="Chemical">ned as described iclass="Chemical">n Figure S1). The Δ[m/z ] spike profiles seeclass="Chemical">n iclass="Chemical">n class="Chemical">n class="Species">tomato fruit samples were different from those of 10 743 compounds containing C, H and O listed in KEGG (Goto ) (Figure 3c; for a complete list of the compounds, see Table S3). This demonstrates that the Δ[m/z ] spikes have a sample-specific profile. The Δ[m/z ] spikes that occurred in the tomato samples are listed in Table S4.
Figure 3

Examples of the distribution of Δ[m/z ] values in the 0–200 Da range at 0.001 Da intervals. Actual calculation of Δ[m/z ] values was performed in the 500 Da range. Δ[m/z ] values were calculated to obtain insights into the chemical building blocks that occur frequently in a set of metabolites. Δ[m/z ] values calculated from m/z values detected in positive-ionization mode from (a) peel at the turning stage (TP) and (b) flesh at the turning stage (TF), and (c) from the theoretical molecular weight of KEGG CHO compounds (KEGG-CHO). Closed arrowheads indicate Δ[m/z ] spikes that were detected in all three sample types (TP, TF and KEGG-CHO). Open arrowheads indicate Δ[m/z ] spikes that were observed specifically in tomato samples TP and TF. Arrows indicate Δ[m/z ] spikes that were observed specifically in KEGG-CHO. P(Δ[m/z ]) indicates the probability of the occurrence of Δ[m/z ] values.

Examples of the distribution of Δ[m/z ] values in the 0–200 Da range at 0.001 Da intervals. Actual calculation of Δ[m/z ] values was performed in the 500 Da range. Δ[m/z ] values were calculated to obtain insights into the chemical building blocks that occur freclass="Chemical">queclass="Chemical">ntly iclass="Chemical">n a set of metabolites. Δ[m/z ] values calculated from m/z values detected iclass="Chemical">n positive-class="Chemical">n class="Disease">ionization mode from (a) peel at the turning stage (TP) and (b) flesh at the turning stage (TF), and (c) from the theoretical molecular weight of KEGG CHO compounds (KEGG-CHO). Closed arrowheads indicate Δ[m/z ] spikes that were detected in all three sample types (TP, TF and KEGG-CHO). Open arrowheads indicate Δ[m/z ] spikes that were observed specifically in tomato samples TP and TF. Arrows indicate Δ[m/z ] spikes that were observed specifically in KEGG-CHO. P(Δ[m/z ]) indicates the probability of the occurrence of Δ[m/z ] values. We then checked whether Δ[m/z ] spikes were generated from biologically relevant metabolite pairs, i.e. that Δ[m/z ] values were produced in combinations that reflect reaction relationships. This was achieved by inspecting the MS/MS fragmentation data (available at http://webs2.kazusa.or.jp/komics/). Biologically relevant metabolite pairs were screened according to two criteria. First, the Δ[m/z ] value observed between the metabolites must be observed in more than one pair of MS/MS fragments. Second, metabolite pairs must have more than one identical MS/MS fragment. The relative intensity of the MS/MS fragment ions was not taken into account. For example, Figure 4 shows the MS/MS spectra of a pair of metabolites with m/z values of 1372.5 (Figure 4a) and 1210.5 (Figure 4b), with a Δ[m/z ] value of 162.053 between the fragments. In addition, several common fragments were detected in the MS/MS spectra of these two metabolites. Thus, the pair is regarded as biologically relevant. We manually inspected the MS/MS spectra of all 2722 metabolite pairs that contributed to the formation of Δ[m/z ] spikes, and found that approximately 37% was biologically relevant (Table S4). Further screening for biologically relevant metabolite pairs was performed by inspecting annotations of putative structures and database hits to determine whether occurrence of a Δ[m/z ] value was possible based on knowledge of the biochemical reactions. The Δ[m/z ] values with the highest percentages of relevant metabolite pairs include those corresponding to chemical building blocks C3class="CellLine">H7NO2S (121.020), class="Chemical">n class="Chemical">caffeic acid (162.032), hexose (162.053 and 162.054), malonic acid (86.001) and the amino group (17.027) (Table 2). The Δ[m/z ] spike profiles show tissue- and ripening stage-dependent differences (Figure S2). To confirm the ripening stage-dependent changes, Δ[m/z ] values between metabolites in two consecutive stages were analyzed (for details, see Experimental procedures). The analysis indicated that addition of chemical building blocks such as an amino group, caffeic acid, a C3H7NO2S moiety or hexose occurred frequently during ripening. According to the annotations of putative structure and database hits, these chemical building blocks are frequently associated with secondary metabolism.
Figure 4

An example of the MS/MS spectra comparison to confirm biological relevance of Δ[m/z ] values. MS/MS spectra of metabolite ID 275 (a) and metabolite ID 379 (b). The MS/MS spectral data for metabolite ID 275 and metabolite ID 379 are provided at http://webs2.kazusa.or.jp/komics/. Comparison of (a) and (b) demonstrates that an Δ[m/z ] value between the two metabolites was observed in a pair of MS/MS fragments (m/z 1372.5 and m/z 1210.5), and that there are several MS/MS fragments with identical m/z values suggesting that Δ[m/z ] observed between metabolite ID 275 and ID 379 is biologically relevant.

Table 2

Biologically relevant Δ[m/z ] spikes estimated by inspection of MS/MS spectra, putative structures and database hits

MS/MS inspection resultsElemental composition differencecPutative chemical building blocks

Δ[m/z ] valueRelevant (%)Not relevant (%)No MS/MS (%) Description Description
121.02097.30.02.7C3H7NO2SAddition of C3H7NO2S
456.149a93.80.06.2C17H28O14NSd
162.03263.90.036.1C9H6O3Addition of caffeic acid Hydroxylation and addition of coumaric acid
104.04826.70.073.3C4H8O3NSd
143.27776.52.920.6Addition of C12H33N, and deletion of O3NSd
162.053b57.95.836.3C6H10O5Addition of hexose Hydroxylation and addition of deoxyhexose
86.00167.910.321.8C3H2O3Addition of malonic acid
162.054b60.09.130.9C6H10O5Addition of hexose Hydroxylation and addition of deoxyhexose
456.148a50.09.140.9C17H28O14NSd
440.15347.111.841.2C17H28O13NSd
17.02745.214.340.5H3NAddition of an amino group
42.01133.314.851.9C2H2ONSd

Assigned to the same elemental composition, respectively.

Assigned to the same elemental composition, respectively.

Elemental composition difference with the highest percentage in all molecular formula combinations.

Not suggested. Known chemical blocks were not suggested by putative structures or database hits.

Biologically relevant Δ[m/z ] spikes estimated by inspection of MS/MS spectra, putative structures and database hits Assigned to the same elemental composition, respectively. Assigned to the same elemental composition, respectively. Elemental composition difference with the highest percentage in all molecular formula combinations. Not suggested. Known chemical blocks were not suggested by putative structures or database hits. An example of the MS/MS spectra comparison to confirm biological relevance of Δ[m/z ] values. MS/MS spectra of metabolite ID 275 (a) and metabolite ID 379 (b). The MS/MS spectral data for metabolite ID 275 and metabolite ID 379 are provided at http://webs2.kazusa.or.jp/komics/. Comparison of (a) and (b) demonstrates that an Δ[m/z ] value between the two metabolites was observed in a pair of MS/MS fragments (m/z 1372.5 and m/z 1210.5), and that there are several MS/MS fragments with identical m/z values suggesting that Δ[m/z ] observed between metabolite ID 275 and ID 379 is biologically relevant.

Secondary metabolites in tomato

In addition to the freclass="Chemical">queclass="Chemical">ntly occurclass="Chemical">n class="Chemical">ring mass differences, the tomato fruit metabolites analyzed using LC-FTICR-MS include diverse flavonoids and glycoalkaloids. Of the 869 annotated metabolites, 70 and 93 were assigned to the flavonoid and glycoalkaloid groups, respectively. The number of flavonoids increased during ripening (Table S5). In addition, peel tissues contained a larger number of flavonoids than flesh. Four chalcone and flavanone aglycones [naringenin chalcone (NGC), naringenin (NG), eriodictyol (ED) and eriodictyol chalcone (EDC)] and two flavonol aglycones [kaempferol (Kae) and quercetin (Que)] were identified by MS/MS and MS3 fragmentation patterns combined with UV/visible absorption spectra, as reported previously (Bino ; Iijima ). Dehydrokaempferol glycosides, previously identified in other cultivars of tomato (Le Gall ; Moco ), were not detected in the Micro-Tom samples. MS/MS fragmentation patterns of the class="Chemical">flavonoids democlass="Chemical">nstrated the occurreclass="Chemical">nce of various glycosylatioclass="Chemical">ns aclass="Chemical">nd acylatioclass="Chemical">ns. class="Chemical">n class="Chemical">Flavonoids in the chalcone/flavanone and flavonol groups showed different conjugation patterns. Conjugate moieties of NH3 (m/z 17.027) and C3H7NO2S (m/z 121.020) were associated exclusively with chalcones and flavanones. On the other hand, deoxyhexose, p- coumaroyl hexose and feruloyl hexose were associated exclusively with Kae and Que. Possible pathway relationships for the class="Chemical">flavonoids are illustrated based oclass="Chemical">n the putative structures (Figure 5a). The modificatioclass="Chemical">n patterclass="Chemical">n observed iclass="Chemical">n the NGC pathway is quite similar to that iclass="Chemical">n the class="Chemical">n class="Chemical">EDC pathway. Likewise, the modification patterns observed in pathways starting from Kae and Que are similar to each other. The apparent similarities suggest that regulation of modification reactions may be similar between the NGC and EDC pathways and between the Kae and Que pathways. To test this, we investigated flavonoid levels in fruits of transgenic Micro-Tom lines over-expressing PAP1, an Arabidopsis transcription factor that up-regulates flavonoid pathway genes (Borevitz ). We focused on comparison of the pairs of NGC and EDC derivatives and the pairs of Kae and Que derivatives, each of which has an identical conjugate moiety (numbered metabolites in Figure 5a). The accumulation levels of three pairs of metabolites in the NGC and EDC pathways changed in a highly correlated manner (correlation coefficient >0.6) in PAP1 over-expressing lines (Figure 5b), as did those of six pairs of metabolites in the Kae and Que pathways (Figure 5c). This suggests that pairs of genes responsible for the same modification reactions are coordinately regulated by the over-expression of PAP1. Alternatively, each pair of modifications may be catalyzed by an identical enzyme.
Figure 5

Reaction and pathway relationships of Micro-Tom flavanoids. (a) Putative metabolic pathway for the flavonoids. Underlined letters indicate metabolites that were not detected in this study. Solid arrows indicate the occurrence of modification between the detected metabolites. Broken arrows indicate possible reactions between detected and non-detected metabolites. Hex, hexose; dHex, deoxyhexose; Glc, glucose; Rut, rutinose; Pen, pentose. (b, c) Correlations between the relative accumulation levels of (b) chalcone/flavanone metabolites and (c) flavonol metabolites in Arabidopsis PAP1- over-expressing tomato fruits (gray bars) in comparison with control fruit (black bars). Lines: C, control; 1–9, independent lines of PAP1- over-expressing Micro-Tom. Metabolites: numbers indicate the metabolites shown in (a) (highlighted by gray shading). CC, correlation coefficient. Means ± SD of three biological repeats are indicated.

Reaction and pathway relationships of class="Disease">Micro-Tom flavanoids. (a) Putative metabolic pathway for the class="Chemical">n class="Chemical">flavonoids. Underlined letters indicate metabolites that were not detected in this study. Solid arrows indicate the occurrence of modification between the detected metabolites. Broken arrows indicate possible reactions between detected and non-detected metabolites. Hex, hexose; dHex, deoxyhexose; Glc, glucose; Rut, rutinose; Pen, pentose. (b, c) Correlations between the relative accumulation levels of (b) chalcone/flavanone metabolites and (c) flavonol metabolites in Arabidopsis PAP1- over-expressing tomato fruits (gray bars) in comparison with control fruit (black bars). Lines: C, control; 1–9, independent lines of PAP1- over-expressing Micro-Tom. Metabolites: numbers indicate the metabolites shown in (a) (highlighted by gray shading). CC, correlation coefficient. Means ± SD of three biological repeats are indicated. Most of the class="Chemical">glycoalkaloids aclass="Chemical">nclass="Chemical">notated iclass="Chemical">n this study (Table S6) appear to be class="Chemical">novel, as they were class="Chemical">not fouclass="Chemical">nd iclass="Chemical">n the literature or public databases. The compositioclass="Chemical">n of class="Chemical">n class="Chemical">glycoalkaloids showed tissue-dependent differences. Peel contained a larger number of glycoalkaloids than flesh. The composition of glycoalkaloids also appeared to change with ripening. The intensity of the mass peak of tomatine (m/z 1034.55303 [M+H]+) was high in fruits at the mature green and breaker stages, but very weak at the red stage, suggesting that levels of tomatine decreased during ripening. On the other hand, a number of glycoalkaloids that are larger than tomatine were detected at the red stage. According to MS data, some of these were assigned as putative intermediate metabolites in the metabolic pathway between tomatine and esculeoside A, the major glycoalkaloid at the red stage (Fujiwara ) (Figure 6). To test whether this pathway is regulated by ripening, we investigated the accumulation levels of the intermediates in fruit tissues (containing both peel and flesh) of non-ripening (nor) and ripening-inhibitor (rin) mutants that do not exhibit ripening-associated ethylene production. The levels of metabolites upstream of C52H85NO24 increased in nor and rin fruits in comparison with wild-type Rutgers, but the level of esculeoside A decreased remarkably (Figure 6). This indicates that the final step of esculeoside A biosynthesis is associated with developmentally regulated ripening events.
Figure 6

Putative metabolic pathway from α-tomatine to esculeoside A. Graphs show the relative abundance of indicated metabolites (gray arrows) in nor and rin mutant fruits (containing both peel and flesh) in comparison with wild-type Rutgers (WT), the background line of the mutants. Means ± SD of three biological repeats are indicated. Esculeoside A was almost absent in the fruits of nor and rin mutants. However, other intermediate glycoalkaloids accumulated at higher levels in nor and rin than WT. The result suggests that the final step of esculeoside A biosynthesis (glycosylation of C52H85NO24) is controlled by developmentally regulated ethylene production.

Putative metabolic pathway from α-class="Chemical">tomatine to class="Chemical">n class="Chemical">esculeoside A. Graphs show the relative abundance of indicated metabolites (gray arrows) in nor and rin mutant fruits (containing both peel and flesh) in comparison with wild-type Rutgers (WT), the background line of the mutants. Means ± SD of three biological repeats are indicated. Esculeoside A was almost absent in the fruits of nor and rin mutants. However, other intermediate glycoalkaloids accumulated at higher levels in nor and rin than WT. The result suggests that the final step of esculeoside A biosynthesis (glycosylation of C52H85NO24) is controlled by developmentally regulated ethylene production.

Discussion

Concept of metabolite annotation

We established a metabolite annotation procedure and constructed a comprehensive metabolite annotation database to organize experimental information obtained by LC-FTICR-MS, using class="Species">tomato as a model placlass="Chemical">nt species. The term ‘metabolite aclass="Chemical">nclass="Chemical">notatioclass="Chemical">n’ has beeclass="Chemical">n proposed previously to describe the process of labeliclass="Chemical">ng experimeclass="Chemical">nts with biological metadata (such as a descriptioclass="Chemical">n of actual experimeclass="Chemical">ntal coclass="Chemical">nditioclass="Chemical">ns) iclass="Chemical">n order to help uclass="Chemical">nravel the biological role of metabolites based oclass="Chemical">n chaclass="Chemical">nges iclass="Chemical">n their levels iclass="Chemical">n respoclass="Chemical">nse to geclass="Chemical">netic aclass="Chemical">nd eclass="Chemical">nviroclass="Chemical">nmeclass="Chemical">ntal perturbatioclass="Chemical">n (Fiehclass="Chemical">n ; Sclass="Chemical">n class="CellLine">cholz and Fiehn, 2007). Their concept of ‘metabolite annotation’ comprises (i) mass spectral annotation and (ii) biological metadata annotation. In this study, we used the term ‘metabolite annotation’ to describe a procedure by which mass spectral information is provided to individual metabolites, thus our annotation procedure can be classified as mass spectral annotation. The metabolite annotation procedure reported in this study is based on four novel concepts. First, we provided annotations to individual ‘metabolites’. We identified metabolite-representing peaks systematically based on the following criteria: (i) that mass signals were detected in consecutive scans to form a peak group, and (ii) that quasi-molecular ions were accompanied by isotopic ions. Second, we aimed to establish a data-driven annotation protocol for LC-MS-derived data as only a few metabolic profiling methods for LC-MS-derived data have been reported (De Vos ; Smith ). This is in contrast to the well-established metabolic profiling methods for GC-MS-derived data (Duran ; Fiehn ; Tikunov ). Third, we provided annotations for non-volatile secondary metabolites that are difficult to detect by GC-MS, which allowed us to explore a diverse range of secondary metabolites. Fourth, we introduced a grading system to describe the experimental evidence by which the annotation was supported. It should be mentioned that the metabolite annotations provided in this study are oclass="Chemical">pen to future curatioclass="Chemical">n. For example, heuristic rules for filteclass="Chemical">n class="Chemical">ring molecular formulae have been proposed recently (Kind and Fiehn, 2007). In the current study, we implemented procedures equivalent to element number filtering, LEWIS and SENIOR checks, and isotopic pattern filtering, but did not implement element ratio checks or element probability checks. Thus, curation of molecular formula annotations will be feasible by applying these rules.

Limitation in complete coverage and quantification of metabolites

In this study, class="Species">tomato fruit tissues were extracted usiclass="Chemical">ng 75% w/v class="Chemical">n class="Chemical">methanol. This method was suitable for extracting a wide range of secondary metabolites, amino acids, sugars, nucleotides and organic acids, but did not extract non-polar metabolites such as lycopene. This demonstrates that the metabolite composition detected is inevitably biased by the choice of extraction method. Thus, an appropriate combination of multiple extraction methods is needed for complete coverage of metabolites. For comprehensive profiling of the annotated metabolites, quantification declass="Chemical">pends oclass="Chemical">n the measuremeclass="Chemical">nt of mass sigclass="Chemical">nal iclass="Chemical">nteclass="Chemical">nsity. However, differeclass="Chemical">nces iclass="Chemical">n the mass sigclass="Chemical">nal iclass="Chemical">nteclass="Chemical">nsity may be caused by a differeclass="Chemical">nt degree of ioclass="Chemical">n suppressioclass="Chemical">n, a pheclass="Chemical">nomeclass="Chemical">noclass="Chemical">n by which the iclass="Chemical">nteclass="Chemical">nsity of a certaiclass="Chemical">n ioclass="Chemical">n is suppressed by the preseclass="Chemical">nce of other ioclass="Chemical">ns. Eveclass="Chemical">n with LC separatioclass="Chemical">n prior to MS, several peaks co-eluted iclass="Chemical">n siclass="Chemical">ngle m/z scaclass="Chemical">ns. We performed semi-quaclass="Chemical">ntitative aclass="Chemical">nalyses of class="Chemical">n class="Chemical">flavonoids and glycoalkaloids based on comparison of the relative mass signal intensities of an identical metabolite across samples (Figures 5b,c and 6). To minimize the possibility that mass signal intensity was affected by different degrees of ion suppression, we checked (i) whether the mass signal intensity is proportional to the UV/visible absorbance, (ii) whether the profile of ions co-eluted with the target ion is similar, and (iii) whether ion suppression is observed in the intensity of co-injected internal calibration standards. Further study is needed to estimate the extent to which ion suppression affects the quantification.

Novel metabolites in tomato fruit

Comparison of 869 annotated metabolites with compounds registered in public databases revealed that 494 of the annotated metabolites appear to be novel. Putative structures for the novel metabolites can be predicted from the annotations of MS/MS fragmentation data. This was particularly effective in predicting putative structures of novel class="Chemical">flavonoids aclass="Chemical">nd class="Chemical">n class="Chemical">glycoalkaloids. In the flavonoid group, an unknown moiety, C3H7NO2S (m/z 121.020), was found as conjugates with NGC, NG and ED. Its predicted molecular formula matched that of cysteine. It has been reported that cysteine forms a conjugate with epicatechin when procyanidins depolymerize in the presence of cysteine (Torres ). However, cysteine conjugates of chalcones and flavanones have not been reported. Structural identification of the moiety will be required to understand the biosynthesis of C3H7NO2S conjugates. Modification of flavonoids has been attracting attention as the biological effects of flavonoid conjugates depend on the nature of the conjugate moieties. The tomato flavonoids found in the present study provide an experimental basis to search for novel functional flavonoids, and to elucidate unknown mechanisms of flavonoid modification. In the glycoalkaloid group, our results indicated the presence of novel glycoakaloids with m/z values larger than the maximum molecular mass (1271 Da) of tomato glycoalkaloid reported so far (Ono ) (Table S6). Most of these novel glycoalkaloids appeared after the onset of ripening. This suggests that glycoalkaloid metabolism is active during fruit ripening, and that glycoalkaloids play unidentified physiological roles in the ripening fruit. class="Chemical">Carotenoids, aclass="Chemical">nother major secoclass="Chemical">ndary metabolite group iclass="Chemical">n class="Chemical">n class="Species">tomato, were not detected under our experimental conditions. Development of a metabolite annotation method for MS data obtained in atmospheric pressure photo-ionization mode, which efficiently ionizes non-polar metabolites including carotenoids, is currently underway.

Reaction and pathway relationships

Metabolite annotations aid our understanding of mechanisms controlling metabolism from chemical and biological points of view. From a chemical point of view, metabolite annotations provide detailed chemical information for each metabolite, which will serve as a basis for identifying unknown metabolites. From a biological point of view, metabolite annotations provide a basis for elucidating biological relationships between metabolites, such as reaction and pathway relationships. To obtain insights into reaction relationships between metabolites, we performed mass difference analysis. Several Δ[m/z ] values occur freclass="Chemical">queclass="Chemical">ntly iclass="Chemical">n metabolites from class="Chemical">n class="Species">tomato fruit, suggesting that chemical building blocks corresponding to those Δ[m/z ] values appear frequently in tomato fruit metabolites. It should be emphasized that signal intensities were not taken into consideration in this analysis. Thus, when we state that certain Δ[m/z ] values occur frequently, this does not mean that the accumulation levels of these metabolites are high. Nevertheless, mass difference analysis combined with inspection of MS/MS spectra annotations provides an efficient way to study metabolites relating to a reaction of interest. To understand the metabolic pathway relationships between annotated metabolites, we arranged class="Chemical">flavonoids detected iclass="Chemical">n this study iclass="Chemical">nto metabolic diagrams (Figure 5a). These democlass="Chemical">nstrate that the modificatioclass="Chemical">n patterclass="Chemical">ns betweeclass="Chemical">n the NGC aclass="Chemical">nd class="Chemical">n class="Chemical">EDC pathways and between the Kae and Que pathways, respectively, are similar to each other. When the flavonoid pathway was up-regulated by over-expression of PAP1, changes in the relative accumulation levels of several pairs of metabolites with identical conjugation patterns were highly correlated (Figure 5b,c). This result demonstrates that genes responsible for each pair of modification reactions are coordinately regulated by PAP1. Alternatively, identical enzymes may use both Kae and Que derivatives as substrates, as reported previously for flavonol glycosyltransferases (Jones ; Yonekura-Sakakibara ). For glycoalkaloids, a biosynthetic pathway from tomatine to esculeoside A (Fujiwara ) was illustrated (Figure 6). By analyzing fruits of nor and rin mutants, we have demonstrated that the reaction step between C52H85NO24 and esculeoside A is regulated by the occurrence of ripening, which is developmentally controlled by NOR and LeMADS-RIN (Giovannoni, 2004). These results demonstrate that the metabolite annotation procedure is a powerful approach for producing hypotheses with respect to unknown metabolic pathways.

Possible link between metabolite annotations and integrated ‘omics’ study

Further insights into the regulation of metabolite biosynthesis will be obtained by the integration of metabolomics data with other ‘omics’ data. A parallel analysis of metabolites and transcripts is a promising approach to achieve this goal (Hirai ; Nikiforova ; Tohge ; Urbanczyk-Wochniak ). Another promising approach involves combination of metabolite analysis with genetic analysis such as quantitative trait loci (QTL) analysis (Keurentjes ; Morreel ; Schauer ). In such approaches, the metabolite annotation plays a complementary role to the metabolic profiling in linking metabolite information to other ‘omics’ information. By contrast to quantitative metabolic profiling, annotations of mass spectral features facilitate qualitative characterization with respect to identity, structural similarity and biochemical relationships between the metabolites. This assists in inference of biological meanings from metabolic profiling combined with other ‘omics’ data. Additionally, new metabolites predicted by the metabolite annotations will be included in multi-‘omics’ pathway tools (Thimm ; Tokimatsu ; Zhang ), and expand our knowledge about unknown metabolic pathways. Metabolite annotations provide firm foundations for integrating chemical information regarding metabolites into a system-level study of plant metabolism.

Experimental procedures

Plant materials

Seeds of cultivated class="Species">tomato (class="Chemical">n class="Species">S. lycopersicum cv. Micro-Tom) were sown in pots (500 ml) filled with a mixture of vermiculite and Powersoil (mix ratio 1:1, Kureha Chemical Industries, http://www.kureha.co.jp/ and Kanto Hiryou Industries, http://www.okumurashoji.co.jp/). Until germination, seeds were covered with plastic film and kept in the dark at 25°C. After 4 days in the dark, they were grown with a photoperiod of 16 h light (80 μmol m−2 s−1)/8 h dark at 25°C. Hyponex® (Hyponex Ltd, http://www.scotts.com/) at 1000-fold dilution was applied to plants once a week. Fruits at the mature green (G, approximately 30 days after anthesis), breaker (B, approximately 35 days after anthesis), turning (T, approximately 38–40 days after anthesis) and red (R, approximately 45–48 days after anthesis) stages were harvested. A vector construct expressing Arabidopsis PAP1 under the control of the CaMV 35S promoter (Tohge ) was provided by K. Saito (Chiba University, Japan). Transformation of Micro-Tom was performed according to the protocol reported previously (Sun ). Seeds of wild-type Rutgers (LA1090) and the nor (LA3013) and rin (LA3012) mutants were obtained from the C.M. Rick Tomato Genetic Resource Center (University of California, Davis, CA, USA).

Metabolite extraction

The peel and the flesh of class="Species">tomato fruit were separated usiclass="Chemical">ng a razor blade. Each sample was sliced, immediately frozeclass="Chemical">n iclass="Chemical">n liquid class="Chemical">n class="Chemical">nitrogen and ground to powder using a Shake Master homogenizer (Biomedical Science, http://www.bmsci.com). Powdered samples (50–70 mg) were extracted with three volumes of methanol containing formononetin (20 μg ml−1) as an internal standard. After homogenization using a Mixer Mill MM 300 (Qiagen, http://www.qiagen.com/) at 27 Hz for 2 min twice, homogenates were centrifuged (12 000 , 10 min, 4°C). The supernatant was filtered through 0.2 μm PVDF membrane (Whatman, http://www.whatman.com), and the filtrate was used for LC-FTICR-MS analysis.

LC-FTICR-MS analysis

An Agilent 1100 system (Agilent, http://www.agilent.com) coupled to a Finnigan LTQ-FT (Thermo Fisher Scientific; http://www.thermofisher.com) was used for LC-FTICR-MS analysis. The data were acquired and browsed using Xcalibur software version 2.0 (Thermo Fisher Scientific). class="Chemical">Methanol extract was applied to a TSKgel columclass="Chemical">n ODS-100V (4.6 × 250 mm, 5 μm; TOSOH Corporatioclass="Chemical">n, http://www.tosoh.com). class="Chemical">n class="Chemical">Water (HPLC grade; solvent A) and acetonitrile (HPLC grade; solvent B) were used as the mobile phase with 0.1% v/v formic acid added to both solvents. The gradient program was as follows: 10% B to 50% B (50 min), 50% B to 90% B (20 min), 90% B (5 min) and 10% B (10 min). The flow rate was set to 0.5 ml min−1, and the column oven temperature was set at 40°C; 20 μl of each sample were injected. To monitor HPLC elution, a photodiode array detector was used in the wavelength range 200–650 nm. The ESI setting was as follows: spray voltage 4.0 kV and capillary temperature 300°C for both positive- and negative-class="Disease">ionization modes. class="Chemical">n class="Chemical">Nitrogen sheath gas and auxiliary gas were set at 40 and 15 arbitrary units, respectively. A full MS scan with internal standards was performed in the m/z range 100–1500 at a resolution of 100 000 (at m/z 400). A mixture of internal calibration standards dissolved in 50% v/v class="Chemical">acetonitrile was iclass="Chemical">ntroduced by a post-columclass="Chemical">n method at a flow rate of 20 μl miclass="Chemical">n−1. The coclass="Chemical">nceclass="Chemical">ntratioclass="Chemical">n of each staclass="Chemical">ndard iclass="Chemical">n the mixture was as follows: for positive mode,: 10 μm lidocaiclass="Chemical">ne (m/z 235.18049 [M+H]+; Sigma-Aldrich, http://www.sigmaaldrich.com/), 5 μm prochloraz (m/z 376.03809 [M+H]+; AccuStaclass="Chemical">ndard Iclass="Chemical">nc., http://www.accustaclass="Chemical">ndard.com), 1.2 μm reserpiclass="Chemical">ne (m/z 609.28066 [M+2H]2+; Sigma-Aldrich), 0.8 μm bombesiclass="Chemical">n (m/z 810.41481 [M+H]+; Sigma-Aldrich), 0.4 μm aureobasidiclass="Chemical">n A (m/z 1123.67778 [M+Na]+; Takara Bio Iclass="Chemical">nc., http://www.takara-bio.com), 22 μm vaclass="Chemical">ncomyciclass="Chemical">n (m/z 1448.43747 [M+H]+; MP Biomedicals Iclass="Chemical">nc., http://www.mpbio.com); for class="Chemical">negative mode: 11.2 μm 2,4-dichloropheclass="Chemical">noxyacetic acid (m/z 218.96212 [M-H]−; Sigma-Aldrich), 3.1 μm ampicilliclass="Chemical">n (m/z 348.10235 [M-H]−, Sigma-Aldrich), 0.25 μm CHAPS (m/z: 659.39468 [M+class="Chemical">n class="Chemical">HCOO]−; Sigma-Aldrich), 1.0 μm tetra-N- acetylchitotetraose (m/z 875.32626 [M+HCOO]−; Toronto Research Chemicals, Inc., http://www.trc-canada.com), 0.6 μm aureobasidin A (m/z 1145.68676 [M+HCOO]−, Takara Bio Inc.). MS/MS and MS3 fragmentation were carried out at a normalized collision energy of 35.0% and a isolation width of 4.0 (m/z), and were obtained by ion trap mode. Relative accumulation levels of flavonoids and glycoalkaloids were estimated by dividing the peak area of the metabolite by that of internal standard (formononetin).

Chemicals

Authentic class="Chemical">naringenin chalcone was geclass="Chemical">nerously provided by the Kikkomaclass="Chemical">n Corporatioclass="Chemical">n (http://www.kikkomaclass="Chemical">n.com). class="Chemical">n class="Chemical">Esculeosides A and B were kindly provided by T. Nohara and Y. Fujiwara (Kumamoto University, Japan). Other authentic compounds were purchased from EXTRASYNTHESE (http://www.extrasynthese.com), Funakoshi Co. Ltd (http://www.funakoshi.co.jp), Sigma-Aldrich, Tokyo Chemical Industry (http://www.tci-asiapacific.com) and Wako Pure Chemical Industries Ltd (http://www.wako-chem.co.jp/).

Metabolite annotation procedure

A program written in Microsoft VC++ was used to export the raw data (XRAW) file of each single run as a text file. The output file includes retention time, scan number, m/z value and their intensities. To discriminate mass signals from baseline noise, mass signals whose intensities were more than three times the baseline level of each scan were selected. Next, m/z values of all ions in each scan were bulk-calibrated with observed m/z values of internal calibration compounds in the same scan using the computational tool DrDMASS (http://kanaya.naist.ac.jp/DrDMASS/, Oikawa ). By using internally calibrated m/z, if them/z were obtained in more than 30% of the total mass scans, those mass signals could be regarded as artificial noise and thus excluded from further analyses. After removing noise, all data were collected as a Microsoft Excel file. The quasi-molecular ions detected with a class="Chemical">13C isotopic ioclass="Chemical">n iclass="Chemical">n the scaclass="Chemical">n at aclass="Chemical">n m/z value that was 1.003 greater were selected. After sorticlass="Chemical">ng mass sigclass="Chemical">nals by scaclass="Chemical">n class="Chemical">number, those detected iclass="Chemical">n more thaclass="Chemical">n three coclass="Chemical">nsecutive scaclass="Chemical">ns were selected aclass="Chemical">nd grouped. If a peak group coclass="Chemical">nsisted of three or four mass sigclass="Chemical">nals, aclass="Chemical">n accurate m/z value for the group was obtaiclass="Chemical">ned as the meaclass="Chemical">n m/z value for the three or four mass sigclass="Chemical">nals. If a peak group coclass="Chemical">nsisted of five or more mass sigclass="Chemical">nals, aclass="Chemical">n accurate m/z value was obtaiclass="Chemical">ned as the meaclass="Chemical">n m/z value for the five most iclass="Chemical">nteclass="Chemical">nse sigclass="Chemical">nals. For the peak group whose iclass="Chemical">nteclass="Chemical">nsity was more thaclass="Chemical">n 1 000 000, m/z values for the highest iclass="Chemical">nteclass="Chemical">nsity sigclass="Chemical">nals were class="Chemical">not used for the meaclass="Chemical">n value calculatioclass="Chemical">n. Iclass="Chemical">nstead, a meaclass="Chemical">n value was calculated usiclass="Chemical">ng the m/z values of mass peaks whose iclass="Chemical">nteclass="Chemical">nsities were just below 1 000 000. Molecular formulae that matched a giveclass="Chemical">n accurate m/z value were determiclass="Chemical">ned as follows. A library of molecular formulae with all possible elemeclass="Chemical">ntal combiclass="Chemical">natioclass="Chemical">ns whose theoretical m/z matched the iclass="Chemical">nput m/z with 1 ppm toleraclass="Chemical">nce was geclass="Chemical">nerated usiclass="Chemical">ng elemeclass="Chemical">nts C, H, N, O, P aclass="Chemical">nd S. To screeclass="Chemical">n the library for chemically possible molecular formulae, all formulae were tested for whether they met followiclass="Chemical">ng criteria (Seclass="Chemical">nior, 1951): (i) the sum of valeclass="Chemical">nces is aclass="Chemical">n eveclass="Chemical">n class="Chemical">number, aclass="Chemical">nd (ii) the sum of valeclass="Chemical">nces is greater thaclass="Chemical">n or equal to twice the class="Chemical">number of atoms miclass="Chemical">nus 1. The accurate m/z was used for molecular formula calculatioclass="Chemical">n. Upper limits of 95 for C, 182 for H, 10 for N, 45 for O, 6 for P aclass="Chemical">nd 5 for S were used for calculatioclass="Chemical">n of formulae. Iclass="Chemical">n additioclass="Chemical">n, the relative iclass="Chemical">nteclass="Chemical">nsity of the class="Chemical">n class="Chemical">13C1 isotopic ion was calculated. The number of carbons in the molecular formula was estimated using the following equation: where n represents the number of carbons. The tolerance for relative intensity was set at 5%. Chemically possible molecular formulae and the relative intensities of the isotope ions were calculated by programs written in Java. The library of molecular formulae was constructed using MySQL. A Java program was developed to search the molecular formula library for molecular formulae matching the criteria described above. Any peak group that is selected based on these criteria is defined as a metabolite. The analysis was repeated three times for each tomato fruit tissue. When a metabolite was detected in two or more repeats, it was regarded as ‘present’ in that tissue. Computational assignment of peak groups of isotopic ions to the parental metabolite was re-checked manually. Assignments of fragment ions and adduct ions to the parental metabolite were performed manually. Peak groups composed of adduct ions produced during ionization were assigned using two criteria as follows. First, it was checked whether the m/z values of ions matched theoretical m/z values of adducts ([M+Na]+, [M+K]+, [M+NH3+H]+, [M+CH3CN+H]+ (Svatos ) and [2M+H]+). Second, retention time was checked to determine whether the adduct ions co-eluted with the proton adduct ion. In negative-ionization ESI mode, formic acid adduct ions ([M+HCOO]−) were frequently produced together with [M-H]− ions, and were assigned using the same criteria. Metabolite annotations were provided for the adduct ion species with the highest intensity, i.e. [M+H]+ and [M-H]− in positive- and negative-ionization ESI modes, respectively, for the majority of the metabolites detected in the present study (Table S2). After these manual curation processes, metabolites were designated as ‘annotated metabolites’.

Database construction

For database construction, a dataset comprised of accurate m/z values, predicted molecular formula, retention time, MS/MS data and λmax of the UV/visible absorption spectra was compiled. As MS/MS data, the m/z value, raw intensity and relative intensity of the 20 highest-intensity MS/MS fragment ions were retrieved. References for each annotated metabolite were searched for in the public databases PubChem (http://pubchem.ncbi.nlm.nih.gov/), the Dictionary of Natural Product (http://www.chemnetbase.com/scripts/n class="Chemical">dnpweb.exe?welcome-maiclass="Chemical">n), KNApSAcK (http://kaclass="Chemical">naya.class="Chemical">naist.jp/KNApSAcK/), KEGG (http://www.geclass="Chemical">nome.jp/kegg/kegg2.html) aclass="Chemical">nd MotoDB (http://appliedbioiclass="Chemical">nformatics.wur.class="Chemical">nl/moto/). To browse aclass="Chemical">nd search the aclass="Chemical">nclass="Chemical">notatioclass="Chemical">n iclass="Chemical">nformatioclass="Chemical">n, a web-based database (http://webs2.kazusa.or.jp/komics/) was coclass="Chemical">nstructed usiclass="Chemical">ng MySQL aclass="Chemical">nd PHP.

Annotation grading system

To each metabolite, an annotation grade was added to describe the evidence supporting the annotations for that metabolite (Figure 2). First, annotations were classified into two grades (A/B versus C) according to whether a single molecular formula was obtained or not. Grades A and B were further classified according to whether the mass spectral attributes of the metabolites matched those of standard chemicals or not. In grade A, annotations were verified by comparison with standard chemicals. In grade B, annotations were assigned with single molecular formulae but lacked verification by standard chemicals. Annotations in grade B were classified into eight sub-grades according to the availability of MS/MS, λmax and reference information. In grade C, multiple molecular formulae were assigned to each metabolite. Annotations in grade C were classified into eight sub-grades according to the availability of MS/MS and λmax information.

Mass difference analysis

Mass difference values (Δ[m/z ]) were calculated for pairwise combinations of m/z values shown in Table S2 at the 0.001 Da interval. Δ[m/z ] values were calculated separately for m/z datasets of class="Species">tomato tissue samples aclass="Chemical">nd for m/z datasets obtaiclass="Chemical">ned iclass="Chemical">n positive- aclass="Chemical">nd class="Chemical">negative-class="Chemical">n class="Disease">ionization ESI modes. Δ[m/z ] values were calculated in the 500 Da range. To identify Δ[m/z ] values that occurred more frequently than others, a threshold probability was determined based on the standard deviation of the probability distribution within each sample. Probabilities of 10-, 20-, 30-, 40-, 50-, 60- and 70-fold standard deviation levels were tested, and the 40-fold standard deviation level was used as the threshold (Figure S1). MS/MS data inspection was performed manually using m/z values for the 20 fragment ions with highest intensity. To match MS/MS fragments between a pair of metabolites, the m/z tolerance was set to 0.1% as MS/MS spectra were obtained by the ion-trap mode, which is less accurate than the FTICR mode. A pairwise difference in elemental composition was calculated based on the molecular formula annotation provided in Table S2.Δ[m/z ] spikes between stages were identified using following criteria: (i) the probability was above the 40-fold standard deviation level, (ii) the frequency of the Δ[m/z ] value increased in the later stages, and (iii) the probability of the Δ[m/z ] value increased in the later stages. To obtain the chemical information of KEGG compounds, compound files were first retrieved from the KEGG ftp site (ftp://ftp.genome.jp/pub/kegg/ligand/compound/, 9 March 2007), and then compounds containing C, H and O in the molecular formula were selected. Finally, compounds with a non-redundant compound ID were chosen. The theoretical molecular weight of KEGG compounds were calculated using accurate masses of the elements C, H, N, O, P and S. Programs for calculating Δ[m/z ] and elemental composition difference were written in Perl. The program for the selection of KEGG compounds was written in Java.
  42 in total

1.  Metabolic profiling allows comprehensive phenotyping of genetically or environmentally modified plant systems.

Authors:  U Roessner; A Luedemann; D Brust; O Fiehn; T Linke; L Willmitzer; A Fernie
Journal:  Plant Cell       Date:  2001-01       Impact factor: 11.277

Review 2.  A biologist's view of the Drosophila genome annotation assessment project.

Authors:  M Ashburner
Journal:  Genome Res       Date:  2000-04       Impact factor: 9.043

Review 3.  Genome annotation: from sequence to biology.

Authors:  L Stein
Journal:  Nat Rev Genet       Date:  2001-07       Impact factor: 53.242

4.  LIGAND: database of chemical compounds and reactions in biological pathways.

Authors:  Susumu Goto; Yasushi Okuno; Masahiro Hattori; Takaaki Nishioka; Minoru Kanehisa
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

5.  Parallel analysis of transcript and metabolic profiles: a new approach in systems biology.

Authors:  Ewa Urbanczyk-Wochniak; Alexander Luedemann; Joachim Kopka; Joachim Selbig; Ute Roessner-Tunali; Lothar Willmitzer; Alisdair R Fernie
Journal:  EMBO Rep       Date:  2003-09-12       Impact factor: 8.807

6.  Nontargeted metabolome analysis by use of Fourier Transform Ion Cyclotron Mass Spectrometry.

Authors:  Asaph Aharoni; C H Ric de Vos; Harrie A Verhoeven; Chris A Maliepaard; Gary Kruppa; Raoul Bino; Dayan B Goodenowe
Journal:  OMICS       Date:  2002

7.  Construction and application of a mass spectral and retention time index database generated from plant GC/EI-TOF-MS metabolite profiles.

Authors:  Cornelia Wagner; Michael Sefkow; Joachim Kopka
Journal:  Phytochemistry       Date:  2003-03       Impact factor: 4.072

8.  Characterization and content of flavonoid glycosides in genetically modified tomato (Lycopersicon esculentum) fruits.

Authors:  Gwénaëlle Le Gall; M Susan DuPont; Fred A Mellon; Adrienne L Davis; Geoff J Collins; Martine E Verhoeyen; Ian J Colquhoun
Journal:  J Agric Food Chem       Date:  2003-04-23       Impact factor: 5.279

9.  Cysteinyl-flavan-3-ol conjugates from grape procyanidins. Antioxidant and antiproliferative properties.

Authors:  J L Torres; C Lozano; L Julià; F J Sánchez-Baeza; J M Anglada; J J Centelles; M Cascante
Journal:  Bioorg Med Chem       Date:  2002-08       Impact factor: 3.641

10.  Metabolic profiling of flavonoids in Lotus japonicus using liquid chromatography Fourier transform ion cyclotron resonance mass spectrometry.

Authors:  Hideyuki Suzuki; Ryosuke Sasaki; Yoshiyuki Ogata; Yukiko Nakamura; Nozomu Sakurai; Mariko Kitajima; Hiromitsu Takayama; Shigehiko Kanaya; Koh Aoki; Daisuke Shibata; Kazuki Saito
Journal:  Phytochemistry       Date:  2007-07-31       Impact factor: 4.072

View more
  92 in total

1.  On the discordance of metabolomics with proteomics and transcriptomics: coping with increasing complexity in logic, chemistry, and network interactions scientific correspondence.

Authors:  Alisdair R Fernie; Mark Stitt
Journal:  Plant Physiol       Date:  2012-01-17       Impact factor: 8.340

2.  Combining genetic diversity, informatics and metabolomics to facilitate annotation of plant gene function.

Authors:  Takayuki Tohge; Alisdair R Fernie
Journal:  Nat Protoc       Date:  2010-06-10       Impact factor: 13.491

3.  Exploring tomato gene functions based on coexpression modules using graph clustering and differential coexpression approaches.

Authors:  Atsushi Fukushima; Tomoko Nishizawa; Mariko Hayakumo; Shoko Hikosaka; Kazuki Saito; Eiji Goto; Miyako Kusano
Journal:  Plant Physiol       Date:  2012-02-03       Impact factor: 8.340

4.  Systematic structural characterization of metabolites in Arabidopsis via candidate substrate-product pair networks.

Authors:  Kris Morreel; Yvan Saeys; Oana Dima; Fachuang Lu; Yves Van de Peer; Ruben Vanholme; John Ralph; Bartel Vanholme; Wout Boerjan
Journal:  Plant Cell       Date:  2014-03-31       Impact factor: 11.277

5.  Differential tomato transcriptomic responses induced by pepino mosaic virus isolates with differential aggressiveness.

Authors:  Inge M Hanssen; H Peter van Esse; Ana-Rosa Ballester; Sander W Hogewoning; Nelia Ortega Parra; Anneleen Paeleman; Bart Lievens; Arnaud G Bovy; Bart P H J Thomma
Journal:  Plant Physiol       Date:  2011-03-22       Impact factor: 8.340

6.  Identification of genes in the phenylalanine metabolic pathway by ectopic expression of a MYB transcription factor in tomato fruit.

Authors:  Valeriano Dal Cin; Denise M Tieman; Takayuki Tohge; Ryan McQuinn; Ric C H de Vos; Sonia Osorio; Eric A Schmelz; Mark G Taylor; Miriam T Smits-Kroon; Robert C Schuurink; Michel A Haring; James Giovannoni; Alisdair R Fernie; Harry J Klee
Journal:  Plant Cell       Date:  2011-07-12       Impact factor: 11.277

Review 7.  Emerging applications of metabolomics in studying chemopreventive phytochemicals.

Authors:  Lei Wang; Chi Chen
Journal:  AAPS J       Date:  2013-06-22       Impact factor: 4.009

8.  Small and remarkable: The Micro-Tom model system as a tool to discover novel hormonal functions and interactions.

Authors:  Marcelo Lattarulo Campos; Rogério Falleiros Carvalho; Vagner Augusto Benedito; Lázaro Eustáquio Pereira Peres
Journal:  Plant Signal Behav       Date:  2010-03-12

Review 9.  Metabolic networks: how to identify key components in the regulation of metabolism and growth.

Authors:  Mark Stitt; Ronan Sulpice; Joost Keurentjes
Journal:  Plant Physiol       Date:  2009-12-11       Impact factor: 8.340

10.  Biochemical and molecular analysis of pink tomatoes: deregulated expression of the gene encoding transcription factor SlMYB12 leads to pink tomato fruit color.

Authors:  Ana-Rosa Ballester; Jos Molthoff; Ric de Vos; Bas te Lintel Hekkert; Diego Orzaez; Josefina-Patricia Fernández-Moreno; Pasquale Tripodi; Silvana Grandillo; Cathie Martin; Jos Heldens; Marieke Ykema; Antonio Granell; Arnaud Bovy
Journal:  Plant Physiol       Date:  2009-11-11       Impact factor: 8.340

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.