| Literature DB >> 28455528 |
Nayumi Akimoto1, Takeshi Ara1,2, Daisuke Nakajima1, Kunihiro Suda1, Chiaki Ikeda1, Shingo Takahashi2,3, Reiko Muneto1, Manabu Yamada1, Hideyuki Suzuki1, Daisuke Shibata1, Nozomu Sakurai4.
Abstract
Currently, in mass spectrometry-based metabolomics, limited reference mass spectra are available for flavonoid identification. In the present study, a database of probable mass fragments for 6,867 known flavonoids (FsDatabase) was manually constructed based on new structure- and fragmentation-related rules using new heuristics to overcome flavonoid complexity. We developed the FlavonoidSearch system for flavonoid annotation, which consists of the FsDatabase and a computational tool (FsTool) to automatically search the FsDatabase using the mass spectra of metabolite peaks as queries. This system showed the highest identification accuracy for the flavonoid aglycone when compared to existing tools and revealed accurate discrimination between the flavonoid aglycone and other compounds. Sixteen new flavonoids were found from parsley, and the diversity of the flavonoid aglycone among different fruits and vegetables was investigated.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28455528 PMCID: PMC5430893 DOI: 10.1038/s41598-017-01390-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Construction of the probable mass fragment database of flavonoids (FsDatabase) for use in the FlavonoidSearch system. (a) Schematic outline of the construction of the FsDatabase. We analysed 139 authentic flavonoids and used the data to develop the following three rules: (1) the MSMS-aglycone rule to define the base structural unit of fragmentation (MSMS-aglycone); (2) the MSMS-category rule to define fragmentation patterns that were characteristic of flavonoid classes (Fig. 1b); and (3) the fragment prediction rule to identify probable fragments. These rules were used to create virtual fragments from 6,867 known flavonoids based on their chemical structures. References to related figures and tables are given in red. (b) The MSMS-aglycone rule (in the grey box) and an example of an MSMS-aglycone (right). According to the rule, the O-glucosyl group in the original structure (left) was replaced with a hydroxyl group. The C6-C3-C6 backbone structure is highlighted in pale blue.
Figure 2Coverage of flavonoids predictable by FlavonoidSearch. For the 6,867 known flavonoids, those with probable mass fragments generated in FsDatabase are shown in blue.
Figure 3Accuracies and discrimination power of the FlavonoidSearch system. (a) The accuracy of each search tool was evaluated using the area under the cumulative curve (AUCc) for a plot of cumulative ratio of queries (Y-axis) to the efficiency of narrowing down to the correct answer (X-axis). The cumulative curve will move closer to the upper left-hand corner of the figure when highly narrowed-down results are obtained for a high number of queries. Therefore, a high AUCc is indicative of high accuracy. Results are shown for data obtained in-house for flavonoid aglycones using ion trap (IT) and Fourier transform (FT) mass spectrometry (MS) of LTQ-FT and IT of LTQ-Orbitrap and spectra retrieved from MassBank and NIST14 searched with the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. (b) Differences in the frequency distributions of Jaccard indices for flavonoid aglycones (red bars) and other compounds (grey bars). The frequency represents the ratio of records with a range of Jaccard indices to all records that had a Jaccard index >0. (c) A receiver operating characteristic curve of the discrimination test by a binary classification in (b), yielding an area under the curve (AUC) of 0.91. Results obtained with ITFT data from NIST14 are shown in (b) and (c).
Figure 4Flavonoids in parsley samples. (a) Flavonoids in parsley annotated by FlavonoidSearch (Jaccard index >0.3) and manual curation. Peaks of derivatives of characteristic aglycones in parsley (apigenin derivatives, red squares; kaempferol derivatives, blue triangles; luteolin derivatives, green inverted triangles; diosmetin derivatives, purple diamonds and isorhamnetin derivatives, yellow circles) and two unknown aglycones (C16H13O7 +, plus symbol; and C17H15O6 +, asterisk) are represented on a two-dimensional (2D) mass chromatogram. The peak positions on an overall view of the 2D mass chromatogram is shown in Supplementary Fig. S4. (b) Apigenin-related compounds annotated in parsley. Peaks identified using authentic compounds (black squares), known derivatives (green circles), unknown derivatives with combinations of known O-substituents (blue triangles) and unknown derivatives with unknown substituents (red diamonds) are represented on a 2D mass chromatogram. The numbers correspond to the peak numbers shown in Supplementary Table S17.
Figure 5The diversity of flavonoid aglycones in 16 plant samples. A hierarchical cluster analysis for annotated flavonoid aglycones in 16 plant samples was performed. MS3 spectra were obtained semi-comprehensively by a data-dependent acquisition from the samples. Numbers of peaks annotated to the symbolised name of the MSMS-aglycones by FlavonoidSearch (Jaccard index >0.3) are represented as ‘peak frequency’ (see Methods).