| Literature DB >> 35131943 |
Vincent Colantonio1, Luis Felipe V Ferrão1, Denise M Tieman1, Nikolay Bliznyuk2,3,4, Charles Sims5, Harry J Klee6, Patricio Munoz6, Marcio F R Resende6.
Abstract
Although they are staple foods in cuisines globally, many commercial fruit varieties have become progressively less flavorful over time. Due to the cost and difficulty associated with flavor phenotyping, breeding programs have long been challenged in selecting for this complex trait. To address this issue, we leveraged targeted metabolomics of diverse tomato and blueberry accessions and their corresponding consumer panel ratings to create statistical and machine learning models that can predict sensory perceptions of fruit flavor. Using these models, a breeding program can assess flavor ratings for a large number of genotypes, previously limited by the low throughput of consumer sensory panels. The ability to predict consumer ratings of liking, sweet, sour, umami, and flavor intensity was evaluated by a 10-fold cross-validation, and the accuracies of 18 different models were assessed. The prediction accuracies were high for most attributes and ranged from 0.87 for sourness intensity in blueberry using XGBoost to 0.46 for overall liking in tomato using linear regression. Further, the best-performing models were used to infer the flavor compounds (sugars, acids, and volatiles) that contribute most to each flavor attribute. We found that the variance decomposition of overall liking score estimates that 42% and 56% of the variance was explained by volatile organic compounds in tomato and blueberry, respectively. We expect that these models will enable an earlier incorporation of flavor as breeding targets and encourage selection and release of more flavorful fruit varieties.Entities:
Keywords: artificial intelligence; flavor; fruit quality
Mesh:
Substances:
Year: 2022 PMID: 35131943 PMCID: PMC8860002 DOI: 10.1073/pnas.2115865119
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 11.205
Fig. 1.(A) Weighted correlation network analysis of tomato metabolites and their assigned clusters based on their known biochemical classification. The size of each metabolite node indicates betweenness centrality, a measure of how often a node exists on the shortest path between other nodes. The thickness of the lines connecting metabolites is scaled relative to the correlation between the metabolites. The identity of each metabolite is denoted by number in the legend. (B) Distribution of metabolite concentrations for each volatile group across the tomato population. Volatile concentrations are reported in nanograms per gram fresh weight per hour (ng/gfw/h) on a log 10 scale.
Fig. 2.(A) Weighted correlation network analysis of blueberry metabolites and their assigned clusters based on their known biochemical classification. The size of each metabolite node indicates betweenness centrality. The thickness of the lines connecting metabolites is scaled relative to the correlation between the metabolites. The identity of each metabolite is denoted by number in the legend. (B) Distribution of metabolite concentrations for each volatile group across the blueberry population. Volatile concentrations are reported in nanograms per gram fresh weight per hour (ng/gfw/h) on a log 10 scale.
Fig. 3.Variation in sensory panel ratings explained by sugars/acids and volatiles overall, and by groups of metabolites of known biochemical classification in tomato and blueberry.
Fig. 4.(A) Accuracy of predicting flavor ratings from metabolome data across a range of statistical and machine learning models for tomato and blueberry. Averages and model rankings are inclusive of umami accuracies depicted in and Datasets S4 and S5. (B) Accuracy of predicting perception traits for tomato using 70 individuals with genomic and metabolomic data. (C) Accuracies for tomato flavor prediction with a consistent test set of 39 samples and increasing training set sizes ranging from 50 to 170 samples.
Fig. 5.Relative importance estimated with gradient-boost machine (x axis) and β coefficients estimated with Bayes A (y axis) of each metabolite in predicting ratings of flavor attributes in blueberry and tomato. Colors indicate the type of metabolite. TA, titratable acidity.
Fig. 6.Schematic representation of how the use of metabolomic selection could be applied in earlier stages of a breeding program, when compared to sensory panels.