| Literature DB >> 28791285 |
Cristian Rojas1,2, Roberto Todeschini3, Davide Ballabio3, Andrea Mauri4, Viviana Consonni3, Piercosimo Tripaldi2, Francesca Grisoni3.
Abstract
This work describes a novel approach based on advanced molecular similarity to predict the sweetness of chemicals. The proposed Quantitative Structure-Taste Relationship (QSTR) model is an expert system developed keeping in mind the five principles defined by the Organization for Economic Co-operation and Development (OECD) for the validation of (Q)SARs. The 649 sweet and non-sweet molecules were described by both conformation-independent extended-connectivity fingerprints (ECFPs) and molecular descriptors. In particular, the molecular similarity in the ECFPs space showed a clear association with molecular taste and it was exploited for model development. Molecules laying in the subspaces where the taste assignation was more difficult were modeled trough a consensus between linear and local approaches (Partial Least Squares-Discriminant Analysis and N-nearest-neighbor classifier). The expert system, which was thoroughly validated through a Monte Carlo procedure and an external set, gave satisfactory results in comparison with the state-of-the-art models. Moreover, the QSTR model can be leveraged into a greater understanding of the relationship between molecular structure and sweetness, and into the design of novel sweeteners.Entities:
Keywords: QSAR; classification; expert system; molecular descriptors; sweetness
Year: 2017 PMID: 28791285 PMCID: PMC5524730 DOI: 10.3389/fchem.2017.00053
Source DB: PubMed Journal: Front Chem ISSN: 2296-2646 Impact factor: 5.221
Summary of the performances of the QSTR classification models reported in the literature for predicting sweet taste of molecules.
| Iwamura, | Sweet and bitter | 2 | SLR | 3 | 49 | – | – | – | – |
| Kier, | Sweet and bitter | 2 | LDA | 2 | 20 | 9 | 0.850 | – | 0.775 |
| Spillane and McGlinchey, | Sweet and non-sweet | 2 | Plot | 2 | 35 | 12 | 0.914 | – | 0.917 |
| Takahashi et al., | Sweet and bitter | 2 | LLA | 3 | 22 | – | 1 | – | – |
| 6 | 22 | – | 0.909 | – | – | ||||
| Spillane et al., | Sweet and bitter | 2 | LDA | 3 | 33 | – | 0.807 | – | – |
| Takahashi et al., | Sweet and bitter | 2 | LDA | 3 | 22 | 9 | 1 | – | 0.775 |
| 2 | 0.955 | – | 0.775 | ||||||
| Miyashita et al., | Sweet and non-sweet | 2 | SIMCA | 4 | 50 | – | 0.798 | – | – |
| Miyashita et al., | Sweet and bitter | 3 | SIMCA | 5 | 91 | – | 0.840 | – | – |
| Okuyama et al., | Sweet and non-sweet | 2 | SIMCA | 1 | 25 | – | 0.868 | – | – |
| 20 | – | 0.808 | – | – | |||||
| Spillane and Sheahan, | Sweet and non-sweet | 2 | LDA | 3 | 23 | – | 0.642 | – | – |
| Spillane and Sheahan, | Sweet and non-sweet | 3 | Plot | 2 | 57 | – | 0.860 | – | – |
| 2 | LDA | 3 | 33 | – | 0.848 | – | – | ||
| 23 | – | 0.870 | – | – | |||||
| Spillane et al., | Sweet and non-sweet (bitter, bitter followed by sweet aftertaste, sour and aniline- or hydrocarbon-like taste) | 2 | Plot | 2 | 40 | – | 0.833 | – | – |
| Drew et al., | Sweet and bitter | 3 | DA | 11 | 50 | – | 1 | – | – |
| Spillane et al., | Sweet and non-sweet | 2 | LDA | 4 | 101 | – | 0.665 | – | – |
| QDA | – | 0.801 | – | – | |||||
| CART | 3 | – | 0.650 | – | – | ||||
| Spillane et al., | Sweet and bitter | 2 | Plot | 2 | 23 | – | 0.862 | – | – |
| LDA | 4 | – | 0.850 | – | – | ||||
| QDA | – | 0.900 | – | – | |||||
| Spillane et al., | Sweet and non-sweet | 2 | LDA | 4 | 132 | – | 0.693 | – | – |
| QDA | – | 0.683 | – | – | |||||
| CART | 3 | – | 0.815 | – | – | ||||
| Kelly et al., | Sweet | 3 | LDA | 8 | 75 | 8 | 0.547 | 0.413 | 0.500 |
| QDA | 0.773 | 0.493 | 0.250 | ||||||
| CART classification | 0.773 | – | – | ||||||
| CART regression (R2 = 0.792) | 7 | 0.813 | – | 0.750 | |||||
| Spillane et al., | Sweet | 3 | CART classification | 6 | 82 | – | 0.753 | – | – |
| 7 | 82 | – | 0.580 | – | – | ||||
| 6 | 70 | 12 | 0.810 | – | 0.583 | ||||
| CART regression ( | 7 | 70 | 12 | 0.807 | – | 0.909 | |||
| Spillane et al., | Sweet and non-sweet (bitterness, blandness or tastelessness) | 2 | LDA | 2 | 58 | – | 0.655 | 0.603 | – |
| 2 | QDA | 3 | 58 | – | 0.759 | 0.603 | – | ||
| 2 | CART | 6 | 48 | 10 | 0.950 | – | 0.700 | ||
| 3 | CART | 6 | 48 | 10 | 0.908 | – | 0.611 | ||
| Rojas et al., | Sweet and tasteless | 2 | 9 | 396 | 170 | 0.866 | 0.874 | 0.753 | |
| Sweet and bitter | 4 | 356 | 152 | 0.927 | 0.921 | 0.901 | |||
| Chéron et al., | Sweet and bitter | 2 | RF | 5 | 796 | 191 | 0.997 | – | 0.902 |
CART, classification and regression tree; d, number of descriptors; DA, discriminant analysis; kNN, k-nearest neighbors; LDA, linear discriminant analysis; LLA, linear learning machine; N.
Not available.
Calculated as the ratio of correctly classified molecules to the total number of molecules (Accuracy).
Number of components for SIMCA analysis.
Number of components considering for the DA analysis.
Figure 1MDS plot of the two first coordinates (explained variance equal to 69.85%) for the training set molecules. Sweet molecules are marked with blue circles, and non-sweet molecules are market with cyan circles.
Figure 2Common chemical scaffold of sweeteners grouped in cluster S1.
Details of the conformation-independent Dragon molecular descriptors included in the N3 and PLSDA models in cluster C3.
| F03[N-O] | Frequency of N—O at topological distance 3 | 2D Atom Pairs | N3 |
| Uindex | Balaban U index | Information indices | |
| CATS2D_04_AL | CATS2D Acceptor-Lipophilic at lag 04 | CATS 2D | |
| CATS2D_05_AL | CATS2D Acceptor-Lipophilic at lag 05 | ||
| C-026 | R–CX–R | Atom-centerd fragments | |
| nCconj | Number of non-aromatic conjugated C(sp2) | Functional group counts | |
| F03[C-S] | Frequency of C—S at topological distance 3 | 2D Atom Pairs | PLSDA |
| MATS1s | Moran autocorrelation of lag 1 weighted by I-state | 2D autocorrelations | |
| CATS2D_02_DN | CATS2D Donor-Negative at lag 02 | CATS 2D | |
| CATS2D_04_AP | CATS2D Acceptor-Positive at lag 04 | ||
| ARR | Aromatic ratio | Ring descriptors | |
| D/Dtr07 | Distance/detour ring index of order 7 |
Figure 3Coefficients for training descriptors in the PLSDA model for the sweet class.
Figure 4Workflow of the basic steps of the QSTR-based expert system for predicting the sweetness of chemicals.
Figure 5Histogram plot of the Jaccard-Tanimoto average similarity of the training molecules from molecules grouped in cluster S1 (A) and cluster S2 (B).
Performance of the QSTR-based expert system based on the “strict” consensus.
| Fitting | 0.892 | 0.929 | 0.855 | 19.7 |
| Monte Carlo | 0.887 | 0.927 | 0.848 | 20.5 |
| Test set | 0.848 | 0.880 | 0.816 | 19.3 |