| Literature DB >> 35464247 |
Mihyun Seo1, Changwon Lim2, Hoonjeong Kwon1,3.
Abstract
Systematic toxicity tests are often waived for the synthetic flavors as they are added in a very small amount in foods. However, their safety for some endpoints such as endocrine disruption should be concerned as they are likely to be active in low levels. In this case, structure-activity-relationship (SAR) models are good alternatives. In this study, therefore, binary, ternary, and quaternary prediction models were designed using simple or complex machine-learning methods. Overall, hard-voting classifiers outperformed other methods. The test scores for the best binary, ternary, and quaternary models were 0.6635, 0.5083, and 0.5217, respectively. Along with model development, some substructures including primary aromatic amine, (enol)ether, phenol, heterocyclic sulfur, and heterocyclic nitrogen, dominantly occurred in the most highly active compounds. The best predicting models were applied to synthetic flavors, and 22 agents appeared to have a strong inhibitory potential towards TPO activities. Supplementary Information: The online version contains supplementary material available at 10.1007/s10068-022-01041-y.Entities:
Keywords: Machine learning; Quantitative structure–activity relationship (QSAR); Synthetic flavor; Thyroid peroxidase inhibitor (TPO); Toxicity prediction
Year: 2022 PMID: 35464247 PMCID: PMC8994803 DOI: 10.1007/s10068-022-01041-y
Source DB: PubMed Journal: Food Sci Biotechnol ISSN: 1226-7708 Impact factor: 2.391
Number of compounds in each subset and their classification criteria
| Max inhibition | IC50 | Binary | Number of compounds | Ternary | Number of compounds | Quaternary | Number of compounds | |||
|---|---|---|---|---|---|---|---|---|---|---|
| Higher than 50% | ≤ 10 μM | A | Train | 388 | A | Train | 283 | A1 | Train | 100 |
| Test | 25 | |||||||||
| > 10 μM | Test | 71 | A2 | Train | 183 | |||||
| Test | 97 | Test | 46 | |||||||
| Between 20 and 50% | B | Train | 105 | B | Train | 105 | ||||
| Test | 26 | Test | 26 | |||||||
| Less than 20% | C | Train | 81 | C | Train | 81 | C | Train | 81 | |
| Test | 21 | Test | 21 | Test | 21 | |||||
Fig. 1Cross-validation (CV) and test scores for each feature, learning method, and grouping method (binary, ternary, or quaternary). For each grouping method, the black bar marked above with an asterisk indicates the test scores of the best-performing model. Each label is shown in “model name_feature extraction method (or voting methods in voting classifier)” form. [RF: Random forest; SVM: Support vector machine; ANN: Artificial neural network; AdaB: Adaptive boosting; XGB: Extreme gradient boosting; PCA: Principal component analysis; LDA: Linear discriminant analysis]
Fig. 2Confusion matrices for the best-performing models of each grouping. The color intensity of each cell represents the proportion of the number of compounds predicted to be a class with respect to the actual number of compounds in a class
Examples of compounds in the highest activity group (group A1) shown with active substructures studied by means of substructure frequency analysis (depicted in red)
| Name of compounds | Structures of compounds and | Purpose of use or | Ref |
|---|---|---|---|
| 4′4′-methylenedianiline |
| Food contact material | (U.S. Food and Drug Administration, |
| Quercetin |
| A flavonoid (flavonol) in edible plants (e.g., onion) | (Chandra, 2010) |
| Genistein |
| A flavonoid (isoflavone) in edible plants (e.g., soybean) | (Chandra, 2010) |
| L-ascorbic acid |
| Natural food substance; food additive | (U.S. Food and Drug Administration, |
| L-Tryptophan |
| Natural food substance (amino acid) | (U.S. Food and Drug Administration, |
| Isoproterenol |
| Drug; β-adrenergic receptor agonist (analogue of epinephrine) | (U.S. National Library of Medicine, 2020) |
| Indole |
| Food additive (flavoring agent) | (Food and Agriculture Organization of the United States (FAO), |
| 2-Mercaptobenzothioazole |
| Pesticide | (United States Environmental Protection Agency (EPA), |
| Phenmedipham |
| Pesticide | (European Commission, |
| Azinphos-methyl |
| Pesticide | (European Commission, |
| Dimethoate |
| Pesticide | (European Commission, |
| Malathion |
| Pesticide | (European Commission, |
Fig. 3Chemical space analysis between the input data for model development and synthetic flavors (red dots: synthetic flavors; green dots: input data for model development)
Synthetic flavors found to be active in the predictive models and their corresponding Cramer classes predicted by the Toxtree software
| Name of compound | Structure | Cramer classa | ||
|---|---|---|---|---|
| Class I | Class II | Class III | ||
| Indole |
| ✓ | ||
| Pyrrole |
| ✓ | ||
| Butyl anthranilate |
| ✓ | ||
| Linalyl anthranilate |
| ✓ | ||
| (+)-Cedrol |
| ✓ | ||
| Geranyl tiglate |
| ✓ | ||
| Allyl Ionone |
| ✓ | ||
| Bis(2-methyl-3-furyl)disulfide |
| ✓ | ||
| (3aR)-(+)-Sclareolide |
| ✓ | ||
| Maltol isobutyrate |
| ✓ | ||
| Cedr-8(15)-en-4-ol |
| ✓ | ||
| 2’-Aminoacetophenone |
| ✓ | ||
| Patchouli alcohol |
| ✓ | ||
| Elemol |
| ✓ | ||
| Dibenzyl disulfide |
| ✓ | ||
| Difurfuryl disulfide |
| ✓ | ||
| 2-Furfurylthio-3-methylpyrazine |
| ✓ | ||
| Furfuryl-2-methyl-3-furyl disulfide |
| ✓ | ||
| 8,8-diethoxy-2,6-dimethyl-2-Octanol |
| ✓ | ||
| Viridiflorol |
| ✓ | ||
| Caryolan-1-ol |
| ✓ | ||
| L-Methionylglycine |
| ✓ | ||
aCramer classes are defined as follows:
Class I: Chemical structures with small potential of oral toxicity (low class)
Class II: Chemical structures that are more potentially toxic than Class I substances, but without typical structures suggestive of toxicity (intermediate class)
Class III: Chemical structures that have no solid evidence of safety or may even have functional groups that suggest strong toxicity or reactivity (high class)