| Literature DB >> 34075109 |
Davor Oršolić1, Vesna Pehar2, Tomislav Šmuc1, Višnja Stepanić3.
Abstract
Widespread use of herbicides results in the global increase in weed resistance. The rotational use of herbicides according to their modes of action (MoAs) and discovery of novel phytotoxic molecules are the two strategies used against the weed resistance. Herein, Random Forest modeling was used to build predictive models and establish comprehensive characterization of structure-activity relationships underlying herbicide classifications according to their MoAs and weed selectivity. By combining the predictive models with herbicide-likeness rules defined by selected molecular features (numbers of H-bond acceptors and donors, logP, topological and relative polar surface area, and net charge), the virtual stepwise screening platform is proposed for characterization of small weight molecules for their phytotoxic properties. The screening cascade was applied on the data set of phytotoxic natural products. The obtained results may be valuable for refinement of herbicide rotational program as well as for discovery of novel herbicides primarily among natural products as a source for molecules of novel structures and novel modes of action and translocation profiles as compared with the synthetic compounds.Entities:
Year: 2021 PMID: 34075109 PMCID: PMC8169684 DOI: 10.1038/s41598-021-90690-w
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
HRAC classification and division of herbicides from the HRAC2020 and extended data sets across the MoA classesa.
| Legacy hrac code | hrac2020&wssa code | Number of compounds in hrac2020/extended set | General mode of action–targeted biological process | Mode of action–targeted molecular functions |
|---|---|---|---|---|
| A | 1 | 21/29 | Fatty acid biosynthesis | Inhibition of acetyl-CoA carboxylase (ACCase) |
| B | 2 | 58/61 | Amino acid synthesis (Leu, Ile, Val) | Inhibition of acetohydroxyacid synthase/acetolactate synthase (AHAS/ALS) |
| C1 | 5 | 43/53 | Photosynthesis (electron transfer) | Inhibition of photosystem (PS) II protein D1 (C1/C2 Ser264; C3 His215) |
| C2 | 5 | 30/37 | ||
| C3 | 6 | 5/9 | ||
| D | 22 | 4/5 | Photosynthesis (electron transfer) | Inhibition of diversion of the electrons transferred by the PS I ferredoxin |
| E | 14 | 29/43 | Photosynthesis (heme synthesis for chlorophyll) | Inhibition of protoporphyrinogen oxidase (PPO) |
| F1 | 12 | 7/9 | Photosynthesis (carotenoid synthesis) | Inhibition of phytoene desaturase (PDS) |
| F2 | 27 | 14/16 | Inhibition of 4-hydroxyphenylpyruvate dioxygenase (4-HPPD) | |
| F3 | 34 | 1/2 | Inhibition of lycopene cyclase | |
| F4 | 13 | 2/1 | Inhibition of 1-deoxy-d-xylulose-5-phosphate (DOXP) synthase | |
| G | 9 | 1/2 | Amino acid synthesis (Phe, Trp, Tyr) | Inhibition of 5-enolpyruvylshikimate-3-phosphate (EPSP) synthase |
| H | 10 | 2/4 | Amino acid synthesis (Gln) | Inhibition of glutamine synthase |
| I | 18 | 1/3 | Tetrahydrofolate synthesis | Inhibition of dihydropteroate (DHP) synthase |
| K1 | 3 | 18/25 | Microtubule polymerization | Inhibition of microtubule assembly |
| K2 | 23 | 6/9 | Inhibition of microtubule organisation | |
| K3 | 15 | 43/39a | Fatty acid synthesis | Inhibition of VLCFAs |
| L | 29b | 6/6 | Cell wall synthesis | Inhibition of cellulose synthase |
| M | 24 | 6/8 | ATP synthesis | Uncoupling of oxidative phosphorylation |
| N | NAb | NAb/23 | Fatty acid synthesis | Inhibition of fatty acid elongase |
| O | 4 | 25/37 | Regulation of auxin-responsive genes | Synthetic auxin mimics -Stimulation of transport inhibitor response protein 1 (TIR1) |
| P | 19 | 2/3 | Long-range hormone signaling | Auxin transport inhibitors |
aIn the HRAC2020 classification there are additional classes Q (3), R (31), S (32) and T (33), all with up to 2 members [5].
bMajority of herbicides from the class N are fused in the K3 (15) class. The treating 23 herbicides of the legacy N class separately, does not affect the results since this subgroup is structurally diverse from the other K3 herbicides.
Figure 1Comparing performance of the four ML classifiers for MoA predictions. (a, d) Density accuracy plot. (b, e) Box plots of distributions of resampled accuracies and kappa values. (c, f) Probability density plot for accuracy differences between the RF and SVM classifiers. The plots (a)–(c) described MoA classifiers (Table 2) and those (d)–(f) present comparison of weed selectivity models built with nine descriptors including log P (Table 3). The RF and SVM MoA classifiers are largely equivalent since 75.7% of posterior probability distribution is inside the region of practical equivalence (rope, the differences of accuracy are less than 1%).
Comparison of classification performance on the test and HRAC_REST case sets of the four optimized 16- class MoA ML models built in terms of 141 MACCS fp keysa.
| MoA | Overallb | Averaged across classes | |||||
|---|---|---|---|---|---|---|---|
| Classifier | Accuracy | Kappa | Sensitivity | Specificity | Precision | F1 | Balanced Accuracy |
| RF | 0.895 | 0.883 | 0.821 | 0.993 | 0.896 | 0.900 | 0.907 |
| XGBoost | 0.895 | 0.883 | 0.821 | 0.993 | 0.899 | 0.899 | 0.907 |
| SVM | 0.912 | 0.902 | 0.838 | 0.994 | 0.935 | 0.936 | 0.916 |
| NB | 0.561 | 0.500 | 0.332 | 0.969 | 0.663 | 0.604 | 0.651 |
| RF | 0.674 | 0.646 | 0.641 | 0.979 | 0.728 | 0.796 | 0.814 |
| XGBoost | 0.663 | 0.633 | 0.594 | 0.978 | 0.670 | 0.771 | 0.790 |
| SVM | 0.696 | 0.667 | 0.631 | 0.980 | 0.673 | 0.797 | 0.809 |
| NB | 0.413 | 0.362 | 0.310 | 0.961 | 0.509 | 0.605 | 0.638 |
aOptimal values of classifiers’ hyperparameters are listed in Table S4.
bThe overall accuracy and kappa values are averaged over 10 × 10-fold CV resamplings.
Comparison of performance metrics on the test set of 3-class RF and SVM models built for prediction of BL, G or NS weed selectivity of herbicides in terms of subset of nine simple molecular and physicochemical descriptors including lipophilicity coefficient logP or 141 MACCS keysa.
| RF /SVMb | Per classes | ||||
|---|---|---|---|---|---|
| 9 descriptors with logP | Sensitivity | Specificity | Precision | F1 | Balanced Accuracy |
| Class: BL | 0.944/0.917 | 0.690/0.690 | 0.791/0.786 | 0.861/0.846 | 0.817/0.803 |
| Class: G | 0.739/0.696 | 0.952/0.929 | 0.895/0.842 | 0.810/0.762 | 0.846/0.812 |
| Class: NS | 0.500/0.667 | 1.000/1.000 | 1.000/1.000 | 0.667/0.800 | 0.750/0.883 |
aThe nine descriptors are logDiff, logSw, Shapeindex, Cat, sp3At, TPSA, HBA, HBD plus logP.
bThe RF and SVM models with 9 descriptors including log P/141 MACCS keys correspond to the models 1 and 7/3 and 9, respectively, in Table S6. The models were trained and applied with using tuned hyperparameters’ values (Figures S2–S4).
Figure 2Heat map presentations and evaluation metrics for distributions of (a) HRAC2020 + HRAC_REST (411) and (b) HRAC2020 (314) herbicides in terms of fractions (%) of MoA classes in clusters generated by the agglomerative algorithm and MACCS fp.
Figure 3Heat maps for structural dissimilarity quantified by Jaccard coefficient(1-TC) calculated for all pairs of 509 synthetic herbicides (a) arranged into MoA classes and (b) divided into the subsets HRAC2020, HRAC_REST and the Z compounds with addition of the set of NPs originated from bacteria, fungi and plants. The extended, HRAC2020 and HRAC_REST compounds are ordered according to the classes A-P. More blue/red values correspond to more structurally similar/diverse compounds. (c) Definition of AD for the RF MoA model (Table 2): given a compound, the model’s prediction is considered reliable if it is similar to at least one training herbicide with TC greater than 0.6 and the estimated class probability is greater than 0.6.
Figure 4(a) The AD for the RF weed selectivity model (1 in Tables 3 and S6). Given a compound, the prediction can be considered credible for the class probability above 0.6 and the Euclidian distance less than 2.0. (b) The most distinguishing molecular features of the broad-leaved or grass selective and non-selective herbicides.
Figure 5(a) The comparison of six subgroups of phytotoxic molecules according to selected molecular properties. Herbicide-like boundaries (Table 4) are denoted by red dash lines. (b) Virtual screening platform proposed for preselecting phytotoxic compounds. Its proof-of-concept should be carried out by in vivo testing.
Herbicide-like chemical space defined in terms of common molecular descriptors (Fig. 5).
| Descriptor | Range | % of 509 synthetic herbicides | % of 131 NPs |
|---|---|---|---|
| HBD (OH/NH) | ≤ 2 | 95 | 51.9 |
| HBA (O/N) | ≤ 6; ≤ 7 | 66.7; 80.0 | 58.8; 65.6 |
| clogPa | 0.5 < clogP ≤ 3.5;0.5 < clogP ≤ 4.5 | 66.7; 80.0 | 47.3; 53.4 |
| TPSA | 20 Å2 < TPSA≤ 120 Å2 | 80 | 63.4 |
| Relative PSA | 0.1 <RelPSA ≤ 0.5 | 80 | 81.7 |
| Net chargeb | ≤ 0 | 95 | 65.6 |
aRegardless logP values were calculated by ADMET Predictor or DataWarrior.
bMore than 95% of synthetic organic herbicides are either neutral molecules (around 2/3) or anions (30%) (Figure S7).