| Literature DB >> 35748219 |
Anastasia Weyrich1, Madeleine Joel2, Geertje Lewin2, Thomas Hofmann1, Markus Frericks3.
Abstract
BACKGROUND: In silico methods for toxicity prediction have increased significantly in recent years due to the 3Rs principle. This also applies to predicting reproductive toxicology, which is one of the most critical factors in pesticide approval. The widely used quantitative structure-activity relationship (QSAR) models use experimental toxicity data to create a model that relates experimentally observed toxicity to molecular structures to predict toxicity. Aim of the study was to evaluate the available prediction models for developmental and reproductive toxicity regarding their strengths and weaknesses in a pesticide database.Entities:
Keywords: QSAR; in silico predictions; in silico protocols; pesticide; reproductive toxicology
Mesh:
Substances:
Year: 2022 PMID: 35748219 PMCID: PMC9545887 DOI: 10.1002/bdr2.2062
Source DB: PubMed Journal: Birth Defects Res Impact factor: 2.661
Short description of all tested models for the prediction of reprotoxicity
| Platform | Model | Functional principle | Database | References |
|---|---|---|---|---|
| VEGA | Developmental Toxicity model (CAESAR, v.2.1.7) |
QSAR classification model 13 EPA descriptors were used to describe molecular properties Classifier: Random Forest |
292 chemical compounds (mainly drug data) 201 developmental toxicants/91 nondevelopmental toxicants Developmental toxicity was defined by FDA categories: A or B ➔ non‐toxicant C, D or X ➔ toxicant | Cassano & Benfenati ( |
| Developmental/Reproductive Toxicity library (PG, v.1.1.0) |
Empirically based decision tree Expert rule based structural features Molecules could be classified into 25 different categories with known DART No categories for nontoxic chemicals Detailed description of the categories could be found in appendix II of Wu et al. ( |
Decision tree is based on a data set of 716 chemicals (664 toxic, 16 non‐toxic) Detailed information about the chemicals and the references, on the basis of which they were classified, could be found in appendix I of Wu et al. ( | Benfenati ( | |
| OECD (Q)SAR Toolbox | Expert‐based DART scheme |
Is also based on the categories from Wu et al. ( Further development of the categories |
Same data base as PG model | OECD (Q)SAR Toolbox ( |
| Leadscope model applier | Repro Female Rat (RFR) v2 |
Statistical based model (QSAR) Three QSAR models were built with a balance of positive and negative compounds ➔ prediction is the average of all three model results Negative and positive features were identified The predicted positive probability is based on individual contributions from the model features Threshold in predicted positive probability is used to assign a positive or negative prediction |
Includes adverse effects to female reproductive organs (cervix, fallopian tube, ovary, uterus, and vagina) and fertility Based on ICSAS database described by Matthews, Kruhlak, Cimino, Benz, and Contrera ( 894 training compounds (14.8% positives) |
Leadscope ( |
| Repro Male Rat (RMR) v2 |
Includes adverse effects to male reproductive organs (Cowper's gland, epididymis, prostate, seminal vesicles, and testes) and fertility Based on ICSAS database described by Matthews et al. ( 714 training compounds (30.07% positives) | |||
| CASE Ultra | Foetal Dysmorphogenesis (FDYSM) Rabbit |
Statistical based SA model Local QSAR for each alert with physicochemical descriptors Outcome of a SAR prediction is given as the probability of being reprotoxic on a scale of 0 to 1 Specific classification threshold for each model Activating and deactivating alerts were detected |
Pre‐processing of data and endpoints are described by Matthews et al. ( 128 active/129 inactive | Chakravarti, Saiakhov, and Klopman ( |
| FDYSM Rat |
436 active/457 inactive | |||
| Female fertility (FFERT) Rat |
113 active/113 inactive | |||
| Male fertility (MFERT) Rat |
180 active/180 inactive |
FIGURE 1Pie charts of the distribution of (a) reprotoxicity categories by ECHA and (b) selected reprotoxicity endpoints within the pesticide DB. Pesticides classified as “NO Reprotox” do not have a Repr. 1A/B or 2 category classification but may have a classification for any other toxicity. The various reprotoxicity endpoints in chart B are based on the definition by Matthews et al. and were classified based on the studies relevant to the classification by ECHA (Matthews, Kruhlak, Daniel Benz, & Contrera, 2007)
The most common chemical groups within the pesticide DB with corresponding mode of actions by the RAC‐posters
| Group | Pesticide type | Mode of action based on IRAC/FRAC/HRAC | # |
|---|---|---|---|
| Triazole | Fungicide | G1: Inhibition of sterol biosynthesis in membranes via C14‐ demethylase (19/21) | 21 |
| Sulfonylurea | Herbicide | 2: Inhibition of acetolactate synthase (14/14) | 14 |
| Carbamate | Insecticide | 1A: Acetylcholine esterase inhibitor (9/13) | 13 |
| Organothiophosphate | Insecticide | 1B: Acetylcholine esterase inhibitor (10/10) | 10 |
| Pyrethroid | Insecticide | 3A: Sodium channel modulator (9/9) | 9 |
| Aryloxphenoxypropionate (FOPs) | Herbicide | 1: Inhibition of acetyl CoA carboxylase (7/7) | 7 |
| Phenoxycarboxylate | Herbicide | 4: Auxin mimics (7/7) | 7 |
| Pyrazolecarboxamide | Fungicide | C2: Inhibition of succinate‐dehydrogenase (7/7) | 7 |
| Strobilurin | Fungicide | C3: Inhibition of cytochrome bc1 (ubiquinol oxidase) at Qo site (cyt b gene) (7/7) | 7 |
| Phenylurea | Herbicide | 5: Inhibition of photosynthesis at PSll—serine 264 binders (5/7) | 7 |
| Chloroacetamide | Herbicide | 15: Inhibition of very long‐chain fatty acid synthesis (6/6) | 6 |
| Dinitroaniline | Herbicide | 3: Inhibition of microtubule assembly (5/6) | 6 |
Note: The numbers in brackets indicate how many of the categorized pesticides can be assigned to the named mode of action. A list of all chemical groups can be found in Table S4
Abbreviations: FRAC, Fungicide Resistance Action Committee; HRAC, Herbicide Resistance Action Committee; IRAC, Insecticide Resistance Action Committee.
The properties of the evaluated CASE Ultra models
| Model | Description | Species | # Active/inactive | # Descriptors | Classification threshold |
|---|---|---|---|---|---|
| FDYSM | Fetal Dysmorphogenesis | Rabbit | 128/129 | 19 | 0.5 |
| Rat | 436/457 | 111 | 0.45 | ||
| FFRET | Female fertility | Rat | 113/113 | 47 | 0.55 |
| MFRET | Male fertility | Rat | 180/180 | 47 | 0.5 |
List of possible predictions of all models and the resulting evaluations
| Model | Prediction | Evaluation |
|---|---|---|
| VEGA_CAESAR | NON‐toxicant (experimental value, good/moderate reliability) | TN, FN |
| Toxicant (experimental value, good/moderate reliability) | TP, FP | |
| NON‐toxicant/toxicant (low reliability) | UNKNOWN | |
| VEGA_PG | NON‐toxicant | TN, FN |
| Toxicant | TP, FP | |
| OECD (Q)SAR Toolbox (OQTB) | Not known precedent reproductive and developmental toxic potential | TN, FN |
| Known precedent reproductive and developmental toxic potential | TP, FP | |
| Not covered by current version of the decision tree | UNKNOWN | |
| Leadscope (LS) | Negative/Negative_EV | TN, FN |
| Positive/Positive_EV | TP, FP | |
| Missing descriptors/not in domain | UNKNOWN | |
| CASE Ultra (CU) | Negative/known negative | TN, FN |
| Positive/known positive | TP, FP | |
| Inconclusive/out of domain | UNKNOWN |
Abbreviations: FN, false negative; FP, false positive; TN, true negative; TP, true positive.
The formulas for calculating the typical parameters to evaluate prediction models
| Value | Name | Definition |
|---|---|---|
| SEN | Sensitivity |
|
| SPC | Specificity |
|
| ACC | Accuracy |
|
| BA | Balanced accuracy |
|
Abbreviations: FN, false negative; FP, false positive; TN, true negative; TP, true positive.
The distribution of reliability of the developmental toxicity prediction of 310 pesticides using the CAESAR model provided by VEGA
| Reliability | Applicability domain | # | # [%] |
|---|---|---|---|
| Experimental value | The predicted compound | 1 | 0.32 |
| Good reliability | The predicted compound | 28 | 9.03 |
| Moderate reliability | The predicted compound | 40 | 12.90 |
| Low reliability | The predicted compound | 241 | 77.74 |
The results of evaluation of the CAESAR model via typical parameters
| # FN | # FP | # TN | # TP | # UNKNOWN | SEN | SPC | BA | ACC | |
|---|---|---|---|---|---|---|---|---|---|
| ALL | 2 | 42 | 8 | 17 | 241 | 0.89 | 0.16 | 0.53 | 0.36 |
| Experimental value | 0 | 0 | 1 | 0 | ‐ | ‐ | 1.00 | ‐ | 1.00 |
| Good reliability | 1 | 23 | 1 | 3 | ‐ | 0.75 | 0.04 | 0.40 | 0.14 |
| Moderate reliability | 1 | 19 | 6 | 14 | ‐ | 0.93 | 0.24 | 0.59 | 0.50 |
Note: In addition to the evaluation for all pesticides, the following lines contain the evaluation related to the prediction reliability.
Abbreviations: ACC, accuracy; BA, balanced accuracy; FN, false negative; FP, false positive; SEN, sensitivity; SPC, specificity; TN, true negative; TP, true positive.
Example for a false positive CAESAR prediction despite good reliability
| Name | Structure | Developmental toxicant? | CAESAR prediction | Similarity by VEGA | Average Tanimoto similarity coefficient | ||||
|---|---|---|---|---|---|---|---|---|---|
| Pub chem | RDKit | Morgan | Feat Morgan | ||||||
| Tested pesticide | Napropamide |
| NO (ECHA) | Toxicant | 1 | 1 | 1 | 1 | |
| Similar compound 1 | Phenyltoloxamine |
| YES (CAESAR) | Toxicant | 0.855 | .828 | .345 | .208 | .205 |
| Similar compound 2 | Naproxen |
| YES (CAESAR) | Toxicant | 0.83 | .671 | .331 | .212 | .282 |
Note: Napropamide was the tested pesticide and phenyltoloxamine and naproxen were the most similar compounds of the data set of the CAESAR model. To show the variability of similarity depending on the selected descriptors, the similarity score provided by the VEGA platform was compared with the average Tanimoto similarity coefficient based on different fingerprints.
The results of evaluation of the PG model via typical parameters
| # FN | # FP | # TN | # TP | SEN | SPC | BA | ACC | |
|---|---|---|---|---|---|---|---|---|
| ALL | 34 | 77 | 182 | 17 | 0.33 | 0.70 | 0.52 | 0.64 |
| Experimental value | 0 | 23 | 3 | 13 | 1.00 | 0.12 | 0.56 | 0.41 |
| Categorized | 0 | 77 | 0 | 17 | 1.00 | 0 | 0.50 | 0.18 |
| Uncategorized | 34 | 0 | 182 | 0 | 0 | 1.00 | 0.50 | 0.84 |
Note: In addition to the evaluation for all pesticides, the following lines differentiate between experimental value and categorized or uncategorized pesticides.
Abbreviations: ACC, accuracy; BA, balanced accuracy; FN, false negative; FP, false positive; SEN, sensitivity; SPC, specificity; TN, true negative; TP, true positive.
FIGURE 2The pie charts show the distribution of pesticides in the chemical categories defined by Cassano et al. (2010) predicted by the PG model (a) or DART scheme of the OECD (Q)SAR Toolbox (b). The structural description of the categories can be found in Table S7
FIGURE 3The bar plots show the evaluation of the predictions divided by the predicted categories for the PG and DART model by OECD (Q)SAR Toolbox. The aim of the depiction is to analyze whether the prediction for some categories is more reliable than for others. FP, false positive; TP, true positive
Structure of the tested pesticide 2,4‐D, the matching rule/virtual compound and the two most similar compounds, as well as their predicted categories by the PG model
| Name | Structure | Predicted category | |
|---|---|---|---|
| Tested pesticide | 2,4‐D |
| 8c |
| Matching rule/virtual compound | ‐ |
| ‐ |
| Similar compound 2 | 2,4,5‐trichlorophenoxyacetic acid |
| 9c |
| Similar compound 3 | 2,4‐D isopropyl ester |
| 9c |
The results of evaluation of the DART scheme of the OECD (Q)SAR Toolbox via typical parameters
| # FN | # FP | # TN | # TP | # UNKNOWN | SEN | SPC | BA | ACC | |
|---|---|---|---|---|---|---|---|---|---|
| ALL | 22 | 88 | 152 | 29 | 19 | 57 | 63 | 60 | 62 |
| Categorized | 0 | 88 | 0 | 29 | 0 | 100 | 0 | 50 | 25 |
| Uncategorized | 22 | 0 | 152 | 0 | 0 | 0 | 100 | 50 | 87 |
Note: In addition to the evaluation for all pesticides, the following lines differentiate between categorized and uncategorized pesticides.
Abbreviations: ACC, accuracy; BA, balanced accuracy; FN, false negative; FP, false positive; SEN, sensitivity; SPC, specificity; TN, true negative; TP, true positive.
FIGURE 4The structural scope of “Toluene and small alkyl toluene derivatives (8a).” R = H, Me, nBu, iPropyl, tBu
A selection of pesticides that were incorrectly classified in subcategory 8a
| Name | 1,4‐dimethyl‐naphthalene | Bifenthrin | Cyazofamid | Iprovalicarb | Metrafenone |
|---|---|---|---|---|---|
| CAS no. | 571‐58‐4 | 82657‐04‐3 | 120116‐88‐3 | 140923‐17‐7 | 220899‐03‐6 |
| Structure |
|
|
|
|
|
The results of evaluation of the two Leadscope models Repro Female Rat (RFR) and Repro Male Rat (RMR) via typical parameters
| Model | # FN | # FP | # TN | # TP | # UNKNOWN | SEN | SPC | BA | ACC |
|---|---|---|---|---|---|---|---|---|---|
| RFR | 3 | 3 | 156 | 1 | 147 | 0.25 | 0.98 | 0.62 | 0.96 |
| RMR | 4 | 26 | 94 | 3 | 183 | 0.43 | 0.78 | 0.61 | 0.76 |
Abbreviations: ACC, accuracy; BA, balanced accuracy; FN, false negative; FP, false positive; SEN, sensitivity; SPC, specificity; TN, true negative; TP, true positive.
Detected structural features and selected training set analogs of triadimenol, which is reprotoxic in female and male rats
| Predicted pesticide | Model | Evaluation | Detected structural features | Selected relevant analog structures | |
|---|---|---|---|---|---|
|
CAS no. 55219‐65‐3
| RFR v2 | FN |
Benzene, 1‐alkoxy‐, 4‐chloro‐
|
|
|
| RMR v2 | UNKNOWN |
Chlorophenol‐
| No analog structures reported. | ||
Detected structural features and selected training set analogs of epoxiconazole, which is reprotoxic in female rats
| Predicted pesticide | Model | Evaluation | Detected structural features | Selected relevant analog structures | |
|---|---|---|---|---|---|
|
CAS no. 135319‐73‐2
| RFR v2 | TP |
Benzene, 1‐halo‐, 4‐oxymethyl‐
|
|
|
| RMR v2 | FP |
Benzene, 1‐fluoro‐
|
|
For structure, see above | |
|
Benzene, 1‐alkyl‐,2‐halo‐
| |||||
|
Benzene, 1‐alkyl‐,2‐chloro‐
| |||||
|
Benzene, 1‐alkyl‐,4‐halo‐
| |||||
The results of evaluation of the four tested CASE Ultra models FDYSM_Rabbit, FDYSM_Rat, FFERT_Rat and MFERT_Rat via typical parameters
| Model | # FN | # FP | # TN | # TP | # UNKNOWN | SEN | SPC | BA | ACC |
|---|---|---|---|---|---|---|---|---|---|
| FDYSM_RABBIT | 4 | 18 | 75 | 5 | 208 | 0.56 | 0.81 | 0.68 | 0.78 |
| FDYSM_RAT | 19 | 75 | 115 | 10 | 91 | 0.34 | 0.61 | 0.48 | 0.57 |
| FFERT_RAT | 2 | 18 | 132 | 1 | 157 | 0.33 | 0.88 | 0.61 | 0.87 |
| MFERT_RAT | 2 | 62 | 106 | 3 | 137 | 0.60 | 0.63 | 0.62 | 0.63 |
Abbreviations: ACC, accuracy; BA, balanced accuracy; FN, false negative; FP, false positive; SEN, sensitivity; SPC, specificity; TN, true negative; TP, true positive.
All triazoles of the pesticide DB that showed fetal dysmorphogenesis in ECHA classification‐relevant studies in rats and their prediction by the FDYSM_Rat model from CASE Ultra
| Name | CAS no. | Structure | Prediction/probability/alert |
|---|---|---|---|
| Ipconazole | 125225‐28‐7 |
| Negative/30.3/no alert |
| Metconazole | 125116‐23‐6 |
| Negative/30.3/no alert |
| Paclobutrazol | 76738‐62‐0 |
| Negative/30.3/no alert |
| Penconazole | 66246‐88‐6 |
| Negative/30.3/no alert |
| Tebuconazole | 107534‐96‐3 |
| Negative/30.3/no alert |
| Triadimenol | 55219‐65‐3 |
| Negative/30.3/no alert |
| Epoxiconazole | 133855‐98‐8 |
| Out of domain/30.3/no alert |
| Bromuconazole | 116255‐48‐2 |
|
Positive/56/alert ID 105: C3H2‐C3‐c:cH:cH:c:cH
|
| Cyproconazole | 94361‐06‐5 |
|
Positive/56/alert ID 105: C3H2‐C3‐c:cH:cH:c:cH
|
| Propiconazole | 60207‐90‐1 |
|
Positive/56/alert ID 105: C3H2‐C3‐c:cH:cH:c:cH
|
Note: When an alert was found, the relevant structure in the molecular pesticide structure is highlighted in green.
FIGURE 5Plot of the accuracy against the balanced accuracy (a) and the FPR against the TPR (b) per model. The size of the points depends on the percentage of pesticides predicted. The black line shows the diagonal of the plot (TPR = FPR). The closer the points are to the diagonal, the more the model's prediction resembles a random process
FIGURE 6All bar plots show the distribution of FN, FP, TN, TP, and UNKNOWN per chemical group for a different prediction model. A more detailed description of the chemical groups can be found in Table 2
Possible additional information on the prediction, which is made available in the reports
| Information about… | Important questions |
|---|---|
| Structural alert/feature/predicted category | Does the selected structural fragment match the key functional groups of the pesticide? |
| Analog structures/similar compounds from training set | How similar are these compounds? |
| Data sources | Which source is the classification based on? Which effects are described in this source? |