| Literature DB >> 32626544 |
Abstract
Currently, there are more than 100,000 industrial chemicals substances produced and present in our living environments. Some of them may have adverse effects on human health. Given the rapid expansion in the number of industrial chemicals, international organizations and regulatory authorities have expressed the need for effective screening tools to promptly and accurately identify chemical substances with potential adverse effects without conducting actual toxicological studies. (Quantitative) Structure-Activity Relationship ((Q)SAR) is a promising approach to predict the potential adverse effects of a chemical on the basis of its chemical structure. Significant effort has been devoted to the development of (Q) SAR models for predicting Ames mutagenicity, among other toxicological endpoints, owing to the significant amount of the necessary Ames test data that have already been accumulated. The International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH) M7 guideline for the assessment and control of mutagenic impurities in pharmaceuticals was established in 2014. It is the first international guideline that addresses the use of (Q) SAR instead of actual toxicological studies for human health assessment. Therefore, (Q) SAR for Ames mutagenicity now require higher predictive power for identifying mutagenic chemicals. This review introduces the advantages and features of (Q)SAR. Several (Q) SAR tools for predicting Ames mutagenicity and approaches to improve (Q) SAR models are also reviewed. Finally, I mention the future of (Q) SAR and other advanced in silico technology in genetic toxicology.Entities:
Keywords: (quantitative) structure–activity relationship ((Q)SAR); AI; Ames test; Benchmark data set; Deep leaning; ICH-M7; Prediction power; Rule-based; Statistics-based
Year: 2020 PMID: 32626544 PMCID: PMC7330942 DOI: 10.1186/s41021-020-00163-1
Source DB: PubMed Journal: Genes Environ ISSN: 1880-7046
(Q) SAR tools used in Chemical Substance Control Law in Japan
| Evaluation item | QSAR tools | |
|---|---|---|
| Ministry of Economy, Trade and Industry | Degradability | BIOWIN5 |
| BIOWIN6 | ||
| CATALOGIC | ||
| Accumulation | BCFWIN | |
| Amot-Gobas Model | ||
| Baseline Model | ||
| Ministry of the Environment | Ecological effects | TIMES |
| ECOSAR | ||
| KATE | ||
| Ministry of Health, Labor and Welfare | Human health effects (Ames mutagenicity) | DerekNexus |
| CASE Ultra | ||
| TIMES_Ames |
Fig. 1Rule based QSAR and statistical based QSAR
OECD Principles for the validation, for regulatory purposes, of QSAR models
| Clarify the endpoint of a test system for the predictive model (predict Ames test results, chromosomal aberration test results, not predict genotoxicity or mutagenicity). | |
| Clarify the types of models (rule-based and statistical-based) and the methods (algorithms, descriptors, etc.) used to build the models, and ensure their transparency. However, in the case of models for commercial purposes, this information is often not necessarily disclosed. | |
| Since the predictability of QSAR depends on the training set used to build the model, the types of chemicals that can make highly accurate predictions are limited. Therefore, clarify the limits of the chemical structure to which the QSAR model can be applied (Clarification of Out of Domain). | |
| The fitness and robustness of the predictive model should be evaluated using an internal training set. Also, its predictability should be determined using an appropriate external dataset. | |
| If possible, show the mechanical association between the model descriptor and the prediction endpoint. If it can be interpreted by mechanisms, it can be part of the scope of Principle-3. |
2X2 prediction matrix for Ames mutagenicity classification
| Experimental Ames mutagenicity class | |||
|---|---|---|---|
| Positive | Negative | ||
| QSAR prediction class | Positive | True Positive (TP) | False Positive (FP) |
| Negative | False Negative (FN) | True Negative (TN) | |
| Unpredictable (OODa) | – | – | |
aOut of Domain
Performance metrics used to evaluate classifiers
| Performance metric | Calculation and description |
|---|---|
| Sensitivity (SENS) | |
| Measures the ability of a QSAR tool to detect Ames positives compounds correctly. | |
| Specificity (SPEC) | |
| Measures the ability for a QSAR tool to detect negatives compounds. | |
| Accuracy (ACC) | ( |
| Assesses a QSAR tool’s overall performance by returning the fraction of compounds which were correctly predicted. | |
| Balanced Accuracy (BA) | ( |
| Assesses the overall model performance, giving each class equal weight. | |
| Positive Prediction Value (PPV) | ( |
| Indicates how frequently positive predictions are correct. | |
| Negative Prediction Value (NPV) | |
| Indicates how often negative predictions are correct. | |
| Mathews Correlation Coefficient (MCC) | |
| Assesses the overall performance of the model. Values can range from −1 to 1, which is in contrast to the other metrics in this table which range form 0 to 1. | |
| Coverage (COV) | ( |
| Evaluates the proportion of compounds for which the model can make a positive or negative prediction. |
Participants in Ames/QSAR international challenge project
| QSAR Vender | QSAR Tool |
|---|---|
| 1. Lhasa Limited (UK) | ① Derek Nexus |
| ② Sarah Nexus | |
| 2. MultiCASE Inc. (USA) | ③ CASE Ultra statistical-based |
| ④ CASE Ultra rule-based | |
| 3. Leadscope Inc. (USA) | ⑤ Leadscope statistical-based |
| ⑥ Leadscope rule-based | |
| 4. IRCCS - Istituto di Ricerche Farmacologiche Mario Negiri (Italy) | ⑦ CAESAR |
| ⑧ SARPY | |
| ⑨ KNN | |
| 5. LMC - Bourgas University (Bulgaria) | ⑩ TIMES_AMES |
| 6. Istituto Superiore di Sanita (Italy) | ⑪ Toxtree |
| 7. Prous Institute (Spain) | ⑫ Symmetry |
| 8. Swedish Toxicology Science Research Center (Sweden) | ⑬ AZAMES |
| 9. FUJITSU KYUSHU SYSTEMS LIMITED (Japan) | ⑭ ADMEWORKS |
| 10. IdeaConsult Ltd. (Bulgaria) | ⑮ AMBIT |
| 11. Molecular Networks GmbH and Altamira LLC (USA) | ⑯ ChemTune•ToxGPS |
| 12. Simulations Plus, Inc. (USA) | ⑰ MUT_Risk |
Number of chemicals in Ames/QSAR international challenge project
| Class | Phase I (2014–2015) | Phase II (2015–2016) | Phase III (2016–2017) | Total (2014–2017) |
|---|---|---|---|---|
| 556 (14.5%) | 562 (14.7%) | 629 (14.3%) | 1757 (14.4%) | |
| 3336 (85.5%) | 3267 (85.3%) | 3780 (85.7%) | 10,383 (85.6%) | |
| 3902 | 3829 | 4409 | 12,140 |
Averages and ranges of the performance metrics of QSAR tools in the Ames/QSAR challenge project
| Performance metric | Phase I | Phase II | Phase III |
|---|---|---|---|
| Sensitivity (%) | 56.7 (38.6–70.0) | 58.0 (41.6–72.1) | 57.1 (31.7–67.6) |
| Specificity (%) | 77.7 (62.5–91.5) | 84.2 (64.9–92.8) | 79.9 (60.7–93.0) |
| Accuracy (%) | 74.7 (63.6–83.9) | 80.3 (65.8–87.7) | 76.7 (68.0–87.3) |
| Balanced Accuracy (%) | 67.2 (62.1–72.5) | 71.1 (64.0–78.9) | 68.5 (62.0–74.4) |
| Positive Prediction Value (%) | 31.2 (24.8–43.1) | 41.2 (27.4–56.3) | 34.8 (21.1–51.0) |
| Negative Prediction Value (%) | 91.5 (89.4–92.5) | 91.9 (88.1–94.2) | 92.0 (89.1–93.6) |
| MCC | 0.28 (0.20–0.39) | 0.37 (0.25–0.50) | 0.31 (0.17–0.44) |
| Coverage (%) | 91.4 (57.7–100) | 89.1 (22.7–100) | 92.3 (74.5–100) |
Fig. 2Convolutional Neural Network (CNN) from SMILES text
Fig. 3Evaluation of mutagenicity of chemicals by Integrated Approach