| Literature DB >> 34834075 |
Cosimo Toma1, Claudia I Cappelli1, Alberto Manganaro2, Anna Lombardo1, Jürgen Arning3, Emilio Benfenati1.
Abstract
To assess the impact of chemicals on an aquatic environment, toxicological data for three trophic levels are needed to address the chronic and acute toxicities. The use of non-testing methods, such as predictive computational models, was proposed to avoid or reduce the need for animal models and speed up the process when there are many substances to be tested. We developed predictive models for Raphidocelis subcapitata, Daphnia magna, and fish for acute and chronic toxicities. The random forest machine learning approach gave the best results. The models gave good statistical quality for all endpoints. These models are freely available for use as individual models in the VEGA platform and for prioritization in JANUS software.Entities:
Keywords: Daphnia magna; Raphidocelis subcapitata; applicability domain; biological databases; fish; quantitative structure-activity relationship (QSAR); random forest
Mesh:
Substances:
Year: 2021 PMID: 34834075 PMCID: PMC8618112 DOI: 10.3390/molecules26226983
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
The statistical parameters of the QSAR models on aquatic toxicity. Box-Cox transformation of millimoles per liter was used in place of the logarithm of millimoles per liter.
|
|
| Fish | |||||
|---|---|---|---|---|---|---|---|
| EC50 a | NOEC b | EC50 a | NOEC b | LC50 a | NOEC b | ||
| Training set | R2 c | 0.96 | 0.95 | 0.94 | 0.95 | 0.95 | 0.96 |
| MAE d | 0.41 | 0.56 | 0.49 | 0.42 | 0.27 | 0.54 | |
| RMSE e | 0.52 | 0.74 | 0.64 | 0.56 | 0.37 | 0.68 | |
| Validation set without AD f | R2 c | 0.59 | 0.58 | 0.56 | 0.74 | 0.65 | 0.74 |
| MAE d | 0.96 | 1.36 | 0.99 | 0.83 | 0.68 | 1.75 | |
| RMSE e | 1.25 | 1.73 | 1.31 | 1.13 | 0.87 | 2.45 | |
| Validation set with AD f | R2 c | 0.6 | 0.63 | 0.69 | 0.78 | 0.65 | 0.76 |
| MAE d | 0.97 | 1.29 | 0.84 | 0.8 | 0.64 | 1.79 | |
| RMSE e | 1.26 | 1.66 | 1.09 | 1.07 | 0.83 | 2.54 | |
| Coverage | 0.89 | 0.93 | 0.84 | 0.81 | 0.81 | 0.89 | |
| Details of the model | Feature selection | VSURF | GASELECT | GASELECT | VSURF | VSURF | GASELECT |
| No. of descriptors | 13 | 40 | 12 | 17 | 13 | 12 | |
| Distance mode | Euclidean-5 | Euclidean-5 | Euclidean-1 | Euclidean-1 | Euclidean-5 | Euclidean-5 | |
| Distance threshold | 0.9 | 0.975 | 0.9 | 0.975 | 0.9 | 0.975 | |
| Error percentile | 0.9 | 1 | 0.75 | 0.75 | 1 | 1 | |
a E(L)C50 is the concentration that causes the effect (death) in 50% of the exposed population. b NOEC is the no observed effect concentration. c R2 is the determination coefficient. d MAE is the mean absolute error. e RMSE is the root mean squared error. f AD is the applicability domain of the model.
Number of chemicals for each trophic level from acute short-term and chronic toxicity tests available in the Japanese Ministry of Environment’s database after pruning of chemical structures and experimental values. The numbers of substances before pruning are given in parentheses.
| Acute Toxicity Test | Chronic Toxicity Test | |
|---|---|---|
| Number of chemicals for | 315 (372) | 408 (577) |
| Number of chemicals for | 428 (509) | 306 (372) |
| Number of chemicals for fish | 331 (393) | 35 (37) |