| Literature DB >> 20678181 |
Thomas Ferrari1, Giuseppina Gini.
Abstract
BACKGROUND: Mutagenicity is the capability of a substance to cause genetic mutations. This property is of high public concern because it has a close relationship with carcinogenicity and potentially with reproductive toxicity. Experimentally, mutagenicity can be assessed by the Ames test on Salmonella with an estimated experimental reproducibility of 85%; this intrinsic limitation of the in vitro test, along with the need for faster and cheaper alternatives, opens the road to other types of assessment methods, such as in silico structure-activity prediction models.A widely used method checks for the presence of known structural alerts for mutagenicity. However the presence of such alerts alone is not a definitive method to prove the mutagenicity of a compound towards Salmonella, since other parts of the molecule can influence and potentially change the classification. Hence statistically based methods will be proposed, with the final objective to obtain a cascade of modeling steps with custom-made properties, such as the reduction of false negatives.Entities:
Year: 2010 PMID: 20678181 PMCID: PMC2913329 DOI: 10.1186/1752-153X-4-S1-S2
Source DB: PubMed Journal: Chem Cent J ISSN: 1752-153X Impact factor: 4.215
Figure 1The architecture of the integrated mutagenicity model: cascading filters.
Confusion matrix of the mutagenicity integrated model on the test set (837 chemical compounds).
| Test set | Mutagenic predictions | Non-mutagenic predictions | Suspicious predictions |
|---|---|---|---|
| Mutagens | 403 | 48 | 14 |
| Non-mutagens | 88 | 268 | 16 |
The "suspicious" label marks potential mutagens picked out from the "non-mutagenic" predictions.
Confusion matrix of mutagenicity integrated model on the training set (3367 chemical compounds).
| Training set | Mutagenic predictions | Non-mutagenic predictions | Suspicious predictions | Unpredicted compounds |
|---|---|---|---|---|
| Mutagens | 1798 | 69 | 15 | 1 |
| Non-mutagens | 169 | 1239 | 76 | 0 |
The low number of true positives in the suspicious set, if compared with the test set confusion matrix (cf. Table 1), is due to the very small number of real mutagens in the "non-mutagenic" predictions on the training set. The unpredicted structure was not processed by the CDK library.
Compared statistics, on the test set, between the integrated model and its single components: the SVM statistical model and the Benigni/Bossa structural alerts set for mutagenicity.
| Test set | Benigni/Bossa | SVM classifier | Integrated model | Integrated model |
|---|---|---|---|---|
| accuracy: | 78.3% | 81.2% | 82.1% | 81.8% |
| sensitivity: | 86% | 84.1% | 86.7% | 89.7% |
| specificity: | 69.6% | 77.7% | 76.3% | 72% |
To evaluate the structural alerts, the "official" (commissioned by JRC) Toxtree v. 1.60 implementation of the Benigni/Bossa rulebase has been used.
Figure 2Graph view of the final model prediction on the test set (837 chemicals). This representation highlights how the suspicious rules set can extract the most suspect compounds from safe predictions with a good accuracy, if related to the very low number of real mutagens still present.