| Literature DB >> 20678183 |
Antonio Cassano1, Alberto Manganaro, Todd Martin, Douglas Young, Nadège Piclin, Marco Pintore, Davide Bigoni, Emilio Benfenati.
Abstract
BACKGROUND: The new REACH legislation requires assessment of a large number of chemicals in the European market for several endpoints. Developmental toxicity is one of the most difficult endpoints to assess, on account of the complexity, length and costs of experiments. Following the encouragement of QSAR (in silico) methods provided in the REACH itself, the CAESAR project has developed several models.Entities:
Year: 2010 PMID: 20678183 PMCID: PMC2913331 DOI: 10.1186/1752-153X-4-S1-S4
Source DB: PubMed Journal: Chem Cent J ISSN: 1752-153X Impact factor: 4.215
Validation statistics from the RF model for developmental toxicity
| Statistical parameters* | Fitting on the training set | Prediction on the test set |
|---|---|---|
| Accuracy | 100% | 84% |
| FP% | 0% | 41% |
| FN% | 0% | 5% |
| PPV | 100% | 85% |
| NPV | 100% | 83% |
| Sensitivity | 100% | 95% |
| Specificity | 100% | 59% |
* The definition of these parameters is in Table 3.
Validation statistics from the AFP model for developmental toxicity
| Statistical parameters* | Fitting on the training set | Prediction on the test set |
|---|---|---|
| Accuracy | 87% | 88% |
| FP% | 26% | 18% |
| FN% | 7% | 10% |
| PPV | 89% | 92% |
| NPV | 83% | 78% |
| Sensitivity | 93% | 90% |
| Specifificy | 74% | 82% |
* The definition of these parameters is in Table 3.
Statistical variables of the performance of a binary classification test
| Acronym | Full name | Definition/Formula |
|---|---|---|
| TP | True positive | Toxic compounds predicted as toxic |
| TN | True negative | Non toxic compounds predicted as non toxic |
| FP | False positive | Non toxic compounds predicted as toxic |
| FN | False negative | Toxic compounds predicted as non toxic |
| FP% | False positive rate | Ratio of non toxic compounds incorrectly classified as toxic FP/(FP + TN) |
| FN% | False negative rate | Ratio of toxic compounds incorrectly classified as non toxic FN/(FN + TP) |
| Sensitivity | Sensitivity or true positive rate | Ratio of toxic compounds correctly classified as toxic TP/(TP + FN) |
| Specificity | Specificity or true negative rate | Ratio of non toxic compounds correctly classified as non toxic TN/(FP + TN) |
| Accuracy | Accuracy or concordance | Proportion of the total number of predictions that were correct (TP + TN)/(TP + TN + FP + FN) |
| PPV | Positive predictive value | Ratio of the predicted toxic compounds that were correct TP/(TP + FP) |
| NPV | Negative predictive value | Ratio of the predicted non toxic compounds that were correct TN/(TN + FN) |
Similarity values of the compounds in the test set
| prediction concordance | similarity score 1 | number of compounds |
|---|---|---|
| True negative (TN) | 0.828 +/- 0.075 | 10 |
| True positive (TP) | 0.829 +/- 0.089 | 39 |
| False positive (FP) | 0.717 +/- 0.062 | 7 |
| False negative (FN) | 0.716 +/- 0.087 | 2 |
| Correctly predicted | 0.829 +/- 0.085 | 49 |
| Incorrectly predicted | 0.717 +/- 0.062 | 9 |
1. Average of the averages for similarity values of the three most similar compounds (mean +/- standard deviation).
The list of descriptors used in the RF model
| Symbol | Definition |
|---|---|
| Icycem | Mean information on the vertex cycle matrix equality |
| BEHm1 | Highest eigenvalue n. 1 of Burden matrix/weighted by atomic masses |
| BELp3 | Lowest eigenvalue n. 3 of Burden matrix/weighted by atomic polarizabilities |
| BELv1 | Highest eigenvalue n. 1 of Burden matrix/weighted by atomic van der Waals volumes |
| BELv8 | Highest eigenvalue n. 8 of Burden matrix/weighted by atomic van der Waals volumes |
| GATS1p | Geary autocorrelation - lag 1/weighted by atomic polarizabilities |
| GATS2m | Geary autocorrelation - lag 2/weighted by atomic masses |
| GATS3v | Geary autocorrelation - lag 3/weighted by atomic van der Waals volumes |
| MATS1p | Moran autocorrelation - lag 1/weighted by atomic polarizabilities |
| MATS4p | Moran autocorrelation - lag 4/weighted by atomic polarizabilities |
| MATS4v | Moran autocorrelation - lag 4/weighted by atomic van der Waals volumes |
| SdssC | Sum of all (αC --) E-State values in molecule |
| ShssNH | Sum of all [-- NH -- ] E-State values in molecule |
The list of the descriptors used in the AFP model
| Symbol | Definition |
|---|---|
| SsOH | Sum of all (-- OH) E-State values in molecule |
| Gmin | Smallest atom E-State value in molecule |
| BEHv1 | Highest eigenvalue n. 1 of Burden matrix/weighted by atomic van der Waals volumes |
| BELe1 | Lowest eigenvalue n. 1 of Burden matrix/weighted by atomic Sanderson electronegativities |
| BELp2 | Lowest eigenvalue n. 1 of Burden matrix/weighted by atomic Sanderson electronegativities |
| ATS8m | Broto-Moreau autocorrelation of a topological structure - lag 8/weighted by atomic masses |
Division of the developmental toxicity data set according to the FDA Guidelines and CAESAR binary classes
| FDA classes | Definition | CAESAR Binary class | Total compounds |
|---|---|---|---|
| Category A | Negative human studies | ||
| Category B | Negative animal studies | ||
| No human studies executed | Non-developmental toxicant | 91 | |
| OR | |||
| Positive animal studies | |||
| Negative human studies | |||
| Category C | Positive animal studies | ||
| No human studies executed | |||
| OR | |||
| No studies at all | |||
| Category D | Positive human studies | Developmental toxicant | 201 |
| Category X | Animal OR human studies show abnormalities | ||
| AND/OR | |||
| Evidence of fetal risk based on human experience | |||
| 292 | |||
Splitting of developmental toxicity compounds, as in table 7
| Classes | Total compounds | Training Set | Test Set |
|---|---|---|---|
| Non-developmental toxicant | 91 | 74 | 17 |
| Developmental toxicant | 201 | 160 | 41 |
| Total number of compounds | 292 | 234 | 58 |