| Literature DB >> 27920722 |
Fabiola Pizzo1, Anna Lombardo1, Alberto Manganaro1, Emilio Benfenati1.
Abstract
The prompt identification of chemical molecules with potential effects on liver may help in drug discovery and in raising the levels of protection for human health. Besides in vitro approaches, computational methods in toxicology are drawing attention. We built a structure-activity relationship (SAR) model for evaluating hepatotoxicity. After compiling a data set of 950 compounds using data from the literature, we randomly split it into training (80%) and test sets (20%). We also compiled an external validation set (101 compounds) for evaluating the performance of the model. To extract structural alerts (SAs) related to hepatotoxicity and non-hepatotoxicity we used SARpy, a statistical application that automatically identifies and extracts chemical fragments related to a specific activity. We also applied the chemical grouping approach for manually identifying other SAs. We calculated accuracy, specificity, sensitivity and Matthews correlation coefficient (MCC) on the training, test and external validation sets. Considering the complexity of the endpoint, the model performed well. In the training, test and external validation sets the accuracy was respectively 81, 63, and 68%, specificity 89, 33, and 33%, sensitivity 93, 88, and 80% and MCC 0.63, 0.27, and 0.13. Since it is preferable to overestimate hepatotoxicity rather than not to recognize unsafe compounds, the model's architecture followed a conservative approach. As it was built using human data, it might be applied without any need for extrapolation from other species. This model will be freely available in the VEGA platform.Entities:
Keywords: chemical clustering; drugs; hepatotoxicity; structural alerts; structure-activity relationship
Year: 2016 PMID: 27920722 PMCID: PMC5118449 DOI: 10.3389/fphar.2016.00442
Source DB: PubMed Journal: Front Pharmacol ISSN: 1663-9812 Impact factor: 5.810
Figure 1Identification and validation of SAs for hepatotoxicity.
Manually extracted structural alerts (SAs) with the total number of occurrences and the number and percentage of true positive (TP) in the training set.
| 1 | [n,c]1ccn[n,c]c1 | Hepatotoxic | N-containing heterocycles aromatic compounds (pyridine, pyrazine, pyrimidine) | 57 | 41 (71.40) | 16 (28.60) | |
| 2 | NS(= O)(= O)c1ccccc1 | Hepatotoxic | Sulphonamides | 31 | 22 (70.96) | 9 (29.04) | |
| 3 | OC(= O)C1[C,S][S,O,C]C2CC(= O)N12 | Hepatotoxic | β-lactam antibiotics (penicillin) | 12 | 8 (66.66) | 4 (33.34) | |
| 4 | O = C1N~CC = C[N,C]1C2C~[S,C]CO2 | Hepatotoxic | Nucleoside analogs | 11 | 9 (81.80) | 2 (18.20) | |
| 5 | C1[S,C,N,O]c2ccccc2[N,C,S,O]c3ccccc13 | Hepatotoxic | Tricyclic antidepressants (TCAs) | 11 | 9 (81.80) | 2 (18.20) | |
| 6 | [N;!$([N+]);!$(NC = O);!$(N = [N,C,O])][a] | Hepatotoxic | Aromatic amines | 10 | 6 (60.00) | 4 (40.00) | |
| 7 | O = C1CCCCCCCCCCCCO1 | Hepatotoxic | Macrolide antibiotics | 7 | 5 (71.40) | 2 (28.60) | |
| 8 | Nc1[n,c]cc2C(= O)C(= CNc2[c,n]1)C(O) = O | Hepatotoxic | Anti-bacterial agents (fluorquinolone) | 6 | 4 (66.66) | 2 (33.34) | |
| 9 | *N(*)CCC(c1cccc[n,c]1)c2cccc[n,c]2 | Hepatotoxic | Cationic amphiphilic drugs (CADs) | 6 | 5 (83.33) | 1 (16.67) | |
| 10 | CC = C(C)C = CC = C(C)C = C[R,a] | Hepatotoxic | Retinoids | 4 | 3 (75.00) | 1 (25.00) | |
| 11 | CNC(= O)N(CCCl)N = O | Hepatotoxic | Nitrosourea compounds | 2 | 2 (100) | 0 (0) | |
| 12 | C1CC2CCC3C(CC[C,c]4[C,c][C,c][C,c][C,c][C,c]34)C2C1 | non-hepatotoxic | Steroids | 23 | 16 (TN) (69.56) | 7 (FN) (30.44) | |
| 13 | CC(= O)NC1C2[S,O]CC = C(N2C1 = O)C(O) = O | non-hepatotoxic | β-lactam antibiotics (cephalosporins) | 16 | 11 (TN) (68.75) | 5 (FN) (31.25) |
We used Marvin for drawing and displaying chemical structures and substructures (Marvin 5.11.5, 2013, ChemAxon) (http://www.chemaxon.com).
Figure 2Decision tree developed for the hepatotoxicity model. Hep stands for “hepatotoxic” and non-hep for “non-hepatotoxic.”
Figure 3Percentages of correctly predicted, wrongly predicted and non-predicted (unknown) compounds in the training, test and external validation sets.
Performance of the model in the training, test and external validation sets.
| Number of compounds | 760 | 190 | 101 |
| Number of TP | 263 | 48 | 35 |
| Number of FP | 72 | 30 | 10 |
| Number of TN | 144 | 15 | 5 |
| Number of FN | 18 | 6 | 9 |
| Number predicted | 497 | 99 | 59 |
| Number unknown | 263 | 91 | 42 |
| Accuracy | 81 | 63 | 68 |
| Sensitivity | 93 | 88 | 80 |
| Specificity | 67 | 33 | 33 |
| MCC | 0.64 | 0.27 | 0.13 |