| Literature DB >> 23457578 |
Lei Chen1, Jing Lu, Jian Zhang, Kai-Rui Feng, Ming-Yue Zheng, Yu-Dong Cai.
Abstract
Toxicity is a major contributor to high attrition rates of new chemical entities in drug discoveries. In this study, an order-classifier was built to predict a series of toxic effects based on data concerning chemical-chemical interactions under the assumption that interactive compounds are more likely to share similar toxicity profiles. According to their interaction confidence scores, the order from the most likely toxicity to the least was obtained for each compound. Ten test groups, each of them containing one training dataset and one test dataset, were constructed from a benchmark dataset consisting of 17,233 compounds. By a Jackknife test on each of these test groups, the 1(st) order prediction accuracies of the training dataset and the test dataset were all approximately 79.50%, substantially higher than the rate of 25.43% achieved by random guesses. Encouraged by the promising results, we expect that our method will become a useful tool in screening out drugs with high toxicity.Entities:
Mesh:
Year: 2013 PMID: 23457578 PMCID: PMC3574107 DOI: 10.1371/journal.pone.0056517
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Distribution of compounds in each category of compound toxicity.
| Tag | Toxicity | Total |
|
| Acute Toxicity | 12,633 |
|
| Mutagenicity | 6,110 |
|
| Tumorigenicity | 2,293 |
|
| Skin and Eye Irritation | 2,353 |
|
| Reproductive Effects | 2,501 |
|
| Multiple Dose Effects | 4,198 |
|
| Non-toxicity | 646 |
| Total | – | 30,734 |
Figure 1The number of compounds plotted against the number of categories in the benchmark dataset.
Distribution of compounds in training and test datasets of each test group.
|
|
|
|
|
| ||||||
| Tag | ||||||||||
|
| 11,382 | 1,251 | 11,387 | 1,246 | 11,351 | 1,282 | 11,364 | 1,269 | 11,385 | 1,248 |
|
| 5,475 | 635 | 5,476 | 634 | 5,529 | 581 | 5,492 | 618 | 5,491 | 619 |
|
| 2,065 | 228 | 2,065 | 228 | 2,063 | 230 | 2,063 | 230 | 2,056 | 237 |
|
| 2,102 | 251 | 2,102 | 251 | 2,115 | 238 | 2,112 | 241 | 2,093 | 260 |
|
| 2,235 | 266 | 2,235 | 266 | 2,260 | 241 | 2,255 | 246 | 2,235 | 266 |
|
| 3,747 | 451 | 3,749 | 449 | 3,777 | 421 | 3,784 | 414 | 3,799 | 399 |
|
| 582 | 64 | 577 | 69 | 586 | 60 | 582 | 64 | 583 | 63 |
| Total | 27,588 | 3,146 | 27,591 | 3,143 | 27,681 | 3,053 | 27,652 | 3,082 | 27,642 | 3,092 |
|
|
|
|
|
| ||||||
|
| ||||||||||
|
| 11,367 | 1,266 | 11,395 | 1,238 | 11,369 | 1,264 | 11,374 | 1,259 | 11,353 | 1,280 |
|
| 5,489 | 621 | 5,500 | 610 | 5,492 | 618 | 5,497 | 613 | 5,506 | 604 |
|
| 2,075 | 218 | 2,067 | 226 | 2,070 | 223 | 2,043 | 250 | 2,070 | 223 |
|
| 2,123 | 230 | 2,125 | 228 | 2,135 | 218 | 2,102 | 251 | 2,133 | 220 |
|
| 2,244 | 257 | 2,243 | 258 | 2,236 | 265 | 2,258 | 243 | 2,234 | 267 |
|
| 3,762 | 436 | 3,750 | 448 | 3,772 | 426 | 3,777 | 421 | 3,755 | 443 |
|
| 583 | 63 | 587 | 59 | 579 | 67 | 569 | 77 | 584 | 62 |
| Total | 27,643 | 3,091 | 27,667 | 3,067 | 27,653 | 3,081 | 27,620 | 3,114 | 27,635 | 3,099 |
Prediction accuracies obtained by the method as applied to training and test datasets of each test group.
|
|
|
|
|
| ||||||
| Prediction Order | ||||||||||
| 1 | 79.40% | 79.69% | 79.45% | 79.28% | 79.23% | 80.62% | 79.28% | 79.45% | 79.30% | 79.34% |
| 2 | 37.16% | 38.42% | 37.14% | 38.24% | 37.54% | 37.20% | 37.17% | 38.31% | 37.40% | 36.16% |
| 3 | 22.18% | 23.16% | 22.20% | 22.87% | 22.32% | 21.65% | 22.29% | 22.63% | 22.53% | 22.87% |
| 4 | 15.45% | 16.66% | 15.49% | 16.77% | 16.35% | 14.86% | 15.46% | 16.13% | 15.41% | 15.55% |
| 5 | 11.06% | 11.61% | 11.04% | 11.49% | 11.00% | 10.85% | 10.88% | 10.16% | 10.95% | 11.20% |
| 6 | 6.92% | 7.25% | 6.84% | 7.89% | 7.23% | 5.86% | 6.99% | 6.56% | 6.85% | 7.84% |
| 7 | 1.21% | 1.33% | 1.22% | 1.04% | 1.27% | 1.51% | 1.39% | 1.45% | 1.26% | 1.68% |
|
|
|
|
|
| ||||||
|
|
|
|
|
|
|
|
|
|
|
|
| 1 | 79.57% | 80.15% | 79.36% | 79.98% | 79.45% | 79.05% | 79.52% | 79.80% | 79.46% | 79.34% |
| 2 | 37.11% | 37.72% | 37.57% | 36.10% | 37.21% | 38.65% | 37.32% | 35.98% | 37.44% | 37.20% |
| 3 | 22.57% | 22.29% | 22.30% | 23.39% | 22.23% | 24.03% | 22.46% | 23.33% | 22.42% | 22.93% |
| 4 | 15.31% | 15.90% | 15.36% | 15.55% | 15.52% | 14.74% | 15.40% | 16.25% | 15.36% | 16.37% |
| 5 | 10.93% | 10.45% | 10.95% | 11.55% | 11.08% | 10.10% | 10.74% | 11.55% | 10.87% | 10.74% |
| 6 | 7.00% | 6.56% | 7.00% | 6.62% | 7.16% | 5.86% | 6.76% | 7.78% | 6.97% | 7.25% |
| 7 | 1.25% | 1.57% | 1.32% | 0.99% | 1.32% | 1.45% | 1.27% | 1.57% | 1.30% | 1.33% |
Proportions of true toxicities covered by the first two predictions for training and test datasets of each test group.
| Test group | Training dataset | Test dataset |
|
| 65.52% | 64.69% |
|
| 65.54% | 64.52% |
|
| 65.43% | 66.49% |
|
| 65.32% | 65.83% |
|
| 65.48% | 64.36% |
|
| 65.46% | 65.71% |
|
| 65.55% | 65.21% |
|
| 65.43% | 65.82% |
|
| 65.61% | 64.07% |
|
| 65.61% | 64.79% |
Figure 2The structures of the alkyl N-nitroso group and the primary aromatic amine group.
Details of Tasosartan’s interactive compounds in the training dataset.
| Compound ID | Tag of toxicity class | Its interactive compound ID | Tag of toxicity class | Confidence score |
| CID000060919 |
| CID000003749 |
| 679 |
| CID000060919 |
| CID000002541 |
| 670 |
| CID000060919 |
| CID000060921 |
| 669 |
| CID000060919 |
| CID000003961 |
| 667 |
| CID000060919 |
| CID000060846 |
| 658 |
| CID000060919 |
| CID000065999 |
| 643 |
| CID000060919 |
| CID000054738 |
| 172 |
The details of common compounds belonging to two categories.
| Tag of toxicity class |
|
|
|
|
|
|
|
| 12,633 | 3,483(22.8%) | 1,485(11.0%) | 2,027(15.6%) | 2,075(15.9%) | 3,446(25.7%) |
|
| 6110 | 1,720(25.7%) | 1,213(16.7%) | 1,336(18.4%) | 1,723(20.1%) | |
|
| 2293 | 570(14.0%) | 753(18.6%) | 781(13.7%) | ||
|
| 2353 | 731(17.7%) | 897(15.9%) | |||
|
| 2501 | 1,409(26.6%) | ||||
|
| 4,198 |
The number of common compounds belonging to two categories.
The number in parenthesis means the ratio of the number of common compounds to the number of non-overlapping compounds of the two categories.
Figure 3Nongeneric SAs (Benigni) and some carcinogens matching these SAs.