| Literature DB >> 31591318 |
Shuaibing He1,2,3,4,5, Xuelian Zhang6,7,8,9,10, Shan Lu11,12,13,14,15, Ting Zhu16,17,18,19,20, Guibo Sun21,22,23,24,25, Xiaobo Sun26,27,28,29,30.
Abstract
In recent years, liver injury induced by Traditional Chinese Medicines (TCMs) has gained increasing attention worldwide. Assessing the hepatotoxicity of compounds in TCMs is essential and inevitable for both doctors and regulatory agencies. However, there has been no effective method to screen the hepatotoxic ingredients in TCMs available until now. In the present study, we initially built a large scale dataset of drug-induced liver injuries (DILIs). Then, 13 types of molecular fingerprints/descriptors and eight machine learning algorithms were utilized to develop single classifiers for DILI, which resulted in 5416 single classifiers. Next, the NaiveBayes algorithm was adopted to integrate the best single classifier of each machine learning algorithm, by which we attempted to build a combined classifier. The accuracy, sensitivity, specificity, and area under the curve of the combined classifier were 72.798, 0.732, 0.724, and 0.793, respectively. Compared to several prior studies, the combined classifier provided better performance both in cross validation and external validation. In our prior study, we developed a herb-hepatotoxic ingredient network and a herb-induced liver injury (HILI) dataset based on pre-clinical evidence published in the scientific literature. Herein, by combining that and the combined classifier developed in this work, we proposed the first instance of a computational toxicology to screen the hepatotoxic ingredients in TCMs. Then Polygonum multiflorum Thunb (PmT) was used as a case to investigate the reliability of the approach proposed. Consequently, a total of 25 ingredients in PmT were identified as hepatotoxicants. The results were highly consistent with records in the literature, indicating that our computational toxicology approach is reliable and effective for the screening of hepatotoxic ingredients in Pmt. The combined classifier developed in this work can be used to assess the hepatotoxic risk of both natural compounds and synthetic drugs. The computational toxicology approach presented in this work will assist with screening the hepatotoxic ingredients in TCMs, which will further lay the foundation for exploring the hepatotoxic mechanisms of TCMs. In addition, the method proposed in this work can be applied to research focused on other adverse effects of TCMs/synthetic drugs.Entities:
Keywords: DILI; Polygonum multiflorum Thunb; TCMs; Traditional Chinese Medicines; computational toxicology; drug-induced liver injury; hepatotoxicity
Year: 2019 PMID: 31591318 PMCID: PMC6843577 DOI: 10.3390/biom9100577
Source DB: PubMed Journal: Biomolecules ISSN: 2218-273X
Feature selection results of 13 types of molecular fingerprints/descriptors.
| ID | Type | Number of Features | ||
|---|---|---|---|---|
| PaDEL-Descriptor | Boruta | Find Correlation | ||
| 1 | FP | 1024 | 117 | 117 |
| 2 | ExtFP | 1024 | 111 | 111 |
| 3 | EStateFP | 79 | 13 | 12 |
| 4 | GraphFP | 1024 | 83 | 73 |
| 5 | MACCSFP | 166 | 59 | 52 |
| 6 | PubchemFP | 881 | 77 | 58 |
| 7 | SubFP | 307 | 25 | 22 |
| 8 | SubFPC | 307 | 18 | 15 |
| 9 | KRFP | 4860 | 61 | 49 |
| 10 | KRFPC | 4860 | 38 | 26 |
| 11 | AP2D | 780 | 38 | 34 |
| 12 | APC2D | 780 | 33 | 17 |
| 13 | 2D Descriptor | 1444 | 138 | 91 |
Figure 1Workflow illustrating the combined classifier framework for predicting drug induced liver injury.
Parameter optimization.
| Algorithm | Parameter | Parameter Optimization Method |
|---|---|---|
| RandomForest | CVParameterSelection | |
| Bagging | CVParameterSelection | |
| AdaBoostM1 | CVParameterSelection | |
| IBk | CVParameterSelection | |
| LibSVM | GridSearch | |
| KStar | CVParameterSelection | |
| J48 | CVParameterSelection |
Performance of the eight best single classifiers.
| Classifiers | Type (Feature Number) | Parameter | ACC | AUC | SE | SP | BACC |
|---|---|---|---|---|---|---|---|
| RandomForest | 2D Descriptor (30) | Depth = 0 | 72.250 | 0.794 | 0.725 | 0.720 | 0.723 |
| Bagging | 2D Descriptor (70) | K = 5 | 69.557 | 0.762 | 0.663 | 0.726 | 0.695 |
| AdaBoostM1 | 2D Descriptor (69) | C = 0.25 | 68.736 | 0.743 | 0.651 | 0.721 | 0.686 |
| IBk | 2D Descriptor (76) | K = 5 | 70.196 | 0.758 | 0.673 | 0.729 | 0.701 |
| LibSVM | ExtFP (80) | C = 0.5, γ = 0.125 | 69.786 | 0.699 | 0.718 | 0.680 | 0.699 |
| KStar | AP2D (14) | B = 1 | 66.819 | 0.704 | 0.691 | 0.647 | 0.669 |
| J48 | SubFP (19) | C = 0.25 | 68.188 | 0.712 | 0.694 | 0.671 | 0.683 |
| NaiveBayes | ExtFP (42) | Default set | 64.993 | 0.704 | 0.677 | 0.625 | 0.651 |
Figure 2Receiver operating characteristic curves of the eight best single classifiers and the combined classifier.
Performance of the eight best single classifiers and the combined classifier on the integrated external validation set.
| Classifier | ACC | AUC | SE | SP | BACC |
|---|---|---|---|---|---|
| RandomForest | 76.961 | 0.810 | 0.789 | 0.737 | 0.763 |
| Bagging | 69.118 | 0.764 | 0.711 | 0.658 | 0.685 |
| AdaBoostM1 | 67.647 | 0.713 | 0.672 | 0.684 | 0.678 |
| IBk | 68.137 | 0.752 | 0.711 | 0.711 | 0.711 |
| LibSVM | 78.431 | 0.756 |
| 0.645 | 0.756 |
| KStar | 65.686 | 0.673 | 0.703 | 0.579 | 0.641 |
| J48 | 64.216 | 0.632 | 0.711 | 0.526 | 0.619 |
| NaiveBayes | 61.275 | 0.603 | 0.719 | 0.434 | 0.577 |
| Combined classifier |
|
| 0.813 |
|
|
The maximum value of each index was highlighted with bold.
Figure 3Comparisons between the combined classifier and prior studies within cross validation.
Figure 4Comparisons between the combined classifier and prior studies on external validation sets. (A) The combined classifier versus Ai’s model; (B) the combined classifier versus Zhang’s model; (C) the combined classifier versus Kotsampasakou’s model.
Figure 5Diagram of the computational toxicology approach to identification the hepatotoxic ingredients in Traditional Chinese Medicines (TCMs).
Figure 6Hierarchical cluster analysis of the 98 ingredients in Polygonum multiflorum Thunb (PmT). The compounds predicted as hepatotoxicity by the combined classifier were highlighted with red solid circles. Molecule ID corresponds to ID in Supplementary File 1 (PmT).
Hepatotoxic ingredients in PmT identified by the computational toxicology approach.
| ID | Ingredient | Liver Toxicity | Source | ||
|---|---|---|---|---|---|
| Subgroup 1 | Subgroup 2 | Subgroup 3 | |||
| 1 | Emodin | [ | + | + | + |
| 2 | Chrysophanol | [ | + | + | + |
| 3 | Chrysarobin | [ | none | none |
|
| 4 | Rhein | [ | + | + | + |
| 5 | Danthron | [ | + | none | + |
| 6 | Polygonumnolide C2 | [ | none | none |
|
| 7 | Emodin dianthrone | [ | none | none |
|
| 8 | Aloe emodin | [ | + | + | none |
| 9 | Luteolin | [ | - |
| none |
| 10 | Physcion | [ | + | + | none |
| 11 | Apigenin | [ |
| none | none |
| 12 | Emodin-8-methyl ether | No report |
| none | none |
| 13 | Citreorosein | No report |
| none | none |
| 14 | Emodin-3-methyl ether | No report |
| none | none |
| 15 | Fallacinol | No report |
| none | none |
| 16 | 2-Acetylemodin | No report |
| none | none |
| 17 | Hexadecanoic acid methyl ester | No report |
| none | none |
| 18 | Octadecanoic acid methyl ester | No report |
| none | none |
| 19 | Docosanoic acid methyl ester | No report |
| none | none |
| 20 | 4-Hydroxybenzaldehyde | No report |
| none | none |
| 21 | 2,5-dimethyl-7-hydroxychromone | No report |
| none | none |
| 22 | Hydroxymaltol | No report |
| none | none |
| 23 | Butanedioic acid | No report |
| none | none |
| 24 | Emodin-6,8-dimethylether | No report |
| none | none |
| 25 | Hexanoic acid | No report |
| none | none |
“+” and “−” indicate hepatotoxic and non-hepatotoxic, respectively. The unique hepatotoxic ingredients from each subgroup are highlighted with red; “none” in column subgroup 1 means that the ingredient was not included in the 98 compounds of Pmt; “none” in subgroup 2 or subgroup 3 represents that the ingredient was not found in the HILI dataset or herb-hepatotoxic ingredient network, respectively.
Figure 7Venn diagram to show a comparison between the computational toxicology approach and the prior study. Prsent “+” and Prsent “−” indicate that the compound was predicted as hepatotoxic or non-hepatotoxic by our computational toxicology approach, respectively. Wang “+” and Wang “−” represent that the compound was identified as hepatotoxic or non-hepatotoxic by Wang et al, respectively. Ingredients included in each module are available in Supplementary File 1 (PmT).