| Literature DB >> 31360720 |
Shilun Yang1,2, Yanjia Shen2, Wendan Lu2, Yinglin Yang2, Haigang Wang2, Li Li2, Chunfu Wu1, Guanhua Du1,2.
Abstract
Xiaoxuming decoction (XXMD), a classic traditional Chinese medicine (TCM) prescription, has been used as a therapeutic in the treatment of stroke in clinical practice for over 1200 years. However, the pharmacological mechanisms of XXMD have not yet been elucidated. The purpose of this study was to develop neuroprotective models for identifying neuroprotective compounds in XXMD against hypoxia-induced and H2O2-induced brain cell damage. In this study, a phenotype-based classification method was designed by machine learning to identify neuroprotective compounds and to clarify the compatibility of XXMD components. Four different single classifiers (AB, kNN, CT, and RF) and molecular fingerprint descriptors were used to construct stacked naïve Bayesian models. Among them, the RF algorithm had a better performance with an average MCC value of 0.725±0.014 and 0.774±0.042 from 5-fold cross-validation and test set, respectively. The probability values calculated by four models were then integrated into a stacked Bayesian model. In total, two optimal models, s-NB-1-LPFP6 and s-NB-2-LPFP6, were obtained. The two validated optimal models revealed Matthews correlation coefficients (MCC) of 0.968 and 0.993 for 5-fold cross-validation and of 0.874 and 0.959 for the test set, respectively. Furthermore, the two models were used for virtual screening experiments to identify neuroprotective compounds in XXMD. Ten representative compounds with potential therapeutic effects against the two phenotypes were selected for further cell-based assays. Among the selected compounds, two compounds significantly inhibited H2O2-induced and Na2S2O4-induced neurotoxicity simultaneously. Together, our findings suggested that machine learning algorithms such as combination Bayesian models were feasible to predict neuroprotective compounds and to preliminarily demonstrate the pharmacological mechanisms of TCM.Entities:
Year: 2019 PMID: 31360720 PMCID: PMC6652039 DOI: 10.1155/2019/6847685
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
The total amount of each Chinese medicine compound obtained from the database.
| No. | Chinese name | English name | Latin name | Number of compounds |
|---|---|---|---|---|
| 1 | Bai Shao | White Peony Root | Paeoniae Radix Alba | 41 |
| 2 | Chuanxiong | Sichuan lovage rhizome | Chuanxiong Rhizoma | 242 |
| 3 | Fangfeng | Divaricate Saposhnikovia Root | Saposhnikovia Radix | 107 |
| 4 | Fang Ji | Fourstamen Stephania Root | Stephaniae Tetrandrae Radix | 85 |
| 5 | Fuzi | Prepared Common Monkshood Daughter Root | Aconiti Lateralis Radix Praeparata | 99 |
| 6 | Gan Cao | Liquorice root | Glycyrrhizae Radix et Rhizoma | 393 |
| 7 | Guizhi | Cassia twig | Cinnamomi Ramulus | 130 |
| 8 | Huang Qin | Baikal Skullcap Root | Scutellariae Radix | 128 |
| 9 | Kuxingren | Bitter Apricot Seed | Armeniacae Semen Amarum | 119 |
| 10 | Ma Huang | Chinese Ephedra Herb | Ephedrae Herba | 74 |
| 11 | Renshen | Ginseng | Ginseng Radix et Rhizoma | 272 |
| 12 | Shengjiang | Fresh ginger | Zingiberis Rhizoma Recens | 168 |
| Total | 1858 | |||
| Remove duplicates | 1484 | |||
Figure 1Workflow for classification model building, validation, and virtual screening (VS) as applied to neuroprotective agents.
Figure 2Visual representation of the chemical space of active compounds (red) and inactive compounds (light green) against hypoxia-induced (a and c) and H2O2-induced (b and d) neurotoxicity. The visualizations of (a) and (b) were generated using t-distributed stochastic neighbor embedding (t-SNE) based on Morgan fingerprints (4096 bits). The visualizations of (c) and (d) were generated using principal component analysis (PCA) based on Morgan fingerprints (4096 bits).
Detailed statistical description of the entire data set.
| Model | Training set | Test set | ||||
|---|---|---|---|---|---|---|
| Active | Inactive | Total | Active | Inactive | Total | |
| Hypoxia-induced | 197 | 792 | 989 | 66 | 264 | 330 |
| H2O2-induced | 87 | 348 | 435 | 29 | 116 | 145 |
Performance of single classification models for the training set (5-fold cross-validation result) and the test set (validation result using external test set) using different combinations of molecular properties.
| No. | Model | Descriptors | 5-fold cross-validation result | Validation result using external test set | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| SE | SP | PPV | MCC | SE | SP | PPV | MCC | |||
| 1 | RF-a1 | 32 | 0.682 | 0.970 | 0.912 | 0.750 | 0.675 | 0.975 | 0.915 | 0.718 |
| 2 | RF-b1 | 53 | 0.621 | 0.973 | 0.903 | 0.750 | 0.650 | 0.981 | 0.915 | 0.716 |
| 3 | RF-c1 | 79 | 0.667 | 0.992 | 0.927 | 0.823 | 0.690 | 0.980 | 0.922 | 0.742 |
| 4 | K-NN-a1 | 32 | 0.682 | 0.905 | 0.861 | 0.710 | 0.695 | 0.926 | 0.880 | 0.622 |
| 5 | K-NN-b1 | 53 | 0.621 | 0.936 | 0.873 | 0.710 | 0.599 | 0.931 | 0.865 | 0.557 |
| 6 | K-NN-c1 | 79 | 0.636 | 0.936 | 0.876 | 0.760 | 0.609 | 0.931 | 0.867 | 0.565 |
| 7 | Tree-a1 | 32 | 0.621 | 0.943 | 0.879 | 0.680 | 0.635 | 0.942 | 0.881 | 0.609 |
| 8 | Tree-b1 | 53 | 0.667 | 0.939 | 0.885 | 0.566 | 0.624 | 0.914 | 0.856 | 0.545 |
| 9 | Tree-c1 | 79 | 0.682 | 0.947 | 0.894 | 0.780 | 0.711 | 0.943 | 0.897 | 0.670 |
| 10 | AB-a1 | 32 | 0.682 | 0.920 | 0.873 | 0.659 | 0.731 | 0.900 | 0.867 | 0.603 |
| 11 | AB-b1 | 53 | 0.667 | 0.936 | 0.882 | 0.741 | 0.695 | 0.902 | 0.860 | 0.578 |
| 12 | AB-c1 | 79 | 0.621 | 0.928 | 0.867 | 0.824 | 0.741 | 0.941 | 0.901 | 0.687 |
| 13 | NB-a1 | 32 | 0.766 | 0.795 | 0.790 | 0.483 | 0.758 | 0.746 | 0.748 | 0.421 |
| 14 | NB-b1 | 53 | 0.777 | 0.904 | 0.879 | 0.644 | 0.682 | 0.894 | 0.852 | 0.555 |
| 15 | NB-c1 | 79 | 0.761 | 0.908 | 0.879 | 0.640 | 0.712 | 0.905 | 0.867 | 0.598 |
| 16 | RF-a2 | 26 | 0.805 | 0.994 | 0.956 | 0.860 | 0.655 | 0.991 | 0.924 | 0.710 |
| 17 | RF-b2 | 50 | 0.839 | 0.994 | 0.963 | 0.882 | 0.690 | 0.983 | 0.924 | 0.675 |
| 18 | RF-c2 | 65 | 0.747 | 0.997 | 0.947 | 0.830 | 0.724 | 1.000 | 0.945 | 0.761 |
| 19 | K-NN-a2 | 26 | 0.793 | 0.960 | 0.926 | 0.766 | 0.724 | 0.957 | 0.910 | 0.574 |
| 20 | K-NN-b2 | 50 | 0.747 | 0.945 | 0.906 | 0.702 | 0.724 | 0.957 | 0.910 | 0.585 |
| 21 | K-NN-c2 | 65 | 0.782 | 0.951 | 0.917 | 0.739 | 0.793 | 0.957 | 0.924 | 0.597 |
| 22 | Tree-a2 | 26 | 0.759 | 0.971 | 0.929 | 0.769 | 0.655 | 0.966 | 0.903 | 0.601 |
| 23 | Tree-b2 | 50 | 0.805 | 0.937 | 0.910 | 0.726 | 0.448 | 0.983 | 0.876 | 0.629 |
| 24 | Tree-c2 | 65 | 0.782 | 0.963 | 0.926 | 0.765 | 0.793 | 0.966 | 0.931 | 0.656 |
| 25 | AB-a2 | 26 | 0.805 | 0.937 | 0.910 | 0.726 | 0.655 | 0.957 | 0.897 | 0.602 |
| 26 | AB-b2 | 50 | 0.828 | 0.931 | 0.910 | 0.732 | 0.793 | 0.948 | 0.917 | 0.621 |
| 27 | AB-c2 | 65 | 0.851 | 0.951 | 0.931 | 0.788 | 0.828 | 0.974 | 0.945 | 0.570 |
| 28 | NB-a2 | 26 | 0.839 | 0.951 | 0.929 | 0.780 | 0.724 | 0.871 | 0.841 | 0.551 |
| 29 | NB-b2 | 50 | 0.816 | 0.966 | 0.936 | 0.796 | 0.621 | 0.914 | 0.855 | 0.542 |
| 30 | NB-c2 | 65 | 0.885 | 0.943 | 0.931 | 0.795 | 0.712 | 0.905 | 0.867 | 0.598 |
1-15: neuroprotective models against hypoxia-induced neurotoxicity (NIN models).
16-30: neuroprotective models against H2O2-induced neurotoxicity (NHN models).
a: models built by DS_2D descriptors.
b: models built by MOE_2D descriptors.
c: models built by DS_MOE 2D descriptors.
Figure 3The comparison of average MCC value made by different algorithms (a and c) and different sets of descriptors (b and d) against hypoxia-induced neurotoxicity (a and b) and H2O2-induced neurotoxicity (c and d) on training set and test set.
Figure 4The comparison of MCC value made by four single classifiers and s-NB-LPFP6 model with different sets of descriptors against hypoxia-induced neurotoxicity (a and b) and H2O2-induced neurotoxicity (c and d) on training set (a and c) and test set (b and d).
Figure 5The component analysis of XXMD database. The blue part indicates the percentage of compounds predicted against hypoxia-induced neurotoxicity; the green part indicates the percentage of compounds predicted against H2O2-induced neurotoxicity; the red part indicates the percentage of compounds predicted against both hypoxia- and H2O2-induced neurotoxicity and the gray part indicates the percentage of inactive compounds predicted.
Chemical structures of representative compounds predicted by two phenotypic screening models in XXMD.
| ID | Name | Structure | Bayesian model | Bayesian model | Most similar compound in training sets | ||
|---|---|---|---|---|---|---|---|
| (s-NB-1-LPFP6) | (s-NB-2-LPFP6) | ||||||
| EstPGood | Prediction | EstPGood | Prediction | ||||
| PubChemCID 21670038 | 5- |
| 0.993 | TRUE | 0.133 | TRUE |
|
| CHEMBL 8260 | Baicalein |
| 0.999 | TRUE | 0.349 | TRUE |
|
| CHEMBL 485818 | Baicalein |
| 1.000 | TRUE | 0.397 | TRUE |
|
| PubChemCID 5281607 | Chrysin |
| 0.994 | TRUE | 0.183 | TRUE |
|
| PubChemCID 441960 | Cimifugin |
| 0.881 | TRUE | 0.183 | TRUE |
|
| CHEMBL 504256 | Fangchinoline |
| 0.828 | TRUE | 0.727 | TRUE |
|
| CHEMBL 1734606 | Prim-O-glucosylcimifugin |
| 0.996 | TRUE | 0.101 | TRUE |
|
| CHEMBL 176045 | Tetrandrine |
| 0.754 | TRUE | 0.787 | TRUE |
|
| CHEMBL 16171 | Wogonin |
| 0.994 | TRUE | 0.416 | TRUE |
|
| PubChemCID 12004622 | Wogonoside |
| 1.000 | TRUE | 0.441 | TRUE |
|
Figure 6Neuroprotective effects of chemicals on H2O2-induced and Na2S2O4-induced SH-SY5Y cells. The viability of the untreated cells was set to 100%. The values represent mean (%) ± SD of three individual experiments (n = 3). ##P < 0.01 versus control groups; ∗P < 0.05 and ∗∗P < 0.01 versus model group.