| Literature DB >> 30971728 |
Hong Bai1, Xianhong Li2, Hongjun Li1, Jialiang Yang3, Kang Ning4.
Abstract
Traditional Chinese Medicine (TCM) preparations have been used in China for thousands of years. Quality evaluation for TCM preparations could be conducted based on chemical ingredients or biological ingredients. To date, the overwhelming majority of researches have focused on chemical ingredients while few studies were reported for biological ingredients. It is only recently that the assessments based on biological ingredients have drawn broader attentions. In this work, we have established a method for quality evaluation of TCM preparations by combination of chemical ingredients determined by HPLC fingerprint and biological ingredients obtained by high-throughput sequencing. This proof-of-concept method has been evaluated and compared with existing methods on Liuwei Dihuang Wan, a classical TCM preparation in China. By comparison of this method with those only based on chemical or biological ingredients, it is suggested that (1) Biological ingredient could complement chemical ingredient in separating TCM preparation from different manufacturers and batches with high accuracy; (2) classification of samples based on selected features would always out-perform those based on all features (either chemical or biological or both). By rationally selecting representative biological and chemical features, we have proven that these two types of features could complement each other for the assessment of ingredient consistencies and differences among various TCM samples, which is helpful to ensure the effectiveness, safety and legality of TCM preparations.Entities:
Mesh:
Substances:
Year: 2019 PMID: 30971728 PMCID: PMC6458136 DOI: 10.1038/s41598-019-42341-4
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Analytical results of the precision, stability and repeatability for 6 characteristic common peaks in LDW sample (MT.A) (n = 5).
| No. | Mean RT (Min) | RSD of RT (%) | RSD of PA (%) | ||||
|---|---|---|---|---|---|---|---|
| Precision | Repeatability | Stability | Precision | Repeatability | Stability | ||
| 1 | 24.55 | 0.89 | 0.63 | 0.21 | 1.17 | 0.32 | 0.67 |
| 2 | 27.90 | 0.78 | 0.72 | 0.19 | 0.58 | 0.55 | 0.62 |
| 3 | 30.22 | 0.79 | 0.73 | 0.21 | 2.67 | 2.79 | 0.22 |
| 4 | 36.10 | 0.70 | 0.67 | 0.21 | 1.95 | 1.73 | 0.27 |
| 5 | 42.59 | 0.62 | 0.60 | 0.12 | 0.34 | 0.75 | 0.40 |
| 6 | 48.22 | 0.24 | 0.24 | 0.13 | 0.34 | 0.79 | 0.28 |
Figure 1The correlation coefficients of LDW samples from three manufacturers (MH, MS and MT) compared to Mean Chromatographic Fingerprint.
Figure 2Heat maps showing clusters of 27 LDW samples using hierarchical clustering based on (a) HPLC fingerprint similarity and (b) detectable species similarity.
Figure 3Feature selection results. The main figure represented the rank of each (A) chemical ingredient’s and (B) biological ingredient’s importance score (IM score) calculated by random forest algorithm and the top 5 ingredients were selected as features. The sub-figure represented the change of mean prediction accuracy of 10-fold cross validation 500 times for each combination of (A) chemical ingredients and (B) biological ingredients added one by one according to importance score of the ingredient (both starting with combination of the first two ingredients).
Figure 4Accuracy analysis of different LDW samples based on random forest method. Bar plot of accuracy value based on chemical and biological ingredients in differentiation of samples among (A) different batches of different manufacturers and (B) different manufacturers. Category information: FP, all chemical features; FP-5, 5 selected chemical features; Taxa, all biological features; Taxa-5, 5 selected biological features; FP_Taxa_10, integrated features.
Figure 5Accuracy analysis of different LDW samples based on KNN method. (a) Bar plot of accuracy value based on chemical ingredients; (b) ROC curve based on chemical ingredients; (c) Bar plot of accuracy value based on biological ingredients; (d) ROC curve based on biological ingredients; (e) Bar plot of accuracy value based on both chemical and biological ingredients; (f) ROC curve based on both chemical and biological ingredients. Insets for (b,d,e): enlarged ROC figures at top-left that could show differences among different approaches. For Fig. 3a,c,e, X-axis represents the number of samples used for the training set, and Y-axis represents accuracy value for classification. Here accuracy value refers to the classification accuracy for samples in testing set at the manufacture level (not batch level). (2) For Fig. 3b,d,f, X-axis represents false positive rate (FPR) for classification, and Y-axis represents true positive rate (TPR) for classification. (3) Category information: FP, all chemical features; FP-auto, 5 automatically selected chemical features; FP-man, 5 manually selected chemical features; Taxa, all biological features; Taxa-auto, 5 automatically selected biological features; Taxa-man, 5 manually selected biological features; Auto-10, 5 automatically selected chemical features and 5 automatically selected biological features, Man-10: 5 manually selected chemical features and 5 manually selected biological features.