| Literature DB >> 35224094 |
Xin Dong1, Yi Zheng1, Zixin Shu1, Kai Chang1, Jianan Xia1, Qiang Zhu1, Kunyu Zhong1, Xinyan Wang1, Kuo Yang1,2, Xuezhong Zhou1.
Abstract
Traditional Chinese medicine (TCM) has played an indispensable role in clinical diagnosis and treatment. Based on a patient's symptom phenotypes, computation-based prescription recommendation methods can recommend personalized TCM prescription using machine learning and artificial intelligence technologies. However, owing to the complexity and individuation of a patient's clinical phenotypes, current prescription recommendation methods cannot obtain good performance. Meanwhile, it is very difficult to conduct effective representation for unrecorded symptom terms in an existing knowledge base. In this study, we proposed a subnetwork-based symptom term mapping method (SSTM) and constructed a SSTM-based TCM prescription recommendation method (termed TCMPR). Our SSTM can extract the subnetwork structure between symptoms from a knowledge network to effectively represent the embedding features of clinical symptom terms (especially the unrecorded terms). The experimental results showed that our method performs better than state-of-the-art methods. In addition, the comprehensive experiments of TCMPR with different hyperparameters (i.e., feature embedding, feature dimension, subnetwork filter threshold, and feature fusion) demonstrate that our method has high performance on TCM prescription recommendation and potentially promote clinical diagnosis and treatment of TCM precision medicine.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35224094 PMCID: PMC8872682 DOI: 10.1155/2022/4845726
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 1An overview of our methods. First, we constructed the HSKG and symptom network (a). Second, comprehensive embedding of patient symptoms was formed with the SSTM and symptom network (b). Finally, the patient's comprehensive embedding vector was used for TCM prescription recommendation (c); the predicted probability of each herb is the output, so as to obtain the recommended prescription.
Figure 2Symptom distribution, herb distribution, and symptom-herb correlation of clinical samples. (a) The distributions of symptom number before and after screening are similar to the Poisson distribution. (b) The distributions of the herb number are also similar to the Poisson distribution. (c) No matter which symptom segmentation algorithm is adopted, the average similarity of symptoms increases with the increase in prescription similarity, and our SSTM method can achieve the best symptom-herb correlation.
The construction of the herb-symptom-related knowledge graph.
| Entity type | Entity amount | Relationship type | Relationship amount |
|---|---|---|---|
| Symptom (SY) | 8,669 | HB-SY | 37,528 |
| Herb (HB) | 8,464 | SY-SY | 3,122 |
| Efficacy (EF) | 1,177 | HB-EF | 34,268 |
| Property (PP) | 45 | HB-PP | 22,244 |
| Meridian (MD) | 182 | HB-MD | 4,958 |
| Total | 18,537 | Total | 102,120 |
The construction of the symptom network.
| Source | Relationship type | Threshold | Relationship amount |
|---|---|---|---|
| HB-SY | SY-SY_word | / | 3,038 |
| SY-SY | SY-SY (synonymy) | / | 3,122 |
| SY-SY_word | SY-SY_word | / | 12,542 |
| HB-EF, HB-SY | SY-SY (EF) | 1,000 | 52,729 |
| HB-MD, HB-SY | SY-SY (MD) | 1,000 | 51,394 |
| HB-PP, HB-SY | SY-SY (PP) | 100 | 45,248 |
| Total | 168,073 | ||
Figure 3Performance of TCMPR and baselines.
Performance comparison of TCMPR and baseline methods.
| Methods | Top@5 | Top@10 | Top@15 | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Precision | Recall | F1-score | Precision | Recall | F1-score | Precision | Recall | F1-score | |
| MLKNN_1 | 0.2519 | 0.1118 | 0.1549 | 0.1832 | 0.1596 | 0.1706 | 0.1541 | 0.1990 | 0.1737 |
| MLKNN_5 | 0.2773 | 0.1217 | 0.1692 | 0.2272 | 0.1987 | 0.2120 | 0.1897 | 0.2477 | 0.2148 |
| MLKNN_10 | 0.2805 | 0.1262 | 0.1741 | 0.2304 | 0.2058 | 0.2174 | 0.1970 | 0.2630 | 0.2252 |
| ML-DT | 0.2799 | 0.1264 | 0.1742 | 0.2251 | 0.1996 | 0.2116 | 0.1949 | 0.2597 | 0.2227 |
| TCMPR (ours) | 0.2823 | 0.1298 | 0.1778 | 0.2311 | 0.2095 | 0.2197 | 0.1989 | 0.2692 | 0.2288 |
| Improvement (TCMPR vs. MLKNN_1) | 12.1% | 16.1% | 14.8% | 26.1% | 31.3% | 28.8% | 29.1% | 35.3% | 31.7% |
Figure 4Performance comparison of different hyperparameters. (a) contains the comparison of embedding methods. In (a), DW represents DeepWalk, NV represents node2vec, LN represents LINE, OH represents One-Hot, and TE represents TransE. (b) is the comparison of fusion methods, (c) shows the feature dimension experiments, and (d) shows the comparison between different subnetwork filter thresholds.