| Literature DB >> 35600940 |
Tingting Zhang1, Zonghai Huang1, Yaqiang Wang2, Chuanbiao Wen1, Yangzhi Peng1, Ying Ye1.
Abstract
Background: The practice of traditional Chinese medicine (TCM) began several thousand years ago, and the knowledge of practitioners is recorded in paper and electronic versions of case notes, manuscripts, and books in multiple languages. Developing a method of information extraction (IE) from these sources to generate a cohesive data set would be a great contribution to the medical field. The goal of this study was to perform a systematic review of the status of IE from TCM sources over the last 10 years.Entities:
Year: 2022 PMID: 35600940 PMCID: PMC9122692 DOI: 10.1155/2022/1679589
Source DB: PubMed Journal: Evid Based Complement Alternat Med ISSN: 1741-427X Impact factor: 2.650
Figure 1Example of extracting information from a TCM clinical record.
Figure 2The search strategy and reasons for each search string category.
Figure 3Flow diagram of included articles and eligibility criteria.
Figure 4Timeline of data sources, models (or algorithms), and extracted targets of TCM.
Figure 5The number of papers enrolled for each type of data source.
Figure 6Frameworks of dictionary-based methods and rule-based methods in IE tasks.
Figure 7Frameworks of shallow machine learning and deep learning methods in IE tasks.
Details of the statistical-based methods, core content, evaluation and best performance of IE systems.
| Author, year | Method | Method type | Supervised type | Core content of extracted TCM information | Evaluation and best performance |
|---|---|---|---|---|---|
| Zhu W et al. [ | SVM | Machine learning | Supervised | Attributes of herbal medicine entities and prescriptions, including medicine application, medicine class, medicine dosage, medicine property, prescription usage, prescription attend illness, and prescription function | Medicine application: |
| Zhu W et al. [ | SVM and regular expression | Combine the machine learning and rule-based method | Supervised | Symptom, diagnosis | Not mentioned |
| Cai D et al. [ | Bootstrapping approach | Machine learning | Semisupervised | Prescriptions, drugs |
|
| Zhu W et al. [ | Decision tree and CRF | Machine learning | Supervised | Irrelevant information, basic information, multiclass information, symptom information, prescription information, and result information | Paragraph (unit) identification: |
| Feng L et al. [ | CRF and bootstrapping | Machine learning | Supervised | Clinical phenotype | Not mentioned |
| Wang Y et al. [ | HMM/MEMM/CRF | Machine learning | Supervised | Symptom | The best |
| Liu H et al. [ | CRF | Machine learning | Supervised | Symptoms and pathogenesis | Symptoms: |
| Jiang Q et al. [ | CRF/HMM/MEMM | Machine learning | Supervised | Symptoms, signs, TCM diagnosis, Chinese medicines (drug), prescriptions, TCM syndrome type, etc. | CRF : |
| Jiang Q et al. [ | CRF | Machine learning | Supervised | Symptoms or signs, TCM diagnosis, TCM syndrome type, Chinese medicines (drug), and names of TCM prescriptions | With all features: |
| Wang J et al. [ | Decision tree/rules | Machine learning and rule-based method | Supervised | Effect relation and conditional effect relation | Rule-based approach: |
| Wan H et al. [ | Heterogeneous factor graph model (HFGM) | Machine learning | Semisupervised | Herb-syndrome relations, herb-disease relations, formula-syndrome relations, formula-disease relations, and syndrome-disease relations |
|
| Liang J et al. [ | CRF | Machine learning | Supervised | TCM drug name | Recognition of traditional Chinese medicine drug names: |
| Ruan T et al. [ | CRF | Machine learning | Supervised | Symptoms, departments, disease, medicines, and examinations | Precision of TCM symptom is 93.26% |
| Sun S. et al. [ | SVM | Machine learning | Supervised | Disease-treatment relation, healthcare relation, meridian points treatment and healthcare method relation, and medication treatment and healthcare method relation | The |
| Zhang H. et al. [ | Short-text LSTM classifier (STLC) | Deep learning | Supervised | Person name |
|
| Chi Y. et al. [ | CRF for entity extraction; SVM, naive Bayes (NB), LSTM, and KNN for relation extraction | Machine learning and deep learning | Supervised | Entities: food material, dish, nutritional element, symptom, and crowd; relations: “good for” or “bad for” relationships between “food material” and “symptom”, and “same”, “related”, or “different” relationships between two “food material” entities | Concept extraction and relationship recognition were all above 85% |
| Chen T et al. [ | Combine BERT with a one-dimensional CNN to fine-tune the pretrained model | Deep learning | Supervised | Herb-syndrome relation, herb-disease relation, formula-syndrome relation, formula-disease relation, and syndrome-disease relation | The “1d-CNN fine-tuning” approach: |
| Jin Z et al. [ | TCMKG-LSTM-CRF | Deep learning | Supervised | Medicine, alias, prescription, pieces, disease, symptom, syndrome, meridian, property, flavor, and function |
|
| Song B et al. [ | BiLSTM-CRF | Deep learning | Supervised | Symptom |
|
| Deng N et al. [ | Co-training-based method | Machine learning | Semisupervised | Four-character medicine effect phrases | In each iteration, the extraction accuracy is all above 97% |
| Deng N et al. [ | Serialized co-training method | Machine learning | Semisupervised | Medicine names | Precision is 98.6% when the number of patent texts reaches 3000 |
| Feng L et al. [ | BiLSTM-CRF, lattice LSTM-CRF, BERT | Deep learning | Supervised | Symptom | The BERT model: |
| Liu L et al. [ | BiLSTM-CRF | Deep learning | Semisupervised | Traditional Chinese medicine, symptoms, patterns, disease, and formulas |
|
| Zhang M et al. [ | BERT-BiLSTM-CRF | Deep learning | Semisupervised | Symptoms, syndromes, treatment, Chinese medicine, prescriptions, pulse, tongue, and efficacy | Accuracy = 81.24%, |
| Deng N et al. [ | Serialized co-training method | Machine learning | Semisupervised | Disease name | Not mentioned |
| Qu Q et al. [ | BERT-BiLSTM-CRF | Deep learning | Supervised | Symptoms, disease names, time, prescription names, and drug names | The best recognition effect on drugs, with an |
| Ruan C et al. [ | Multiview graph model to extract relation (MVG2RE) | Deep learning | Supervised | Herb-syndrome, herb-disease, formula-disease, formula-syndrome, and syndrome-disease |
|
| Zhang D et al. [ | BiLSTM | Deep learning | Distantly supervised | Chinese medicine, symptom, medicine prescription, dose, tongue-like, and pulse |
|
| Zhang H et al. [ | BiLSTM-CRF | Deep learning | Supervised | Chief complaints (symptom name, symptom duration, accompanying symptom, etc.); pathological history (symptom name, symptom cause, etc.); personal history (place of birth, smoking history, drinking history, etc.); and TCM diagnosis (tongue quality, furred tongue, tongue body, pulse, etc.) | Not mentioned |
| Zheng Z et al. [ | BiLSTM-CRF | Deep learning | Supervised | Syndrome, disease, symptom, prescription, therapy, herb |
|
| Zhou S et al. [ | Structural bidirectional long short-term memory (SLSTM) model | Deep learning | Supervised | Subjects, methods, and results | SLSTM model achieves close to 90% performance in precision, recall, and |
| Bai T et al. [ | SEGATT-CNN combined with different classifiers | Deep learning | Supervised | Entity: disease, herb names; relation: herbal-disease relation, herbal-chemical relation | Herbal-disease relation: SVM combined SEGATT-CNN: |
| Deng N et al. [ | BiLSTM-CRF | Deep learning | Supervised | Herb name, disease name, symptom, therapeutic effect, and nonentity |
|
| Jia Q et al. [ | Span-level distantly supervised NER approach | Deep learning | Distantly supervised | Symptom, medicine, prescriptions, dose, tongue-like, and pulse |
|
| Zhang T et al. [ | Multihop self-attention mechanism + BiLSTM | Deep learning | Supervised | Therapeutic relation and causal relation | Therapeutic relation: |
| Xu H et al. [ | Nested NER model based on LSTM-CRF | Deep learning | Supervised | Medicine, symptom, pulse, tongue, medicine prescription, dose, disease location, onset time and duration, severity, color, quantity, and frequency |
|
| Guan Y et al. [ | BERT-BiLSTM-CRF | Deep learning | Supervised | Disease, pattern, and symptom |
|
Details of evaluation metrics.
| Evaluation index | Formula | Role |
|---|---|---|
| Precision | Precision= | Determine the ratio of correctly predicted positive samples |
| Accuracy | Accuracy= | Determine the correct rate of all classifications |
| Recall | Recall= | Ratio of correctly predicted positive samples to the total number of real positive samples |
|
| F − measure=(1+ | Examine precision and recall |
| AUC | Area under curve | Intuitively express the real classification ability of the model |
Figure 8Example of the reference relations in the ancient TCM literature.