| Literature DB >> 34972505 |
Ming Cheng1, Shufeng Xiong2, Fei Li3, Pan Liang4, Jianbo Gao4.
Abstract
BACKGROUND: Named entity recognition (NER) on Chinese electronic medical/healthcare records has attracted significantly attentions as it can be applied to building applications to understand these records. Most previous methods have been purely data-driven, requiring high-quality and large-scale labeled medical data. However, labeled data is expensive to obtain, and these data-driven methods are difficult to handle rare and unseen entities.Entities:
Keywords: Chinese clinical named entity recognition; Deep neural network; Dictionary features; Multi-task learning
Mesh:
Year: 2021 PMID: 34972505 PMCID: PMC8719412 DOI: 10.1186/s12911-021-01717-1
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Fig. 1Parallel multi-task model by incorporating external knowledge and shared representation among tasks
Fig. 2The framework of our system. First, the system embeds a sentence into a high dimensional space to extract features. Then, it concatenates the resulting vectors of each encoder and performs multi-task. The right pink nodes layer represents segmentation while the right blue nodes layer represents categorization
An illustrative example of the tag sequence
“Sym” is an abbreviation for “Symptom”, and “Ana” is an abbreviation for “Anatomy”
N-gram feature templates of the ith character
| Types | Template |
|---|---|
| 2-gram | |
| 3-gram | |
| 4-gram | |
| 5-gram | |
| 6-gram |
Fig. 3An illustration for n-gram feature construction. An segment sample of the drug entity with solid rectangle. The character is represented by the yellow shadow
Statistics of the entity recognition in Chinese clinical texts
| Total (1596) | 10142 | 1275 | 12689 | 1513 | 13740 | 39359 |
| Total (600) | 5574 | 1085 | 849 | 2764 | 1708 | 11980 |
| Total (736) | 9686 | 1164 | 1105 | 4117 | 3061 | 19133 |
| Records (2932) | Clinical Entities (70472) | |||||
Parameters of our model in the experiments
| Dim of character embedding | 100 |
| Dim of radical embedding | 50 |
| Number of BiLSTM hidden units | 128 |
| Dropout | 0.5 |
| Batch size | 32 |
| Epochs | 300 |
Comparative results with F-measure between different models on three datasets
| Method | CCKS2017 | CCKS2018 | FCCd |
|---|---|---|---|
| Wang et al. [ | 91.24 | 89.72 | 86.07 |
| Hu et al. [ | 91.03 | – | – |
| Zhang et al. [ | 90.52 | – | – |
| Qiu et al. [ | 91.32 | – | – |
| Li et al. [ | 91.60 | 89.56 | 86.87 |
| Tang et al. [ | 90.61 | 88.63 | 86.24 |
| Luo et al. [ | 91.36 | 88.63 | 85.52 |
| Yang et al. [ | 90.16 | 89.13 | 84.73 |
| Our | 91.84 | 90.29 | 87.05 |
Impact of the different character embeddings in our method
| Dataset | Method | Precision | Recall | F-measure |
|---|---|---|---|---|
| CCKS2017 | Random | 89.56 | 89.29 | 89.42 |
| BERT(Baseline1) | 90.73 | 90.51 | 90.62 | |
| FT-BERT(Baseline2) | 91.27 | 91.21 | 91.24 | |
| FT-BERT-Radical | 91.69 | 91.34 | ||
| FCCd | Random | 84.23 | 83.32 | 83.77 |
| BERT(Baseline1) | 86.11 | 85.52 | 85.81 | |
| FT-BERT(Baseline2) | 86.21 | 85.83 | 86.02 | |
| FT-BERT-Radical | 86.95 | 86.56 |
The best result is in bold
Impact of the dictionary features on our method
| Dataset | Method | Precision | Recall | F-measure |
|---|---|---|---|---|
| CCKS2017 | FT-BERT-Radical | 91.69 | 91.34 | 91.51 |
| FT-BERT-Radical+Dictionary | 91.91 | 91.78 | ||
| FCCd | FT-BERT-Radical | 86.95 | 86.56 | 86.75 |
| FT-BERT-Radical+Dictionary | 87.32 | 86.79 |
The best result is in bold
Impact of the different dictionary sizes on method performance
| Dataset | Dictionary size | Precision | Recall | F-measure |
|---|---|---|---|---|
| CCKS2017 | 70% | 91.79 | 91.61 | 91.70 |
| 80% | 91.83 | 91.69 | 91.76 | |
| 90% | 91.87 | 91.73 | 91.80 | |
| 100% | 91.91 | 91.78 | ||
| FCCd | 70% | 87.19 | 86.65 | 86.92 |
| 80% | 87.23 | 86.71 | 86.97 | |
| 90% | 87.28 | 86.75 | 87.01 | |
| 100% | 87.32 | 86.79 |
The best result is in bold
Performances of the networks with and without multi-task learning on the two datasets
| Dataset | Method | Precision | Recall | F-measure |
|---|---|---|---|---|
| CCKS2017 | Single-task for NES | 92.06 | 91.55 | 91.80 |
| Single-task for NER | 91.72 | 91.59 | 91.65 | |
| Multi-task for NER | 91.91 | 91.78 | ||
| FCCd | Single-task for NES | 87.24 | 86.81 | 87.02 |
| Single-task for NER | 87.03 | 86.60 | 86.81 | |
| Multi-task for NER | 87.32 | 86.79 |
The best result is in bold
Comparative performance (recall) of different methods for rare entities
| Method | 0 | 1 | 2 | 3 |
|---|---|---|---|---|
| BERT(Baseline1) | 51.47 | 69.29 | 81.36 | 90.34 |
| FT-BERT(Baseline2) | 53.62 | 71.36 | 82.54 | 90.97 |
| Our | ||||
| BERT(Baseline1) | 50.37 | 59.47 | 71.36 | 84.91 |
| FT-BERT(Baseline2) | 52.56 | 60.38 | 73.82 | 84.75 |
| Our | ||||
The best result is in bold
Performances of our model on each category of entity
| Dataset | Entity type | Precision | Recall | F-measure |
|---|---|---|---|---|
| CCKS2017 | Symptom | 96.87 | 97.12 | 96.99 |
| Disease | 86.32 | 79.61 | 82.83 | |
| Exam | 94.13 | 93.81 | 93.97 | |
| Treatment | 82.69 | 83.73 | 83.21 | |
| Anatomy | 89.67 | 88.03 | 88.84 | |
| Average | 89.93 | 88.46 | 89.16 | |
| FCCd | Anatomy | 87.03 | 86.63 | 86.83 |
| Operation | 86.32 | 86.03 | 86.17 | |
| Drug | 87.86 | 85.41 | 86.62 | |
| IndeSym | 88.37 | 87.58 | 87.97 | |
| DesSym | 87.32 | 87.02 | 87.17 | |
| Average | 87.38 | 86.53 | 86.95 |
Fig. 4Distribution of representative radicals in three datasets