| Literature DB >> 30961597 |
Bin Ji1, Rui Liu2, Shasha Li3, Jie Yu1, Qingbo Wu1, Yusong Tan1, Jiaju Wu4.
Abstract
BACKGROUND: With the rapid spread of electronic medical records and the arrival of medical big data era, the application of natural language processing technology in biomedicine has become a hot research topic.Entities:
Keywords: Attention; BiLSTM-CRF; Chinese electronic medical record; Drug dictionary; Named entity recognition
Mesh:
Year: 2019 PMID: 30961597 PMCID: PMC6454595 DOI: 10.1186/s12911-019-0767-2
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Fig. 1BIO tagging schema
Fig. 2Architecture of neural network based approach to medical NER on Chinese EMRs
Fig. 3Architecture of BiLSTM-CRF model
Fig. 4Architecture of Attention-BiLSTM-CRF model
Examples of entity boundary partition error
| Recognized entity | Correct entity | Category |
|---|---|---|
| 左附件 | 左附件区 | Anatomy |
| 卵巢切除 | 卵巢切除术 | surgery |
| 贝伐 | 贝伐珠单抗 | Drug |
| 睡眠不佳 | 饮食睡眠不佳 | Individual symptom |
| 疼痛 | 疼痛不适 | Symptom description |
Entity auto-correct algorithm
Test results of the two neural network models on given dataset
| Entity name | BiLSTM-CRF | Attention-BiLSTM-CRF | ||||
|---|---|---|---|---|---|---|
| Strict index (%) | Strict index (%) | |||||
| P | R | F1 | P | R | F1 | |
| Anatomy | 85.57 | 85.61 | 85.59 | 86.29 | 86.24 | 86.27 |
| Surgery | 85.81 | 85.58 | 85.69 | 86.19 | 84.90 | 85.54 |
| Drug | 94.92 | 78.11 | 85.70 | 89.73 | 85.98 | 87.81 |
| Independent symptom | 92.45 | 89.52 | 90.96 | 91.93 | 90.20 | 91.05 |
| Symptom description | 91.81 | 87,91 | 89.82 | 91.58 | 87.69 | 89.59 |
| Total | 87.66 | 85.72 | 86.68 | 87.75 | 86.77 | 87.26 |
Test results of the two models with drug dictionary
| Approach | Strict index (%) | ||
|---|---|---|---|
| P | R | F1 | |
| BiLSTM-CRF | 87.66 | 85.72 | 86.68 |
| BiLSTM-CRF + dictionary | 88.61 | 86.83 | 87.71 |
| Attention-BiLSTM-CRF | 87.75 | 86.77 | 87.26 |
| Attention-BiLSTM-CRF + dictionary | 88.79 | 87.80 | 88.29 |
Test results of the two models with post-processing rules
| Approach | Strict index (%) | ||
|---|---|---|---|
| P | R | F1 | |
| BiLSTM-CRF | 87.66 | 85.72 | 86.68 |
| BiLSTM-CRF + rule | 89.43 | 87.47 | 88.44 |
| Attention-BiLSTM-CRF | 87.75 | 86.77 | 87.26 |
| Attention-BiLSTM-CRF + rule | 89.61 | 88.58 | 89.09 |
Test results of the two models with entity auto-correct algorithm
| Approach | Strict index (%) | ||
|---|---|---|---|
| P | R | F1 | |
| BiLSTM-CRF | 87.66 | 85.72 | 86.68 |
| BiLSTM-CRF + algorithm | 88.73 | 86.78 | 87.74 |
| Attention-BiLSTM-CRF | 87.75 | 86.77 | 87.26 |
| Attention-BiLSTM-CRF + algorithm | 88.72 | 87.71 | 88.21 |
Test results of the two models with all auxiliary measures
| Approach | Strict index (%) | ||
|---|---|---|---|
| P | R | F1 | |
| BiLSTM-CRF | 87.66 | 85.72 | 86.68 |
| BiLSTM-CRF + all | 91.12 | 89.21 | 90.15 |
| Attention-BiLSTM-CRF | 87.75 | 86.77 | 87.26 |
| Attention-BiLSTM-CRF + all | 91.26 | 90.38 | 90.82 |