| Literature DB >> 31048840 |
Xishuang Dong1, Shanta Chowdhury1, Lijun Qian1, Xiangfang Li1, Yi Guan2, Jinfeng Yang3, Qiubin Yu4.
Abstract
Specific entity terms such as disease, test, symptom, and genes in Electronic Medical Record (EMR) can be extracted by Named Entity Recognition (NER). However, limited resources of labeled EMR pose a great challenge for mining medical entity terms. In this study, a novel multitask bi-directional RNN model combined with deep transfer learning is proposed as a potential solution of transferring knowledge and data augmentation to enhance NER performance with limited data. The proposed model has been evaluated using micro average F-score, macro average F-score and accuracy. It is observed that the proposed model outperforms the baseline model in the case of discharge datasets. For instance, for the case of discharge summary, the micro average F-score is improved by 2.55% and the overall accuracy is improved by 7.53%. For the case of progress notes, the micro average F-score and the overall accuracy are improved by 1.63% and 5.63%, respectively.Entities:
Mesh:
Year: 2019 PMID: 31048840 PMCID: PMC6497281 DOI: 10.1371/journal.pone.0216046
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Framework of the proposed model for NER.
Fig 2Contextual word representation from vector representation.
To extract relevant context information from sentence, bi-directional RNN with LSTM cell is used to extract information from a vector associated with word embedding (red shaded box) and character embedding (white shaded box) to form contextual word representation (green shaded box).
The proposed network architecture.
| Name | Description |
|---|---|
| Input | Sentences in EMR |
| Word Embedding | Mikolov model |
| Character Embedding Layer [ | 150 LSTM cells for each hidden layer, one forward hidden layer andone backward hidden layer, Dropout = 0.5 |
| Transferred layer | 300 LSTM cells for each hidden layer, one forward hidden layer and one backward hidden layer, Dropout = 0.5 |
| Shared Layer | 300 LSTM cells for each hidden layer, one forward hidden layer and one backward hidden layer, Dropout = 0.5 |
| Parts-of-speech tag (POS) layer | 300 LSTM cells for each hidden layer, one forward hidden layer and one backward hidden layer, Dropout = 0.5 |
| Named Entity recognition (NER) Layer | 300 LSTM cells for each hidden layer, one forward hidden layer and one backward hidden layer, Dropout = 0.5 |
| Output | Softmax |
Fig 3Main architecture of the proposed model that contains transferred layer (yellow shaded box) initialized by deep transfer learning and other three layers, namely, shared layer (blue shaded box), NER layer (red shaded box) and POS layer(green shaded box), where the NER layer and the POS layer are for the task NER and POS, respectively.
Comparison results of MicroP, MicroR and MicroF measure on discharge summaries.
| Model | MicroP | MicroR | MicroF |
|---|---|---|---|
| Naive Bayes (NB) | 78.07 | 77.91 | 77.99 |
| Maximum Entropy (ME) | 88.81 | 88.81 | 88.81 |
| Support Vector Machine (SVM) | 90.52 | 90.52 | 90.52 |
| Conditional Random Field (CRF) [ | 93.15 | 93.15 | 93.15 |
| Convolutional Neural Network (CNN) [ | 88.64 | 88.64 | 88.64 |
| Bi-RNN model (BRNN) | 90.90 | 90.90 | 90.90 |
| Transfer learning Bi-RNN model (TBRNN) [ | 92.25 | 92.25 | 92.25 |
| Multitask Bi-RNN model (MBRNN) [ | 93.31 | 93.31 | 93.31 |
| Our proposed model | 93.45 | 93.45 | 93.45 |
Comparison results of MicroP, MicroR and MicroF measure on progress notes.
| Model | MicroP | MicroR | MicroF |
|---|---|---|---|
| Naive Bayes (NB) | 79.42 | 79.37 | 79.40 |
| Maximum Entropy (ME) | 91.45 | 91.45 | 91.45 |
| Support Vector Machine (SVM) | 93.07 | 93.06 | 93.06 |
| Conditional Random Field (CRF) [ | 94.93 | 94.02 | 94.02 |
| Convolutional Neural Network (CNN) [ | 91.13 | 91.14 | 91.13 |
| Bi-RNN model (BRNN) | 93.58 | 93.58 | 93.58 |
| Transfer learning Bi-RNN model (TBRNN) [ | 94.37 | 94.37 | 94.37 |
| Multitask Bi-RNN model (MBRNN) [ | 96.65 | 96.65 | 96.65 |
| Our proposed model | 95.21 | 95.21 | 95.21 |
Comparison results of NER on discharge summaries.
| Multitask model [ | |||
| Entity type | Precision | Recall | F-measure |
| Disease | 84.11 | 84.70 | 84.40 |
| Symptom | 88.08 | 84.01 | 86.00 |
| Disease group | 43.75 | 82.35 | 57.14 |
| Treatment | 73.91 | 82.06 | 77.77 |
| Test | 89.23 | 87.99 | 88.61 |
| Macro average | 75.82 | 84.22 | 78.79 |
| Our proposed model | |||
| Entity type | Precision | Recall | F-measure |
| Disease | 84.31 | 85.32 | 84.82 |
| Symptom | 87.52 | 85.14 | 86.32 |
| Disease group | 62.50 | 83.33 | 71.43 |
| Treatment | 76.20 | 79.59 | 77.86 |
| Test | 90.16 | 88.91 | 89.53 |
| Macro average | 80.14 | 84.46 | 81.99 |
Comparison results of NER on progress notes.
| Multitask model [ | |||
| Entity type | Precision | Recall | F-measure |
| Disease | 94.06 | 95.07 | 94.5 |
| Symptom | 94.50 | 90.79 | 92.61 |
| Disease group | 77.27 | 80.95 | 79.06 |
| Treatment | 88.15 | 87.19 | 87.67 |
| Test | 92.53 | 93.36 | 92.94 |
| Macro average | 89.31 | 89.47 | 89.37 |
| Our proposed model | |||
| Entity type | Precision | Recall | F-measure |
| Disease | 92.88 | 91.13 | 92.00 |
| Symptom | 92.79 | 88.02 | 90.35 |
| Disease group | 59.09 | 81.25 | 68.42 |
| Treatment | 88.46 | 90.68 | 89.56 |
| Test | 80.71 | 81.20 | 80.95 |
| Macro average | 82.78 | 86.46 | 84.25 |
Comparison results (%accuracy) on discharge summaries.
TMBRNN is the proposed model.
| Model | Entity type | |||||
|---|---|---|---|---|---|---|
| Disease | Symptom | Disease group | Treatment | Test | Accuracy | |
| NB | 44.82 | 51.72 | N/A | 59.00 | 65.96 | 58.91 |
| ME | 48.32 | 56.34 | 34.19 | 58.80 | 76.10 | 65.68 |
| SVM | 57.18 | 62.52 | 37.22 | 60.48 | 80.17 | 70.46 |
| CRF [ | 77.33 | 77.83 | 48.39 | 77.47 | 90.05 | 83.94 |
| CNN [ | 52.80 | 65.76 | 40.00 | 53.14 | 79.28 | 68.60 |
| BRNN | 73.83 | 79.35 | 28.00 | 67.99 | 82.63 | 77.85 |
| TBRNN [ | 74.30 | 82.60 | 44.00 | 68.20 | 86.79 | 80.75 |
| MBRNN [ | 76.86 | 87.22 | 36.00 | 71.33 | 89.20 | 83.51 |
| TMBRNN | 80.37 | 86.14 | 60.00 | 72.17 | 90.84 | 85.20 |
Comparison results (%accuracy) on progress notes.
TMBRNN is the proposed model.
| Model | Entity type | |||||
|---|---|---|---|---|---|---|
| Disease | Symptom | Disease group | Treatment | Test | Accuracy | |
| NB | 69.50 | 70.09 | N/A | 41.59 | 71.85 | 67.49 |
| ME | 71.49 | 72.37 | 41.15 | 52.93 | 77.58 | 72.44 |
| SVM | 77.77 | 76.92 | 21.12 | 56.36 | 81.49 | 76.45 |
| CRF [ | 87.42 | 87.09 | 36.06 | 75.60 | 90.31 | 87.22 |
| CNN [ | 76.19 | 76.65 | 12.50 | 51.83 | 76.65 | 73.40 |
| BRNN | 87.48 | 87.01 | 25.00 | 63.99 | 83.75 | 82.72 |
| TBRNN [ | 88.70 | 88.49 | 31.25 | 72.93 | 86.12 | 85.43 |
| MBRNN [ | 92.24 | 94.19 | 75.00 | 86.46 | 92.61 | 92.13 |
| TMBRNN | 89.93 | 92.02 | 50.00 | 77.29 | 88.94 | 88.35 |
Fig 4Different overall performance conducted with different batch sizes.
Fig 5Different accuracies on mining different categories of medical terms with different batch sizes.
Comparison results of NER in terms of different learning rates.
| Discharge Summary | |||
| Learning Rate | MicroF | MacroF | Overall Accuracy |
| 0.01 | 93.45 | 81.98 | 85.20 |
| 0.001 | 91.22 | 70.32 | 78.92 |
| 0.0001 | 83.30 | 49.65 | 54.30 |
| Progress Note | |||
| Learning Rate | MicroF | MacroF | Overall Accuracy |
| 0.01 | 94.92 | 81.20 | 89.19 |
| 0.001 | 94.51 | 76.91 | 87.60 |
| 0.0001 | 86.79 | 54.63 | 64.48 |
Comparison results of NER on discharge summaries and progress notes.
| Discharge Summary | |||
| Entity type | lr = 0.01 | lr = 0.001 | lr = 0.0001 |
| Disease | 80.37 | 70.32 | 41.58 |
| Symptom | 86.14 | 16.00 | 56.52 |
| Disease group | 60.00 | 70.32 | 0.00 |
| Treatment | 72.17 | 65.60 | 40.16 |
| Test | 90.94 | 85.78 | 62.84 |
| Progress Note | |||
| Entity type | lr = 0.01 | lr = 0.001 | lr = 0.0001 |
| Disease | 89.25 | 88.84 | 71.02 |
| Symptom | 93.73 | 91.23 | 73.80 |
| Disease group | 31.25 | 25.00 | 0.00 |
| Treatment | 78.89 | 75.23 | 28.67 |
| Test | 89.96 | 88.88 | 66.55 |