| Literature DB >> 35656458 |
Dongmei Li1,2, Jiao Long1,2, Jintao Qu1,2, Xiaoping Zhang3.
Abstract
Traditional clinical named entity recognition methods fail to balance the effectiveness of feature extraction of unstructured text and the complexity of neural network models. We propose a model based on ALBERT and a multihead attention (MHA) mechanism to solve this problem. Structurally, the model first obtains character-level word embeddings through the ALBERT pretraining language model, then inputs the word embeddings into the iterated dilated convolutional neural network model to quickly extract global semantic information, and decodes the predicted labels through conditional random fields to obtain the optimal label sequence. Also, we apply the MHA mechanism to capture intercharacter dependencies from multiple aspects. Furthermore, we use the RAdam optimizer to boost the convergence speed and improve the generalization ability of our model. Experimental results show that our model achieves an F1 score of 85.63% on the CCKS-2019 dataset-an increase of 4.36% compared to the baseline model.Entities:
Year: 2022 PMID: 35656458 PMCID: PMC9152388 DOI: 10.1155/2022/2056039
Source DB: PubMed Journal: Evid Based Complement Alternat Med ISSN: 1741-427X Impact factor: 2.650
Figure 1Main architecture of our ALBERT-IDCNN-MHA-CRF model.
Figure 2Input example.
Statistics of different types of entities for the CCKS-2019.
| Disease | Exam | Test | Operation | Drug | Anatomy | Sum | |
|---|---|---|---|---|---|---|---|
| Train | 2116 | 222 | 318 | 765 | 456 | 1486 | 5363 |
| Test | 682 | 91 | 193 | 140 | 263 | 447 | 1816 |
Results of different models.
| Method | P (%) | R (%) | F1 (%) |
|---|---|---|---|
| BiLSTM-CRF(Baseline) | 79.79 | 82.81 | 81.27 |
| IDCNN-CRF | 80.37 | 82.65 | 81.49 |
| IDCNN-MHA-CRF | 82.16 | 82.81 | 82.48 |
| ALBERT-IDCNN-CRF | 82.70 | 84.03 | 83.36 |
| ALBERT-IDCNN-MHA-CRF |
|
|
|
The best result on each metric is shown in bold face.
Figure 3Results on different types of entities.
Error samples.
| Prediction | True entity |
|---|---|
| 肝细胞性肝癌 (hepatocellular carcinoma) | (左肝)肝细胞性肝癌(中度分化) ((left liver) hepatocellular carcinoma (moderately differentiated)) |
| 胃癌 (gastric cancer) | 胃癌根治术 (radical gastrectomy for gastric cancer) |
| 肾上腺 (adrenal gland) | 左肾上腺 (left adrenal gland) |
| 淋巴结 (lymph nodes) | 腹主动脉旁淋巴结 (abdominal para-aortic lymph nodes) |
Summary of different optimizers.
| Optimizer | Year | Learning rate | Gradient |
|---|---|---|---|
| AdaGrad | 2011 | √ | × |
| RMSprop | 2012 | √ | × |
| Adam | 2014 | √ | √ |
| Lookahead + Adam | 2019 | √ | √ |
| RAdam | 2019 | √ | √ |
“√” means dynamic adjustment, “×” means not.
Results on different optimizers.
| Optimizer | P (%) | R (%) | F1 (%) |
|---|---|---|---|
| AdaGrad | 78.12 | 80.99 | 79.53 |
| RMSprop | 80.62 | 83.71 | 82.13 |
| Adam | 83.46 | 85.96 | 84.69 |
| Lookahead + Adam | 83.69 | 85.8 | 84.74 |
| RAdam |
|
|
|
The best result on each metric is shown in bold face.
Figure 4Comparison of the performance of different optimizers.
Comparison with state-of-the-art models.
| Team name | Method | F1 (%) |
|---|---|---|
| Alihealth | BBC + BBT + FBBC + rule | 85.62 |
| THU_MSIIP | Ensemble | 85.59 |
| DUTIR | ELMO + BiLSTM-CRF | 85.16 |
| Jfhealthcare | — | 84.85 |
| Suda-hlt | — | 84.12 |
| ZJUCST | — | 83.80 |
| Ours | ALBERT-IDCNN-MHA-CRF |
|
The best result is shown in bold face.
Figure 5Influence of the number of heads.