| Literature DB >> 36059424 |
Wenchao Gao1, Yu Li1, Xiaole Guan1, Shiyu Chen1, Shanshan Zhao1.
Abstract
Commonly used nested entity recognition methods are span-based entity recognition methods, which focus on learning the head and tail representations of entities. This method lacks obvious boundary supervision, which leads to the failure of the correct candidate entities to be predicted, resulting in the problem of high precision and low recall. To solve the above problems, this paper proposes a named entity recognition method based on multi-task learning and biaffine mechanism, introduces the idea of multi-task learning, and divides the task into two subtasks, entity span classification and boundary detection. The entity span classification task uses biaffine mechanism to score the resulting spans and select the most likely entity class. The boundary detection task mainly solves the problem of low recall caused by the lack of boundary supervision in span classification. It captures the relationship between adjacent words in the input text according to the context, indicates the boundary range of entities, and enhances the span representation through additional boundary supervision. The experimental results show that the named entity recognition method based on multi-task learning and biaffine mechanism can improve the F1 value by up to 7.05%, 12.63%, and 14.68% on the GENIA, ACE2004, and ACE2005 nested datasets compared with other methods, which verifies that this method has better performance on the nested entity recognition task.Entities:
Mesh:
Year: 2022 PMID: 36059424 PMCID: PMC9436550 DOI: 10.1155/2022/2687615
Source DB: PubMed Journal: Comput Intell Neurosci
Figure 1Named entity example.
Figure 2MTL-BAM model structure.
Figure 3The input representation structure of the BERT.
Figure 4Score matrix for entity span.
Figure 5Boundary detection model.
MTL-BAM algorithm flow.
| Algorithm: | MTL-BAM | |
|---|---|---|
| Input: | Original dataset | |
| Output: | Named entity recognition model | |
| 1: | While (not traversing all original dataset sentences) do | |
| 2: |
| |
| 3: |
| |
| 4: | The corresponding word vector | |
| 5: | Connect | |
| 6: | While (not traversed all | |
| 7: | Input BiLSTM layer to get output | |
| 8: | Use two multilayer perceptrons for | |
| 9: | While (NER model parameters did not converge) do | |
| 10: | While (not traversed all | |
| 11: | Input biaffine network training to get loss | |
| 12: | Input boundary detection module training to get loss | |
| 13: | Multi_Loss=loss | |
Experimental parameter setting.
| Parameter | Value |
|---|---|
| BiLSTM size | 200 |
| BiLSTM layer | 3 |
| BiLSTM dropout | 0.4 |
| MLP size | 150 |
| MLP dropout | 0.2 |
| BERT size | 1024 |
| FastText embedding size | 300 |
| CharCNN filter widths | [ |
| Char embedding size | 50 |
| Embeddings dropout | 0.5 |
| Optimiser | Adam |
| Learning rate | 0.001 |
Experimental results of entity types in GENIA dataset.
| Entity type | BAM | MTL-BAM | ||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |
| DNA |
| 75.67 |
| 78.35 |
| 77.15 |
| RNA | 86.79 | 84.40 | 85.58 |
|
|
|
| Protein |
| 83.56 | 83.20 | 81.85 |
|
|
| cell_line | 81.08 | 67.42 | 73.62 |
|
|
|
| cell_type |
| 74.09 | 75.50 | 75.87 |
|
|
| Total |
| 79.42 | 80.33 | 80.62 |
|
|
Figure 6Experimental results of various entity types in GENIA dataset.
Experimental results of entity types in ACE2004 dataset.
| Entity type | BAM | MTL-BAM | ||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |
| LOC | 65.69 | 63.81 | 64.73 |
|
|
|
| WEA | 71.43 | 46.88 | 56.60 |
|
|
|
| GPE | 83.70 | 83.94 | 83.82 |
|
|
|
| PER |
| 89.12 |
| 88.35 |
| 89.71 |
| FAC | 72.63 |
|
| 69.00 |
| 65.40 |
| ORG |
| 78.80 | 80.84 | 80.39 |
|
|
| VEH | 88.24 | 88.24 | 88.24 |
|
|
|
| Overall |
| 83.70 | 85.00 | 84.88 |
|
|
Experimental results of entity types in ACE2005 dataset.
| Entity type | BAM | MTL-BAM | ||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |
| LOC |
| 59.34 | 62.14 | 64.71 |
|
|
| WEA |
| 85.15 |
| 82.69 |
| 84.31 |
| GPE |
| 83.15 | 84.20 | 84.14 |
|
|
| PER |
| 87.74 | 88.25 | 86.10 |
|
|
| FAC | 69.50 | 72.06 | 70.76 |
|
|
|
| ORG |
| 76.82 |
| 84.04 |
| 80.78 |
| VEH | 75.64 | 69.75 | 72.58 |
|
|
|
| Overall |
| 85.36 | 84.79 | 84.23 |
|
|
Comparison results between MTL-BAM model and other entity recognition models on three nested datasets.
| Model | GENIA | ACE2004 | ACE2005 | ||||||
|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
| |
| Katuyar and Cardie [ | 79.8 | 68.2 | 73.6 | 73.6 | 71.8 | 72.7 | 70.6 | 70.4 | 70.5 |
| Ju et al. [ | 78.5 | 71.3 | 74.7 | — | — | — | 74.2 | 70.3 | 72.2 |
| Zheng et al. [ | 74.5 | 75.6 | 75.0 | — | — | — | — | — | — |
| Wang and Lu [ | 77.0 | 73.3 | 75.1 | 78.0 | 72.4 | 75.1 | 76.8 | 72.3 | 74.5 |
| Yi et al. [ | — | — | 76.2 | — | — | 84.7 | — | — | 82.9 |
| Sohrab and miwa [ | 93.2 | 64.0 | 77.1 | — | — | — | — | — | — |
| Jana et al. [ | — | — | 78.3 | — | — | 84.4 | — | — | 84.3 |
| MTL-BAM | 80.62 | 80.68 |
| 84.88 | 85.78 |
| 84.23 | 86.15 |
|
Comparison results of MTL-BAM and BAM models on JNLPBA dataset.
| Model |
|
|
|
|---|---|---|---|
| BAM |
| 79.12 | 75.84 |
| MTL-BAM | 72.42 |
|
|
Comparison results between MTL-BAM model and other entity recognition models on JNLPBA dataset.
| Model |
|
|---|---|
| Ju et al. [ | 70.1 |
| Zheng et al. [ | 73.6 |
| Wang et al. [ | 73.52 |
| Song et al. [ | 75.04 |
| MTL-BAM |
|
Comparison results of MTL-BAM and BAM models on CoNLL2003 dataset.
| Model |
|
|
|
|---|---|---|---|
| BAM |
| 93.41 | 93.19 |
| MTL-BAM | 92.72 |
|
|
Comparison results between MTL-BAM model and other entity recognition models on CoNLL2003 dataset.
| Model |
|
|---|---|
| Lample et al. [ | 90.94 |
| Strubell et al. [ | 90.7 |
| Devlin et al. [ | 92.8 |
| Akbik et al. [ | 93.09 |
| MTL-BAM |
|