| Literature DB >> 35240782 |
Chaofan Li1, Kai Ma1.
Abstract
Named entities are the main carriers of relevant medical knowledge in Electronic Medical Records (EMR). Clinical electronic medical records lead to problems such as word segmentation ambiguity and polysemy due to the specificity of Chinese language structure, so a Clinical Named Entity Recognition (CNER) model based on multi-head self-attention combined with BILSTM neural network and Conditional Random Fields is proposed. Firstly, the pre-trained language model organically combines char vectors and word vectors for the text sequences of the original dataset. The sequences are then fed into the parallel structure of the multi-head self-attention module and the BILSTM neural network module, respectively. By splicing the output of the neural network module to obtain multi-level information such as contextual information and feature association weights. Finally, entity annotation is performed by CRF. The results of the multiple comparison experiments show that the structure of the proposed model is very reasonable and robust, and it can effectively improve the Chinese CNER model. The model can extract multi-level and more comprehensive text features, compensate for the defect of long-distance dependency loss, with better applicability and recognition performance.Entities:
Keywords: bi-directional long-short term memory ; clinical named entity recognition ; conditional random fields ; electronic medical records ; multi-head self-attention
Mesh:
Year: 2022 PMID: 35240782 DOI: 10.3934/mbe.2022103
Source DB: PubMed Journal: Math Biosci Eng ISSN: 1547-1063 Impact factor: 2.080