| Literature DB >> 32998724 |
Young-Min Kim1,2, Tae-Hoon Lee3.
Abstract
BACKGROUND: While clinical entity recognition mostly aims at electronic health records (EHRs), there are also the demands of dealing with the other type of text data. Automatic medical diagnosis is an example of new applications using a different data source. In this work, we are interested in extracting Korean clinical entities from a new medical dataset, which is completely different from EHRs. The dataset is collected from an online QA site for medical diagnosis. Bidirectional Encoder Representations from Transformers (BERT), which is one of the best language representation models, is used to extract the entities.Entities:
Keywords: BERT; Clinical entity recognition; Diagnosis text; Korean
Year: 2020 PMID: 32998724 PMCID: PMC7526093 DOI: 10.1186/s12911-020-01241-8
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Fig. 1Architecture of a dialogue system
Fig. 2BERT architecture and fine-tuning
Fig. 3BERT input representation. (modified image from [8])
Statistics of the QA dataset for diagnosis
| neurology | 126 | 630 |
| neurosurgery | 156 | 780 |
| internal medicine | 131 | 655 |
| otorhinolaryngology | 123 | 615 |
| total | 536 | 2189 |
Entity definition for clinical NER
| Disease (DZ) | disease name. used for final diagnosis |
| Symptom (SX) | symptom which can be detected by users |
| Body Part (BP) | body part where the symptom reveals |
Characteristics of the annotated diagnosis data
| # annotated unique entities | 297 | 228 | 199 |
| # annotated entities | 915 | 1,267 | 1,010 |
Fig. 4An example of tokenized sentence in diagnosis dataset. Each column corresponds to the output labels, input tokens, and the translation of input in English
Fig. 5An example of WordPiece input representation with three different labeling strategies (WP, IOB, TRAD) for the evaluation. BERT labels are used for the training, whereas WP and IOB labels are used only for the evaluation
Fig. 6Bi-LSTM-CRF architecture
NER performance comparison results of BERT and bi-LSTM-CRF on the Exobrain dataset
| Precision | 0.93 | 0.93 | 0.90 | 0.89 |
| Recall | 0.94 | 0.94 | 0.87 | 0.86 |
| F1 | 0.94 | 0.93 | 0.89 | 0.88 |
NER performance comparison results of BERT and bi-LSTM-CRF on the diagnosis dataset
| Precision | 0.82 | 0.83 | 0.81 | 0.82 |
| Recall | 0.85 | 0.85 | 0.78 | 0.79 |
| F1 | 0.83 | 0.84 | 0.79 | 0.81 |
NER results of BERT on the diagnosis dataset evaluated with BERT labels and WP label
| Precision | 0.72 | 0.83 | 0.70 | 0.82 |
| Recall | 0.72 | 0.84 | 0.73 | 0.84 |
| F1 | 0.72 | 0.84 | 0.72 | 0.83 |
Detailed evaluation result with BERT for the diagnosis dataset
| B-DZ | 0.85 | 0.88 | 0.87 | 0.85 | 0.89 | 0.87 | 0.85 | 0.88 | 0.87 |
| I-DZ | 0.83 | 0.82 | 0.82 | 0.83 | 0.84 | 0.83 | 0.85 | 0.88 | 0.87 |
| B-SX | 0.86 | 0.85 | 0.86 | 0.86 | 0.84 | 0.84 | 0.86 | 0.85 | 0.86 |
| I-SX | 0.53 | 0.54 | 0.54 | 0.54 | 0.55 | 0.55 | 0.83 | 0.81 | 0.82 |
| B-BP | 0.83 | 0.86 | 0.84 | 0.78 | 0.83 | 0.81 | 0.83 | 0.86 | 0.84 |
| I-BP | 0.38 | 0.39 | 0.39 | 0.36 | 0.42 | 0.39 | 0.72 | 0.79 | 0.75 |
Transfer learning result tested with the question data using BERT model trained with the diagnosis (answer) dataset
| B-DZ | 0.87 | 0.81 | 0.84 |
| I-DZ | 0.84 | 0.85 | 0.85 |
| B-SX | 0.78 | 0.52 | 0.63 |
| I-SX | 0.19 | 0.07 | 0.10 |
| B-BP | 0.81 | 0.81 | 0.81 |
| I-BP | 0.69 | 0.26 | 0.37 |
| macro-avg | 0.70 | 0.55 | 0.62 |
| micro-avg | 0.81 | 0.67 | 0.73 |