| Literature DB >> 30700331 |
Zengjian Liu1, Xiaolong Wang1, Qingcai Chen1, Buzhou Tang2, Hua Xu3.
Abstract
BACKGROUND: The goal of temporal indexing is to select an occurred time or time interval for each medical entity in clinical notes, so that all medical entities can be indexed on a united timeline, which could assist the understanding of clinical notes and the further application of medical entities. Some temporal relation shared tasks for the medical entity in English clinical notes have been organized in the past few years, such as the 2012 i2b2 NLP challenge, 2015 and 2016 clinical TempEval challenges. In these tasks, many heuristics rule-based and machine learning-based systems have been developed. In recent years, the deep neural network models have shown great potential on many problems including the relation classification.Entities:
Keywords: Clinical notes; Convolutional neural network; Medical entity; Recurrent neural network; Temporal indexing
Mesh:
Year: 2019 PMID: 30700331 PMCID: PMC6354334 DOI: 10.1186/s12911-019-0735-x
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Fig. 1The main flow for temporal indexing
Sections and corresponding occurred times in the Chinese clinical notes
| Section name | Occurred time | Relation |
|---|---|---|
| Chief complaint, History of present illness, Past medical history, Personal history, Conditions in admission | Admission | Before |
| Physical examination, Assistant examination, Preliminary diagnosis, Diagnosis on admission | Admission | Simultaneous |
| Diagnosis and treatment | Admission | After |
Fig. 2The main flow for candidate selection
Fig. 3An example for the candidate selection
Fig. 4Overview architecture of RNN-CNN
Fig. 5Examples for the annotation of temporal indexing
Statistics of the annotated dataset for temporal indexing of medical entity
| #Record | #ME | #TE | #Time node | #Time interval | |
|---|---|---|---|---|---|
| Training set | 413 | 9348 | 2982 | 6461 | 2887 |
| Test set | 150 | 3263 | 1024 | 2131 | 1132 |
| Total | 563 | 12,611 | 4006 | 8592 | 4019 |
Hyper-parameters used for the neural network
| Hyper-parameter | Value |
|---|---|
| Dimension of word representation | 50 |
| Dimension of position representation | 30 |
| Dimension of feature representation | 20 |
| Size of LSTM unit | 50 |
| Size of convolution filter | 3/5/7 |
| Number of convolution filters | 100 |
| Max length of the sentence | 100 |
| Dropout probability | 0.5 |
| Batch size | 32 |
| Training epochs | 20 |
The results of our candidate selection method
| Selected TEs | Selected MEs | Coverage Scale | |
|---|---|---|---|
| Training set | 31,156 | 8995 | 96.22% |
| Test set | 10,233 | 3171 | 97.18% |
The performance of various methods for the temporal relation classification (%)
| Method | Precision | Recall | F1-score |
|---|---|---|---|
| SVM | 70.13 | 73.18 | 71.63 |
| CNN | 70.68 | 79.56 | 74.86 |
| RNN | 67.10 | 84.68 | 74.87 |
| RNN-CNN | 71.41 | 81.15 | 75.97 |
| Merged | 74.97 | 82.07 | 78.36 |
The accuracies of various methods for the temporal indexing (%)
| Method | Relaxed | Strict |
|---|---|---|
| Rule | 86.42 | 67.36 |
| SVM | 85.53 | 69.32 |
| CNN | 86.33 | 69.84 |
| RNN | 87.13 | 71.38 |
| RNN-CNN | 88.26 | 71.96 |
| Merged | 88.57 | 73.31 |
The performances of various methods on the time and time interval indexing (%)
| Method | Time node (%) | Time interval (%) | ||||
|---|---|---|---|---|---|---|
| Precision | Recall | F1-score | Precision | Recall | F1-score | |
| Rule | 69.62 | 79.35 | 74.17 | 60.79 | 44.79 | 51.58 |
| SVM | 65.79 | 83.39 | 73.55 | 86.30 | 42.84 | 57.26 |
| CNN | 72.02 | 79.35 | 75.50 | 64.26 | 51.94 | 57.45 |
| RNN | 74.13 | 78.41 | 76.21 | 65.21 | 58.13 | 61.47 |
| RNN-CNN | 74.80 | 78.84 | 76.76 | 65.68 | 59.01 | 62.17 |
| Merged | 73.38 | 81.23 | 77.10 | 73.12 | 58.39 | 64.93 |