| Literature DB >> 34903164 |
Xingwang Li1, Yijia Zhang2, Faiz Ul Islam1, Deshi Dong3, Hao Wei1, Mingyu Lu1.
Abstract
BACKGROUND: Clinical notes are documents that contain detailed information about the health status of patients. Medical codes generally accompany them. However, the manual diagnosis is costly and error-prone. Moreover, large datasets in clinical diagnosis are susceptible to noise labels because of erroneous manual annotation. Therefore, machine learning has been utilized to perform automatic diagnoses. Previous state-of-the-art (SOTA) models used convolutional neural networks to build document representations for predicting medical codes. However, the clinical notes are usually long-tailed. Moreover, most models fail to deal with the noise during code allocation. Therefore, denoising mechanism and long-tailed classification are the keys to automated coding at scale.Entities:
Keywords: Attention mechanism; Automatic diagnosis; Denoising model; Joint learning; Multi-label classification
Mesh:
Year: 2021 PMID: 34903164 PMCID: PMC8667397 DOI: 10.1186/s12859-021-04520-x
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Examples of ICD-9 codes (011-016)
| ICD code | Description |
|---|---|
| 011 | Tuberculosis |
| 012 | respiratory tuberculosis |
| 013 | Tuberculosis of the meninges and central nervous system |
| 014 | Bowel and intestinal membrane gland tuberculosis |
| 015 | Bone and joint tuberculosis |
| 016 | Reproductive urinary system tuberculosis |
Fig. 1Example of noise interference
Fig. 2The distribution of ICD codes on MIMIC-III
Fig. 3Schematic overview of JLAN
Fig. 4The scheme of the joint learning mechanism
Statistics of the datasets
| Dataset | Vocab | Train | Valid | Test |
|---|---|---|---|---|
| MIMIC-III-50 | 59,168 | 8067 | 1574 | 1730 |
| MIMIC-III | 140,795 | 47,724 | 1632 | 3372 |
Performance comparison of using different T-loss in JLAN
| config | MIMIC-III-full | MIMIC-III-50 | ||
|---|---|---|---|---|
| Micro-F1 | Macro-F1 | Micro-F1 | Macro-F1 | |
| T-loss=0.05 | 0.542 | 0.061 | 0.623 | 0.571 |
| T-loss=0.1 | 0.557 | 0.068 | 0.626 | 0.574 |
| T-loss=0.15 | 0.556 | 0.068 | 0.627 | 0.573 |
| T-loss=0.2 | 0.547 | 0.064 | 0.625 | 0.573 |
Performance comparison of using different residual blocks in JLAN
| Config | MIMIC-III-full | MIMIC-III-50 | ||
|---|---|---|---|---|
| Micro-F1 | Macro-F1 | Micro-F1 | Macro-F1 | |
| P = 1 | 0.543 | 0.062 | 0.637 | 0.585 |
| P = 2 | 0.541 | 0.059 | 0.597 | 0.558 |
| P = 3 | 0.540 | 0.059 | 0.582 | 0.524 |
The performance of the JLAN model and baseline models on the MIMIC-III-50 test set
| Model | AUC | F1 | P@5 | R@5 | ||
|---|---|---|---|---|---|---|
| Macro | Micro | Macro | Micro | |||
| CNN | 87.6 | 90.7 | 57.6 | 62.5 | 62.0 | – |
| BiGRU | 82.8 | 86.8 | 48.4 | 54.9 | 59.1 | – |
| LEAM | 88.1 | 91.2 | 54.0 | 61.9 | 61.2 | – |
| CAML | 87.5 | 90.9 | 53.2 | 61.4 | 60.9 | – |
| DR–CAML | 88.4 | 91.6 | 57.6 | 63.3 | 61.8 | – |
| MSATT-KG | 91.4 | 93.6 | 63.8 | 68.4 | 64.4 | - |
| MultiResCNN | 89.9 | 92.8 | 60.6 | 67.0 | 64.1 | 62.1 |
| JLAN |
Fig. 5Comparison of JLAN and baseline model
The performance of JLAN and the baseline models on the MIMIC-III-full test set
| Model | AUC | F1 | P@15 | P@8 | ||
|---|---|---|---|---|---|---|
| Macro | Micro | Macro | Micro | |||
| LR | 56.1 | 93.7 | 1.1 | 27.2 | – | 54.2 |
| CNN | 80.6 | 96.9 | 4.2 | 41.9 | – | 58.1 |
| BiGRU | 82.2 | 97.1 | 3.8 | 41.7 | 58.5 | |
| CAML | 89.5 | 98.6 | 8.8 | 53.9 | – | 70.9 |
| DR-CAML | 89.7 | 98.5 | 8.6 | 52.9 | – | 69.0 |
| MSATT-KG | 91.0 | 9.0 | 55.3 | – | 72.8 | |
| MultiResCNN | 91.0 | 98.6 | 8.5 | 55.2 | 73.4 | |
| JLAN | 98.8 | 57.9 |
Fig. 6Result of the ablation experiment. 'L', 'S' and 'J' denote label attention, self-attention and joint learning, respectively
Fig. 7Results of the joint learning experiment. The blue and orange rectangles represent training with and without the joint learning mechanism, respectively
Fig. 8Effect of the denoising model
Fig. 9Visualization of self-attention mechanism on patient-A
Fig. 10Visualization of label attention mechanism on patient-A