| Literature DB >> 29666671 |
Kunli Zhang1, Hongchao Ma1,2, Yueshu Zhao3, Hongying Zan1, Lei Zhuang1.
Abstract
Obstetric electronic medical records (EMRs) contain massive amounts of medical data and health information. The information extraction and diagnosis assistants of obstetric EMRs are of great significance in improving the fertility level of the population. The admitting diagnosis in the first course record of the EMR is reasoned from various sources, such as chief complaints, auxiliary examinations, and physical examinations. This paper treats the diagnosis assistant as a multilabel classification task based on the analyses of obstetric EMRs. The latent Dirichlet allocation (LDA) topic and the word vector are used as features and the four multilabel classification methods, BP-MLL (backpropagation multilabel learning), RAkEL (RAndom k labELsets), MLkNN (multilabel k-nearest neighbor), and CC (chain classifier), are utilized to build the diagnosis assistant models. Experimental results conducted on real cases show that the BP-MLL achieves the best performance with an average precision up to 0.7413 ± 0.0100 when the number of label sets and the word dimensions are 71 and 100, respectively. The result of the diagnosis assistant can be introduced as a supplementary learning method for medical students. Additionally, the method can be used not only for obstetric EMRs but also for other medical records.Entities:
Mesh:
Year: 2018 PMID: 29666671 PMCID: PMC5832137 DOI: 10.1155/2018/7273451
Source DB: PubMed Journal: J Healthc Eng ISSN: 2040-2295 Impact factor: 2.682
Figure 1The example of the first course of disease record.
Figure 2The frequency distribution of diagnoses.
Figure 3The workflow of the diagnosis assistant.
Results with |L2| = 233, K = 120, and T = 100.
| Method | Feature | HL↓ | C↓ | OE↓ | RL↓ | AP↑ |
|---|---|---|---|---|---|---|
| RAkEL | LDA | 0.0085 ± 0.0002 | 124.5190 ± 2.9857 | 0.3479 ± 0.0192 | 0.2874 ± 0.0092 | 0.5727 ± 0.0090 |
| Word vector | 0.0078 ± 0.0002 | 127.1671 ± 2.4166 | 0.2902 ± 0.0173 | 0.2984 ± 0.0104 | 0.5906 ± 0.0109 | |
|
| ||||||
| MLkNN | LDA | 0.0078 ± 0.000 | 15.5416 ± 0.7277 | 0.2425 ± 0.0127 | 0.0292 ± 0.0009 | 0.6571 ± 0.0087 |
| Word vector |
| 13.6120 ± 0.6596 |
| 0.0240 ± 0.0007 |
| |
|
| ||||||
| CC | LDA | 0.0093 ± 0.0002 | 109.6586 ± 2.9200 | 0.4908 ± 0.0150 | 0.2430 ± 0.0078 | 0.5073 ± 0.0097 |
| Word vector | 0.0088 ± 0.0001 | 90.2732 ± 2.8796 | 0.4427 ± 0.0109 | 0.1960 ± 0.0070 | 0.5408 ± 0.0074 | |
|
| ||||||
| BP-MLL | LDA | 0.0341 ± 0.0058 | 14.6960 ± 1.0139 | 0.2426 ± 0.0136 | 0.0276 ± 0.0020 | 0.6264 ± 0.0114 |
| Word vector | 0.0244 ± 0.0012 |
| 0.2431 ± 0.0136 |
| 0.6588 ± 0.0091 | |
Results with |L3| = 153, K = 120, and T = 100.
| Method | Feature | HL↓ | C↓ | OE↓ | RL↓ | AP↑ |
|---|---|---|---|---|---|---|
| RAkEL | LDA | 0.0120 ± 0.0003 | 67.4355 ± 1.5205 | 0.3044 ± 0.0125 | 0.2317 ± 0.0057 | 0.6205 ± 0.0074 |
| Word vector | 0.0114 ± 0.0003 | 74.6069 ± 1.7410 | 0.2636 ± 0.0147 | 0.2527 ± 0.0088 | 0.6228 ± 0.0068 | |
|
| ||||||
| MLkNN | LDA | 0.0113 ± 0.0002 | 11.3434 ± 0.5319 | 0.2511 ± 0.0074 | 0.0347 ± 0.0016 | 0.6650 ± 0.0065 |
| Word vector |
| 11.7522 ± 0.4557 |
| 0.0318 ± 0.0015 |
| |
|
| ||||||
| CC | LDA | 0.0136 ± 0.0003 | 71.7498 ± 3.5310 | 0.4942 ± 0.0107 | 0.2533 ± 0.0142 | 0.5108 ± 0.0098 |
| Word vector | 0.0134 ± 0.0002 | 61.0079 ± 1.8989 | 0.4427 ± 0.0109 | 0.2050 ± 0.0075 | 0.5430 ± 0.0075 | |
|
| ||||||
| BP-MLL | LDA | 0.0362 ± 0.0043 | 10.2577 ± 0.5443 | 0.2531 ± 0.0087 | 0.0302 ± 0.0022 | 0.6522 ± 0.0149 |
| Word vector | 0.0276 ± 0.0011 |
| 0.2417 ± 0.0140 |
| 0.6751 ± 0.0091 | |
Results with with |L4| = 71, K = 120, and D = 100.
| Method | Feature | HL↓ | C↓ | OE↓ | RL↓ | AP↑ |
|---|---|---|---|---|---|---|
| RAkEL | LDA | 0.0244 ± 0.0004 | 26.3255 ± 1.1150 | 0.2799 ± 0.0123 | 0.1870 ± 0.0081 | 0.6575 ± 0.0090 |
| Word vector | 0.0237 ± 0.0004 | 29.9007 ± 0.8173 | 0.2391 ± 0.0113 | 0.2074 ± 0.0071 | 0.6595 ± 0.0082 | |
|
| ||||||
| MLkNN | LDA | 0.0241 ± 0.0003 | 9.2824 ± 0.2916 | 0.2498 ± 0.0112 | 0.0631 ± 0.0026 | 0.6697 ± 0.0085 |
| Word vector |
| 9.0997 ± 0.4973 |
| 0.0547 ± 0.0033 | 0.7356 ± 0.0088 | |
|
| ||||||
| CC | LDA | 0.0288 ± 0.0006 | 34.4526 ± 1.4447 | 0.4850 ± 0.0220 | 0.2729 ± 0.0130 | 0.5228 ± 0.0125 |
| Word vector | 0.0285 ± 0.0004 | 30.4830 ± 0.7443 | 0.4427 ± 0.0109 | 0.2301 ± 0.0068 | 0.5509 ± 0.0069 | |
|
| ||||||
| BP-MLL | LDA | 0.0458 ± 0.0046 | 7.4636 ± 0.4216 | 0.2521 ± 0.0128 | 0.0462 ± 0.0030 | 0.7081 ± 0.0098 |
| Word vector | 0.0349 ± 0.0014 |
| 0.2325 ± 0.0131 |
|
| |
Figure 4Experimental results on different number of topics.
Figure 5Experimental results on different word dimensions.