| Literature DB >> 36241999 |
Yanlong Qiu1,2, Wei Wang3, Chengkun Wu4,5, Zhichang Zhang6.
Abstract
BACKGROUND: Cardiovascular disease (CVD) is a serious disease that endangers human health and is one of the main causes of death. Therefore, using the patient's electronic medical record (EMR) to predict CVD automatically has important application value in intelligent assisted diagnosis and treatment, and is a hot issue in intelligent medical research. However, existing methods based on natural language processing can only predict CVD according to the whole or part of the context information of EMR.Entities:
Keywords: Attention mechanism; CVD prediction; CVD risk factors extraction; Chinese electronic medical record; Information fusion
Mesh:
Year: 2022 PMID: 36241999 PMCID: PMC9569064 DOI: 10.1186/s12859-022-04963-w
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.307
Attributes of CVD
| No. | Attributes | Description |
|---|---|---|
| 1. | Overweight/Obesity (O2) | A diagnosis of patient overweight or obesity |
| 2. | Hypertension | A diagnosis or history of hypertension |
| 3. | Diabetes | A diagnosis or a history of diabetes |
| 4. | Dyslipidemia | A diagnosis of dyslipidemia, hyperlipidemia or a history of hyperlipidemia |
| 5. | Chronic kidney disease (CKD) | A diagnosis of CKD |
| 6. | Atherosis | A diagnosis of atherosclerosis or atherosclerotic plaque |
| 7. | Obstructive sleep apnea syndrome (OSAS) | A diagnosis of OSAS |
| 8. | Smoking | Smoking or a patient history of smoking |
| 9. | Alcohol abuse (A2) | Alcohol abuse |
| 10. | Family history of CVD (FHCVD) | Patient has a family history of CVD or has a first-degree relative (parents, siblings, or children) who has a history of CVD |
| 11. | Age | The age of the patient |
| 12. | Gender | The gender of patient |
Fig. 1The main process of CVD prediction
Fig. 2The architecture of BiLSTM-CRF model
Fig. 3The architecture of RFAB model
Fig. 4Generate the character embedding for experiments
Distribution of CVD risk factors and their occurrence times
| Risk factors | Before DHS | During DHS | After DHS | Continuing DHS | Total |
|---|---|---|---|---|---|
| O2 | 0 | 0 | 0 | 18 | 18 |
| Hypertension | 405 | 1909 | 10 | 1405 | 3729 |
| Diabetes | 60 | 57 | 13 | 877 | 1007 |
| Dyslipidemia | 4 | 287 | 6 | 75 | 372 |
| CKD | 0 | 0 | 0 | 26 | 26 |
| Atherosis | 3 | 4 | 0 | 137 | 144 |
| OSAS | 0 | 0 | 0 | 1 | 1 |
| Smoking | 8 | 0 | 0 | 500 | 508 |
| A2 | 9 | 0 | 0 | 86 | 95 |
| FHCVD | 0 | 0 | 0 | 10 | 10 |
| Age | – | – | – | – | 1859 |
| Gender | – | – | – | – | 1909 |
DHS duration of hospital stay, “–” denotes not considered
Hyper parameters of RFAB
| Parameter | Description | Value |
|---|---|---|
| Dimension of word embedding | 100 | |
| Learning rate | le−3 | |
| Batch size | 10 | |
| Each neuron’s deactivation rate | 0.5 | |
| Decay rate for | 0.99 | |
| Number of decay steps | 500 | |
| Each BiLSTM’s hidden unit quantity | 256 | |
| Number of epochs | 60 |
Fig. 5Comparison of CRF and BiLSTM-CRF models
The comparison of each model for CVD prediction results
| Model | Accuracy % | Precision % | Recall % | F-score % | |
|---|---|---|---|---|---|
| 90.91 | 90.91 | 90.91 | 90.91 | 4.98 | |
| 89.39 | 89.03 | 89.39 | 89.21 | 6.64 | |
| 92.83 | 92.64 | 92.83 | 92.73 | 3.13 | |
| 93.94 | 89.43 | 93.21 | 91.28 | 3.93 | |
| 92.24 | 93.46 | 92.73 | 93.09 | 3.01 | |
| 82.58 | 81.35 | 83.01 | 82.17 | 13.61 | |
| 93.91 | 93.83 | 93.91 | 93.86 | 2.01 | |
| 89.23 | 88.96 | 89.23 | 89.07 | 6.77 | |
| 95.43 | 95.39 | 95.43 | 95.41 | 0.48 | |
| 95.87 | 95.98 | 95.87 | 95.86 | – |
The performance of each model at random embedding
| Model | Accuracy % | Precision % | Recall % | F-score % | |
|---|---|---|---|---|---|
| 91.67 | 90.87 | 91.24 | 91.05 | 3.99 | |
| 81.82 | 79.45 | 82.36 | 80.88 | 14.07 | |
| 92.31 | 92.18 | 92.31 | 94.09 | 2.47 | |
| 95.22 | 95.16 | 95.22 | 95.19 | – |
Fig. 6Visualization of learned attention α. a Basic physical status of a patient on EMR. b A description in the Case Characteristics module of a patient in EMR