| Literature DB >> 32565878 |
Zeguo Shao1,2, Yuhong Xiang1, Yingchao Zhu3, Aiqin Fan4, Peng Zhang5,6.
Abstract
PURPOSE: To explore the influences of smoking, alcohol consumption, drinking tea, diet, sleep, and exercise on the risk of stroke and relationships among the factors, present corresponding knowledge-based rules, and provide a scientific basis for assessment and intervention of risk factors of stroke.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32565878 PMCID: PMC7285386 DOI: 10.1155/2020/3217356
Source DB: PubMed Journal: Comput Math Methods Med ISSN: 1748-670X Impact factor: 2.238
Subjects' clinical data.
| Type of data | Risk factor of stroke | Field | Data distribution |
|---|---|---|---|
| Clinical diagnosis | Hypertension | Hyte | y: 1242, n: 3782, uncertain: 575 |
| Dyslipidemia | Dysl | y: 511, n: 4508, uncertain: 580 | |
| Diabetes | Diab | y: 403, n: 4618, uncertain: 578 | |
| Atrial fibrillation | AF | y: 75, n: 4940, uncertain: 584 | |
|
| |||
| Medical history and family history | Family history of stroke | FSH | y: 449, n: 4460, uncertain: 690 |
| History of stroke | SH | y: 165, n: 4730, uncertain: 704 | |
| TIA | TIA | y: 95, n: 4350, uncertain: 1154 | |
|
| |||
| Demographic information | Gender | Gen | M: 2491, F: 3108 |
| Age | Age | Refer to | |
|
| |||
| Physical examination | BMI | BMIc | B1: 205, B2: 2926, B3: 1760, B4: 520, B5: 150, uncertain: 38 |
|
| |||
| Daily habits | Smoking | Smok | y: 1192, n: 4379, null: 28 |
| Alcohol consumption | Alco | y: 1065, n: 4500, null: 34 | |
| Drinking tea | Tea | y: 1563, n: 3997, null: 39 | |
| Diet | DT | C1: 2812, C2: 263, C3: 2181, null: 370 | |
| Sleep | Sleep | TS: 366, TB: 4958, BL: 205, null: 70 | |
| Exercise sport | Sport | C1: 1518, C2: 1624, C3: 2275, null: 182 | |
“y” means “yes,” “n” indicates “no,” and definitions of the types of BMI, diet, sleep, and exercise are presented in Figure 1. In Figure 1, we sometimes use fields to represent their corresponding stroke risk factors.
Figure 1Distribution of age- and gender-based data.
Sleep classification.
| Age | Duration of sleep (hours) | Mark |
|---|---|---|
| <3 (months) | <14 | TS |
| 14~17 | TB | |
| >17 | TL | |
|
| ||
| 1~2 (years old) | <11 | TS |
| 11~14 | TB | |
| >14 | TL | |
|
| ||
| 6~13 (years old) | <9 | TS |
| 9~11 | TB | |
| >11 | TL | |
|
| ||
| 14~17 (years old) | <8 | TS |
| 8~10 | TB | |
| <10 | TL | |
|
| ||
| 18~64 (years old) | <6 | TS |
| 6~10 | TB | |
| <10 | TL | |
|
| ||
| >64 (years old) | <7 | TS |
| 7~8 | TB | |
| <8 | TL | |
Figure 2BMI classification.
Definition of different levels of risk factors of stroke.
| Type | Definition |
|---|---|
| Y | Have a history of stroke. |
| T | Has a previous transient ischemic attack. |
| H | The major risk factors defined in the guidelines are 2 items or more, or the major risk factors include 1 item, and the secondary risk factors involve 2 items or more. |
| M | The major risk factors defined in the guidelines include 1 item, and the secondary risk factors involve less than 2 items. |
| L | The main risk factors defined in the guidelines include 0 item, and the secondary risk factors involve 2 items or more. |
| N | The main risk factors defined in the guidelines include 0 item, and the secondary risk factors involve less than 2 items. |
Figure 3SMOTE+C4.5 classification model.
Figure 4A decision tree to classify risk factors of stroke.
Figure 5Decision tree #1 to classify risk factors of stroke.
Figure 6Decision tree #2 to classify risk factors of stroke.
Figure 7Decision tree #3 to classify risk factors of stroke.
Figure 8Decision tree #4 to classify risk factors of stroke.
Confusion matrix achieved by the optimized C4.5 algorithm.
| Risk level analyzed by optimized C4.5 algorithm | Recall | ||||||
|---|---|---|---|---|---|---|---|
| H | M | Y | T | N | L | ||
| Risk level analyzed by physicians | |||||||
| H |
| 127 | 0 | 0 | 0 | 0 | 0.910 |
| M | 44 |
| 0 | 0 | 0 | 0 | 0.972 |
| Y | 0 | 0 |
| 0 | 0 | 0 | 1.000 |
| T | 2 | 0 | 0 |
| 0 | 0 |
|
| N | 0 | 0 | 0 | 0 |
| 255 | 0.727 |
| L | 0 | 0 | 0 | 0 | 182 |
| 0.766 |
| Precision | 0.966 | 0.922 | 1.000 | 1.000 | 0.789 | 0.700 | |
| Accuracy | 87.53% | ||||||
| Kappa | 0.8344 | ||||||
Confusion matrix achieved by the random forest algorithm.
| Risk level analyzed by random forest algorithm | Recall | ||||||
|---|---|---|---|---|---|---|---|
| H | M | Y | T | N | L | ||
| Risk level analyzed by physicians | |||||||
| H |
| 115 | 0 | 0 | 0 | 0 | 0.919 |
| M | 72 |
| 0 | 0 | 1 | 0 | 0.953 |
| Y | 6 | 0 |
| 0 | 0 | 1 | 0.958 |
| T | 24 | 6 | 0 |
| 3 | 9 |
|
| N | 0 | 0 | 0 | 0 |
| 235 | 0.748 |
| L | 0 | 0 | 0 | 0 | 239 |
| 0.693 |
| Precision | 0.927 | 0.924 | 1.000 | 1.000 | 0.742 | 0.688 | |
| Accuracy | 85.46% | ||||||
| Kappa | 0.8063 | ||||||
Confusion matrix achieved by the Logistic algorithm.
| Risk level analyzed by Logistic | Recall | ||||||
|---|---|---|---|---|---|---|---|
| H | M | Y | T | N | L | ||
| Risk level analyzed by physicians | |||||||
| H |
| 124 | 0 | 1 | 0 | 1 | 0.911 |
| M | 97 |
| 1 | 1 | 1 | 0 | 0.935 |
| Y | 0 | 0 |
| 0 | 0 | 1 | 0.994 |
| T | 5 | 0 | 1 |
| 1 | 0 | 0.868 |
| N | 0 | 0 | 0 | 0 |
| 244 | 0.739 |
| L | 0 | 1 | 0 | 0 | 214 |
| 0.724 |
| Precision | 0.927 | 0.920 | 0.988 | 0.958 | 0.762 | 0.696 | |
| Accuracy | 85.83% | ||||||
| Kappa | 0.8119 | ||||||
Figure 9Illustration of errors of the optimized C4.5 algorithm.
Figure 10Illustration of errors of the random forest algorithm.
Figure 11Illustration of errors of the Logistic algorithm.
Values of risk factors for stroke.
| Risk factors | Depth/frequency | Average depth | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | ||
| SH | 1 | 0.00 | ||||||||||||||
| Hyte | 2 | 2.00 | ||||||||||||||
| Dysl | 2 | 1 | 3.33 | |||||||||||||
| Diab | 4 | 2 | 1 | 3.71 | ||||||||||||
| FSH | 4 | 1 | 5.40 | |||||||||||||
| TIA | 1 | 1 | 1 | 2 | 2 | 5.43 | ||||||||||
| Smok | 1 | 2 | 1 | 6.00 | ||||||||||||
| AF | 7 | 2 | 6.67 | |||||||||||||
| Sport | 1 | 1 | 3 | 7.00 | ||||||||||||
| Sleep | 1 | 1 | 1 | 8.67 | ||||||||||||
| Gen | 1 | 1 | 3 | 2 | 9.00 | |||||||||||
| BMI | 3 | 1 | 9.25 | |||||||||||||
| Tea | 1 | 1 | 1 | 9.33 | ||||||||||||
| Age | 1 | 2 | 1 | 3 | 3 | 1 | 10.00 | |||||||||
| Alco | 1 | 2 | 1 | 1 | 10.60 | |||||||||||
| DT | 1 | 3 | 10.75 | |||||||||||||
A factor-based relationship matrix.
| SH | Hyte | Dysl | Diab | FSH | TIA | Smok | AF | Sport | Sleep | Gen | BMIc | Tea | Age | Alco | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Hyte | 6.84 | ||||||||||||||
| Dysl | 6.34 | 6.34 | |||||||||||||
| Diab | 5.71 | 5.71 | 5.46 | ||||||||||||
| FSH | 5.71 | 4.95 | 4.95 | 4.64 | |||||||||||
| TIA | 3.91 | 3.91 | 3.66 | 3.46 | 3.29 | ||||||||||
| Smok | 4.16 | 4.16 | 4.16 | 4.16 | 3.85 | 3.03 | |||||||||
| AF | 0.45 | 0.45 | 0.45 | 0.45 | 0.45 | 0.33 | 0.20 | ||||||||
| Sport | 4.49 | 4.49 | 4.49 | 3.82 | 3.98 | 2.90 | 3.44 | 0.20 | |||||||
| Sleep | 0.42 | 0.42 | 0.42 | 0.42 | 0.27 | 0.08 | 0.42 | 0.00 | 0.27 | ||||||
| Gen | 1.74 | 1.74 | 1.74 | 1.43 | 1.43 | 1.05 | 1.05 | 0.00 | 1.74 | 0.08 | |||||
| BMIc | 2.17 | 2.17 | 2.17 | 2.17 | 2.17 | 2.17 | 2.17 | 0.00 | 2.17 | 0.08 | 1.05 | ||||
| Tea | 0.60 | 0.60 | 0.60 | 0.60 | 0.60 | 0.48 | 0.48 | 0.13 | 0.48 | 0.00 | 0.22 | 0.38 | |||
| Age | 6.84 | 6.84 | 6.34 | 5.71 | 4.95 | 3.91 | 4.16 | 0.45 | 4.49 | 0.42 | 1.74 | 2.17 | 0.60 | ||
| Alco | 0.70 | 0.70 | 0.70 | 0.70 | 0.70 | 0.40 | 0.70 | 0.00 | 0.70 | 0.19 | 0.22 | 0.40 | 0.14 | 0.70 | |
| DT | 0.96 | 0.96 | 0.96 | 0.96 | 0.96 | 0.96 | 0.96 | 0.00 | 0.96 | 0.00 | 0.62 | 0.86 | 0.38 | 0.96 | 0.22 |
Factors with higher correlation values than the mean values within the group.
| Smok | Sport | Sleep | Tea | Alco | DT | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Factors | Correlation | Factors | Correlation | Factors | Correlation | Factors | Correlation | Factors | Correlation | Factors | Correlation |
| SH | 4.16 | SH | 4.49 | SH | 0.42 | SH | 0.60 | SH | 0.70 | SH | 0.96 |
| Hyte | 4.16 | Hyte | 4.49 | Hyte | 0.42 | Hyte | 0.60 | Hyte | 0.70 | Hyte | 0.96 |
| Dysl | 4.16 | Dysl | 4.49 | Dysl | 0.42 | Dysl | 0.60 | Dysl | 0.70 | Dysl | 0.96 |
| Diab | 4.16 | Age | 4.49 | Age | 0.42 | Age | 0.60 | Age | 0.70 | Age | 0.96 |
| Age | 4.16 | FSH | 3.98 | Diab | 0.42 | Diab | 0.60 | Diab | 0.70 | Diab | 0.96 |
| FSH | 3.85 | Diab | 3.82 | Smok | 0.42 | FSH | 0.60 | FSH | 0.70 | FSH | 0.96 |
| Sport | 3.44 | Smok | 3.44 | FSH | 0.27 | Smok | 0.48 | Smok | 0.70 | Smok | 0.96 |
| TIA | 3.03 | TIA | 2.90 | Sport | 0.27 | Sport | 0.48 | Sport | 0.70 | Sport | 0.96 |
| TIA | 0.48 | TIA | 0.96 | ||||||||
| BMI | 0.86 | ||||||||||
The effects of the 6 daily habits (smoking, alcohol consumption, drinking tea, diet, sleep, and exercise) on stroke risk are discussed in the next sections.
Figure 12Radar charts illustrating the effects of daily life habits on risk factors of stroke.