| Literature DB >> 31141892 |
Songhee Cheon1, Jungyoon Kim2, Jihye Lim3.
Abstract
The increase in stroke incidence with the aging of the Korean population will rapidly impose an economic burden on society. Timely treatment can improve stroke prognosis. Awareness of stroke warning signs and appropriate actions in the event of a stroke improve outcomes. Medical service use and health behavior data are easier to collect than medical imaging data. Here, we used a deep neural network to detect stroke using medical service use and health behavior data; we identified 15,099 patients with stroke. Principal component analysis (PCA) featuring quantile scaling was used to extract relevant background features from medical records; we used these to predict stroke. We compared our method (a scaled PCA/deep neural network [DNN] approach) to five other machine-learning methods. The area under the curve (AUC) value of our method was 83.48%; hence; it can be used by both patients and doctors to prescreen for possible stroke.Entities:
Keywords: deep learning; feature extraction; prediction; stroke
Mesh:
Year: 2019 PMID: 31141892 PMCID: PMC6603534 DOI: 10.3390/ijerph16111876
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1The patient selection process. KNHDS = Korean National Hospital Discharge In-depth Injury Survey. ICD = International Classification of Diseases.
Distribution of subjects by general characteristics.
| Variables | N (%) | |
|---|---|---|
| Mean age | 66.1 years | |
| Gender | Male | 8252 (54.7) |
| Female | 6847 (45.3) | |
| Mortality | Yes | 1038 (6.9) |
| No | 14,061 (93.1) | |
| Stroke type | ischemic | 10,668 (70.7) |
| hemorrhagic | 4431 (29.3) | |
Figure 2The architecture of the deep neural network (DNN)/scaled principal component analysis (PCA) approach.
Figure 3Two-dimensional plots of the first and second principal components (class 0 indicates non-stroke patients and class 1 indicates stroke patients).
Figure 4The architecture of the proposed DNN.
Figure 5Train loss during training with early stopping.
Confusion matrix for our method.
| Confusion Matrix | Predicted (T) | Predicted (F) |
|---|---|---|
| Actual (T) | 238 | 132 |
| Actual (F) | 688 | 4076 |
Comparison of the confusion matrix values and performance for six classifiers (testing data).
| TH | TP | FP | FN | TN | SN (%) | SP | PP | ACC | AUC | |
|---|---|---|---|---|---|---|---|---|---|---|
| RFC | 0.077 | 223 | 960 | 147 | 3804 | 60.27 | 79.85 | 18.85 | 78.44 | 77.59 |
| ADB | 0.487 | 234 | 928 | 136 | 3836 | 63.24 | 80.52 | 20.14 | 79.28 |
|
| GNB | 0.065 | 258 | 1396 | 112 | 3368 | 69.73 |
|
| 70.63 | 78.08 |
| KNNC | 0.065 | 219 | 892 | 151 | 3872 |
| 81.28 | 19.71 | 79.68 | 72.11 |
| SVC | 0.065 | 221 | 1380 | 149 | 3384 | 59.73 | 71.03 | 13.8 |
|
|
| DNN | 0.13 | 238 | 688 | 132 | 4076 | 64.32 | 85.56 | 25.7 | 84.03 |
|
TH, threshold; RFC, random forest classifier; ADB, AdaBoost classifier: GNB, Gaussian naive Bayes; KNNC, K-nearest neighbor classifier; SVC, support vector machine; DNN, deep neural network.
Figure 6Regarding predictive performance, the area under the receiver operating characteristic curve value was highest (83.48%) for the DNN/scaled PCA classifier.
Comparison of the performance for six classifiers (10-fold cross-validation).
| Classifier | AUC | Classifier | AUC |
|---|---|---|---|
| RFC | 79.4 | ADB |
|
| GNB | 80.0 | KNNC | 72.2 |
| SVC | 69.7 | DNN |
|
Correlation coefficients of the variables (over 0.09) among 11 variables.
| Variable | Corr. Coff. | Variable | Corr. Coff. |
|---|---|---|---|
| Brain surgery required |
| Admission mode |
|
| Stroke type |
| Mortality | 1 |