| Literature DB >> 35155629 |
Shih-Yi Lin1,2, Kin-Man Law3,4, Yi-Chun Yeh3, Kuo-Chen Wu3,5, Jhih-Han Lai2, Chih-Hsueh Lin1,6, Wu-Huei Hsu1,7, Cheng-Chieh Lin1,6, Chia-Hung Kao1,3,8,9.
Abstract
BACKGROUND: Although carotid sonographic features have been used as predictors of recurrent stroke, few large-scale studies have explored the use of machine learning analysis of carotid sonographic features for the prediction of recurrent stroke.Entities:
Keywords: CatBoost model; acute stroke; carotid sonographic features; machine learning; recurrent stroke
Year: 2022 PMID: 35155629 PMCID: PMC8833232 DOI: 10.3389/fcvm.2022.804410
Source DB: PubMed Journal: Front Cardiovasc Med ISSN: 2297-055X
Figure 1Flow chart.
Clinical characteristics in 2,411 study patients.
|
|
|
| |
|---|---|---|---|
|
| |||
| Men | 1,169 (61.66%) | 320 (62.14%) | 0.843 |
| Age | 66.18 ± 12.67 (24–98) | 67.63 ± 13.14 (27–96) | <0.05 |
|
| |||
| Occlusion and stenosis | 1,438 (75.84%) | 413 (80.19%) | <0.05 |
| Hemorrhage | 144 (7.59%) | 30 (5.83%) | 0.169 |
| TIA and related syndrome | 167 (8.81%) | 35 (6.80%) | 0.144 |
| Stroke syndrome | 51 (2.69%) | 8 (1.55%) | 0.139 |
| Others cerebral vascular disease | 96 (5.06%) | 29 (5.63%) | 0.606 |
|
| |||
| Hypertension | 1,227 (64.72%) | 371 (72.04%) | <0.05 |
| Diabetes mellitus | 682 (35.97%) | 233 (45.24%) | <0.001 |
| Hyperlipidemia | 503 (26.53%) | 144 (27.96%) | 0.516 |
| End stage renal disease | 10 (0.53%) | 21 (4.08%) | <0.0001 |
| Atrial fibrillation | 81 (4.27%) | 44 (8.54%) | <0.001 |
| Heart failure | 123 (6.49%) | 64 (12.43%) | <0.0001 |
| Liver cirrhosis | 36 (1.90%) | 7 (1.36%) | 0.412 |
| Cancer | 113 (5.96%) | 43 (8.35%) | 0.051 |
| BMI > 25 | 675 (35.60%) | 188 (36.50%) | 0.862 |
| Medicine after first stroke | |||
| Angiotensin II receptor blockers (ARBs) | 753 (39.72%) | 287 (55.73%) | <0.0001 |
| Dihydropyridine derivatives | 1,083 (57.12%) | 385 (74.76%) | <0.0001 |
| Anti-coagulant | 665 (35.07%) | 353 (68.54%) | <0.0001 |
| Anti-platelet | 1,738 (91.67%) | 500 (97.09%) | <0.0001 |
| HMG-COA inhibitors | 1,181 (62.29%) | 381 (73.98%) | <0.0001 |
| NSAID | 761 (40.14%) | 312 (60.58%) | <0.0001 |
*Values are expressed as the mean ± SD.
HTN, hypertention; DM, diabetes mellitus; ESRD, end stage renal disease.
Comparison of the predictive performance for six models (cross-validated data).
|
|
|
|
|
|
|
|---|---|---|---|---|---|
|
|
|
|
| ||
| RF | - | 0.070 (0.049–0.091) | 1.000 (1.000–1.000) | 0.801 (0.797–0.805) | 0.819 (0.802–0.846) |
| SVM | - | 0.216 (0.184–0.247) | 0.973 (0.968–0.978) | 0.811 (0.805–0.818) | 0.759 (0.739–0.781) |
| LR | - | 0.305 (0.256–0.353) | 0.958 (0.948–0.969) | 0.819 (0.809–0.829) | 0.774 (0.759–0.793) |
| DT | - | 0.155 (0.116–0.194) | 0.997 (0.994–1.000) | 0.817 (0.811–0.824) | 0.688 (0.686–0.733) |
| CatBoost | - | 0.441 (0.394–0.488) | 0.994 (0.990–0.998) | 0.876 (0.867–0.885) | 0.844 (0.824–0.868) |
| LGBM | - | 0.421 (0.381–0.461) | 0.982 (0.977–0.987) | 0.862 (0.855–0.870) | 0.832 (0.813–0.851) |
| RF | Class weight | 0.678 (0.632–0.722) | 0.762 (0.742–0.782) | 0.744 (0.726–0.762) | 0.787 (0.766–0.818) |
| SVM | Class weight | 0.060 (0.038–0.082) | 0.996 (0.993–0.999) | 0.796 (0.791–0.801) | 0.647 (0.617–0.683) |
| LR | Class weight | 0.678 (0.636–0.720) | 0.735 (0.707–0.763) | 0.723 (0.703–0.743) | 0.779 (0.762–0.798) |
| DT | Class weight | 0.717 (0.664–0.769) | 0.624 (0.606–0.641) | 0.644 (0.627–0.661) | 0.684 (0.674–0.726) |
| CatBoost | Class weight | 0.522 (0.484–0.560) | 0.928 (0.912–0.943) | 0.841 (0.832–0.851) | 0.829 (0.814–0.849) |
| LGBM | Class weight | 0.493 (0.467–0.519) | 0.954 (0.943–0.965) | 0.856 (0.845–0.866) | 0.825 (0.808–0.843) |
| RF | Balanced bagging | 0.604 (0.558–0.649) | 0.814 (0.790–0.838) | 0.769 (0.751–0.788) | 0.796 (0.770–0.823) |
| SVM | Balanced bagging | 0.474 (0.432–0.516) | 0.675 (0.636–0.714) | 0.632 (0.602–0.662) | 0.588 (0.563–0.620) |
| LR | Balanced bagging | 0.687 (0.640–0.734) | 0.736 (0.706–0.765) | 0.725 (0.707–0.744) | 0.781 (0.764–0.800) |
| DT | Balanced bagging | 0.497 (0.463–0.531) | 0.829 (0.815–0.842) | 0.758 (0.752–0.764) | 0.735 (0.717–0.755) |
| CatBoost | Balanced bagging | 0.606 (0.555–0.656) | 0.845 (0.823–0.867) | 0.794 (0.780–0.808) | 0.818 (0.797–0.843) |
| LGBM | Balanced bagging | 0.592 (0.559–0.625) | 0.851 (0.835–0.866) | 0.796 (0.786–0.805) | 0.811 (0.793–0.833) |
Figure 2ROC curves of combinations of different data balancing methods and machine learning algorithms. (A) No data balancing. (B) After class weight. (C) balanced bagging.
Figure 3CatBoost model for different settings: clinical data, sonographic data, and combination of clinical and sonographic data for recurrent stroke prediction.
Confusion matrix for the CatBoost without balancing method prediction.
|
| |||
|---|---|---|---|
|
|
| ||
| True | No | 1,885 | 11 |
| Yes | 288 | 227 | |
Figure 4Top 10 features observed in the CatBoost model without balancing method for classifying stroke once and recurrent stroke. Dist, distal; Lt, left, Rt, right; CCA, common carotid artery; ICA, internal carotid artery; ECA, external carotid artery; SUBC, subclavian artery; VERT, vertebral artery; BA, basilar artery; Prox, proximal; TCD, transcranial Doppler; PI, pulse index; RI, resistive index; TAMEAN, intensity-weighted mean frequency; EDV, end-diastolic velocity; PSV, peak systolic velocity.