| Literature DB >> 34938254 |
Shraddha Mainali1, Marin E Darsie2,3, Keaton S Smetana4.
Abstract
The application of machine learning has rapidly evolved in medicine over the past decade. In stroke, commercially available machine learning algorithms have already been incorporated into clinical application for rapid diagnosis. The creation and advancement of deep learning techniques have greatly improved clinical utilization of machine learning tools and new algorithms continue to emerge with improved accuracy in stroke diagnosis and outcome prediction. Although imaging-based feature recognition and segmentation have significantly facilitated rapid stroke diagnosis and triaging, stroke prognostication is dependent on a multitude of patient specific as well as clinical factors and hence accurate outcome prediction remains challenging. Despite its vital role in stroke diagnosis and prognostication, it is important to recognize that machine learning output is only as good as the input data and the appropriateness of algorithm applied to any specific data set. Additionally, many studies on machine learning tend to be limited by small sample size and hence concerted efforts to collate data could improve evaluation of future machine learning tools in stroke. In the present state, machine learning technology serves as a helpful and efficient tool for rapid clinical decision making while oversight from clinical experts is still required to address specific aspects not accounted for in an automated algorithm. This article provides an overview of machine learning technology and a tabulated review of pertinent machine learning studies related to stroke diagnosis and outcome prediction.Entities:
Keywords: artificial intelligence; deep learning; machine learning; machine learning in medical imaging; machine learning in medicine; stroke diagnosis; stroke outcome prediction; stroke prognosis
Year: 2021 PMID: 34938254 PMCID: PMC8685212 DOI: 10.3389/fneur.2021.734345
Source DB: PubMed Journal: Front Neurol ISSN: 1664-2295 Impact factor: 4.003
Figure 1Supervised learning. In supervised learning, a model is built by labeling images [Subarachnoid Hemorrhage (SAH) and Not Subarachnoid Hemorrhage (Not SAH)], a predictive model is created, and then tested for accuracy in reading unlabeled images (gray box). Source: WesternDigital BLOG.
Figure 3Created from the following referenes: Dey (28) Zhou (29) Geron (30).
Studies utilizing machine learning for stroke diagnosis and prediction.
|
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|
|
| |||||||||
|
| |||||||||
| Garca-Terriza et al. ( | Stroke type diagnosis and mortality | RF | 10-fold cross validation resampling | •119 | •Type of stroke | • | - | May predict the type of stroke a patient is at risk for and outcomes | Data obtained after event to for prediction models but do not include usual risk factors for consideration |
| Sung et al. ( | Ischemic stroke phenotype | Various models (C4.5, CART, KNN, RF, SVM, LR, with aggregation algorithms | 10-fold cross validation | 4,640 | Clinical notes with preprocessing and MetaMap to identify medical entities +/- NIHSS | • | - | Clinical text plus validated scoring tools might aid in phenotyping of stroke | •Phenotype based on OCSP definitions, |
| Giri et al. ( | Ischemic stroke diagnosis by EEG | 1D CNN vs. various models (NB, Classification Tree, ANN, RF, kNN, LR) | Leave-one-out cross-validation | •32 – AIS | 15-min EEG with 24 chosen features | •Accuracy - 0.86 | Leave-one-out scenario of 1D CNN | In areas with limited access to CT imaging may help diagnosis AIS | Time to apply EEG electrodes may result in delays of care |
| Lee et al. ( | Identify patients within 4.5-h thrombolysis window | LR, RF, SVM | •85% training | 355 | MRI features | •Sensitivity 75.8% | RF | Improved sensitivity than human readings in identifying stroke patients within thrombolysis window | Assessed only dichotomized visibility of signals in the lesion territory |
| Ho et al. ( | Classifying onset time from imaging | LR, RF, GBRT, SVM, SMR | 10-fold cross validation on training data with optimal hyperparameters | 104 | MRI | •Sensitivity 78.8% | LR with deep autoencoder features | Improved stroke onset detection compared to DWI-FLAIR | Trained on MRI only |
| Takahashi et al. ( | Detection for MCA dot sign in unenhanced CT | SVM | Not described | 297 images | Unenhanced CT | Sensitivity 97.5% | SVM | Accurately detect hyperdense MCA dot sign | Data from 7 patients |
| Chen et al. ( | Automatically segment stroke lesions in DWI | CNN | Train / Test | 741 subjects | DWI | Dice score 0.67 | CNN | Segment stroke lesions automatically | Improved Dice scores on larger lesions |
| Bouts et al. ( | Depict ischemic tissue that can recover after reperfusion | GLM, GAM, SVM, Adaptive boosting, RF | Generalized cross validation with unbiased risk estimator scoring | 19 rats | MRI | Dice Score 0.79 | GLM | MRI-based algorithms could estimate extent of salvageable tissue | Varying efficacy in differentiating between areas irreversibly damaged vs. salvaged after reperfusion |
| Chen et al. ( | Quantify cerebral edema following infarction | RF with geodesic active contour segmentation | •10-fold cross validation | 38 subjects | CT Imaging | •Baseline Dice Score 0.76 | RF with geodesic active contour segmentation | Efficiently and accurately measure evolution of cerebral edema | |
| Colak et al. ( | Stroke Prediction | MLP ANN and SVM with radial basis function kernel | Train / Test | 297 subjects (130 sick and 167 healthy) | 9 predictors (CAD, DM, HTN, CVA history, AF, smoking, carotid Doppler findings, cholesterol, CRP | •Accuracy 85.9% | ANN | Ability to screen patients at risk for stroke based on comorbidities | Factors used to predict model are known to be risk factors for stroke |
| Maier et al. ( | Classify lesion segmentation | KNN, GNB, GLM, RF, CNN | Leave-one-out cross-validation | 37 subjects | MRI | •RF: | •RF | Future work may be able to segment lesions | No methods achieved results in the range of the human observer agreement |
| Öman et al. ( | Detection of ischemic stroke | 3D CNN | Train / Test | 60 subjects | CT Angiography | •Sensitivity 93% | 3D CNN | Lesion can be detected with CNN | Contralateral hemisphere data may reduce false positive findings |
| Chen et al. ( | Prehospital detection of large vessel occlusion | ANN | 10-fold cross validation | 600 subjects | Baseline demographics, medical history, NIHSS, risk factors | •Youden index 0.640 | ANN | Known patient risk factors may help in predicting large vessel occlusion | Cohort included stroke patients and not those with mimics or hemorrhagic stroke |
|
| |||||||||
| Dhar et al. ( | Hemorrhage and perihematomal edema (PHE) quantification | CNN | •10-fold cross validation | 124 | 24-h CT head scans | •Dice score | - | Rapid and consistent measurements of supratentorial ICH | -IVH not delineated from ICH |
| Arab et al. ( | Hematoma segmentation and volume quantification | CNN with deep supervision based on reader labeling | Train / Test | 55 | 64 axial slices of 128 × 128 voxels | •Dice score | CNN with deep supervision | Fast and reliable quantification of hematoma volume | •False positives observed with calcifications |
| Ko et al. ( | ICH detection | CNN and long-short term memory | Train / Test | 5,244,234 | Pre-processed CTH to balance subtypes and window settings | •Classification accuracy | - | Identification of ICH and subtypes | -Preprocessing of data required to attain accuracy |
| Irene et al. ( | ICH segmentation and volume approximation | Dynamic Graph CNN | •4-fold cross validation | 27 | CTH | •Accuracy 96.4% | SVM method with radial basis function kernel | Identification of ICH and blood volume prediction | Small dataset |
| Arbabshinrani et al. ( | Diagnose ICH and prioritize radiology worklists | Deep CNN | •Training (75%) | 46,573 studies | Preprocessing of CTH images | •ROC 0.846 | - | Assist in upgrading image reads to “stat” from “routine” | Did not identify location of ICH |
| Sage et al. ( | ICH subtype detection | Double-branch CNN of SVM, RF | Concatenation of double-branch features and classification | 9,997 subjects | 372,556 images (11,454 CT scans) | •Accuracy range | - | Identify and classify ICH | EDH performed the worst in SVM and RF possibly due to under representation in data |
| Ye et al. ( | ICH subtype detection | 3D joint CNN – recurrent NN | •Training (80%) | 2,836 subjects | 76,621 slices from non-contrast head CT scans | •AUC for +/- ICH | - | Identify and classify ICH | SAH classification may have been more difficult due to blended ICH examples |
| Chang et al. ( | ICH detection and volume measurements | Hybrid 3D/2D CNN | 5-fold cross validation | 10,841 Scans | Non-contrast CTH | • | - | Identification of ICH and blood volume prediction | Generalization needs to be confirmed in other institutions |
|
| |||||||||
| Capoglu et al. ( | Vasospasm prediction | Sparse dictionary learning and covariance-based features | Not described | 20 | 3D brain angiograms | ROC 0.93 | - | Proof of concept to predict those who might have vasospasm | Small dataset |
| Ramos et al. ( | DCI Prediction | LogReg, SVM, RF, MLP | Monte-Carlo cross-validation with 100 random splits (75% training / 25% test) and 5-fold cross-validation | 317 | Non-contrast CT image data and 48 clinical variables | •ROC 0.74 | RF with clinical variables and image features | ML improved prediction of DCI especially when image features included (aneurysm height / width) | Manual extraction of features from medical images is time-consuming |
| Tanioka et al. ( | DCI prediction | RF | Leave-one-out cross-validation | 95 | Clinical variables and matricellular proteins (MCP) on days 1 – 3 | •Accuracy | - | MCP might play a role in predicting DCI but further data needed | Other biomarkers not assessed |
|
| |||||||||
| Ni et al. ( | Stroke Case Detection | LR, SVM-P, SVM-R, RF, ANN | Two iterations of 10-fold cross validation | 8,131 | Medical record information compared to ICD codes | •Accuracy 88.6% | RF | Detection of stroke diagnosis through EHR data that was miscoded | Accurate ICD codes limit utility of the algorithm |
| Park et al. ( | Autonomously grade NIHSS and MRC scores through wearable sensors | •SVM | 5-fold cross validation searched by Bayes optimization in 30 trials | 240 | Wearable sensors | •NIHSS: | SVM | Automatic grading in real time of proximal weakness | Requires sensors to be applied |
|
| |||||||||
|
|
|
|
|
|
|
|
|
|
|
|
| |||||||||
| Nielsen et al. ( | Prediction of final infarct volume | CNNdeep | 85% training/15% testing | 222 | MRI images | AUC 0.88 ± 0.12 | - | Facilitates treatment selection | No external validation, retrospective |
| Giacalone et al. ( | Prediction of final infarct volume | SVM | K-fold cross-validation | 4 | MRI images | 95% accuracy | - | “ ” | Small sample size, Retrospective |
| Grosser et al. ( | Prediction of final infarct volume | XGBoost | Leave-one-out cross-validation | 99 | MRI images | AUC 0.893 ± 0.085 | Spatial lesion probability | “ ” | Retrospective, Limited generalizability (patient data is from 2006 to 2009) |
| Foroushani et al. ( | Prediction of malignant cerebral edema | LR | 10-fold cross-validation | 361 | Serial, quantitative CT images | AUC 0.96 | Reduction in CSF volume | “ ” | No external validation |
| Bentley et al. ( | Prediction of sICH | SVM | K-fold cross-validation | 116 | Unenhanced CT images | AUC 0.744 | Baseline NIHSS, CT evidence of acute ischemia | “ ” | Image processing took ~30 min; Small number of sICH cases |
| Yu et al. ( | Prediction of HT | SR-KDA | Leave-one-out cross-validation | 155 | MRI images | 83.7 ± 2.6% accuracy | - | “ ” | Single-center, Retrospective |
| Scalzo et al. ( | Prediction of HT | SR-KDA | 10-fold cross-validation | 263 | MRI images | 88% accuracy | - | “ ” | Retrospective, current limitations in measuring BBB permeability |
| van Os et al. ( | Prediction of reperfusion after EVT (mTICI <2b vs. ≥2b) | LR (using backward elimination) | Nested cross-validation, consisting of an outer and an inner cross-validation loop | 1,383 | EHR data, CT/CTA images | AUC 0.57 | - | “ ” | Retrospective; Only moderate predictive value, LR outperformed machine-learning |
| Hilbert et al. ( | Prediction of reperfusion after EVT (mTICI <2b vs. ≥2b) | RFNN-ResNet-AE fine-tuned | 4-fold cross-validation | 1301 | CTA images | Average AUC 0.65 | - | “ ” | Retrospective; Only moderate predictive value |
| Rondina et al. ( | Comparison of imaging approaches (lesion load per ROI vs. pattern of voxel) to predict post stroke motor impairment | GPR | 10-fold cross-validation | 50 | Post stroke MRI | Best prediction was obtained using motor ROI and CST (derived from probabilistic tractography) R = 0.83, RMSE = 0.68 | Patterns of voxels representing lesion probability produced better results | Informs appropriate methodology for predicting long term motor outcomes from early post-stroke MRI. | Small sample size, no external validation |
|
| |||||||||
| Matsumoto et al. ( | Prediction of all-cause, in-hospital mortality | LASSO | 10-fold cross-validation | 4,232 | EHR data | AUC 0.88 | - | Facilitates GOC decision making | Retrospective, Single-center, Limited generalizability (ETV used in only 1.5% of patients), Low rate (3.5%) of in-hospital mortality |
| Scrutinio et al. ( | Prediction of 3-yr mortality after severe stroke | SMOTE RF | 10-fold cross-validation | 1,207 | EHR data | AUC 0.928 | Age | Facilitates GOC decision making | No external validation |
| Ge et al. ( | Prediction of SAP at 7 and 14 d | Attention-augmented GRU | 10-fold cross-validation | 13,930 | EHR data | •7 d: AUC 0.928 | PPI use | Facilitates early detection and targeted application of prophylaxis interventions | Single-center, No external validation |
| Li et al. ( | Prediction of SAP at 7 d | XGBoost | 5-fold cross-validation | 3,160 | EHR data | AUC 0.841 | Age, Baseline NIHSS, FBG, sex, Premorbid mRS score, & History of AF | “ “ | Single-center, No external validation |
| Wang et al. ( | Predicting functional outcome (mRS) at 1st and 6th months | RF | 10-fold cross-validation | 333 | Demographics, labs, CT brain | •1 month outcome: AUC 0.899; | •1 month outcome= 26 attributes; | Use of ML to predict functional outcome after ICH is feasible, and RF model provides the best predictive performance | Small sample size, excluded large hematomas, did not evaluate hematoma or edema expansion, no external validation |
|
| |||||||||
| Heo et al. ( | Prediction of mRS score (0–2 vs. 3–6) at 90 d | Deep neural network | 67% training/ 33% testing | 2,604 | EHR data | AUC 0.888 | - | Informs patient expectations, Facilitates GOC decision making | Single-center, No external validation |
| Lin et al. ( | Prediction of mRS score (0–2 vs. 3–6) at 90 d | SVM | 10-fold cross-validation | 35,798 | Registry data | f1-score 87.9 ± 0.2% (92.9 ± 0.1%, with follow-up data) | mRS score at 30 d, toilet use degree of dependence | “ “ | More severe strokes accounted for most prediction errors |
| Brugnara et al. ( | Prediction of mRS score (0–2 vs. 3–6) at 90 d | “Machine-learning models with gradient boosting classifiers” | Not specified | 246 | Clinical data, radiological data (CT, CTA, CTP, and angiographic images) | AUC 0.856 | NIHSS score at 24 h, Premorbid mRS score, Final infarct volume on CT | “ “ | Single center, No external validation, Retrospective |
| Forkert et al. ( | Prediction of mRS score at 90 d | SVM (specifically the Extended Problem- specific model) | Leave-one-out cross-validation | 68 | Clinical data, MRI images | •mRS score ± 1: 82.4% accuracy | •L-hemisphere strokes: lesion-based | “ “ | No external validation, Retrospective |
| Monteiro et al. ( | Prediction of mRS score (0–2 vs. 3–6) at 90 d | RF | 10-fold cross-validation | 425 | Clinical data, CT or MRI images | AUC 0.936 ± 0.34 | Baseline NIHSS score, Baseline NIHSS score on subsection 2 (Best gaze, horizontal EOMs) | “ “ | Single center, No external validation, Retrospective, Performed worse than non-imaging model |
| Jang et al. ( | Prediction of mRS score (>1 vs. >2) at 90 d | XGBoost | 3-fold cross-validation and a random search strategy | 6,731 | Registry data | •mRS >1: AUC 0.84 | “ “ | Treatment-related factors were not included, No external validation | |
| Hope et al. ( | Prediction of speech production scores | GPR | Leave-one-out cross-validation | 270 | Clinical data, Assessments, MRI images | Time post-stroke, Lesion site | Informs patient expectations | Post-stroke imaging obtained over a wide range of times (<1 month to +30 y), No external validation, Retrospective | |
| Lopes et al. ( | Prediction of cognitive functions at 3 y after minor stroke | Ridge Regression | 3-step nested leave-one-out cross-validation, consisting of inner, middle, and outer loops | 72 | Clinical data, Assessments, functional MRI images | - | “ “ | Limited generalizability (mean NIHSS on admission was 1.5 ± 2.2), Retrospective | |
| Sale et al. ( | Prediction of change in BI score and FIM score during inpatient rehab | SVM | Nested 5-fold cross-validation | 55 | Clinical biomarker data, Assessments | Discharge cognitive FIM score: MADP 17.55%, RMSE 4.28 | Cognitive FIM score upon admission | Informs patient expectations, Facilitates GOC decision making | Small sample size, included hemorrhagic stroke patients |
| Iwamoto et al. ( | Prediction of ADL dependence after inpatient rehab | CART method | Not specified | 994 | Clinical data, Assessments | AUC 0.83 | FIM transfer score (≤ 4 or >4) | “ “ | Single center, Retrospective |
| Lin et al. ( | Prediction of BI score (<60, 60–90, >90) upon discharge from inpatient rehab | LR, RF | 5-fold cross-validation | 313 | Clinical data, Assessments | LR: AUC 0.796, RF: AUC 0.792 | BI, IADL, and BBT scores on admission | “ “ | Limited generalizability due to aggressive rehab strategy, No external validation |
| Tozlu et al. ( | Prediction of post-intervention UE motor impairment in chronic stroke | Elastic net | Nested 10-fold cross-validation with outer and inner loops | 102 | Clinical data, Assessments | Median | Pre-intervention UE-FMA, difference in MT between the affected and unaffected hemispheres | Informs patient expectations, Increases rehabilitation efficiency | Retrospective, No external validation |
| Stinear et al. ( | Predicts potential for UE recovery | Cluster analyses | Not applicable | 40 | Clinical assessments ± neurophysiological assessments and MRI images | Partial η2 0.811 | - | “ “ | Small sample size, Single center, No external validation |
ADL, Activities of daily living; AE, Auto-encoders; AF, Atrial fibrillation; AIS, Acute ischemic stroke; ANN, Artificial neural network; AUC, area under the receiver operating characteristic curve; BBB, blood-brain barrier; BBT, Berg balance test; BI, Barthel Index; CART, Classification and regression tree; CNN, convolutional neural network; CSF, cerebral spinal fluid; CST, Corticospinal tract; CT, computed tomography; CTA, Computerized tomography angiography; CTP, Computerized tomography perfusion; CXR, Chest radiograph; D, days; DCI, delayed cerebral ischemia; DTI, Diffusion Tensor Imaging; DWI, diffusion weighted image; EDH, epidural hematoma; EEG, electroencephalogram; EHR, electronic health record; EOMs, Extra-ocular movements; EVT, endovascular treatment; FBG, Fasting blood glucose; FIM, Functional independence measure; GAM, generalized additive model; GBRT, gradient boosted regression tree; GLM, generalized linear model; GOC, Goals-of-care; GRU, gated recurrent unit; GPR, Gaussian Process model Regression; H, hours; HT, hemorrhagic transformation; IADL, Instrumental activities of daily living scale; ICH, Intracerebral hemorrhage; IVH, intraventricular hemorrhage; KNN, K nearest neighbor; L, Left; LASSO, Least absolute shrinkage and selection operator regression; LR, logistic regression; MADP, Mean absolute percentage deviation; MCA, middle cerebral artery; MCP, matricellular proteins; Min, minutes; MLP, multilayer perceptron; MRC, medical research council; MRI, magnetic resonance imaging; mRS, Modified Rankin Score; MT, motor threshold; NB, naïve bayes; NIHSS, National Institutes of Health Stroke Scale; PHE, perihematomal edema; PPI, Proton pump inhibitor; RF, Random forest; RFNN, Structured Receptive Field Neural Networks; RMSE, Root mean square error; ROI, region of interest; Rt, Right; SAP, Stroke-associated pneumonia; sICH, symptomatic intracranial hemorrhage; SMOTE, synthetic minority oversampling technique; SMR, stepwise multilinear regression; SR-KDA, Kernel Spectral Regression for Discriminant Analysis; SVM, support vector model; SVM-P, support vector machine with polynomial; SVM-R, support vector machine with radial basis function; UE, Upper extremity; UE-FMA, Upper extremity Fugl-Meyer Assessment; XGBoost, Extreme gradient boosting; Yr, year.
NB: List of ML terms with definitions is provided in .
Many of the listed studies utilize a variety of machine learning (ML)-based approaches. The approach listed on the table is the approach with the optimal result from each individual study.
Phenotype based on Oxfordshire Community Stroke Project (OCSP) (total anterior circulation infarcts, lacunar infarcts, partial anterior circulation infarcts, posterior circulation infarcts).