| Literature DB >> 35873778 |
Xinrui Wang1, Yiming Fan2, Nan Zhang1, Jing Li3, Yang Duan1, Benqiang Yang1.
Abstract
Machine learning (ML) has been proposed for lesion segmentation in acute ischemic stroke (AIS). This study aimed to provide a systematic review and meta-analysis of the overall performance of current ML algorithms for final infarct prediction from baseline imaging. We made a comprehensive literature search on eligible studies developing ML models for core infarcted tissue estimation on admission CT or MRI in AIS patients. Eleven studies meeting the inclusion criteria were included in the quantitative analysis. Study characteristics, model methodology, and predictive performance of the included studies were extracted. A meta-analysis was conducted on the dice similarity coefficient (DSC) score by using a random-effects model to assess the overall predictive performance. Study heterogeneity was assessed by Cochrane Q and Higgins I 2 tests. The pooled DSC score of the included ML models was 0.50 (95% CI 0.39-0.61), with high heterogeneity observed across studies (I 2 96.5%, p < 0.001). Sensitivity analyses using the one-study removed method showed the adjusted overall DSC score ranged from 0.47 to 0.52. Subgroup analyses indicated that the DL-based models outperformed the conventional ML classifiers with the best performance observed in DL algorithms combined with CT data. Despite the presence of heterogeneity, current ML-based approaches for final infarct prediction showed moderate but promising performance. Before well integrated into clinical stroke workflow, future investigations are suggested to train ML models on large-scale, multi-vendor data, validate on external cohorts and adopt formalized reporting standards for improving model accuracy and robustness.Entities:
Keywords: computed tomography; deep learning; ischemic stroke; machine learning; magnetic resonance imaging; meta-analysis
Year: 2022 PMID: 35873778 PMCID: PMC9305175 DOI: 10.3389/fneur.2022.910259
Source DB: PubMed Journal: Front Neurol ISSN: 1664-2295 Impact factor: 4.086
Figure 1Flow diagram of literature review and study selection process.
Study characteristics, model methodology, and predictive performance of the included studies.
|
|
| ||||||
|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
| |
| Gottrup et al. ( | Single center | N | 14 | Leave-one-out cross validation | N | NR | |
| Single center | Y | 61 | 25 | 36 | N | N | |
| Livne et al. ( | Multicenter (I-KNOW study and the Ischemic Preconditioning trial) | Y | 195 | ≈156 | ≈39 | N | Y |
| Nielsen et al. ( | Multicenter (I-KNOW and remote ischemic preconditioning studies) | Y | 222 | 187 | 35 | N | Y |
| Pinto et al. ( | Multicenter (ISLES 2017 dataset) | Y | 75 | 43 | 32 | N | N |
| Winzeck et al. ( | Multicenter (ISLES 2017 dataset) | Y | 75 | 43 | 32 | N | N |
| Clèrigues et al. ( | Multicenter (ISLES 2018 dataset) | Y | 103 | 63 | 40 | N | Y |
| Ho et al. ( | Single center | Y | 48 | ≈43 | ≈5 | N | N |
| Multicenter | Y | 103 | ≈82 | ≈21 | N | NR | |
| Pérez Malla et al. ( | Multicenter (ISLES 2017 dataset) | Y | 75 | 43 | 32 | N | N |
| Robben et al. ( | Multicenter (MR CLEAN study) | Y | 188 | ≈150 | ≈38 | N | NR |
| Winder et al. ( | Single center | Y | 90 | Leave-one-out cross validation | N | N | |
| Grosser et al. ( | Multicenter | Y | 99 | Leave-one-out cross validation | N | NR | |
| Grosser et al. ( | Multicenter | Y | 99 | Leave-one-out cross validation | N | NR | |
| Hu et al. ( | Multicenter (ISLES 2017 dataset) | Y | 75 | 43 | 32 | N | N |
| Single center | Y | 92 unsuccessful recanalization 36 and successful recanalization 56 | 53 | 39 | N | N | |
| Kumar et al. ( | Multicenter (ISLES 2017 dataset) | Y | 75 | 43 | 32 | N | N |
| Multicenter (ISLES 2017 dataset) | Y | 75 | 43 | 32 | N | N | |
| Qiu et al. ( | Single center | Y | 257 | 157 | 100 | N | N |
| Wang et al. ( | Multicenter (ISLES 2018 dataset) | Y | 103 | 63 | 40 | N | Y |
| Yu et al. ( | Multicenter (ICAS and DEFUSE-2 studies) | Y | 182 | ≈146 | ≈36 | N | NR |
| Single center | Y | 394 | ≈358 | ≈36 | N | N | |
| Multicenter (HIBISCUS-STROKE and I-KNOW cohorts) | Y | 109 reperfused 74 and non-reperfused 35 | Reperfused≈69 and non-reperfused≈28 | Reperfused≈15 and non- reperfused≈ 7 | N | NR | |
| Hakim et al. ( | Multicenter (ISLES 2018 dataset) | Y | 103 | 63 | 40 | N | Y |
| Hokkinen et al. ( | Single center | Y | 83 | NR | NR | N | N |
| Hokkinen et al. ( | Single center | Y | 89 | None | 89 | N | N |
| Klug et al. ( | Single center | Y | 144 intravenous thrombolysis (IVT) 80, endovascular thrombectomy (EVT) 64 | ≈115 | ≈29 | N | N |
| Multicenter (Prove-IT study and HERMES collaboration) | Y | 205 | 68 | 137 | Y | NR | |
| Modrau et al. ( | Multicenter (TEA-Stroke Trial) | Y | 52 theophylline 27 and control group 25 | NR | NR | N | NR |
| Pinto et al. ( | Multicenter (ISLES 2017 dataset) | Y | 75 | 43 | 32 | N | N |
| Qiu et al. ( | Multicenter (Prove-IT study) | Y | 196 | 170 | 26 | N | NR |
| Multicenter (ISLES 2018 dataset) | Y | 103 | 63 | 40 | N | Y | |
| Vupputuri et al. ( | Multicenter (ISLES 2017 dataset) | Y | 75 | 43 | 32 | N | N |
| Multicenter (ICAS, DEFUSE and DEFUSE-2 studies) | Y | 185 | 118 | 67 | N | NR | |
| He et al. ( | Single center | Y | 70 | 59 | 11 | N | N |
| Single center | Y | 261 | ≈209 | ≈52 | N | NR | |
| Shi et al. ( | Multicenter (ISLES 2018 dataset) | Y | 103 | 63 | 40 | N | Y |
| Multicenter | N | 89 | ≈71 | ≈18 | N | N | |
|
|
| ||||||
|
|
|
|
|
| |||
| Gottrup et al. ( | k-nearest neighbor classification | MR-CBF, CBV, MTT, DWI, ADC, T2WI | Infarct lesions manually segmented on follow-up T2WI 5 days or later | NR | AUC: 0.814 ± 0.001 | ||
| Random forest classifier, including segmentation and predictive classifiers | Features extracted from MR-T1 contrast, T2WI, ADC, CBF, CBV, TTP, Tmax | Final infarct lesions manually segmented on follow-up T2WI at 90 days by 2 radiologists | 0.34 ± 0.22 | AUC: 0.94 ± 0.08 | |||
| Livne et al. ( | Extreme gradient boosting (XGBoost) | MR-DWI, T2-FLAIR, and TTP derived from the concentration curve; CBF, MTT and Tmax using oscillatory singular value decomposition deconvolution; CBF, CBV, MTT, Tmax, relative transit time heterogeneity and capillary transit time heterogeneity using a statistical approach | Final infarct lesions semi-automatically segmented on follow-up T2-FLAIR | NR | AUC: 0.92 | ||
| Nielsen et al. ( | Modified SegNet | MR-mean capillary transit time, CBV, CBF, cerebral metabolism of oxygen, relative transit time heterogeneity, delay, TRACE DWI, ADC, and T2-FLAIR | Infarcts lesions manually segmented on follow-up T2-FLAIR at 30 days by 4 expert radiologists | NR | AUC: 0.88 ± 0.12 | ||
| Pinto et al. ( | Fully convolutional U-Net combined with a 2D-dimensional gated recurrent unit layer | MR-ADC, rCBF, rCBV, MTT, TTP, Tmax and clinical information-TICI score | Final infarct lesions manually segmented on follow-up T2WI at 90 days by a neuroradiologist | 0.29 ± 0.22 | Precision: 0.26 ± 0.23 | ||
| Winzeck et al. ( | Multiscale U-net architecture trained with negative Dice score | MR-ADC, rCBF, rCBV, MTT, TMAX, TTP, Raw PWI and clinical information time-since-stroke, time-to-treatment, TICI and mRS scores | Final infarct lesions manually segmented on follow-up T2WI at 90 days by a neuroradiologist | 0.31 ± 0.23 | Sensitivity: 0.45 ± 0.31 | ||
| Clèrigues et al. ( | 2D asymmetrical residual encoder–decoder CNN by using a more regularized network training procedure, symmetric modality augmentation and uncertainty filtering | CT-raw CTP series and CBF, CBV, MTT, Tmax | Infarct core manually segmented by a single investigator and then subjected to group review until acceptance | 0.547 ± 0.242 | Sensitivity: 0.609 ± 0.250 | ||
| Ho et al. ( | Unit CNN-contralateral model including modified input patches (patches of interest paired with contralateral patches), convolutional layer architecture and unit temporal filter learning | MR-PWI source image | Infarct lesions semi-automatically segmented on follow-up FLAIR at 3–7 days by a radiologist | NR | AUC: 0.871 ± 0.024 | ||
| Feed-forward ANN | CT-rCBF, CBV, MTT, and Tmax | acute infarct lesions segmented on follow-up DWI at median time delay of 40.5 min | 0.48 (IQR 0.23–0.70) | AUC: 0.85 | |||
| Pérez Malla et al. ( | DeepMedic model with PReLU activation using transfer learning, data augmentation and binary morphological post-processing operations | MR-ADC, MTT, and rCBF | Final infarct lesions manually segmented on follow-up T2WI at 90 days by a neuroradiologist | 0.34 | - | ||
| Robben et al. ( | Fully convolutional network with PReLU activation | CT-native CTP, downsampled CTP, arterial input function and clinical data-time between stroke onset and imaging, time between imaging and the end of the mechanical thrombectomy, mTICI score and persistence of occlusion at 24 h | Infarct lesions semi-automatically segmented on follow-up NCCT at 1–5 days by an experienced reader | 0.48 | Mean absolute volume error: 36.7 ml | ||
| Winder et al. ( | Random forest classifier | MR-ADC, distance to ischemic core, tissue type, anatomical location, CBV, MTT, Tmax, CBF and clinical data-NIHSS, age, sex, and time from symptom onset | Final infarct lesion manually segmented on FLAIR or DWI or NCCT at 5–7 days by an experienced medical expert | 0.447 ± 0.247 | - | ||
| Grosser et al. ( | Random forest classifier trained by local and global approaches | MR-ADC, CBF, CBV, MTT, Tmax | Infarct lesions manually segmented on follow-up FLAIR at 1–7 days by 2 neurologists in consensus | 0.353 ± 0.220 | AUC: 0.859 ± 0.089 Sensitivity: 0.415 ± 0.231 Specificity: 0.964 ± 0.034 | ||
| Grosser et al. ( | XGBoost | MR-ADC, CBF, CBV, MTT, Tmax and voxel-wise lesion probabilities | Infarct lesions manually segmented on follow-up FLAIR within 7 days by 2 neuroradiologists in consensus | 0.395 ± 0.229 | AUC: 0.888 ± 0.101 | ||
| Hu et al. ( | Brain SegNet: a 3D dense segmentation network based on ResNet and trained with data augmentation and Focal loss | MR-TTP, Tmax, rCBV, rCBF, MTT, ADC | Final infarct lesions manually segmented on follow-up T2WI at 90 days by a neuroradiologist | 0.30 ± 0.22 | Precision: 0.35 ± 0.27 Recall: 0.43 ± 0.27 | ||
| Random forest classifier | Features derived from MR-ADC and rTTP: range, mean, median, min, max, standard deviation, skew, kurtosis, 10 th percentile, 25 th percentile, 75 th percentile, and 90 th percentile | Infarct lesions manually segmented on follow-up DWI at 7 days | 0.49 (IQR 0.37–0.59) | Unsuccessful recanalization: AUC: 0.746 ± 0.048 Mean volume error: −32.5 ml Successful recanalization: AUC: 0.764 ± 0.127 Mean volume error: 3.5 ml | |||
| Kumar et al. ( | Classifier-Segmenter network, using a hybrid training strategy with a self-similar (fractal) U-Net model | MR-DWI, ADC, CBV, CBF, MTT, TTP, Tmax | Final infarct lesions manually segmented on follow-up T2WI at 90 days by a neuroradiologist | 0.28 ± 0.22 | Precision: 0.37 ± 0.29 Recall: 0.45 ± 0.34 | ||
| Two-branch Restricted Boltzmann Machine provides lesion and hemodynamics features from parametric MRI maps, then combined with parametric MRI maps and fed to a U-net using NReLU activation | MR-ADC, MTT, TTP, rCBF and rCBV | Final infarct lesions manually segmented on follow-up T2WI at 90 days by a neuroradiologist | 0.38 ± 0.22 | Precision: 0.41 ± 0.26 Recall: 0.53 ± 0.29 | |||
| Qiu et al. ( | Random forest classifier | Features derived from NCCT: Hounsfield units, bilateral density difference, hypoattenuation measurement, distance feature, atlas-encoded lesion location feature | Early infarct lesions manually segmented on follow-up DWI within 1 h | NR | Mean volume error: 11 ml | ||
| Wang et al. ( | CNN model with a feature extractor, a pseudo-DWI generator and a final lesion segmenter using hybrid loss function | CT-CBF, CBV, MTT, Tmax and synthesized pseudo-DWI | Infarct core manually segmented by a single investigator and then subjected to group review until acceptance | 0.54 ± 0.21 | Precision: 51.20 ± 22.00 Recall: 64.20 ± 23.99 | ||
| Yu et al. ( | 2.5D attention-gated U-Net using mixed loss functions | MR-DWI, ADC, Tmax, MTT, CBF, CBV | Final infarct lesions manually segmented on follow-up T2-FLAIR at 3–7 days by a neuroradiologist | 0.53 (IQR 0.31–0.68) | AUC: 0.92 (IQR 0.87–0.96) Mean volume error: 9 ml (IQR −14ml−29ml) | ||
| Gradient Boosting | MR-DWI, ADC, Tmax, MRR, CBF, CBV | Infarct lesions manually segmented on follow-up DWI around 24 h by a neuroradiologist | 0.53 (IQR 0.29–0.68) | AUC: 0.98 (IQR 0.95–0.99) | |||
| U-Net with multi-class Dice loss functions | MR-DWI, ADC, Tmax, CBF, CBV | Final infarct lesions semi-automatically segmented on follow-up T2-FLAIR at 6- or 30-day using intensity-based thresholding method | Reperfused: 0.44 ± 0.25 | Reperfused: AUC: 0.87 ± 0.13 | |||
| Hakim et al. ( | 3D multi-scale U-shape network with atrous convolution | CT-CTP source data, CBF, CBV, MTT, Tmax | Infarct core manually segmented by a single investigator and then subjected to group review until acceptance | 0.51 ± 0.31 | Mean absolute volume error: 10.24 ± 9.94 ml | ||
| Hokkinen et al. ( | 3D CNN | CT-CTA source image | Infarct lesions manually segmented on follow-up CT with median time interval of 36 h | NR | Mean volume error: −16.3 ml | ||
| Hokkinen et al. ( | 3D CNN | CT-CTA source image | Infarct lesions manually segmented on follow-up CT or DWI within 5 days by a radiologist | NR | Mean volume error: 13.9 ± 12.5 ml | ||
| Klug et al. ( | General linear regression model | CT-MTT, Tmax, CBF and CBV and multi-perfusion parameter analysis | Final infarct lesions segmented on T2-FLAIR within 10 days by 2 neuroradiologists | 0.155 | AUC: 0.89 | ||
| Random forest classifier | CT-average map, Tmax, CBF, CBV and clinical data-onset-to-imaging time, imaging-to-reperfusion time | PRoveIT study: infarct lesions manually segmented on follow-up DWI or NCCT by 2 experts in consensus; HERMES collaboration: infarct lesions automatically segmented followed by manual corrections | 0.388 (IQR 0.192–0.541) | AUC: 0.81 ± 0.11 | |||
| Modrau et al. ( | Random forest classifier | MR-ADC, CBF, CBV, MTT, Tmax, tissue type probability, anatomical location, distance to the ischemic core and clinical data-age, sex, baseline NIHSS, time of stroke onset to medical application | Infarct lesions manually segmented on follow-up T2-FLAIR at 24 h | Theophylline subgroup: 0.40 ± 0.249 | |||
| Pinto et al. ( | 2D U-Net with a data-driven branch computing spatio-temporal features from DSC-MRI | MR-DSC-MRI spatio-temporal information, Tmax, TTP, MTT, rCBV, rCBF, ADC | Final infarct lesions manually segmented on follow-up T2WI at 90 days by a neuroradiologist | 0.31 ± 0.21 | Precision: 0.29 ± 0.23 | ||
| Qiu et al. ( | Random forest classifier | Features derived multi-phase CTA: average and standard deviation of HUs across 3-phase CTA images, coefficient of variance of HUs in 3-phase CTA images, changing slopes of HUs between any two phases, peak of HUs in 3-phase CTA images, time of peak HU | Infarct lesions manually segmented on follow-up DWI/NCCT at 24/36h by 2 radiologists | 0.247 (IQR 0.138–0.304) | Mean volume error: 21.7 ml | ||
| MultiRes U-Net | CT-CBF, CBV, MTT, Tmax, contrast map, Tmax heatmap | Infarct core manually segmented by a single investigator and then subjected to group review until acceptance | 0.68 ± 0.26 | Sensitivity: 0.68 ± 0.15 Mean absolute volume error: 22.62 ± 7.3 ml | |||
| Vupputuri et al. ( | MCN-DN: Multi-path convolution leveraged attention deep network with LReLU | MR-ADC, CBF, CBV, MTT, TTP | Final infarct lesions manually segmented on follow-up T2WI at 90 days by a neuroradiologist | 0.47 | Sensitivity:0.867 | ||
| Attention-gated U-Net with mixed loss functions | MR-DWI, ADC, Tmax, MTT, CBV, CBF and masks of Tmax (>6s) and ADC (620 × 10-6 mm2/s ) | iCAS and DEFUSE-2 studies: final infarct lesions segmented on T2-FLAIR at 3–7 days; DEFUSE study: final infarct lesions segmented on T2-FLAIR at 30 days | 0.57 (IQR 0.30–0.69) | AUC: 0.94 (IQR 0.89–0.97) | |||
| He et al. ( | 2D U-Net with binary focal loss and Jaccard loss combined functions | CT-CBF, CBV, MTT, Tmax | Infarct lesions manually segmented on follow-up DWI/SWI or NCCT | 0.61 | AUC: 0.92 | ||
| R2U-RNet with residual refinement unit (RRU) activation and multiscale focal loss functions | CT-NCCT with intensity normalization and histogram equalization | Infarct lesion manually segmented on follow-up DWI within 7 days by a radiologist | 0.54 ± 0.29 | - | |||
| Shi et al. ( | C2MA-Net: a cross-modal cross-attention network | CT-CBF, CBV, MTT, Tmax | Infarct core manually segmented by a single investigator and then subjected to group review until acceptance | 0.48 | Precision: 0.48 | ||
| ISP-Net: a multi-scale atrous convolution with weighted cross entropy loss functions | CT-CTP source data, CBF, CBV, MTT, Tmax | Infarct lesions segmented on follow-up CT or DWI at 1-7 days | 0.801 ± 0.078 | AUC: 0.721 ± 0.108 | |||
Y, Yes; N, No; NR, not reported; PReLU, parametric rectified linear unit; NReLU, noisy rectified linear unit; LReLU, leaky rectified linear unit; CTA, CT angiography; CTP, CT perfusion; DWI, diffusion-weighted imaging; ADC, apparent diffusion coefficient; CBF, cerebral blood flow; CBV, cerebral blood volume; MTT, mean transit time; TTP, time to peak; Tmax, time to maximum of the residue function; NIHSS, National Institute of Health stroke scale; mTICI, modified thrombolysis in cerebral infarction; mRS, modified Rankin scale; DSC, dice similarity coefficient; AUC, area under the receiving operator characteristic curve.
Studies included in the meta-analysis were presented in bold font.
Methodological quality assessment of the included studies.
|
|
|
|
|
|
|
|
| ||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Are all three image sets (training, validation, and test sets) defined? | N | Y | N | N | Y | N | N | N | Y | N | N |
| Is an external test set used for final statistical reporting? | N | N | N | N | N | N | Y | N | N | N | N |
| Have multivendor images been used to evaluate the AI algorithm? | N | U | N | N | N | U | U | Y | U | U | N |
| Are the sizes of the training, validation and test sets justified? | U | U | U | U | U | U | U | U | U | U | U |
| Was the AI algorithm trained using a standard of reference that is widely accepted in our field? | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
| Was preparation of images for the AI algorithm adequately described? | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
| Were the results of the AI algorithm compared with radiology experts and/or pathology? | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
| Was the manner in which the AI algorithms makes decisions demonstrated? | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
| Is the AI algorithm publicly available? | N | N | N | Y | Y | N | N | N | N | N | N |
Y, yes; N, no; U, unknown.
Figure 2Forest plot of the included studies that assessed the performance of infarct tissue outcome prediction. Forest plot shows that the dice similarity coefficient (DSC) representing the performance of the machine learning-based approaches for final infarct prediction centers around 0.50 with a 95% confidence interval (CI) ranging from 0.39 to 0.61.
Figure 3Sensitivity analysis for the overall predictive performance using one-study-removed method.
Figure 4Funnel plot of the included studies. The effect size of mean dice similarity coefficient (DSC) score was displayed on the horizontal axis. Standard error was plotted on the vertical axis.
Figure 5Forest plot of subgroup analyses in conventional machine learning (ML) classifiers using MR data (A) and CT data (B) as model input, and deep learning models using MR data (C) and CT data (D) as model input, respectively.