| Literature DB >> 35961967 |
Devon Watts1, Rafaela Fernandes Pulice2,3, Jim Reilly4, Andre R Brunoni5,6, Flávio Kapczinski1,3,7,8, Ives Cavalcante Passos9,10.
Abstract
Selecting a course of treatment in psychiatry remains a trial-and-error process, and this long-standing clinical challenge has prompted an increased focus on predictive models of treatment response using machine learning techniques. Electroencephalography (EEG) represents a cost-effective and scalable potential measure to predict treatment response to major depressive disorder. We performed separate meta-analyses to determine the ability of models to distinguish between responders and non-responders using EEG across treatments, as well as a performed subgroup analysis of response to transcranial magnetic stimulation (rTMS), and antidepressants (Registration Number: CRD42021257477) in Major Depressive Disorder by searching PubMed, Scopus, and Web of Science for articles published between January 1960 and February 2022. We included 15 studies that predicted treatment responses among patients with major depressive disorder using machine-learning techniques. Within a random-effects model with a restricted maximum likelihood estimator comprising 758 patients, the pooled accuracy across studies was 83.93% (95% CI: 78.90-89.29), with an Area-Under-the-Curve (AUC) of 0.850 (95% CI: 0.747-0.890), and partial AUC of 0.779. The average sensitivity and specificity across models were 77.96% (95% CI: 60.05-88.70), and 84.60% (95% CI: 67.89-92.39), respectively. In a subgroup analysis, greater performance was observed in predicting response to rTMS (Pooled accuracy: 85.70% (95% CI: 77.45-94.83), Area-Under-the-Curve (AUC): 0.928, partial AUC: 0.844), relative to antidepressants (Pooled accuracy: 81.41% (95% CI: 77.45-94.83, AUC: 0.895, pAUC: 0.821). Furthermore, across all meta-analyses, the specificity (true negatives) of EEG models was greater than the sensitivity (true positives), suggesting that EEG models thus far better identify non-responders than responders to treatment in MDD. Studies varied widely in important features across models, although relevant features included absolute and relative power in frontal and temporal electrodes, measures of connectivity, and asymmetry across hemispheres. Predictive models of treatment response using EEG hold promise in major depressive disorder, although there is a need for prospective model validation in independent datasets, and a greater emphasis on replicating physiological markers. Crucially, standardization in cut-off values and clinical scales for defining clinical response and non-response will aid in the reproducibility of findings and the clinical utility of predictive models. Furthermore, several models thus far have used data from open-label trials with small sample sizes and evaluated performance in the absence of training and testing sets, which increases the risk of statistical overfitting. Large consortium studies are required to establish predictive signatures of treatment response using EEG, and better elucidate the replicability of specific markers. Additionally, it is speculated that greater performance was observed in rTMS models, since EEG is assessing neural networks more likely to be directly targeted by rTMS, comprising electrical activity primarily near the surface of the cortex. Prospectively, there is a need for models that examine the comparative effectiveness of multiple treatments across the same patients. However, this will require a thoughtful consideration towards cumulative treatment effects, and whether washout periods between treatments should be utilised. Regardless, longitudinal cross-over trials comparing multiple treatments across the same group of patients will be an important prerequisite step to both facilitate precision psychiatry and identify generalizable physiological predictors of response between and across treatment options.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35961967 PMCID: PMC9374666 DOI: 10.1038/s41398-022-02064-z
Source DB: PubMed Journal: Transl Psychiatry ISSN: 2158-3188 Impact factor: 7.989
Machine learning studies predicting treatment response using EEG in major depressive disorder (a summary of sample size, treatment outcomes, machine learning algorithms, and performance metrics).
| First author, year | Sample size and diagnosis [ | Intervention | Outcome | Machine learning model | Accuracy | Other measures |
|---|---|---|---|---|---|---|
| Bailey [ | 39 patients with treatment-resistant depression | 3 weeks (15 sessions) unilateral left 10 Hz rTMS | Responders (≥50% decrease in HAM-D after 5–8 weeks of rTMS) vs. Non-responders | Linear SVM | 91% | Sensitivity: 91% Specificity: 92% F1 score: 0.93 |
| Bailey [ | 32 patients with treatment-resistant depression | 3 weeks (15 sessions) unilateral left 10 Hz rTMS | Responders (≥50% decrease in HAM-D after 5–8 weeks of rTMS) vs. Non-responders | Linear SVM | 86.66% | Sensitivity: 84% Specificity: 89% |
| Corlier [ | 109 patients with MDD | 3 weeks (15 sessions) of 10 Hz left DLPFC rTMS (68 subjects received unilateral left treatment, 41 were changed to sequential bilateral treatment—10 Hz left DLPFC, 1 Hz right DLPFC) | Responders (≥40% decrease in IDS-30 scores from baseline to treatment 30) vs. Non-responders | Elastic Net | 61.8–79.2% (Best performance observed with alpha band frequency and IDS-30 percent change score) | AUC: 0.52–0.77 Specificity: 70.9–82.7% Sensitivity: 34.8–75.7% PPV: 58.2–79.7% NPV: 63.8–82.2% |
| Erguzel [ | 147 patients with treatment-resistant depression | 18 sessions of 25 Hz left PFC rTMS | Responders (≥50% decrease in HAM-D scores after 3 weeks of treatment) vs. Non-responders | BPNN | 89.12% | Sensitivity: 94.44% AUC: 0.904 |
| Erguzel [ | 55 patients with MDD | 18 sessions of 25 Hz left PFC rTMS | Responders (≥50% decrease in HAM-D scores after 3 weeks of treatment) vs. Non-responders | ANN | 89.09% | Sensitivity: 86.67–93.33% Specificity: 80–84% AUC: 0.686–0.909 Best model (6-fold CV) Sensitivity: 93.3% Specificity: 84.0% AUC: 0.909 |
| Erguzel [ | 147 patients with treatment-resistant depression | 20 sessions of adjunctive 25 Hz left PFC rTMS | Responders (≥50% decrease in HAM-D scores after 20 sessions of rTMS) vs. Non-responders | ANN SVM DT | Accuracy: 78.3–86.4% Best performance using SVM Balanced Accuracy: 54.71–75.42% | Sensitivity: 60.41–68.62% Specificity: 49.01–82.22% |
| Hasanzadeh [ | 46 patients with MDD | 5-sessions of 10 Hz left DLPFC rTMS | Responders (≥50% decrease in BDI-II or HAM-D scores from baseline) vs. Non-responders Remission (Remission defined as BDI ≤ 8 or HAM-D ≤ 9) vs. Non-remission | kNN | 76.1–91.3% best performance with power spectral features | Sensitivity: 69.6–87% Specificity: 82.6–95.7% |
| Cao [ | 37 patients with treatment-resistant depression | Patients randomized to one of three groups (1:1:1): 0.5 mg/kg ketamine 0.2 mg/kg ketamine Normal saline | Responders (≥45% reduction in HAM-D score from baseline to 240 min posttreatment) vs. Non-responders | LDA NMSC kNN PARZEN PERLC DRBMC SVM Radial kernel | 78.4% Best performance using SVM with a radial kernel | Sensitivity: 79.3% Specificity: 84.2% Recall: 78.5% Precision: 87.0% |
| Cook [ | 180 patients with MDD | 8-week trial of escitalopram (10 mg) or bupropion (150 mg) (1-week single-blind escitalopram followed by 7 weeks double-blind trial) | Remission (≤7 HDRS at week 8) vs. Non-remission | LDA | 64.4% | Sensitivity: 74.3% Specificity: 55.3% PPV: 60.5% NPV: 70.0% AUC: 0.635 |
| de la Salle [ | 47 patients with MDD | 12-week double-blinded trial of: (1) escitalopram+ bupropion (2) escitalopram+ placebo (3) bupropion+placebo | Responders (≥50% reduction in MADRS scores from baseline to posttreatment) vs. Non-responders Remitters (≤10 MADRS at post-treatment) vs. Non-responders | LR | Response: Change in PF Cordance: 81% Change in MRF Cordance: 74% Remission: Change in PF Cordance: 70% Change in MRF Cordance: 51% | Response (ΔPF): AUC: 0.85 Sensitivity: 70% Specificity: 85% PPV: 0.95 NPV: 0.76 Remission (ΔPF): AUC: 0.66 Sensitivity: 65% Specificity: 74% PPV: 65% NPV: 74% Response (ΔMRF): AUC: 0.80 Sensitivity: 70% Specificity: 95% PPV: 95% NPV: 76% Remission (ΔMRF): AUC: 0.59 Sensitivity: 93% Specificity: 31% PPV: 39% NPV: 91% |
| Jaworska [ | 51 patients with MDD | 12-week double-blinded trial of: (1) escitalopram+bupropion (2) escitalopram+placebo (3) bupropion+placebo | Responders (≥50% reduction in MADRS scores from baseline to posttreatment) vs. Non-responders | RF SVM AdaBoost CART MLP GNB | 88% Combined model, accuracy of each individual model not reported | AUC: 0.716-0.901 Highest AUC observed in Random Forest Model Combined model Sensitivity = 77% Specificity = 99% PPV = 99 NPV = 81 |
| Mumtaz [ | 34 patients with MDD | Open-label trial of an SSRI | Responders (Responders defined as ≥50% improvement in pre- vs. post-treatment BDI-II scores) vs. Non-responders | LR | 87.5% | Sensitivity: 95% Specificity: 80% |
| Rajpurkar [ | 518 patients with MDD | Patients randomized in a 1:1:1: ratio to escitalopram, sertraline, or extended-release venlafaxine for 8 weeks | Regression model (Continuous improvement in individual symptoms, defined as the difference in score for each of the symptoms on the HAM-D from baseline to week 8) | GBM | Best model observed using EEG and baseline symptom features | 95% CI: 0.473–0.639 Used C-index to assess performance (probability that the algorithm will correctly identify, given 2 random patients with different improvement levels, which patient showed greater improvement |
| Wu [ | 309 patients with MDD | 8-week course of sertraline or placebo | Regression model (Pre- minus post-treatment difference in HAMD17 scores, with missing endpoint values, imputed to maintain an intent-to-treat framework.) | SELSER Algorithm developed in the current study | Sertraline Placebo | NA |
| Zhdanov [ | 122 patients with MDD | 8-weeks of open-label escitalopram (10–20 mg) treatment | Responders (≥50% improvement in MADRS scores from baseline to post-treatment) vs. Non-responders | SVM radial kernel | 79.2% Using baseline EEG data 82.4% Using baseline and week 2 EEG data | Baseline Model Sensitivity—67.3% Specificity—91.0% Baseline and Week 2 Model Sensitivity: 79.2% Specificity: 85.5% |
ANN artificial neural network, BDI Beck depression inventory, BPNN back-propagation neural networks, CART classification and regression trees, CNN convolutional neural network, DLPFC dorsolateral prefrontal cortex, DRBMC discriminative restricted Boltzmann machine, DT decision trees, ELM extreme learning machine, GBM gradient boosting machine, GNB Gaussian naive Bayes, HAM-D Hamilton depression rating scale, IDS-SR inventory of depressive symptomatology (self-report), kNN k-nearest neighbors, KPLSR kernelized partial least squares regression, LASSO least absolute shrinkage and selection operator, LDA linear discriminant analysis, LR logistic regression; MADRS Montgomery–Asberg depression rating scale, MFA mixture of factor analysis, MLP multi-layer perceptron, MRF middle right frontal, NMSC nearest mean classifier, PARZEN Parzen density estimation, PERCL perceptron classifier, RF random forest, SCZ schizophrenia, SELSER sparse EEG latent SpacE regression, SVM support vector machine.
Extracted features across studies (a summary of pre-processing strategies, feature extraction methods, feature selection, and top predictors across studies).
| First author, year | Pre-processing strategy | EEG features | Feature extraction method | Feature selection method | Top features |
|---|---|---|---|---|---|
| Bailey [ | Data down-sampled to 1000 Hz Second order Butterworth filtering with bandpass from 1 to 80 Hz and a band-stop filter 47–53 Hz Fast ICA used to manually select and remove eye blinks, movements, and remaining muscle artifacts. | Power spectral analysis connectivity analysis | - Morlet Wavelet transform to calculate power in the upper alpha band (10–12.5 Hz), theta band (4–8 Hz), and gamma band (30–45 Hz) - Average power calculated across the entire retention period with each frequency band and averaged over trials - Hanning taper time–frequency transform to determine instantaneous phase values for complex Fourier-spectra from 4 to 45 Hz with a 1 Hz resolution across a 3-oscillation sliding time window - Weighted phase lagged index (wPLI) calculated between each electrode - wPLI provides a value between 0 and 1 for each electrode pair at each frequency and time point | Not applicable | - Greater theta power at Fz in responders vs. non-responders ( - No significant differences for alpha or gamma power, or theta-gamma coupling - Responders showed a non-significant pattern of less gamma connectivity than non-responders at baseline ( - Responders showed significantly more theta connectivity across baseline and week 1, with both interhemispheric fronto-parietal coupling and frontal and parietal interhemispheric coupling (overall |
| Bailey [ | Same Procedure as Bailey [ | Power spectral analysis Connectivity analysis Theta cordance analysis | - Absolute power values for each epoch 1–80 Hz underwent a multi-taper fast Fourier frequency transformation with a Hanning taper - Absolute power averaged across neighboring electrode pairs - Relative power in reattributed absolute theta band calculated by dividing power in theta band by total power from 1 to 80 Hz - Subtracted half-maximal values from normalized absolute and relative power in theta band, and summed together for each electrode - Individualized alpha peak frequency averaged across F3, Fz, and F4 electrodes - Multitaper fast Fourier frequency transformation - Gaussian distribution with least-squared error fitted to electrodes in 6–14 Hz range - Peaks of distribution selected from each electrode and averaged | Not applicable | - Greater theta connectivity in responders vs. non-responders ( - No main effect of theta cordance, frontal-midline theta power, or alpha power. |
| Corlier [ | ICA-based FASTER algorithm Dominant alpha frequency peak determined for each subject (highest spectral peak within 7-13 Hz alpha range) | EEG functional connectivity measures (coherence, envelope correlation, and alpha band frequency) | - Coherence: correlation of amplitude and phase - Envelope: correlation of amplitude - Alpha frequency band: similarity of the spectral waveform of the alpha band across regions | Elastic Net | Coherence & Envelope: Connections in the frontal to temporo-parietal nodes Alpha frequency band: Connections between the left frontal seeds (near stimulation site) and contralateral fronto-temporal locations EN models for coherence and envelope correlation showed a diffuse coupling pattern, while αSC showed a more focal connectivity. |
| Erguzel [ | Manually selected artifact-free EEG data with a minimum split-half reliability ratio of 0.95 and minimum test-retest reliability ratio of 0.90. FFT | EEG cordance (combines absolute and relative EEG power, and negative discordance values) | - Normalized power across electrode sites and frequency bands - Maximum absolute and relative power of each frequency band is calculated to derive normalized absolute and relative power - Half-maximal value is subtracted, absolute/relative normalized power is summed. | Genetic algorithm - adaptive heuristic search algorithm was applied to features of all selected channels to reduce the number of dimensions | Fp1, Fp2, F7, F8, and F3 in the theta frequency band |
| Erguzel [ | Band-pass filter with 0.15–30 Hz frequency FFT used to calculate absolute and relative power in each of two non-overlapping frequency bands (Delta—1–4 Hz, theta—4–8 Hz) | EEG cordance (combines absolute and relative EEG power, and negative discordance values) | - Normalized power across electrode sites and frequency bands - Maximum absolute and relative power of each frequency band is calculated to derive normalized absolute and relative power - Half-maximal value is subtracted, absolute/relative normalized power is summed. | ANN | NA |
| Erguzel [ | Band-pass filter with 0.15–30 Hz frequency Manually selected artifact-free EEG data (at least 2 min) FFT | EEG cordance (combines absolute and relative EEG power, and negative discordance values) | EEG cordance analyses follow the same procedure as Erguzel 2014 | Not applicable | |
| Hasanzadeh [ | Sampling frequency 500 Hz Bandpass FIR filter (1–42 Hz) ICA to remove noisy data MARA to label noisy ICs Visually inspected to eliminate remaining artifacts | 21 features in four categories (nonlinear, PSDl, spectral, and cordance) | - LZC: Complexity measure of time series to estimate scholastic and chaotic behavior of time series - KFD: Algorithm for computing fractal dimension, a measure of self-similarity of a time series based on number of patterns repetitions - Delta (1–4 Hz)—Beta (12–30 Hz) by Welch method with a non-overlapped window, 500 samples in length - Average power computed for frequencies in each band - Method that quantifies the degree of phase coupling between components of a signal - measure of complexity of system based on chaos and time delay reconstruction theory | mRMR | - Nonlinear (LZC, KFD, CD)—80.4% accuracy - Power (D, T, A, B) - 91.3% accuracy - Spectrum (BispSL, Bisp2M, and BispEn in all bands)—84.8% accuracy - Cordance (Fr, Pre, Fr)—76.1% accuracy - All—87% accuracy |
| Cao [ | Real-time artifact removal algorithm based on CCA, feature extraction, and a GMM used to improve signal quality | Power spectral analysis EEG Alpha Asymmetry EEG Theta Cordance | - 256-point FFT using Welch’s method - 10 min spans of data with 256-point moving window at 128-point overlap - Absolute and relative power of four prefrontal channels from delta (1–3.5 Hz), theta (4–7.5 Hz), lower alpha (8–10 Hz) and upper alpha (10.5–12 Hz) bands. - mid-prefrontal (Fp1/Fp2) and mid-lateral (AF7/AF8) hemispheric asymmetry index to establish a relative measure of the difference in EEG (lower and upper) alpha power between the right and left forehead areas. - Combines information from both absolute and relative powers in the EEG theta band | 0.5 mg/kg dose - AF7 theta— - Fp2 theta— 0.2 mg/kg dose - Fp1 theta— - Fp2 theta— | |
| Cooks [ | Artifact-free epochs selected following rejection of muscle, electrocardiographic, and drowsiness artifacts. | Power spectral analysis ATR Relative combined theta and alpha power | - Calculated using consecutive two-second epochs of eyes-closed rest, by averaging values calculated separately for each channel in each epoch - Non-linear weighted combination of relative combined theta and alpha power (3–112 Hz), alpha1 power (8.5–12 Hz) and alpha2 absolute power (9–11.5 Hz) | Relative combined theta and alpha power was scaled to a range from 0 to 100; a cut-off score of ≥46.2 was selected | NA |
| Jaworska [ | Bandpass filters 0.1–80 Hz 100 s of artifact-free data subjected to a FFT ln-transformed prior to analyses to ensure normality (Minimizes influence of extreme values) | eLORETA analysis Theta Cordance | - estimates neural activity as current density based on MNI-152 template, creating a low-resolution activation image - Values from prefrontal electrodes (Fp1, Fp2) at baseline and week 1 | Tree-based feature selection kernel PCA | eLORETA features were most important, comprising 17 delta, 20 theta, 14 alpha1, 20 alpha2, and 17 beta EEG features. Power at week 1 at T8 followed by power at Cp6 Baseline power at Fp2 and week 1 power at Fc2 Alpha1 Baseline power at F7/8 Alpha2 Baseline power at P8 and week 1 power at O1 Baseline power at T7 and week 21 power at Fz |
| Mumtaz [ | Bandpass filters 0.1–70 Hz EEG data collected during 5 min eyes open, and 5 min eyes closed - 3-stimulus visual Oddball task used 50 Hz notch filter used to suppress power line noise | Wavelet coefficients in the delta and theta frequency range | - involves a window function to capture both low and high-frequency components of the signal | Rank-based feature selection according to their relevance to class labels minimum redundancy and maximum relevance | C3—theta frequency F7—delta frequency F3—delta frequency F7—theta frequency T4—theta frequency F8—theta frequency F4—delta frequency Fz—delta frequency F4—delta frequency C4—delta frequency F8—theta frequency T4—delta frequency P3—theta frequency |
| Rajpurkar [ | Raw EEG signal was filtered using a band-pass filter with 0.15 - 30 Hz frequency prior to artifact removal FFT | Relative and absolute band power Frontal alpha asymmetry Occipital asymmetry Ratio of beta/alpha band power Ratio of theta/alpha band power | Frontal alpha asymmetry - difference in alpha bandpower between O2 and O1 Occipital beta asymmetry - difference in beta bandpower between O2 and O1 ratio of beta/alpha and theta/alpha band power - Calculated for each electrode Feature selection: Decision tree weight in LightGBM | Gradient boosted feature selection | 2. T7-T3 beta absolute ratio 3. F7 gamma relative 4. Fp2 delta relative 5. F3 alpha absolute 6. Fp2 theta absolute 7. P4 alpha absolute 8. T7-T3 beta relative ratio 9. F7 beta relative |
| Salle [ | Data was filtered (0.1–30 Hz), ocular-corrected, and inspected for artifacts (voltages ±μV, faulty channels, drift) Minimum of 100 s of artifact-free data was required for participant inclusion | Theta Cordance (Prefrontal— Fp1, Fp2 MRF—Fz, Fp2, F4, F8) | Combines information from both absolute and relative powers in the EEG theta band | NA | Change in MRF theta cordance (Fz, Fp2, F4, F8) = 74% accuracy |
| Wu [ | 60 Hz AC line noise artifact removed using CleanLine - Non-physiological slow drifts in EEG recordings were removed using 0.01 Hz high-pass filter - Spectrally filtered EEG data were re-referenced to common average - Bad channels were rejected based on thresholding spatial correlations among channels - Subjects with more than 20% bad channels were discarded - Rejected channels were interpolated from EEG of adjacent channels via spherical spline interpolation - Remaining artifacts were removed using ICA - EEG data re-referenced to common average | SELSER Channel-level alpha band power Theta Coherence Band power features of latent signals extracted with ICA or PCA | - spatial filter transforms multi-channel EEG data into a single latent signal, where the power is used as a feature - model fitting is done under a sparse constraint on the number of spatial filters, which reduces dimensionality - eigenvalues of the covariance matrix to reduce dimensionality | SELSER | Best performance using SELSER on alpha frequency range eyes-open rsEEG data |
| Zhdanov [ | 0.05–100 Hz bandpass filter Filtering performed using 2nd order Butterworth filters applied to the data in forward and reverse direction, to eliminate phase distortion Data pre-processed with EEGLAB toolbox Channels contaminated by large sporadic artifact were identified by human analyst and deleted EEG data bandpass filtered 1–80 Hz Notch-filtered at 60 Hz | Electrode-level spectral features Source-level spectral features Multiscale-entropy-based features Microstate-based features | - EEGLAB function - log-transformed absolute power obtained for each channel - For each pair, absolute power at left electrode divided by right, resulting in 25 features for each band eLORETA algorithm as implemented by LORETA-KEY software Following regions selected on basis of prior literature: ACC, rACC, and mOFC - Quantifies variability of time series by estimating predictability of amplitude patterns across a time series - Two consecutive data points were used for data matching, and points were considered to match if their absolute amplitude difference was <15% of the standard deviation of the time series. - Implemented using CARTOOL - - - | Unpaired 2-tailed | MSE asymmetry features—C3/C4 (baseline) MSE asymmetry features—FC3/FC4 (baseline) MSE asymmetry features—T7/T8 (week 2) MSE asymmetry features—CP3/CP4 (week 2) Electrode-level spectral asymmetry—P3/P4 alpha low (baseline) Electrode-level spectral asymmetry —T7/TP8 theta (week 2) Electrode-level spectral asymmetry —F7/F8 beta mid (week 2) Source-level spectral features—alpha high ACC, rACC (week 2) |
ACC anterior cingulate cortex, rACC rostral anterior cingulate cortex, ANN artificial neural network, CCA canonical correlation analysis, Coh coherence, eLORETA exact low-resolution brain electromagnetic tomography, FDR Fisher’s discriminant ratio, FIR finite impulse response, FFT fast Fourier transformation, GMM Gaussian mixture model, ICA independent component analysis, KFD Katz fractal dimension, LASSO least absolute shrinkage and selection operator, LCMV linearly constrained minimum variance, LightGBM light gradient boosting machine, LZC Lempel–Ziv complexity, MARA multiple artifact rejection algorithm, MNI Montreal Neurological Institute, mOFC medial orbitofrontal cortex, MRF middle right frontal, mRMR maximum relevance minimum redundancy, MSC magnitude squared coherence, PCA principal component analysis, PSD power spectral density, rACC rostral Anterior Cingulate Cortex, rsEEG resting-state EEG, SELSER sparse EEG latent space regression.
Fig. 1Pooled effects of treatment response accuracy using EEG.
Pooled accuracy of treatment response prediction models in Major Depressive Disorder across 792 patients within a random-effects model using a restricted maximum-likelihood estimator to calculate the heterogeneity variance τ2. Model accuracy across studies was used, in conjunction with standard deviation, calculated by multiplying the standard error by the square root of the sample size (SD = SE × √n). Knapp–Hartung adjustments were used to calculate the confidence interval around the pooled effect. The average accuracy across models was 83.94% (95% CI: 78.91–89.29), with a heterogeneity variance τ2 of 0.0044.
Fig. 2Sensitivity and specificity across models.
A calculation of the sensitivity and specificity summary statistics across 12 studies using the frequencies of true positives, false negatives, false positives, and true negatives, using the madad function in the mada package in R. Overall, the balanced accuracy (sensitivity + specificity/2) across studies was 81.28%. Across studies, model sensitivity was lower than specificity, suggesting that predictive models of treatment response using EEG overall show better performance in identifying true non-responders to treatment (specificity), relative to true responders to treatment (sensitivity).
Model performance metrics across EEG models.
| (a) | ||||||
|---|---|---|---|---|---|---|
| Authors | Sensitivity | 2.5% | 97.5% | Specificity | 2.5% | 97.5% |
| Bailey [ | 0.731 | 0.460 | 0.896 | 0.946 | 0.798 | 0.988 |
| Bailey [ | 0.700 | 0.448 | 0.870 | 0.914 | 0.758 | 0.973 |
| Corlier [ | 0.607 | 0.494 | 0.709 | 0.643 | 0.477 | 0.780 |
| Erguzel [ | 0.919 | 0.772 | 0.975 | 0.827 | 0.643 | 0.927 |
| Erguzel [ | 0.841 | 0.665 | 0.945 | 0.938 | 0.769 | 0.985 |
| Hasanzadeh [ | 0.854 | 0.665 | 0.945 | 0.938 | 0.769 | 0.985 |
| Cao [ | 0.794 | 0.558 | 0.922 | 0.886 | 0.694 | 0.964 |
| Cook [ | 0.731 | 0.576 | 0.845 | 0.542 | 0.383 | 0.692 |
| Salle [ | 0.696 | 0.511 | 0.834 | 0.929 | 0.741 | 0.983 |
| Jaworska [ | 0.768 | 0.585 | 0.886 | 0.980 | 0.834 | 0.998 |
| Mumtaz [ | 0.921 | 0.719 | 0.982 | 0.763 | 0.539 | 0.899 |
| Zhdanov [ | 0.791 | 0.666 | 0.878 | 0.846 | 0.742 | 0.913 |
| Average | 0.776 | 0.600 | 0.892 | 0.846 | 0.678 | 0.923 |
| Test for equality of sensitivities: | ||||||
| Test for equality of specificities: | ||||||
| Correlation of sensitivities and false-positive rates: Rho = −0.203 (−0696 to 0.420) | ||||||
| Total DOR: 23.49 (95% CI: 10.40–52.02), | ||||||
| posLR: 5.232 (95% CI: 3.15–8.67), | ||||||
| negLR: 0.271 (95% CI: 0.195–0.376), | ||||||
| AUC: 0.850 (95% CI: 0.747–0.890); pAUC: 0.777 | ||||||
A summary of performance metrics across all predictive models of treatment response using EEG.
(a) The madad function in the “mada” package was used to calculate the sensitivity, specificity, and partial Area-Under-The-Curve (AUC) across studies, while the maduani function was used to calculate the Diagnostic Odds Ratio (DOR), positive likelihood ratio (posLR), and negative likelihood ratio (negLR). AUC was calculated using the AUC_boot function in dmetatools, with an alpha of 0.95 and 2000 bootstrap iterations. Overall, the balanced accuracy (sensitivity + specificity/2) was 81.1%.
(b) The metamean function in the “meta” package was used to pool accuracy across studies in a random effects model using an inverse variance method with Knapp–Hartung adjustments to calculate the confidence interval around the pooled effect. Across models, overall model accuracy was 83.93% (95% CI: 78.90–89.29).