| Literature DB >> 33285915 |
Jiang Wu1,2, Tengfei Zhou1, Taiyong Li1,2.
Abstract
Epilepsy is a common nervous system disease that is characterized by recurrent seizures. An electroencephalogram (EEG) records neural activity, and it is commonly used for the diagnosis of epilepsy. To achieve accurate detection of epileptic seizures, an automatic detection approach of epileptic seizures, integrating complementary ensemble empirical mode decomposition (CEEMD) and extreme gradient boosting (XGBoost), named CEEMD-XGBoost, is proposed. Firstly, the decomposition method, CEEMD, which is capable of effectively reducing the influence of mode mixing and end effects, was utilized to divide raw EEG signals into a set of intrinsic mode functions (IMFs) and residues. Secondly, the multi-domain features were extracted from raw signals and the decomposed components, and they were further selected according to the importance scores of the extracted features. Finally, XGBoost was applied to develop the epileptic seizure detection model. Experiments were conducted on two benchmark epilepsy EEG datasets, named the Bonn dataset and the CHB-MIT (Children's Hospital Boston and Massachusetts Institute of Technology) dataset, to evaluate the performance of our proposed CEEMD-XGBoost. The extensive experimental results indicated that, compared with some previous EEG classification models, CEEMD-XGBoost can significantly enhance the detection performance of epileptic seizures in terms of sensitivity, specificity, and accuracy.Entities:
Keywords: complementary ensemble empirical mode decomposition (CEEMD); electroencephalogram (EEG); epileptic seizure detection; extreme gradient boosting (XGBoost); feature selection
Year: 2020 PMID: 33285915 PMCID: PMC7516550 DOI: 10.3390/e22020140
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1An illustration of empirical mode decomposition (EMD).
Figure 2The complete ensemble empirical mode decomposition (CEEMD)/Extreme gradient boosting (XGBoost) model.
Summary of the Bonn electroencephalogram (EEG) data.
| Set A | Set B | Set C | Set D | Set E | |
|---|---|---|---|---|---|
| Volunteer type | Heathy | Heathy | Epileptic | Epileptic | Epileptic |
| Volunteer state | Awake state with eyes open | Awake state with eyes closed | Interictal | Interictal | Ictal |
| Number of channels | 100 | 100 | 100 | 100 | 100 |
| Electrode placement | International 10–20 system | International 10–20 system | Hippocampus opposite to hemisphere | Within epileptogenic zone | Within epileptogenic zone |
Figure 3A raw multi-channel EEG signal from patient 01 in the CHB-MIT (Children’s Hospital Boston and Massachusetts Institute of Technology) dataset.
Figure 4A raw EEG segment from set A in the Bonn dataset and the corresponding components decomposed by complete ensemble empirical mode decomposition (CEEMD).
Figure 5A raw EEG segment from patient 01 in the CHB-MIT dataset and the corresponding components decomposed by complete ensemble empirical mode decomposition (CEEMD).
A subset of the extracted features.
| Category | Sub-Category | Features |
|---|---|---|
| Time domain | Energy | Absolute energy, energy ratio |
| Autocorrelation | Mean and variance over the autocorrelation for different lags | |
| Autoregression | Autoregressive coefficients | |
| Linear trend intercept | Correlation coefficient, | |
| Statistics | Mean absolute change | |
| Frequency domain | Fourier transform spectrum | Fourier transform aggregate and coefficients |
| Time-frequency domain | Wavelet | Continuous wavelet coefficients and peaks |
Various cases considered in this study.
| Dataset | Cases | Classes | Description | Type |
|---|---|---|---|---|
| Bonn | I | A-E | Non-seizure (eyes open) and ictal | Two |
| II | B-E | Non-seizure (eyes closed) and ictal | Two | |
| III | C-E | Interictal and ictal | Two | |
| IV | D-E | Interictal and ictal | Two | |
| V | A-D | Non-seizure (eyes open) and interictal | Two | |
| VI | AB-E | Non-seizure and ictal | Two | |
| VII | CD-E | Interictal and ictal | Two | |
| VIII | ACD-E | Non-ictal and ictal | Two | |
| IX | BCD-E | Non-ictal and ictal | Two | |
| X | ABCD-E | Non-ictal and ictal | Two | |
| XI | A-D-E | Non-seizure, interictal and ictal | Three | |
| XII | AB-CD-E | Non-seizure, interictal and ictal | Three | |
| CHB-MIT | XIII | Non-seizure–Seizure | Non-ictal and ictal | Two |
Figure 6An example of the CHB-MIT EEG segmentation.
Classification performance (in %) of proposed methodology using 10-fold cross-validation. SEN—sensitivity; SPE—specificity; ACC—accuracy.
| Dataset | Cases | Classes | SEN | SPE | ACC | Time |
|---|---|---|---|---|---|---|
| Bonn | I | A-E | 100.00 ± 0.00 | 100.00 ± 0.00 | 100.00 ± 0.00 | 10.4s |
| II | B-E | 100.00 ± 0.00 | 100.00 ± 0.00 | 100.00 ± 0.00 | 10.2s | |
| III | C-E | 99.00 ± 0.06 | 100.00 ± 0.00 | 99.50 ± 0.03 | 10.3s | |
| IV | D-E | 100.00 ± 0.00 | 100.00 ± 0.00 | 100.00 ± 0.00 | 10.2s | |
| V | A-D | 100.00 ± 0.00 | 100.00 ± 0.00 | 100.00 ± 0.00 | 10.4s | |
| VI | AB-E | 100.00 ± 0.00 | 100.00 ± 0.00 | 100.00 ± 0.00 | 14.0s | |
| VII | CD-E | 99.00 ± 0.06 | 100.00 ± 0.00 | 99.33 ± 0.03 | 14.3s | |
| VIII | ACD-E | 100.00 ± 0.00 | 99.67 ± 0.02 | 99.75 ± 0.02 | 18.3s | |
| IX | BCD-E | 99.00 ± 0.06 | 99.33 ± 0.03 | 99.50 ± 0.02 | 19.6s | |
| X | ABCD-E | 99.00 ± 0.06 | 99.50 ± 0.02 | 99.6 ± 0.02 | 23.6s | |
| XI | A-D-E | - | - | 100.00 ± 0.00 | 42.4s | |
| XII | AB-CD-E | - | - | 99.00 ± 0.04 | 85.1s | |
| CHB-MIT | XIII | Non-seizure–Seizure | 95.70 ± 0.06 | 95.89 ± 0.10 | 95.79 ± 0.05 | 772.0s |
Accuracy (in %) comparison with some of the existing techniques for cases I–V on the Bonn dataset.
| Authors | Year | Methods | A-E | B-E | C-E | D-E | A-D |
|---|---|---|---|---|---|---|---|
| Kumar et al. [ | 2014 | DWT + SVM | 100.00 | 100.00 | 99.60 | 95.85 | - |
| Sharmila and Geethanjali [ | 2016 | DWT + KNN | 100.00 | 98.25 | 97.25 | 95.62 | - |
| DWT + NB | 100.00 | 99.25 | 99.62 | 95.12 | - | ||
| Hassan and Subasi [ | 2016 | CEEMDAN + LPBoost | 100.00 | - | 100.00 | 97.00 | - |
| Swami et al. [ | 2016 | DTCWT + GRNN | 100.00 | 100.00 | 100.00 | 99.50 | - |
| Tawfik et al. [ | 2016 | WPE + SVM | 99.50 | 85.00 | 93.50 | 96.50 | - |
| Sharma et al. [ | 2017 | ATFFWT + LS-SVM | 100.00 | 100.00 | 99.00 | 98.50 | - |
| Jaiswal and Banka [ | 2017 | LNGP + ANN | 99.82 | 99.25 | 99.02 | 98.18 | 99.90 |
| 1D-LGP + ANN | 99.80 | 98.92 | 99.10 | 99.07 | 99.37 | ||
| Tiwari et al. [ | 2017 | Keypoint based LBP + SVM | 100.00 | - | - | - | - |
| Kaur and Singh [ | 2017 | EMD + spike parameters + ANN | 100.00 | 100.00 | 100.00 | 99.00 | - |
| Li et al. [ | 2018 | CWT + GMM + GLCM + relief + SVM | 100.00 | - | 100.00 | - | - |
| Singh and Dehuri [ | 2018 | DWT + MLPNN | 99.50 | 97.00 | 98.51 | 100.00 | - |
| Zhang et al. [ | 2018 | WPD + FDE + KNN | 100.00 | 99.95 | 99.86 | 99.39 | - |
| Gupta and Pachori [ | 2019 | FBSE based rhythms + WMRPE + LS-SVM | 99.50 | 99.50 | 99.50 | 97.50 | - |
| Mamli and Kalbkhani [ | 2019 | FSST + GLCM + SVM | 100.00 | 99.38 | 99.54 | 96.48 | - |
| Raghu et al. [ | 2019 | Matrix determinant + MLP | 99.45 | 96.06 | 97.60 | 97.60 | - |
| Proposed method | 2019 | CEEMD + XGBoost | 100.00 | 100.00 | 99.50 | 100.00 | 100.00 |
DWT: discrete wavelet transform; SVM: support vector machine; KNN: k-nearest neighbor; NB: naive Bayes; CEEMDAN: complete ensemble empirical mode decomposition with adaptive noise; LPBoost: linear programming boosting; DTCWT: dual-tree complex wavelet transform; GRNN: general regression neural network; WPE: weighted permutation entropy; ATFFWT: analytic time-frequency flexible wavelet transform; LS-SVM: least squares support vector machine; LNGP: local neighbor gradient pattern; ANN: artificial neural network; 1D-LGP: one-dimensional local gradient pattern; GMM: Gaussian mixture model; GLCM: gray-level co-occurrence matrix; MLPNN: multilayer perceptron neural network; WPD: wavelet packet decomposition; FDE: fuzzy distribution entropy; FBSE: Fourier–Bessel series expansion; WMRPE: weighted multiscale Renyi permutation entropy; FSST: Fourier synchro-squeezed transform; MLP: multi-layer perceptron; CEEMD: complementary ensemble empirical mode decomposition; XGBoost: extreme gradient boosting.
Accuracy (in %) comparison with some of the existing techniques for cases VI–X on the Bonn dataset.
| Authors | Year | Methods | AB-E | CD-E | ACD-E | BCD-E | ABCD-E |
|---|---|---|---|---|---|---|---|
| Kumar et al. [ | 2014 | DWT + SVM | - | - | 98.80 | - | 97.38 |
| Sharmila and Geethanjali [ | 2016 | DWT + KNN | 98.83 | 96.08 | 96.80 | 96.37 | 97.10 |
| DWT + NB | 99.16 | 98.75 | 97.31 | 95.10 | 95.85 | ||
| Hassan and Subasi [ | 2016 | CEEMDAN + LPBoost | - | - | - | - | 99.20 |
| Sharma et al. [ | 2017 | ATFFWT + LS-SVM | 100.00 | 98.67 | - | - | 99.20 |
| Jaiswal and Banka [ | 2017 | LNGP +ANN | - | 98.88 | - | - | 98.72 |
| Tiwari et al. [ | 2017 | Keypoint-based LBP + SVM | - | 99.45 | - | - | 99.31 |
| Kaur and Singh [ | 2017 | EMD + spike parameters + ANN | - | - | - | - | 99.80 |
| Singh and Dehuri [ | 2018 | DWT + MLPNN | 89.00 | 99.33 | 98.00 | 95.75 | 95.60 |
| Zhang et al. [ | 2018 | WPD + FDE + KNN | 99.98 | 99.58 | - | - | 99.71 |
| Zhang et al. [ | 2018 | GST + SVD-based features + RF | - | 99.12 | - | - | 99.63 |
| Gupta and Pachori [ | 2019 | FBSE-based rhythms + WMRPE + LS-SVM | - | 99.00 | - | - | 98.60 |
| Mamli and Kalbkhani [ | 2019 | FSST + GLCM + SVM | 99.73 | 99.59 | - | - | 97.38 |
| Raghu et al. [ | 2019 | Matrix determinant + MLP | 97.10 | 96.85 | 96.00 | - | 97.20 |
| Proposed method | 2019 | CEEMD + XGBoost | 100.00 | 99.33 | 99.75 | 99.50 | 99.60 |
GST: generalized Stockwell transform; SVD: singular value decomposition; RF: random forest.
Accuracy (in %) comparison with some of the existing techniques for cases XI–XII on the Bonn dataset.
| Authors | Year | Methods | A-D-E | AB-CD-E |
|---|---|---|---|---|
| Hassan and Subasi [ | 2016 | CEEMDAN + LPBoost | 97.60 | |
| Tawfik et al. [ | 2016 | WPE + SVM | 97.50 | - |
| Jaiswal and Banka [ | 2017 | LNGP + ANN | 98.22 | - |
| Tiwari et al. [ | 2017 | Keypoint-based LBP + SVM | - | 98.80 |
| Kalbkhani and Shayesteh [ | 2017 | ST + NN | 99.37 | 99.54 |
| Zhang et al. [ | 2018 | WPD + FDE + LNN | 99.39 | 98.76 |
| Zhang et al. [ | 2018 | GST + SVD-based features + RF | 99.03 | 98.62 |
| Mamli and Kalbkhani [ | 2019 | FSST + GLCM + SVM | 99.67 | 99.26 |
| Raghu et al. [ | 2019 | Matrix determinant and MLP | - | 96.50 |
| Proposed method | 2019 | CEEMD + XGBoost | 100.00 | 99.00 |
ST: Stockwell transform.
Accuracy (in %) comparison with some of the existing techniques for cases XIII on the CHB-MIT dataset.
| Authors | Year | Methods | Non-Seizure-Seizure |
|---|---|---|---|
| - | - | CEEMD + NN | 89.18 |
| - | - | CEEMD + SVM | 90.07 |
| - | - | CEEMD + RF | 90.90 |
| Rafiuddin et al. [ | 2011 | WT + LDA | 80.16 |
| Khan et al. [ | 2012 | DWT + LDA | 91.80 |
| Behnam et al. [ | 2015 | DWT + SLMM + MLP + KNN | 90.00 |
| Zabihi et al. [ | 2016 | PSR + LDA + NB | 94.69 |
| Yuan et al. [ | 2018 | WT + CtxFusion EEG | 95.71 |
| Wei et al. [ | 2019 | CNN + MIDS | 84.00 |
| Proposed method | 2019 | CEEMD + XGBoost | 95.79 |
NN: neural network; RF: random forest; WT: wavelet transform; LDA: linear discriminant analysis; SLMM: singular Lorenz measures method; PSR: phase-space reconstruction; NB: naive Bayes; CtxFusion EEG: wavelet transform context fusion EEG; CNN: convolutional neural network; MIDS: merger of increasing and decreasing sequences.
Performances (in %) of detection models with CEEMD vs. without CEEMD.
| Dataset | Cases | Classes | XGBoost | CEEMD-XGBoost | ||||
|---|---|---|---|---|---|---|---|---|
| SEN | SPE | ACC | SEN | SPE | ACC | |||
| Bonn | I | A-E | 100.00 ± 0.00 | 100.00 ± 0.00 | 100.00 ± 0.00 | 100.00 ± 0.00 | 100.00 ± 0.00 | 100.00 ± 0.00 |
| II | B-E | 99.00 ± 0.06 | 98.00 ± 0.08 | 98.50 ± 0.05 | 100.00 ± 0.00 | 100.00 ± 0.00 | 100.00 ± 0.00 | |
| III | C-E | 99.00 ± 0.06 | 99.00 ± 0.06 | 99.00 ± 0.04 | 99.00 ± 0.06 | 100.00 ± 0.00 | 99.50 ± 0.03 | |
| IV | D-E | 100.00 ± 0.00 | 100.00 ± 0.00 | 100.00 ± 0.00 | 100.00 ± 0.00 | 100.00 ± 0.00 | 100.00 ± 0.00 | |
| V | A-D | 100.00 ± 0.00 | 99.00 ± 0.06 | 99.50 ± 0.03 | 100.00 ± 0.00 | 100.00 ± 0.00 | 100.00 ± 0.00 | |
| VI | AB-E | 99.00 ± 0.06 | 99.00 ± 0.04 | 99.00 ± 0.03 | 100.00 ± 0.00 | 100.00 ± 0.00 | 100.00 ± 0.00 | |
| VII | CD-E | 97.00 ± 0.13 | 98.50 ± 0.05 | 98.00 ± 0.04 | 99.00 ± 0.06 | 100.00 ± 0.00 | 99.33 ± 0.03 | |
| VIII | ACD-E | 97.00 ± 0.13 | 99.00 ± 0.03 | 98.50 ± 0.05 | 100.00 ± 0.00 | 99.67 ± 0.02 | 99.75 ± 0.02 | |
| IX | BCD-E | 96.00 ± 0.13 | 99.67 ± 0.02 | 98.75 ± 0.05 | 99.00 ± 0.06 | 99.33 ± 0.03 | 99.50 ± 0.02 | |
| X | ABCD-E | 96.00 ± 0.13 | 99.75 ± 0.02 | 99.00 ± 0.03 | 99.00 ± 0.06 | 99.50 ± 0.02 | 99.60 ± 0.02 | |
| XI | A-D-E | - | - | 100.00 ± 0.00 | - | - | 100.00 ± 0.00 | |
| XII | AB-CD-E | - | - | 97.40 ± 0.07 | - | - | 99.00 ± 0.04 | |
| CHB-MIT | XIII | Non-seizure–Seizure | 93.46 ± 0.12 | 92.83 ± 0.17 | 93.14 ± 0.07 | 95.70 ± 0.06 | 95.89 ± 0.10 | 95.79 ± 0.05 |
The 20 most important features in the Bonn dataset.
| Feature | Importance Score | Component | Domain | Description |
|---|---|---|---|---|
| Change_quantiles | 0.0487 |
| Time | The absolute, average value of consecutive changes inside the corridor given by the quantiles 0.1 and 0.2 |
| Agg_linear_trend | 0.0430 |
| Time | A linear least-squares regression for values of the time series |
| Agg_linear_trend | 0.0358 |
| Time | A linear least-squares regression for values of the time series |
| Svd_entropy | 0.0352 |
| Entropy | Singular value decomposition entropy |
| FFT_aggregated_skew | 0.0352 |
| Frequency | The spectral skew of the absolute Fourier transform spectrum |
| AR_coefficient | 0.0336 |
| Frequency | The autoregressive coefficients |
| Percentage_of_reoccurring_datapoints_to_all | 0.0272 | Raw | Time | The percentage of unique values which appear more than once |
| Ratio_value_number_to_time_series_length | 0.0230 | Raw | Time | A factor which is 1 when all values in the time series appear only once, and below 1 otherwise |
| Agg_linear_trend | 0.0204 | Raw | Time | A linear least-squares regression for values of the time series |
| AR_coefficient | 0.0165 | Raw | Time | The autoregressive coefficients |
| AR_coefficient | 0.0153 |
| Time | The autoregressive coefficients |
| Approximate_entropy | 0.0143 |
| Entropy | Approximate entropy |
| FFT_coefficient | 0.0141 |
| Frequency | The Fourier coefficients of discrete Fourier transform |
| Autocorrelation | 0.0138 |
| Time | The autocorrelation of the specified lag (8) |
| Absolute_sum_of_changes | 0.0137 |
| Time | The sum of the absolute value of consecutive changes |
| Change_quantiles | 0.0113 |
| Time | The absolute, average value of consecutive changes inside the corridor given by the quantiles 0.8 and 0.0 |
| Agg_autocorrelation | 0.0108 | Raw | Time | The value of an aggregation function over the autocorrelation |
| Autocorrelation | 0.0102 | Raw | Time | The autocorrelation of the specified lag (6) |
| Approximate_entropy | 0.0098 | Raw | Entropy | Approximate entropy |
| Number_CWT_peaks | 0.0094 | Raw | Time-frequency | The number of peaks that appear at enough width scales and with sufficiently high signal-to-noise ratio (SNR) |
The 20 most important features in the CHB-MIT dataset.
| Feature | Importance Score | Component | Domain | Description |
|---|---|---|---|---|
| Change_quantiles | 0.0527 |
| Time | The absolute, average value of consecutive changes inside the corridor given by the quantiles 0.2 and 0.4 |
| Quantile | 0.0423 | Raw | Time | The 0.8 quantile of the raw signal |
| Agg_linear_trend | 0.0419 |
| Time | A linear least-squares regression for values of the time series |
| Change_quantiles | 0.0359 |
| Time | The absolute, average value of consecutive changes inside the corridor given by the quantiles 0.2 and 1.0 |
| Change_quantiles | 0.0206 | Raw | Time | The absolute, average value of consecutive changes inside the corridor given by the quantiles 0.4 and 0.6 |
| FFT_coefficient | 0.0179 |
| Frequency | The Fourier coefficients of discrete Fourier transform |
| Quantile | 0.0165 |
| Time | The 0.2 quantile of the raw signal |
| Number_peaks | 0.0148 |
| Time | The number of peaks of at least support 1 in the signal |
| FFT_coefficient | 0.0145 |
| Frequency | The Fourier coefficients of discrete Fourier transform |
| number_crossing | 0.0083 |
| Time | The number of crossings of the signal on 1 |
| Autocorrelation | 0.0051 |
| Time | The autocorrelation of the specified lag (9) |
| Range_count | 0.0046 |
| Time | Count observed values within the interval [−1, 1) |
| energy_ratio_by_chunks | 0.0043 |
| Time | The sum of squares of chunks; 3 out of 10 chunks expressed as a ratio with the sum of squares over the whole time series |
| Permutation_entropy | 0.0041 |
| Entropy | Permutation entropy |
| Change_quantiles | 0.0040 |
| Time | The absolute, average value of consecutive changes inside the corridor given by the quantiles 0.0 and 0.6 |
| FFT_coefficient | 0.0039 |
| Frequency | The Fourier coefficients of discrete Fourier transform |
| FFT_coefficient | 0.0036 |
| Frequency | The Fourier coefficients of discrete Fourier transform |
| Cwt_coefficients | 0.0035 |
| Time-frequency | Continuous wavelet transform for the Ricker wavelet |
| Number_peaks | 0.0035 |
| Time | The number of peaks of at least support 3 in the signal |
| Change_quantiles | 0.0034 |
| Time | The absolute, average value of consecutive changes inside the corridor given by the quantiles 0.0 and 0.6 |