| Literature DB >> 30634406 |
Xiashuang Wang1,2, Guanghong Gong3, Ni Li4,5.
Abstract
Automatic recognition methods for non-stationary electroencephalogram (EEG) data collected from EEG sensors play an essential role in neurological detection. The integrated approaches proposed in this study consist of Symlet wavelet processing, a gradient boosting machine, and a grid search optimizer for a three-class classification scheme for normal subjects, intermittent epilepsy, and continuous epilepsy. Fourth-order Symlet wavelets are adopted to decompose the EEG data into five frequencies sub-bands, such as gamma, beta, alpha, theta, and delta, whose statistical features were computed and used as classification features. The grid search optimizer is used to automatically find the optimal parameters for training the classifier. The classification accuracy of the gradient boosting machine was compared with that of a conventional support vector machine and a random forest classifier constructed according to previous descriptions. Multiple performance indices were used to evaluate the proposed classification scheme, which provided better classification accuracy and detection effectiveness than has been recently reported in other studies on three-class classification of EEG data.Entities:
Keywords: Symlet wavelet; gradient boosting machine; grid search optimizer; multiple performance indices evaluation; recognition of epilepsy EEG
Mesh:
Year: 2019 PMID: 30634406 PMCID: PMC6359608 DOI: 10.3390/s19020219
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Auxiliary medical diagnostic system for epilepsy electroencephalogram.
Dataset description.
| Data Sources | Parameter Description | Dataset Category | Subject | Epileptogenic Foci | Electrode Collection Area | Number of Samples |
|---|---|---|---|---|---|---|
| Bonn | 5 groups | {OZ} | Healthy volunteers | Scalp surface | All brain areas | 200 |
| {FN} | Intermittent epilepsy | Intracranial site | Lesion outside inside area | 200 | ||
| {S} | Continuous ictal epilepsy | Intracranial site | Intra-lesional area | 100 |
Figure 2Frequency bands of epilepsy EEGs extracted using wavelet decomposition.
Figure A1(a) Raw {S} data and corresponding wavelet decomposition; (b) raw {FN} data and corresponding wavelet decomposition; and (c) raw {OZ} data and corresponding wavelet decomposition.
Statistical features of the dataset.
| Datasets | {FN} | {OZ} | {S} |
|---|---|---|---|
| Mean | −5.94 | −6.31 | −4.74 |
| Number of cases | 4097 | 4097 | 4097 |
| Standard deviation | 13.10 | 4.56 | 38.55 |
Gradient boosting machine classifier in pseudocode.
|
Build predicted classifier Initialize for Compute the negative gradient Fit a new base-learner function Find the best gradient descent step-size Update function |
| |
Figure 3Parameters optimization flow in the grid search optimized algorithm.
Figure 4Classification implementation.
Definition of the performance classification multiple-indices used in the experiments. Parameters, , are the probability of correct classification for sub-datasets . Similarly, represents the incorrect classification probability. Parameters, , are the sum of all classification rates of sub-datasets, .
| Test/Real Type | {OZ} | {FN} | {S} | Sensitivity | Specificity | Accuracy |
|---|---|---|---|---|---|---|
| {OZ} |
|
|
|
|
|
|
| {FN} |
|
|
|
|
| |
| {S} |
|
|
|
|
|
Comparison of results of the proposed method with those of existing methods for accuracy, area under the curve, receiver operating characteristic, confusion matric and precision–recall curve of for the two- and three-level-class classifications on the Bonn University data.
| Authors | Techniques | 10-Fold CV | Dataset | ACC (%) | AUC | CM/PRC |
|---|---|---|---|---|---|---|
| Guo et al. (2010) [ | DWT and line length, ANN | No | {Z}-{S} | 100 | No | No |
| Gandhi et al. (2011) [ | DWT, energy and std, | Yes | {FNOZ}-{S} | 95.4 | No | No |
| Nicolaou et al. (2012) [ | Permutation entropy, SVM | No | {Z}-{S} | 93.5 | No | No |
| Shafiul Alam and Bhuiyan et al. (2013) [ | EMD, higher order moments, ANN | No | {O}-{S} | 100 | No | No |
| Samiee et al. (2015) [ | STFT Spectral coefficients with their statistical, values, Bayes, LR, SVM, KNN, and ANN | No | {Z}-{S} | 99.8 | No | No |
| Swami et al. (2016) [ | DTCWT, energy and std, Shannon entropy features, RNN | Yes | {Z}-{S} | 100 | No | No |
| Li et al. (2016) [ | Distribution entropy and sample entropy Statistical analysis | No | for sample entropy distribution entropy for short length data | mean | Yes | No |
| Manish et al. (2017) [ | ATFFWT and FD, LS-SVM | Yes | {Z}-{S} | 100 | No | No |
| Wang et al. (2017) [ | DWT, SVM | No |
|
| No | No |
|
| Symlets wavelets, statistical mean energy std and PCA, GBM-GSO, RF, SVM | Yes | {Z}-{S} | 100 | Yes | Yes |
Figure 5Confusion matrices comparing the results of gradient boosting machine, random forest and support vector machine with grid search optimizer on {FN}-{OZ}-{S} classification.
Performance comparisons between gradient boosting machine, support vector machine and random forest.
| GBM | SVM | RF | |
|---|---|---|---|
| Multi-class classification ability |
|
|
|
| Sensitivity of parameter selection |
|
|
|
| Generalization ability |
|
|
|
| Strong: | |||
Figure 6Comparison of receiver operating characteristics for the three-class classification.
Figure 7Comparison of the precision–recall curves space for the three-class classification.