Literature DB >> 25206493

Approximate entropy and support vector machines for electroencephalogram signal classification.

Zhen Zhang¹, Yi Zhou¹, Ziyi Chen², Xianghua Tian³, Shouhong Du³, Ruimei Huang¹.

Abstract

The automatic detection and identification of electroencephalogram waves play an important role in the prediction, diagnosis and treatment of epileptic seizures. In this study, a nonlinear dynamics index-approximate entropy and a support vector machine that has strong generalization ability were applied to classify electroencephalogram signals at epileptic interictal and ictal periods. Our aim was to verify whether approximate entropy waves can be effectively applied to the automatic real-time detection of epilepsy in the electroencephalogram, and to explore its generalization ability as a classifier trained using a nonlinear dynamics index. Four patients presenting with partial epileptic seizures were included in this study. They were all diagnosed with neocortex localized epilepsy and epileptic foci were clearly observed by electroencephalogram. The electroencephalogram data form the four involved patients were segmented and the characteristic values of each segment, that is, the approximate entropy, were extracted. The support vector machine classifier was constructed with the approximate entropy extracted from one epileptic case, and then electroencephalogram waves of the other three cases were classified, reaching a 93.33% accuracy rate. Our findings suggest that the use of approximate entropy allows the automatic real-time detection of electroencephalogram data in epileptic cases. The combination of approximate entropy and support vector machines shows good generalization ability for the classification of electroencephalogram signals for epilepsy.

Entities: Disease Gene Species

Keywords: approximate entropy; automatic real-time detection; brain injury; classification; electroencephalogram; epilepsy; generalization; grants-supported paper; neural regeneration; neuroregeneration; nonlinear dynamics; support vector machine

Year: 2013 PMID： 25206493 PMCID： PMC4145978 DOI： 10.3969/j.issn.1673-5374.2013.20.003

Source DB: PubMed Journal: Neural Regen Res ISSN： 1673-5374 Impact factor: 5.135

Research Highlights The characteristic index of the electroencephalogram signal was extracted using a nonlinear dynamics method, and a support vector machine was used for the classification of epileptic electroencephalogram signals. Our findings are more accurate than previous classification studies. Previous studies used electroencephalogram data from one or two cases in one electroencephalogram database, while we selected electroencephalogram data from four epileptic patients from the electroencephalogram database in two different hospitals, so the results are more representative. In this study, electroencephalogram data from different cases were regarded as training data and test data for the support vector machine, which was different from previous studies that only used data from the same case, so our results are more meaningful for the prediction of clinical seizures. We calculated the average classification accuracy rate of three cases as the final results, which are more convincing. Our findings indicate that a nonlinear dynamics index trained classifier can effectively identify epileptic electroencephalogram signals, and has good generalization ability.

INTRODUCTION

Epilepsy is caused by the highly synchronized abnormal discharge of neurons in the brain, it may recur with sudden onset, and can seriously affect the patient's life and work. The electroencephalogram signal contains a large amount of physiological and pathological information about the brain, and is regarded as a key method for the diagnosis and treatment of epilepsy and other brain diseases[1]. The electroencephalogram signal consists of a group of randomized electrophysiological signals, which have characteristic nonstationary and nonlinear properties. Recently, a nonlinear approach has attracted increasing attention in electroencephalogram signal detection due to the great advantage of the randomness of representative signals[2]. The nonlinear properties of electroencephalogram signals have become a hot topic. At present, epileptic patients account for 5%o of the total population worldwide, including six million cases in China. The majority of affected cases can be effectively controlled by drug treatment, while the remaining are intractable epilepsy patients because the drug treatment is ineffective. Among them, 21–24% of intractable epilepsy patients can be cured by surgical removal of epileptic loci; however, the symptoms of the remaining 75% of intractable epilepsy patients are unable to be controlled[3]. Therefore, being able to predict recurrence and sudden seizures is very important for the improvement of the patient's quality of life. A recent study found that a rule-based seizure prediction method using a nonlinear dynamics index achieved greater than 90% sensitivity for focal neocortical epilepsy[4]. An effective prediction of an epileptic seizure requires automatic classification of epileptic electroencephalogram signals[5]. Therefore, we designed this study to explore the automatic identification and classification of electroencephalogram signals in epilepsy. Growing evidence has focused on the automatic detection and identification of epilepsy signals[6]. A variety of classification methods have been proposed[7], such as decision tree, Gaussian mixture classification, fuzzy classification and support vector machines. In our experiment, we used a support vector machine for classification of electroencephalogram signals, because it requires only a small amount of data and has strong generalization ability[8910]. Unfortunately, in previous experiments[8910], training data and test data from electroencephalogram signals are often derived from the same cases, which may affect the clinical applicability of the classifiers. To test the generalization ability of the nonlinear dynamics index and verify the availability of nonlinear dynamics in realtime detection of electroencephalogram epileptic signals, we selected training data and test data from the electroencephalogram signals of different cases. In brief, we first collected a multi-segmental electroencephalogram signal in one epileptic patient and continuous electroencephalogram signals from different cases. Then the signals were divided chronologically, and the nonlinear dynamics index was measured. The electroencephalogram data we collected were scalp data that had high noise disturbance, so we used approximate entropy, which is the best anti-noise nonlinear dynamics index[11]. After the approximate entropy was calculated, we classified the data in each segment. Our findings indicate that this test method can clearly elucidate the generalization ability of the classifier.

RESULTS

Quantitative analysis of subjects

Four patients with partial epileptic seizures (neocortex localized epilepsy with clear loci) were included in this study. Their electroencephalogram data were divided into segments and the characteristic values of each segment were measured, i.e., their approximate entropy. The support vector machine classifier was constructed using the approximate entropy of one epileptic case, and then the electroencephalogram waves of the other three cases were classified. All patients were involved in the result analysis, with no loss.

Construction of classifiers

The electroencephalogram data of patient 1, which were used to construct the support vector machine classifier, were divided into group A and group B. Group A contained 40 segments of electroencephalogram data during the epileptic interictal period, and group B contained 40 segments of electroencephalogram data during the epileptic seizure period. There were significant differences in the approximate entropy between group A and group B (t = 5.863, P < 0.01; paired sample t-test). The grouping is listed in Figure 1.

Figure 1

Flowchart of support vector machine classifier.

Groups A and B: epileptic electroencephalogram data at interictal and ictal states. ‘Extract characteristics’ extract the approximate entropy of electroencephalogram data. ‘Classifier construction algorithm’ calculates the approximate entropy at interictal and ictal states and thus the classifier algorithm is obtained.

Flowchart of support vector machine classifier. Groups A and B: epileptic electroencephalogram data at interictal and ictal states. ‘Extract characteristics’ extract the approximate entropy of electroencephalogram data. ‘Classifier construction algorithm’ calculates the approximate entropy at interictal and ictal states and thus the classifier algorithm is obtained. After the classifiers were constructed, the electroencephalogram data from the other three patients were chronologically and equally divided into two groups (patient 2: C1, C2; patient 3: D1, D2; patient 4: E1, E2), with 40 segments of data in each group. All of the approximate entropies of the electroencephalogram data in each group were input into the constructed classifier for classification (Figure 2).

Figure 2

Test flowchart of electroencephalogram data of epileptic patients (C1, C2; D1, D2; E1, E2).

All the approximate entropy of the electroencephalogram data in the six groups were input into the constructed classifier for classification. ‘Extract characteristics’ extract the approximate entropy of the electroencephalogram data. Finally, the classification results of the electroencephalogram data from the six groups could be obtained.

Test flowchart of electroencephalogram data of epileptic patients (C1, C2; D1, D2; E1, E2). All the approximate entropy of the electroencephalogram data in the six groups were input into the constructed classifier for classification. ‘Extract characteristics’ extract the approximate entropy of the electroencephalogram data. Finally, the classification results of the electroencephalogram data from the six groups could be obtained. As the classification result of the electroencephalogram data in each group was a one-dimensional vector, we can combine electroencephalogram C1, C2, D1, D2, E1, E2 into groups C, D, and E.

Classification results

The characteristics of classifiers can be described using the following index: Sensitivity = true negativity / (true negativity + false positivity) (1) Sensitivity means the percentage of successfully verified seizure signals among all test data. Specificity = true negativity / (true negativity + false positivity) (2) Specificity means the percentage of successfully excluded seizure signals during the ictal period. Accuracy = number of correct classification / total number of test samples (3) Accuracy means the overall identification rate of the classifiers. In the group C, according to the clinician's diagnosis report, 96–130 seconds were regarded as the ictal period, and patients presented with apparent spasm in the left upper limb, so we arranged 48–65 segments of data as the ictal data. As shown in Figure 3, the data in segments 49, 51–65, and 75 (17 segments in total) were involved in the ictal data.

Figure 3

Electroencephalogram data classification results of patient 2 (group C).

According to the support vector machine classifier algorithm, the Y axis represents classification results, 1 represents the epileptic interictal period and – 1 represents the epileptic ictal period. The X axis represents electroencephalogram data at 1–80 segments for 0–160 seconds.

Each segment of data represents 2 seconds. Segments 49, 51–65, 75 (total 17 segments) can be considered as ictal data.

Electroencephalogram data classification results of patient 2 (group C). According to the support vector machine classifier algorithm, the Y axis represents classification results, 1 represents the epileptic interictal period and – 1 represents the epileptic ictal period. The X axis represents electroencephalogram data at 1–80 segments for 0–160 seconds. Each segment of data represents 2 seconds. Segments 49, 51–65, 75 (total 17 segments) can be considered as ictal data. In the group D, according to the clinician's diagnosis report, the data in segments 44–63 were the ictal data (Figure 4).

Figure 4

Electroencephalogram data classification results of patient 3 (group D).

According to the support vector machine classifier algorithm, the Y axis represents classification results, 1 represents the epileptic interictal period and –1 represents the epileptic ictal period. The X axis represents the electroencephalogram data at 1–80 segments for 0–160 seconds.

Each segment of data represents 2 seconds. Segments 44–63 can be considered as ictal data.

Electroencephalogram data classification results of patient 3 (group D). According to the support vector machine classifier algorithm, the Y axis represents classification results, 1 represents the epileptic interictal period and –1 represents the epileptic ictal period. The X axis represents the electroencephalogram data at 1–80 segments for 0–160 seconds. Each segment of data represents 2 seconds. Segments 44–63 can be considered as ictal data. In the group E, according to the clinician's diagnosis report, the data in segments 53–72 were the ictal data (Figure 5).

Figure 5

Electroencephalogram data classification results of patient 4 (group E).

According to the support vector machine classifier algorithm, the Y axis represents classification results, 1 represents the epileptic interictal period and –1 represents the epileptic ictal period. The X axis represents the electroencephalogram data at 1–80 segments for 0–160 seconds.

Each segment of data represents 2 seconds. Segments 53–72 can be considered as ictal data.

Electroencephalogram data classification results of patient 4 (group E). According to the support vector machine classifier algorithm, the Y axis represents classification results, 1 represents the epileptic interictal period and –1 represents the epileptic ictal period. The X axis represents the electroencephalogram data at 1–80 segments for 0–160 seconds. Each segment of data represents 2 seconds. Segments 53–72 can be considered as ictal data. Based on the clinician's diagnosis report, the classification sensitivity, specificity and accuracy rate were very high in the groups C, D, and E. The accuracy rate of electroencephalogram data classification in the three groups reached 93.33% (Table 1).

Table 1

Comparison of the sensitivity, specificity and accuracy of electroencephalogram data classification results for three epileptic patients

DISCUSSION

In this study, we tried to extract an electroencephalogram signal index using a nonlinear dynamics method and used a support vector machine in the classification of epileptic electroencephalogram signals. Compared with previous classification results[1278910], our findings are more accurate and convincing for the following reasons: First, previous studies used electroencephalogram data from one database and included only a few cases[1278910], while our study collected the electroencephalogram data of four patients from two different hospitals. Second, previous studies also used training data and test data from one database[1278910], while our study collected training data and test data for the support vector machine from different databases and different cases. Finally, we calculated the average classification accuracy rate of three patients as the classification results, which are more convincing. Our experimental findings suggest that classifiers constructed with nonlinear dynamics indexes can effectively identify epileptic electroencephalogram signals and have good generalization ability. This method is simple and easy to use, and may benefit real-time detection and identification systems for epilepsy, thus assisting clinical diagnosis and treatment. Currently, neuroscience has become a hot spot in life science research. The characteristics of the nervous system may be quantitatively described through research into neurophysiological signals and the use of nonlinear dynamics methods, thus furthering our understanding of epilepsy. In this study, we used nonlinear dynamics and a support vector machine to classify epileptic electroencephalogram signals, and have obtained a high accuracy rate compared with previous studies[1278910]. Furthermore, the classifiers constructed with a nonlinear dynamics index showed good generalization ability. However, our experiments are theoretical research and are not yet fit to be used clinically, thus practical applications deserve further exploration.

SUBJECTS AND METHODS

Design

A computer-simulation, neuro-electrophysiological experiment.

Time and setting

Experiments were performed at the Medical Information Processing Laboratory, Department of Biomedical Engineering, Zhongshan School of Medicine, Sun Yat-sen University, China from July to October in 2012.

Subjects

Electroencephalogram data from partial epileptic patients were collected from the First Affiliated Hospital of Sun Yat-sen University and the Second Affiliated Hospital of Guangzhou Medical College, China.

Inclusion criteria

(1) Neocortical localized epilepsy; (2) clear epileptic loci; (3) onset within 1 minute, with clear ictal and interictal periods.

Exclusion criteria

(1) Limbic system localized epilepsy; (2) unclear epileptic loci; (3) onset later than 1 minute, ictal and interictal periods were elusive.

Electroencephalogram data

Electroencephalogram data were collected from the electroencephalogram room, First Affiliated Hospital of Sun Yat-sen University (patient 1) and the electroencephalogram room, Department of Neurology, Second Affiliated Hospital of Guangzhou Medical College (patients 2–4), China, at 200 Hz and 256 Hz sampling frequency respectively. Patient 1: Temporal lobe epilepsy. Electroencephalogram data were collected via the F7 lead (left anterior temple) for 1 200 seconds continuously, and divided into ictal data (40 segments; group A) and ictal data (40 segments; group B) according to clinician's reports. Each group contained 2 000 points of data. Patient 2: Temporal lobe epilepsy. Electroencephalogram data were collected via the F8 lead (right anterior temple) for 160 seconds continuously, and chronologically divided into 80 segments (group C). Each segment contained 514 points of data, with each point representing 2 seconds. According to clinical performance, 88–126 seconds were regarded as the ictal period, and the patient presented with left upper limb convulsions. Patient 3: Temporal lobe epilepsy. Electroencephalogram data were collected via the T4 lead (right middle temple) for 160 seconds continuously, and chronologically divided into 80 segments (group D). Each segment contained 514 points of data, with each point representing 2 seconds. According to clinical performance, 88–126 seconds were regarded as the ictal period, and the patient presented with left upper limb convulsions and a left deflection of the head and the eyes. Patient 4: Frontal lobe epilepsy. Electroencephalogram data were collected via the Fp1 lead (left anterior temple) for 160 seconds continuously, and chronologically divided into 80 segments (group E). Each segment contained 514 points of data, and each point represented 2 seconds. According to clinical performance, 106–144 seconds were regarded as the ictal period, and the patient presented with left upper limb stiffness and a right deflection of the head and the eyes. All of the electroencephalogram data were collected from the scalp, processed and stored in the First Affiliated Hospital of Sun Yat-sen University, China and the Second Affiliated Hospital of Guangzhou Medical College, China. Prior to the experiments, all four epileptic patients were informed of the experimental scheme and risk, and gave their informed consent[12]. The electroencephalogram data from the four epileptic patients are summarized in Table 2.

Table 2

General information of the four partial epileptic patients

Methods

Extraction of nonlinear characteristics value

Introduction of approximate entropy: a common nonlinear kinetic parameter, which includes correlation dimension, Lyapunov exponent, and approximate entropy. Among these, approximate entropy is a rule for the complexity of a measure series and statistical quantization. It can measure the probability of new patterns when the dimension increases from m to m + 1 and the incidence of new information in a time sequence. Complex signals tend to have high approximate entropy; on the contrary, regular signals tend to have low approximate entropy. Approximate entropy can reflect the overall characteristics of the signals. Compared with correlation dimension and Lyapunov exponent, approximate entropy requires a small amount of data (500–1 000 points) and has anti-noise ability. Generally, approximate entropy is calculated by its definition as follows: For a given N-point time series {u(i)}, approximate entropy can be obtained by the following steps (m is the dimension of the preset pattern, and r is the preset similar tolerance): (1) The sequence {u(i)} composes the m-dimensional vector X(i), namely: (2) For every i value, the distance between vector X(i) and another vector X(j) is calculated: (3) Given the threshold value r (r > 0), for each i value, the number of d[X(i),X(j)]Cim(r), namely: (4) Cim(r) logarithm is obtained, and the average i value is calculated, denoted as Φm(r), namely: (5) For m + 1, repeat steps 1–4, to obtain Φm+1(r) (6) Obtain the final results as follows: Usually m = 2, r = 0.1 – 0.25 SD(u) (SD: standard deviation), and the obtained approximate entropy has reasonable statistical properties, which was consistent with previous studies[13], so we set m = 2, r = 0.25 SD(u) for approximate entropy. A fast approximate entropy algorithm: Approximate entropy was extracted using a fast algorithm, which overcomes the shortcomings of common algorithms, such as low calculation efficiency, slow speed and being not conducive to real-time operation. This algorithm was proposed by Hong et al [14]. using the theory of binary distance matrix; the speed was 5 times the definition calculation. 1) For the N-point sequence, the N × N binary distance matrix D was calculated, and the element in the D (ith row and jth column) is denoted by d: 2) and were calculated using the elements in matrix D: 3) (r) logarithm is obtained and the average i value is calculated: (r) logarithm is obtained and the average i value is calculated: 4) Approximate entropy estimate: This fast approximate entropy algorithm omits the vector construction process and directly obtains the difference of data in the time series, thus improving the calculation speed of approximate entropy. During seizures, a large group of neurons in the human brain synchronously discharge, and several brain functions are inhibited, therefore, compared with normal brain electrical activity, the complexity will be reduced. From the interictal period to the ictal period, approximate entropy reduces to varying degrees and then begins to increase after the end of the seizure[1516].

Support vector machine

Definition of support vector machine: It is the optimal design rule of linear classifier proposed by Vapnik et al [171819]. Its basic principle is to find the optimal hyperplane in the feature space, and to separate two types of samples correctly. The interval between different types of sample sets that are close to the hyperplane should be the maximum, so as to achieve a greater generalization capability; when the optimal classification hyperplane cannot completely separate two types of sample, it is suggested to introduce a relaxation factor ξ, allowing the presence of misclassified samples, thus balancing the empirical risk and the promotion. A support vector machine is based on structural risk minimization, while others are based on empirical risk minimization. Compared with other classification methods, a support vector machine will keep the sample characteristics and consider the adaptability of new input data; therefore, it has a stronger learning ability for small sample sizes[20]. Data points are expressed as an n-dimensional vector x, and the present study discusses a binary classification algorithm, with y = 1 or –1 to represent different levels. The classifier aims to find a hyperplane in the n-dimensional space, using the following equation: Through the hyperplane, two types of data can be separated. If the data point on one side of the hyperplane is – 1, then that on the other side of the hyperplane is all 1: Obviously, if f(x), then x is a point located on the hyperplane. As for all f(x) < 0 points, y = –1, while f(x) > 0 corresponds to y = 1. During the classification process, we introduced the data points x into f(x). If the result was less than 0, we defined it as –1, and if greater than 0, we defined it as 1[21]. After the optimal classification plane in the character space was chosen, the binary classification results could be tested with the test data. For the new data x, f(x) can be defined as: Here, λ1 ≥ 0 is the Lagrangian factor, meeting y (w,Φ(x)+b)=1 K(x,x) is the kernel function, which receives two low-dimensional space vectors and mappings to high-dimensional space, calculating the vector inner product through a transformation and constructing a linear discriminant function in high-dimensional space. Many problematic nonlinear classification problems in low-dimensional space can be solved through the kernal function conversion into a highdimensional space. Influencing factors: There are two factors contributing to the properties of the support vector machine classifier: the kernel function and the error penalty parameter C[2223]. Currently, the selection criteria for the kernel function remain elusive, and radial basis functions have achieved good outcomes in electroencephalogram signal analysis[24]. The radial basis functions can map nonlinear time series low-dimensional space vector sets into high-dimensional space, so the kernel function in the support vector machine we used was a radial basis function. The error penalty parameter C is applied to optimize the generalization ability of the machine, which may be affected by the misclassified sample proportion and the algorithm complexity. In our experiments, we tried C = 50, C = 100, and C is close to infinity for numerical calculation. When C = 50, there were two misclassified samples; when C = 100, there were also two misclassified samples; and when C was approaching infinity, only one sample was misclassified, so we set the error penalty parameter C as approaching infinity.

Statistical analysis

All data are derived from raw electroencephalogram images. The data of patient 1 were divided into groups A and B, and the approximate entropy of the two groups was compared using a paired samples t-test. A P < 0.01 value was considered significant.

17 in total

1. Physiological time-series analysis using approximate entropy and sample entropy.

Authors: J S Richman; J R Moorman
Journal: Am J Physiol Heart Circ Physiol Date: 2000-06 Impact factor: 4.733

2. Discriminating preictal and interictal states in patients with temporal lobe epilepsy using wavelet analysis of intracerebral EEG.

Authors: Kais Gadhoumi; Jean-Marc Lina; Jean Gotman
Journal: Clin Neurophysiol Date: 2012-04-03 Impact factor: 3.708

Approximate entropy and support vector machines for electroencephalogram signal classification.

INTRODUCTION

RESULTS

Quantitative analysis of subjects

Construction of classifiers

Classification results

DISCUSSION

SUBJECTS AND METHODS

Design

Time and setting

Subjects

Inclusion criteria

Exclusion criteria

Electroencephalogram data

Methods

Extraction of nonlinear characteristics value

Support vector machine

Statistical analysis

1. Physiological time-series analysis using approximate entropy and sample entropy.

2. Discriminating preictal and interictal states in patients with temporal lobe epilepsy using wavelet analysis of intracerebral EEG.

3. Support vector machines for histogram-based image classification.

4. Support vector machines for spam categorization.

5. A rule-based seizure prediction method for focal neocortical epilepsy.

6. Classification of patterns of EEG synchronization for seizure prediction.

Review 7. Epileptic seizure prediction and control.

8. Automatic seizure detection using wavelet transform and SVM in long-term intracranial EEG.

9. Approximate entropy-based epileptic EEG detection using artificial neural networks.

10. Application of approximate entropy on dynamic characteristics of epileptic absence seizure.

1. A Comparison of Multiscale Permutation Entropy Measures in On-Line Depth of Anesthesia Monitoring.

2. Automatic Detection of Driver Fatigue Using Driving Operation Information for Transportation Safety.

3. Online Detection of Driver Fatigue Using Steering Wheel Angles for Real Driving Conditions.

4. Epilepsy Detection in EEG Using Grassmann Discriminant Analysis Method.