Malihe Hassani1, Mohammad Reza Karami1. 1. Department of Electronic Engineering, Faculty of Electrical and Computer Engineering, Babol University of Technology, Babol, Iran.
Abstract
The Volterra model is widely used for nonlinearity identification in practical applications. In this paper, we employed Volterra model to find the nonlinearity relation between electroencephalogram (EEG) signal and the noise that is a novel approach to estimate noise in EEG signal. We show that by employing this method. We can considerably improve the signal to noise ratio by the ratio of at least 1.54. An important issue in implementing Volterra model is its computation complexity, especially when the degree of nonlinearity is increased. Hence, in many applications it is urgent to reduce the complexity of computation. In this paper, we use the property of EEG signal and propose a new and good approximation of delayed input signal to its adjacent samples in order to reduce the computation of finding Volterra series coefficients. The computation complexity is reduced by the ratio of at least 1/3 when the filter memory is 3.
The Volterra model is widely used for nonlinearity identification in practical applications. In this paper, we employed Volterra model to find the nonlinearity relation between electroencephalogram (EEG) signal and the noise that is a novel approach to estimate noise in EEG signal. We show that by employing this method. We can considerably improve the signal to noise ratio by the ratio of at least 1.54. An important issue in implementing Volterra model is its computation complexity, especially when the degree of nonlinearity is increased. Hence, in many applications it is urgent to reduce the complexity of computation. In this paper, we use the property of EEG signal and propose a new and good approximation of delayed input signal to its adjacent samples in order to reduce the computation of finding Volterra series coefficients. The computation complexity is reduced by the ratio of at least 1/3 when the filter memory is 3.
An electroencephalogram (EEG) is a test that detects the electrical activity over a short period of time, usually 20–40 min in the brain using electrodes placed on the scalp. German physiologist and psychiatrist, Hans Berger (1873–1941) measured the first human EEG in 1924, although similar studies had been carried out in animals as early as 1870.The common activity of millions of neurons, at the depth of several millimeters that have same spatial direction yields an electrical field, which is strong enough in order to measure from the people scalp.[1] The amplitude of an EEG signal is almost in the range of 40–100 μV with the five major frequency bands from 0 to 100 Hz.[1]These frequency bands from low to high frequencies are respectively called delta (δ) 0.5–4 hz, theta (θ) 4–8 hz, alpha (α) 8–13 hz, beta (β) 13–30 hz, and gamma (γ) >30 Hz.An important branch of studies regarding to EEG is artifact removal and noise canceling. EEG is nonstationary and highly vulnerable to a variety of noises, especially frequency interference, thus, eliminating the noise in the raw EEG data so as to obtain useful information that reflects the brain activities and states is an important subject for EEG.[2] Various artifact removal methods were introduced in the literature; the most usual artifact elimination method is based on linear regression, and its purpose is to reject the most repeated artifact, that is, eye movement or blinking. This approach can approximate the concurrently measured electrooculogram (EOG) signals in the EEG and eliminate them.[3] However, because the EOG electrodes are located near the EEG electrodes, they involve a detailed amount of brain activity. Therefore, eliminating these signals may produce distortions in the EEG signals. More complete methods are established on a linear decomposition of the multichannel EEG recordings. The most common approach is the blind source separation (BSS).[4] The principal BSS assumption is based on the fact that the artifact sources are independent from brain sources, either normal or pathologic; the goal is to retrieve the original sources (brain and artifactual), by giving only sensor observations.Normally, source separation algorithms outperform in no noise conditions, so denoising step (to remove additive noise) should be performed in the preprocessing stage of the EEG signal. A common solution for noise canceling from nonstationary signals is wavelet denoising (WD). The decomposition of a noisy signal on a wavelet basis (discrete orthogonal wavelet transform, discrete wavelet transform [DWT]) concentrates the main signal in a few wavelet coefficients that have large absolute values without changing the noise random distribution. Therefore, denoising can be performed by putting a threshold on the wavelet coefficients.[5678]In this paper, the noise is considered dependent on the signal and has a nonlinear relation to it,[9] so we employed a nonlinear noise estimation approach based on Volterra series expansion to remove the noise. Recognition and compensation of undesired nonlinearity is one of the important subjects in the field of digital signal processing.[10] Adaptive Volterra filtering also has some applications to acoustic echo cancelation.[11] It was shown that the performance of the system was affected by unwanted nonlinearities in the system.[12] The influences of unwanted nonlinearities can be decreased by different ways.[131415] Nonlinear models also have some applications in the speech processing domain and result in a better performance in this field.[16] Appling Volterra series to model a nonlinear system is widely used with great success. However, in the present conditions, methods of calculating the Volterra series coefficients are not general, despite the fact that they can be estimated for systems with known and fixed order. When the nonlinear system order is unknown, adaptive methods and algorithms are widely used for the Volterra kernel estimation. One important issue in kernel calculation is the speed of the process. If there is fast Volterra series coefficients estimation, we can use a higher order model for the system that leads to a better performance. This paper proposes a new implementation of the higher order Least mean squares (LMS) Volterra filter. A second order nonlinear system with memory is identified using the new implementation of the LMS algorithm for Volterra kernels estimation.The new implementation uses the property of EEG signal and is based on an approximation of delayed input signal to its adjacent samples and then applying only linear adaptive LMS algorithm. After finding Volterra series coefficients, we employed Volterra model to find the nonlinearity relation between EEG signal and the noise, which is a new way to estimate noise in EEG signal. We show that, by employing this method we can considerably improve the signal to noise ratio (SNR).
THE VOLTERRA MODEL
For a discrete-time and causal nonlinear system with memory, the input-output relation of the nonlinear Volterra filter is given by:[17]Where the functions hr, r = 1,…, N represent the Volterra kernels, and N represents the nonlinearity degree of the model. By choosing N = 2, the Volterra series expansion can be expressed as:Above equation contains one zero-order sentence, that is, h0, M sentence corresponding to first-order Volterra filter and M2 sentence corresponding to second-order Volterra filter. We can express the relation of input-output by applying nonlinear operators as illustrated below.y[n]=h0[x[n]]+H1[x[n]]+H2[x[n]] (3)In above representations, the functions hi, i = 0, 1, 2 are the kernels corresponding to the nonlinear operators Hi [x[n].The model characterized by the Eq. 2 and Eq. 3 is called a second-order nonlinear Volterra model. In above expressions, filter memory is considered equal for all nonlinearity orders. In general form, Eq. 1 can have a different memory for each nonlinearity order. We can also consider symmetric Volterra coefficients that lead to more simplification.The second-order Volterra kernel is an (M × M) matrix. The third-order kernel has M matrices with the size of (M × M). By considering symmetric kernels with memory M, we need to determine M (M + 1)/2 coefficients for the second-order Volterra kernel, while the third-order one needs M (M + 1) (M + 2)/6 coefficients.[15]The accuracy of determining Volterra coefficients is an important issue in the practical applications. It was shown that homogeneity and non-orthogonality were two features of Volterra kernels. As a result of non-orthogonality, cross-correlation methods cannot be used to calculate Volterra kernels and the quantities of the Volterra kernels is dependent to the order of the applied Volterra filter.[1819] However, for inputs that their amplitude density function is symmetric, like the Gaussian noise, the odd order Volterra functional is orthogonal to the even order Volterra functional. Hence, for this kind of input, a second-order Volterra model, with zero DC component, is an orthogonal model. Hence, we can apply cross-correlation technique in order to directly calculate Volterra kernel.[17] When the higher order Volterra model is used, adaptive techniques of kernel measurements are applied. Since the relation of the input-output is linear according to Volterra coefficients, using adaptive algorithms for the implementation of Volterra model is very easy. Here, we present the input vectors for different order filters. The first order input vector, when M = 3 is written as follows:X(1)=[x[n] x[n–1] x[n–2]] (4)By considering equal memories for different orders kernels, “the second order input vector” is given by:X(2)=X(1)×X(1) (5)When the kernels are symmetric, just xi, j on the assumption of i ≥ j, of X(2), are chosen in the input-output relation of the Volterra model. Therefore, “the second order input vector,” in the vector form is:And its length is 6. For “the third order input vector” it is proposed to express the multiple input delayed signal products in Eq. 2 by matrices elements.[11] Multiplying “the second order input vector” (Eq. 5) by the first order input vector can create these matrices. By considering equal memory for all filters in Eq. 2, M = 3, and symmetric kernels we have:[20]Therefore, “the third order input vector” composed of three matrices as illustrated in Eq. 7–9.[20] “The third order input vector” can be expressed in vector form as indicated below:Its size is (1 × 10). These input vectors are used to implement the LMS Volterra filter in nonlinear system identification.
ESTIMATING SECOND-ORDER VOLTERRA SERIES COEFFICIENTS
At the first step, it is required to find the nonlinear relation between noisy signal and noise. Then, we employ this relation to estimate noise present in the signal at next step. Consider the following block diagram [Figure 1]:
Figure 1
Relation between noisy signal and noise
Relation between noisy signal and noiseIn which y[n] is noisy signal and e[n] is the noise. As expressed in previous part, we have the following relation for above block diagram:Where the nonlinearity degree of the model is 2 and M is the filter memory. h0, h1 (k1), h2 (k1, K2) are Volterra series coefficients. These coefficients will be obtained by LMS adaptive filter and the technique explained later is used for reducing its computation complexity. We employ these coefficients in nonlinear modeling of noise.
LEAST MEAN SQUARES ALGORITHM FOR TRUNCATED VOLTERRA SERIES MODEL
The development of a gradient type LMS adaptive algorithm for truncated Volterra series nonlinear models follows a similar method of development as for linear systems.[21]The truncated rth order Volterra series expansion is:Assuming h0 = 0 and r = 2, the weight vector can be expressed as:The input vector is:Linear and quadratic coefficients are updated separately by minimizing the instantaneous square of the error:Where y(n) = (n) is the estimate of d (n). This results in the update equations:Where μ1 and μ2 are step-sizes used to control the speed of convergence and ensure the stability of the filter.Using the weight vector notation, H[n], we can combine the two update equations into one as the coefficient update equation:Where μ is chosen such that 0 < μ1,μ2 < and λmax is the maximum eigenvalue of the autocorrelation matrix of the input vector X[n].
Reducing Computation Complexity to Find Volterra Series Coefficients
Consider the following second-order Volterra representation as an example:By extending above relation for M = 3 and considering symmetric kernel we have:In which h2 [k1] is the same as h2 [k1, k2] if k1 and k2 are equal.An example of EEG signal is shown in terms of μV in Figure 2, Because difference between adjacent samples in EEG signal is negligible as shown in Table 1 and Figure 3,[22] we can consider x[n] x[n−1] equal by x[n]2, x[n] x[n−2] equal by x[n−1]2, and x[n−1]x[n−2] equal by x[n−2]2, then we can rewrite above equation:
Figure 2
An example of electroencephalogram signal
Table 1
An example of 10 consecutive sample of an EEG signal
Figure 3
Difference between consecutive samples of an electroencephalogram signal
An example of electroencephalogram signalAn example of 10 consecutive sample of an EEG signalDifference between consecutive samples of an electroencephalogram signalWe can replace z[n] = x[n]2:Instead of employing nonlinear adaptive filter that is very complicated, we can use the linear one and compute h2[0], h2[1], h2[2], and use these same coefficients instead of 2h2[0,1], 2h2[0,2], and 2 h2[1,2], respectively.As a result we do not need to compute 9 coefficients, 3 coefficients need not to be computed due to symmetry, and 3 coefficients is obtained by sufficient approximation of x[n] with its adjacent samples.Hence, instead of computing M2 coefficients in second order kernel h2 [k1, k2], we only need to compute M coefficients. Then when M = 3, we can reduce the computation by the ratio of 1/3 even more because we do not consider the complexity of nonlinear adaptive filtering.
Estimating Noise Based on Second-Order Volterra Series Coefficients
After estimating the second-order Volterra series coefficients in the previous part, it is time to employ these coefficients for estimating noise. Consider the following system [Figure 4]:
Figure 4
Block diagram of noise estimation by Volterra model
In which y[n] is noisy signal, e[n] is estimated noise, and d[n] is denoised signal.Block diagram of noise estimation by Volterra modelWe use Volterra series coefficients h0, h1 (k1), h2 (k1, k2) obtained in previous part to estimate e[n], that is, noise present in the noisy signal y[n]. Then d[n] is obtained according to the following equation:d[n]=y[n]–e[n] (23)
The Evaluation Criteria
For denoising algorithms evaluation, we have used the classical criteria of the mean squared error (MSE) value and (SNR) value between the original signals and their denoised versions, as follow:Where x(n) the original signal, (n) its denoised version and N is the length of the signal. Furthermore, we have obtained the ratio of SNRs in denoised signal and noisy signal as a criterion of reduction of noise as follows:Moreover, smaller MSE and larger SNR values indicated a better denoisng algorithm.
SIMULATION RESULTS
Synthetic Data
The synthetic noisy signal y has a length 500 and is analogous to EEG signal in no stationary points of view. It also has variable spectrum. The synthetic signal is a sine wave with a specific frequency in every 100 samples and has variance 0.5051, and the Gaussian noise is added to signal in two cases with mean zero and variance 0.1 and 0.3.[2324]As stated above, y can be expressed as follows:Where w(n) is the Gaussian noise and f1,…, f5 is the frequencies in the range of EEG frequency. Because EEG signal is not stationary and has different characteristics at different times, the synthetic signal is chosen so that it has the same behavior.In order to find second-order Volterra coefficients between y[n] and e[n], we divide y[n] into periods of P points and compute the noise of current period by using Volterra coefficients of previous period, because adjacent points have approximately similar coefficients. Tables 2-4 show the results for noise variances (v) of 0.1, 0.3, and 0.5, respectively. As you see, SNR is more and more improved by decreasing P as it is expected.
Table 2
Results from denoising simulated signals by changing the value of P (variance of noise is 0.1)
Table 4
Results from denoising simulated signals by changing the value of P (variance of noise is 0.5)
Results from denoising simulated signals by changing the value of P (variance of noise is 0.1)Results from denoising simulated signals by changing the value of P (variance of noise is 0.3)Results from denoising simulated signals by changing the value of P (variance of noise is 0.5)In every case, several simulations were performed, and the results of denoising algorithms are shown as a mean ± standard deviation.As nowadays classical solution for noise removal from nonstationary signals is WD, which has the capability of studying both frequency and time information simultaneously. The basic idea is simple: By decomposing the signal on a wavelet basis (DWT), we obtain a representation of the signal that concentrates most of its energy in few wavelet coefficients having large absolute values. On the contrary, most of the noise tends to be represented by the wavelet coefficients having small values, which means that its energy will not be retained by large value coefficients.Consequently, performing a partial reconstruction of the signal using only these large coefficients (by inverse DWT) leads to an almost noise-free version of the signal.[25]To do comparison between our proposed algorithm with other denoising algorithms, the results are computed for Daubechies, “db8,” “db6,” and “db4” wavelets and are shown in Tables 5-7 for noise variances of 0.1, 0.3, and 0.5, respectively. As you understand from the results, although WD is used for no stationary signals, in comparison with our nonlinear method, it does not perform well.
Table 5
Results from denoising simulated signals by using Daubechies wavelets (variance of noise is 0.1)
Table 7
Results from denoising simulated signals by using Daubechies wavelets (variance of noise is 0.5)
Results from denoising simulated signals by using Daubechies wavelets (variance of noise is 0.1)Results from denoising simulated signals by using Daubechies wavelets (variance of noise is 0.3)Results from denoising simulated signals by using Daubechies wavelets (variance of noise is 0.5)Furthermore, we depicted the amount of R in different variances of 0.1, 0.3, and 0.5 for Daubechies, “db8,” “db6,” and “db4” wavelets and nonlinear modeling of noise when P = 2 in Figure 5.
Figure 5
The amount of R in different variances of 0.1, 0.3 and 0.5 for Daubechies, “db8,” “db6,” and “db4” wavelets and nonlinear modeling of noise when P = 2
The amount of R in different variances of 0.1, 0.3 and 0.5 for Daubechies, “db8,” “db6,” and “db4” wavelets and nonlinear modeling of noise when P = 2We depicted noisy signal y[n], estimated noise e[n], and denoised signal d[n] for 100 sample length (in order to see the details) for the case that variance of noise is 0.1 in Figures 6 and 7 for P = 2 and 250, respectively.
Figure 6
Noisy signal y[n], estimated noise e[n] (it is depicted bigger for the purpose of better representation), and denoised signal d[n], for P = 2
Figure 7
Noisy signal y[n], estimated noise e[n] (it is depicted bigger for the purpose of better representation), and denoised signal d[n], for P = 250
Noisy signal y[n], estimated noise e[n] (it is depicted bigger for the purpose of better representation), and denoised signal d[n], for P = 2Noisy signal y[n], estimated noise e[n] (it is depicted bigger for the purpose of better representation), and denoised signal d[n], for P = 250Furthermore, in Figure 8 we depicted the histogram of estimated noise when P = 250 and variance of noise is 0.3, and then we calculated the goodness of fit by applying Chi-square test in different variances of noise and P = 250 for level significant of 0.05. As we know the P > 0.05 means, there is no significant difference between the estimated noise and the existing noise in the signal. We performed the simulation several times and Table 8 shows the percentage of times that the P value of Chi-square test is >0.05 in different variances of noise, that is, denoted by G. We can see, in most of the time the P > 0.05 and also G is increased by increasing variance of noise.
Figure 8
Histogram of estimated noise when P = 250 and variance of noise is 0.3
Table 8
Percentage of times that P > 0.05 in different variances of noise
Histogram of estimated noise when P = 250 and variance of noise is 0.3Percentage of times that P > 0.05 in different variances of noise
Real Data
EEG has small amplitude in the range of μV. There are five major frequency bands in the EEG. These frequency bands from low to high frequencies are called delta (δ), theta (θ), alpha (α), beta (β), and gamma (γ), respectively.[1]Delta is the frequency range up to 4 Hz. It has the highest amplitude. It exists normally in adults in slow wave sleep. It is also seen normally in babies. Theta waves lie within the range of 4 Hz to 7 Hz. Theta is seen normally in young children, in drowsiness in older children and adults. Alpha is the frequency range from 7 Hz to 14 Hz. It is seen in normal adults during relaxed and mentally inactive awareness. Beta waves lie within the range of 15 Hz to about 30 Hz. It is associated with expectancy states and tension. Gamma is the frequency range approximately 30–100 Hz. Usually, it is not of clinical interests and thus often filtered out in many practical situations.The real data were recorded from 32 electrodes placed at the standard positions of the 10–20 international system with a Biosemi Active Two system and are thus reference free.[2226] They are not recorded from the scalp. The average signal from the two mastoid electrodes was used for referencing. Arbitrary referencing schemes can be implemented by subtracting the reference channel(s) from all other channels. The sampling rate is 2048 Hz. Note that the data might contain artifacts coming from eye-blinks, eye-movements, muscle-activity, etc. This means that the data are recorded under real conditions.Human data based on five disabled and four healthy subjects. The disabled subjects (1–5) were all wheelchair-bound but had varying communication and limb muscle control abilities. The four healthy subjects (6–9) were all male PhD students, ages 30, and had no known neurological deficits. Signals were recorded at 2048 Hz sampling rate from 32 electrodes placed at the standard positions of the 10–20 international system.We run the proposed algorithm for real data of length 0.48 s (1000 samples), but for the purpose of better representation, Figure 9 shows 100 samples of an EEG noisy signal, y[n], its estimated noise e[n] and denoised EEG signal d[n].
Figure 9
Noisy signal y[n], estimated noise e[n] (it is depicted bigger for the purpose of better representation), and denoised signal d[n], for P = 2, when the data is real
Noisy signal y[n], estimated noise e[n] (it is depicted bigger for the purpose of better representation), and denoised signal d[n], for P = 2, when the data is realIn order to compute and investigate the evaluation criteria for real data, at first we denoised it by our proposed method and consider it as an original signal and then add it Gaussian noise with mean zero and three different variances. Tables 9-11 show the results in the case of real data for noise variances of 0.1, 0.3, and 0.5, respectively.
Table 9
Results from denoising real data by changing the value of P (variance of noise is 0.1)
Table 11
Results from denoising real data by changing the value of P (variance of noise is 0.5)
Results from denoising real data by changing the value of P (variance of noise is 0.1)Results from denoising real data by changing the value of P (variance of noise is 0.3)Results from denoising real data by changing the value of P (variance of noise is 0.5)In order to do a qualitative comparison of our denoising algorithm with other algorithms in the case of real data, we ask some expert to grade the quality of the denoised signal. We did it for all data and averaged the grades for final comparison. Note that we compared the results of our method when P = 2 in different variances of noise with wavelet db4.Table 12 shows the rating system for the quality of denoised signal and Table 13 shows the mean grades for it.
Table 12
The rating system for the quality of denoised signal
Table 13
Mean grades for the quality of denoised signal
The rating system for the quality of denoised signalMean grades for the quality of denoised signal
CONCLUSION AND FUTURE WORK
This paper uses the property of EEG signal and proposes a new and good approximation of delayed input signal to its adjacent samples to reduce the complexity of Volterra series coefficients computation in nonlinear modeling of EEG signal. This method is important in practical applications in which computing higher order kernels by using nonlinear adaptive filtering is very complicated. By using the proposed technique, we only need to compute M coefficients instead of M2 so reducing the computation by the ratio of at least 1/3, when M = 3. After finding Volterra series coefficients, we employed Volterra model to find the nonlinearity relation between EEG signal and the noise that is a new way to estimate noise in EEG signal. By employing this method, we can considerably improve the SNR in comparison with WD. We have to combine the linear and nonlinear method of estimating noise in the EEG signal in the future.
BIOGRAPHIES
Malihe Hassani has received her Bs.c & Ms.c in in electrical and electronic engineering 2002 & 2005 respectively. Now she is PhD candidate in Babol Noshirvani Institute of Tecknology, department of electrical and electronic engineering, she published more than 7 articled in journals and conferences. Her research interests are digital signal proessing,biomedical signal processing and image proessing.E-mail:
Mhassani@nit.ac.irMohammad Reza Karami has received the Bs.c in electrical and electronic engineering in 1992, Ms.c of signal processing in 1994, and PhD in 1998 in biomedical engineering from I.N.P.L d‘Nancy of France. He is now the Associate professor with the Department of Electrical and Computer Engineering, Babol University of Technology. Since 1998 his research is in signal and speech processing. He published mor than 100 articles in journals and conferences. He teaches Digital signal processing, Biomedical signal processing and speech processing in university. His research interests include Speech, Image and signal processing.E-mail:
mkarami@nit.ac.ir
Table 3
Results from denoising simulated signals by changing the value of P (variance of noise is 0.3)
Table 6
Results from denoising simulated signals by using Daubechies wavelets (variance of noise is 0.3)
Table 10
Results from denoising real data by changing the value of P (variance of noise is 0.3)