It is a challenge in evoked potential (EP) analysis to incorporate prior physiological knowledge for estimation. In this paper, we address the problem of single-channel trial-to-trial EP characteristics estimation. Prior information about phase-locked properties of the EPs is assesed by means of estimated signal subspace and eigenvalue decomposition. Then for those situations that dynamic fluctuations from stimulus-to-stimulus could be expected, prior information can be exploited by means of state-space modeling and recursive Bayesian mean square estimation methods (Kalman filtering and smoothing). We demonstrate that a few dominant eigenvectors of the data correlation matrix are able to model trend-like changes of some component of the EPs, and that Kalman smoother algorithm is to be preferred in terms of better tracking capabilities and mean square error reduction. We also demonstrate the effect of strong artifacts, particularly eye blinks, on the quality of the signal subspace and EP estimates by means of independent component analysis applied as a prepossessing step on the multichannel measurements.
It is a challenge in evoked potential (EP) analysis to incorporate prior physiological knowledge for estimation. In this paper, we address the problem of single-channel trial-to-trial EP characteristics estimation. Prior information about phase-locked properties of the EPs is assesed by means of estimated signal subspace and eigenvalue decomposition. Then for those situations that dynamic fluctuations from stimulus-to-stimulus could be expected, prior information can be exploited by means of state-space modeling and recursive Bayesian mean square estimation methods (Kalman filtering and smoothing). We demonstrate that a few dominant eigenvectors of the data correlation matrix are able to model trend-like changes of some component of the EPs, and that Kalman smoother algorithm is to be preferred in terms of better tracking capabilities and mean square error reduction. We also demonstrate the effect of strong artifacts, particularly eye blinks, on the quality of the signal subspace and EP estimates by means of independent component analysis applied as a prepossessing step on the multichannel measurements.
Evoked potentials (EPs) and ongoing brain activity oscillations,
obtained by scalp electroencephalogram (EEG) recordings, have been linked with
various cognitive processes and provide means for studying cerebral brain
function [1]. An EP is usually considered to be a wave or complex
elicited by and time-locked to a physiological or nonphysiological stimulation
or event. EPs are buried into background brain activity, and nonneural activity
like muscle noise. Since many parallel mental processes may occur
simultaneously in the brain, it is difficult to observe and determine an evoked
potential on a single-trial base. Therefore, the simplest way to investigate
EPs is to use ensemble averages of time-locked EEG epochs obtained by repeated
stimulation. It is well known that this signal enhancement implies a loss of
information related to trial-to-trial variability, and nonstationary features of
event-related phenomena.The generation mechanism of evoked responses is not
precisely known in many situations. EPs are assumed to be generated either
separately of ongoing brain activity, or through stimulus-induced
reorganization of ongoing activity. For example, it might be possible that
during the performance of an auditory oddball discrimination task, the brain
activity is being restructured while attention is focused on the target
stimulus [2]. Phase synchronization of ongoing brain activity is
one possible mechanism for the generation of event-related responses. That is,
following the onset of a sensory stimulus, the phase distribution of ongoing
activity changes from uniform to one which is centered around a specific phase [3]. Moreover, several studies have concluded that
averaged EPs are not separate from ongoing cortical processes, but rather, are
generated by phase synchronization and partial phase resetting of ongoing activity
[4, 5]. However, phase coherence over trials observed with
common signal decomposition methods (e.g., wavelets) can result both from a
phase-coherent state of ongoing rhythms and from the presence of a
phase-coherent event related potential, which is additive to ongoing EEG [6]. Furthermore, stochastic changes in amplitude and
latency of different components of the EPs are able to explain significant part
of intertrial variability of the measurements [6-9].Several methods have been proposed for EP estimation
and denoising; see, for example, [10-13]. In general, most of the methods for single-trial EP
analysis aim to decompose the measurements into relevant components or to
explain the data through some parameters. The parametrization gives the
necessary means to investigate, for example, the changes that the stimulus
causes to the ongoing EEG signal, or that the repetition of the test causes to
the responses. Most of the methods are based on an explicit model or on some
specific assumptions for the EPs. Every decomposition then involves at least
two main considerations. On the one hand, if the resulting estimates follow too
closely the measurements, it is possible that some features of the data are
still going to be hidden by phenomena unrelated to the stimulation. On the
other hand, if the estimates do not follow the measurements, some features may
have been neglected. Usually a balance between these considerations is made and
care is given to the correct interpretation of a parametrization that is able
to reveal specific features of the experiment.The performance and applicability of every
single-trial estimation method depends on the prior information used and the
statistical properties of the EP signals. Here, we focus on the case that some
parameters of the EPs change dynamically from stimulus to stimulus. This
situation could be a trend-like change of the amplitude or latency of some phase-locked component of the EPs.
Although, for example, the above-mentioned methods [10-13] could be used to estimate such changes, they do not
take into account in the estimation procedure this trend-like variability.The most obvious way to handle time variations between
single-trial measurements is subaveraging of the measurements in groups. Subaveraging
could give optimal estimators if the EPs are assumed to be invariant within the
subaveraged groups. A better approach is to use moving window or exponentially
weighted average filters; see, for example, [14, 15]. Other adaptive methods have also been proposed for
EP estimation, especially for brain stem potential tracking, for example, [16]. The statistical properties of some average filters
and different recursive estimation methods for EP estimation have been
discussed through Kalman filtering in [17]. Some smoothing methods have also been proposed for
modeling trial-to-trial variability in EPs (e.g., [18]).An elegant way to describe trial-to-trial variations
in EPs can be given through state-space models. State-space modeling for
single-trial dynamical estimation considers the EP as a vector-valued random
process with stochastic fluctuations from stimulus to stimulus [17]. Then, past and future realizations contain
information of relevance to be used in the estimation procedure. Recursive
estimates for the states, that are optimal in the mean square sense, are given
by Kalman filter and smoother algorithms. Of importance is also the
parametrization of the problem and the selection of an observation model for
the measurements. For example, in [16, 17] generic observation models were used based on shifted
Gaussian-shaped smooth functions. While other generic observation models could
also be considered, when all the measurements are available, data-based
observation models can be used.In this paper, we extend the method presented in [17] to the use of Kalman smoother algorithm. We
demonstrate that for batch processing the use of the smoother algorithm is
preferable. Fixed-interval smoothing improves the tracking performance of EP
characteristics and reduces greater the noise. In parallel, we propose a novel
method for state-space modeling of EPs. The method is based on the eigenvalue
decomposition of the ensemble data correlation matrix. A few dominant
eigenvectors form a signal subspace that can be used for single-trial
estimation. Subspace-based methods have already been proposed for EP
estimation, for example, in [12, 19]. However, these approaches do not take into account
in the estimation procedure the situation that some characteristics of the EPs
change dynamically from stimulus to stimulus. In this paper, we demonstrate
that such a signal subspace can be used to model dynamic changes present in EP
measurements.The approach is demonstrated with simulated and real
measurements obtained by an auditory EP experiment. Finally, we investigate the
effect of strong artifacts on the quality of the estimates by means of
independent component analysis (ICA), which is applied as a prepossessing step
on the multichannel measurements.
2. METHODS
The sampled potential (from channel ) relative to
the successive stimulus or trial can be denoted
with a column vector of length :
where is the total
number of trials.
2.1. Linear estimation and additive noise model
A widely used model for EP estimation is the additive
noise model. The observations are then assumed to be of the form
The vector corresponds to
the part of the activity that is related to the stimulation, and the rest of
the activity is usually
assumed to be independent of the stimulus and the EP. Single-trial EPs can be
further modeled as a linear combination of some preselected basis vectors.
Then, the observation model takes the form
where is the
observation matrix, which contains the basis vectors of length in its columns,
and is a parameter
vector of length . The estimated EPs can then be
obtained by using the estimated parameters as follows:
By treating
both and as random, the
estimator that minimizes
the mean square Bayes cost is given by the
conditional mean
[20]
of the posterior distribution
By taking into
account the linear observation model, and that and are assumed
uncorrelated, that is, . the linear conditional mean estimator takes the form [20]
where and are,
respectively, the covariance and the mean of . is the
covariance of the zero mean measurement noise, and denotes
transpose. The estimator is optimal in the mean square sense among all possible
estimators, not only linear, if and are Gaussian.
In Bayesian estimation this is also called the maximum a posteriori estimator
(MAP), and and represent prior
information about the parameters . If they are not available, we can assume corresponding
to infinite prior variance for the parameters. In this case, the estimator
reduces to the ordinary minimum variance Gauss-Markov estimator, which treats
the parameters as nonrandom. If we assume that the errors are independent with
equal variances . the estimator is identical to the ordinary least
squares estimator
2.2. State-space modeling of EPs
Estimators of the form (7) can be used to model time-varying characteristics of EPs, for example, in terms of amplitude and latency estimates of some characteristic peak of the signals. However, such
estimators do not take into account situations that some dynamical behavior is
expected from stimulus to stimulus. A mathematical plausible way to incorporate
prior information for estimation about time-varying phenomena is given through
state-space modeling.The measurement vectors can be
considered as realizations of a stochastic vector process, that depends on some
unobserved parameters (state vector)
through the model (3). The parameters are the
quantities that we are primarily interested in, and their form depends on the
parametrization of the estimation problem. In order to model the time evolution
of the hidden process , a linear first-order Markov model can be used, that
is, with some
initial distribution for . Equations (3) and (9) form a linear state-space
model, where and are preselected
matrices. Other important assumptions for the model are
For the white
noise sequences and , we can also assume and for every , but the covariances , can still be
time-varying.for every , the observation noise vectors as well as the
state noise vectors are mutually
independent and also mutually independent of the initial state ,the vectors are mutually
independent for all .
2.3. Kalman filter and smoother algorithms
The Kalman filtering problem is related to the
determination of the mean square estimator for the state given the
observations . This is equal to the conditional mean
that relates to the density [20]
where
The optimal
linear mean square estimator can then be obtained recursively by restricting to
a linear conditional mean, or by assuming and to be Gaussian [20]. The recursive estimator can be written as
where is the
prediction of based on and is the optimal
MS estimate at time . Clearly this is of the form (7), which is the
Bayesian MAP estimator using the last available estimate as prior information.
After adding the initializations, Kalman filter algorithm can be written as
follows.
for . The matrix is called the
Kalman gain matrix.Initialization:Prediction step:Filtering step:If all the measurements are available, that is, , then the fixed interval smoothing problem can be
considered, that is,
that relates to
the density
[21]
The last form
suggests again a recursive estimation procedure for the determination of the
conditional density. It is thus possible to compute filtered and prediction
distributions in a forward (filtering) recursion, and then execute a backward
recursion with each smoothed distribution relying upon
the quantities calculated in the forward run and the previous (in reverse time)
smoothed distributions . This property enables the formulation of the
forward-backward method for the smoothing problem [20], which gives the smoother estimates as corrections of
the filter estimates. So for the linear or Gaussian case the smoothing problem
is complete through the backward recursion.Smoothing:
for . For the initialization of the backward recursion the
filter estimates can be used, that is, .
2.4. Signal and noise subspaces
Singular value decomposition (SVD) has many
theoretical and practical applications in signal processing and identification
problems [23]. In relatively high signal-to-noise ratio conditions
(SNR), SVD of a data matrix can divide measurements into signal and noise
subspaces. Alternatively, it can also be understood in terms of principal
component regression (PCR) as a combined method for signal enhancement and
optimal model dimension reduction [24]. The subspace method has been used to enhance
stimulus phase-locked activity in different studies (e.g., [19]).The available data matrix , which has as columns the EEG sampled epochs relative
to the stimulation, can be decomposed as
where satisfies , satisfies , and is a
pseudodiagonal matrix with nonnegative diagonal elements such that . If , then has the form , where and is a zero
matrix. If , then has the form , where . Only singular values
are nonzero, where .For the additive noise model and relatively small noise
the following decomposition can be considered:
The matrix contains the largest
singular values and the respective
left singular vectors associated mainly with the signals . Thus the matrices represent a
signal subspace, and represent
primarily the noise subspace.From the SVD of the matrix we also have
This means that
the left singular vectors of are the
eigenvectors of the matrix , or the eigenvectors of the data correlation matrixIf we denote with the matrix with
columns the dominant
eigenvectors, then the ordinary least squares estimator for the parameters becomes
Estimates for
the EPs can then be obtained from (4). Quantitatively, the first basis vector
is the best mean-square fit of a single waveform to the entire set of epochs.
Thus, the first eigenvector is similar to the mean of the epochs, and the
corresponding parameters or principal component () reveal the
contribution of the eigenvector to each epoch. The rest of the dominant
eigenvectors model primarily amplitude differences between individual EP peak
components, and latency variations from trial to trial. Therefore, since this
basis contains prior information about phase-locked characteristics of the EP
signals, we consider the following state-space model for dynamical estimation:
with the
selections , , that is, a random walk model, and for all . Estimates for the parameters can then be obtained by
Kalman filter and smoother algorithms for different selections of state and
observation noise covariance matrices. Thus, the applicability of the proposed
method relates on the quality of the signal subspace in low signal-to-noise
ratio conditions, as well as on the assumption of hidden dynamical behavior
from trial-to-trial.
2.5. Artifact correction by ICA
Individual EEG channels measure superimposed activity
generated simultaneously by various brain sources. The behavior of the sources
is stochastic and generally nonstationary. In addition, artifact sources, such
as eye blinks, can distort statistical properties of the signals and increase
complexity. For the problem of blind source separation (BSS) of the
multichannel EEG measurements, target is to recover unobserved brain generated
initial source signals by using only the available sensor data and some
statistical properties assumed for the sources [25, 26].Fundamentally, the basic problem that BSS attempts to
solve assumes a set of measured data
points at time instant () to be a
linear combination of unknown sources , that is
For EEG measurements, is the number
of available channels, and the measurements can be summarized in a matrix having the
vectors in its columns
and different channel recordings in each row. A time-invariant mixing matrix is the common
approach for ICA and BSS of EEG, for example, in event-related studies [3]. This model can be interpreted as the fixed
biophysical structure of the brain itself whilst the sources distributed within
this structure change their intensity over time [25].A general formulation for BSS without any assumptions
(prior information) about the nature of the data, noise, or mixing system will
leave the problem of EEG separation intractable. Therefore, some basic
assumptions are needed. For example, the goal of ICA is to recover independent
sources given only sensor observations that are unknown linear mixtures of
unobserved independent source signals [27, 28].The assumption of physiological independence of the
sources can be quite obvious in some situations, for example, when used in
artifact rejection separating brain signals from ocular artifacts. Note that
the ICA model considers the signals as independent and identically distributed,
and requires non-Gaussian sources. Thus, by ignoring time structure, the
estimation is based solely on investigating structure across the sensors as
estimated by the sample distribution of the measurements, and an embedded density
parametrization (differentiating at least between sub-Gaussian and
super-Gaussian sources). Therefore, the model might not be able to separate
every kind of sources (e.g., stationary Gaussian random processes). However, in
many situations predominant artifacts show a highly kurtotic sample
distribution that enables estimation.ICA methods carry ambiguities about the ordering and
the overall amplitude and sign of the estimated sources. The rows of the data
matrix are the EEG
channel recordings and are decomposed as , where has in its rows
the independent components. The mixing matrix contains the
spatial information of the sources obtained at the sensors. Therefore, the
columns of are the spatial
distributions of the estimated sources, which are normalized to unit variance.
For example, eye movements and eye blinks project mainly to frontal sites. An
artifact source can be eliminated and removed from the measurements by
backprojection.
3. RESULTS
In this section, we present the performance of Kalman
filter and smoother algorithms on tracking dynamic variations, and estimating
single-trial EPs in a simulated and a real data set. In parallel, we
investigate the performance of the method when the signal subspace is enhanced
by rejecting eye-related artifacts with the use of ICA.
3.1. Measurements and artifact removal
EEG measurements were obtained from a standard oddball
paradigm with auditory stimulation (1 subject, 60 EEG channels, reference:
ears). In the recording, 569 auditory stimuli were presented with an
interstimulus interval of 1 second. Eighty-five percent of the stimuli were the
standard tones at 800 Hz. Fifteen percent were the deviant tones at 560 Hz. The
deviant tones were randomly presented. The subject was sitting in a chair, and
was asked to press a button every time he heard the deviant target tone. The
sampling rate of the measurements was 500 Hz.Reduction in noise for EEG signals can be done with
linear filtering without altering the basic ICA model [27]. If we further assume less sources than sensors and
that the sensor noise is relatively small, then principal component analysis
(PCA) on the data covariance matrix and dimension reduction can be used to
reduce the noise and to prevent overlearning [27]. For the analysis, the data were digitally filtered
in the range (1–35 Hz). All the measurement set (about 10 minutes) was used for
the estimation of the separating matrix. The dimension of the data was reduced
with PCA to 31, by keeping eigenvectors associated with eigenvalues larger than
1, resulting in more than 99% of explained variance. The Fast ICA algorithm in
parallel form [27] was used for the estimation of independent
components.By visual inspection of the estimated components and
scalp activations two components showed to be related to eye activity. The
blink components are presented in Figure 1. On the left, the time activations
corresponding to the first minute of the recordings are presented, and on the
right the spatial distributions. Furthermore, these components did not show any
significant correlation with the two types of stimuli (standard and target).
Correlation with stimulation time was investigated by computing EP image plots
for every estimated component. The component-based EP image plots are not shown
here, but such images are also used in the next section (Figures 3 and 5). EP image plots are constructed by color-coding potential variations
occurring in single-trial epoch vectors (e.g., [3]). The thin color-coded horizontal bars, each
representing a single-trial, are, for example, stacked row-by-row according to
data collection time (data epochs sampled relative to successive stimulus or
trial ) producing an
EP image.
Figure 1
Blink-related components estimated with ICA. Time activations (left) and scalp activations
(right). The left plots correspond to the first minute of the measurement set.
Figure 3
Simulations
resembling the N100/P200 auditory complex and obtained estimates. For
background noise prestimuli EEG samples relative to standard tones from channel
CZ after ocular artifacts removal were used. Left: simulations (Gaussian
functions) and noisy simulations, single-trials as image plots (up), and the
respective 10 dominant eigenvectors of the data correlation matrix (bottom).
The EP images represent stimulus locked stacked epochs (row-by-row). The
color-maps describe the amplitude level in , -axis
represents successive stimulus or trial , and the -axis
represents within a trial latency variation. Right: single-trial estimates as
image plots with Kalman filter and smoother (up) and estimated variability of
the second positive peak (bottom). Simulated amplitude and latency trends
(light bold), estimates based on Kalman filter (dark thin) and based on
fixed-interval Kalman smoother (dark bold). For estimation the selection was used.
Figure 5
N100/P200
auditory complex, measurements from channel CZ. EEG epochs relative to the
standard tones (0–500 milliseconds after auditory stimulation), and obtained
estimates. Left: EEG epochs as image plots after and before blink correction
(up) and the respective 10 dominant eigenvectors of the data correlation matrix
(bottom). The EP images represent stimulus locked stacked epochs (row-by-row).
The color-maps describe the amplitude level in , -axis represent
successive stimulus or trial , and the -axis within a
trial latency variation. Right: single-trial estimates as image plots with
Kalman smoother (up) based on artifact corrected measurements and original
measurements respectively, amplitude and latency estimates of the P200 peak
(bottom) based on original measurements (thin) and artifact corrected
measurements (bold). For estimation the selection was used.
Note that PCA-based dimension reduction is a rather
subjective approach for the determination of the number of brain source signals
in EEG measurements [25]. Some relatively weak brain sources, as measured at
the sensors, may be eliminated. Additionally, some estimated independent
components may remain the mixture of more than one source signals. However, by
computing different EP image plots we did not observe any significant loss of
phase-locked EP activity. Furthermore, filtering and dimension reduction
provided good estimates for the blink components and fast convergence for the
Fast ICA algorithm. Therefore, the performance was considered satisfactory for
ocular artifact removal, and for the demonstration needs of the proposed
subspace method for dynamical estimation of single-channel single-trial EPs.
3.2. Single-trial estimation
Real EEG data were used as background EEG activity, or
noise, in the simulations. From the recordings, we used only the channel CZ,
after preprocessing and artifact removal by ICA. Only ocular artifacts were
considered. As background activity for the simulations, we sampled prestimulus
EEG epochs from milliseconds to
0 millisecond relative to the standard stimulus onset. Simulated EPs were
constructed according to the additive noise model by superimposing upon the
selected real EEG epochs linear combinations of 2 Gaussian-shaped functions. In
order to be consistent to the real measurements (standard tones and N100/P200
complex), each pseudoreal EP vector has two Gaussian peaks: a negative after
100 milliseconds and a positive after 200 milliseconds. Trial-to-trial
sinusoidal variations for the amplitude and latency of the second peak were
generated. Random variations were also added to the amplitudes, latencies, and widths
of both simulated peaks.The estimated time-varying SNR with respect only to the
second peak can be seen in Figure 2 as a function of the stimulus number . Therefore, the important assumption in the
simulations is the trend-like behavior in low signal-to-noise ratio conditions.
By construction the simulated EPs have trend-like trial-to-trial
characteristics. This can be observed in Figure 3 (left) and the EP image
plots. In the same figure (bottom, left), they are also presented the 10
dominant eigenvectors of the data correlation matrix obtained before and after
EEG addition.
Figure 2
Time-varying
SNR(dB) for the simulated second peak as a function of the stimulus number
(trial) , that is, , , where are the
simulated noise-free single-trial EPs and prestimulus EEG
epochs sampled relative to the standard tone from channel CZ after ocular
artifact removal with ICA. The sums were considered in a smaller interval
around 200 milliseconds covering only the second peak (see also Figure 3).
It must be noted that the aim in the creation of the
simulations was that the average of the simulated EPs is close to the average
of the real measurements (standard tones and N100/P200 complex at channel CZ).
The average of the real measurements has a negative peak around 110
milliseconds (N100) with amplitude about , and a positive peak (P200) around 230 milliseconds
with amplitude about 5 . Then the simulations were created as follows. For
the first peak random variability in a small range in amplitude and latency was
simulated that gives ensemble average with peak amplitude about at the required
latency. For the second peak dynamic variability was created with range of
about 10 (2–12 ) in amplitude
and about 45 milliseconds in latency, such that the average has peak amplitude
about 6 and similar
latency to the real measurements. Then prestimuli EEG was added. In that
respect, SNR conditions were not directly considered, but instead a reasonable
range for the time-varying behavior was assumed that can produce similar
average with the real measurements.For estimation the state-space model
(25) was
selected. For the covariances we used and for every
stimulus . Then the selection of the last variance term is not
essential since only the ratio has effect on
the estimates. Then the choice can be made and
care should be given to the selection of . In general, if it is tuned too small, fast
fluctuations of EPs are going to be lost, and if it is selected too big the
estimates have too much variance and they will tend to be similar to the
ordinary least squares or principal component regression solution. The
selection can be based on experience and visual inspection of the estimates as
a balance between preserving expected dynamic variability and greater noise
reduction.In order to identify an optimal value for the variance
term for the
simulations we calculated root mean square errors (RMSEs) between the estimates
based on the noisy data and the noiseless simulated EPs. The RMSEs were
computed with respect to the second peak only over a smaller time interval
around 200 milliseconds. For initialization of the algorithms we used half the
data set by filtering backwards in time. The last estimates were used for
initializing the forward run. Finally, the last state estimate of the Kalman
filter forward run was used to initialize the backward smoothing procedure.Means of RMSEs over all single-trials for different
values of state noise variance parameter and for different dimensions of the
observation matrix are presented in Figure 4 as contour plots for Kalman filter
(top) and smoother (middle). In all the cases, Kalman smoother provides smaller
error than the filter. This is to be expected, since all the measurements are
included in the estimation procedure. The reduction of the error during
backward smoothing is due to greater noise cancellation, as well as better
tracking of the dynamic fluctuations. Optimal values of for all the
selected observation matrices are between and . By considering the contour plots and by inspection
of the estimates, around 10 eigenvectors are enough for tracking the dynamic
fluctuations. Single-trial estimates for that dimension () and with the
selection are presented
in Figure 3 as image plots for Kalman filter and smoother. In the right
(bottom) of the same figure they are presented estimates for the single-trial
latency and amplitude of the second peak as a function of the stimulus number
or trial .
Figure 4
Means of RMSEs
for different values of the state noise variance parameter and different
number of dominant eigenvectors included in the observation model. Contour
plots of the means for Kalman filter (top) and smoother (middle). Means when 10
eigenvector are included in the observation model (bottom). In all plots the -axis is in
logarithmic scale.
State-space representation and a few dominant
eigenvectors obtained from the ensemble data correlation matrix are able to
model the amplitude and latency changes. Bayesian recursive mean square
estimation is able to reveal the hidden dynamic variability under unfavorable
signal-to-noise ratio conditions. Clearly, Kalman smoother tracks better the
dynamic changes and reduces greater the noise.For the real measurements we considered epochs 0–500
milliseconds after the presentation of the standard tones from channel CZ
before and after eye artifact removal. For the two data sets we selected 10
eigenvectors of the data correlation matrix for estimation. The strong blink
contributions clearly affect the eigenvectors and the signal subspace,
especially after the first half of the measurements, see Figure 5. This can
also be seen by observing the first two eigenvectors that reflect mainly blink
artifacts. However, since the blinks occur random enough, recursive mean square
estimation is largely reducing their contribution. This can be observed in
Figure 5 from the estimates, which are obtained with Kalman smoother with the
same choices and for both data
sets. The estimated dynamic variability of the second peak (P200) in terms of
amplitudes and latencies is presented in the left (bottom) of the same figure.Some representative individual single-trial estimates
are presented in Figure 6 for the simulations (left) and real EP measurements
(right). The estimates for the simulations and the real EP measurements
(standard tones and N100/P200 complex) are based on the artifact corrected EEG
and Kalman smoother algorithm. The identification of peak potentials from raw
measurements can be misleading even in simple simulations (e.g., stimulus number , left). The proposed method produced accurate
estimates for the simulations even in very low SNR conditions (e.g., stimulus
number , left). This is because we assumed a trend-like
variability. The evaluation of the estimates for the real EPs is naturally more
difficult. For example, clear N100 and P200 peaks are obtained for stimuli 50
and 250 (right). Though, the identification of peaks is not trivial for
stimulus 450 (right). However, it must be noted that the proposed method does
not make assumptions for the number of peaks and their exact form. This
information is obtained from the estimated signal subspace and the included
eigenvectors.
Figure 6
Representative single-trial estimates based on Kalman smoother algorithm. Estimates for the
simulations (left) and for the real measurements (right) (standard tones and
N100/P200 complex after ocular artifact correction by ICA). Measurements and
noisy simulations (dark thin), noise-free simulations (light bold), and
estimates (dark bold).
In summary, the proposed approach for single-trial
dynamical estimation of EPs consists of the following steps. (1) Band-pass
filter the selected EEG channels. This has as an effect on the
improvement of the quality of the signal subspace. For example, it can reduce
high-frequency components, and therefore, it can provide smoother eigenvectors
and estimates. (2) Enhance the quality of the signal subspace. If the EEG
epochs contain strong artifact contributions, such as eye blinks, an artifact
correction method can be applied, for example, ICA. (3) Estimate the data correlation
matrix and compute eigenvectors. In the simplest case, a basic artifact
correction method based on thresholding of potential values and excluding very
noisy single-trial epochs can be applied prior to the computation of the
correlation matrix. (4) Select a few dominant eigenvectors to form the
observation model for estimation. The estimated signal subspace must be able to
model latency changes for different phase-locked EP components. (5) Estimate EP
characteristics with Kalman smoother algorithm. The smoothing parameter can be
selected by visual inspection of the estimates (EP image plots), and by
considering the expected trial-to-trial variability of individual peaks.
4. DISCUSSION AND CONCLUSION
We presented a new dynamical estimation method for
single-trial EP estimation based on a state-space representation for the
trial-to-trial evolution of EP characteristics. The method uses the eigenvalue
decomposition of the data correlation matrix for the identification of the
state-space model. This is an extension of the method presented in [17], where a generic observation model was used. A few dominant eigenvectors obtained from the ensemble measurements incorporate prior
information about shape characteristics and within trials correlations of
individual EP peaks. This approach takes also into account individual subject
characteristics for estimation. Therefore, the method is applicable for
different types of EP experiments as long as dynamical behavior from
trial-to-trial could be expected. For a Gaussian basis selection like in [17], someone has to select the number of basis vectors and their width. This is not always trivially easy, since a given wave shape
may perform in a different way for every individual peak. Therefore, a benefit
of SVD is the rather easy selection of observation model that can take into
account shape information about different peaks and individual subject
characteristics. However, for very weak EPs a generic observation model may
have better performance.Estimates for the state parameters are obtained with
Kalman filter and fixed-interval smoother algorithms. Both share the optimality
of Bayesian recursive mean square estimation. The fixed-interval smoothing
method estimates better the hidden dynamic changes and reduces greater the
noise. Therefore, it should be preferred when all the measurements are
available. The same behavior can be shown when other observation models are
considered, for example, generic basis vectors as in [17]. Therefore, the present paper introduces the use of
Kalman smoother algorithm for dynamical estimation of EPs. The use of the
filter is appropriate for online estimation. However, compromises between better
tracking capabilities and almost online estimation can be searched in terms of
fixed-lag smoothing methods [29].For the demonstration of the methods we used
measurements from an auditory experiment (oddball paradigm). Since the aim was
to investigate to performance of the methods when strong artifacts exist, we
only considered the standard tone measurements and not the deviant and the P300
target response. For this data set the blink artifacts were more prominent for
the standard tones. In addition, the estimates of latency and amplitude of the
P200 peak (slower and smaller responses towards the end of the measurements)
just show that even in ordinary experiments some dynamic behavior from stimulus
to stimulus could be expected. However, the method should be addressed to the
study of more specific experimental settings. The investigation of latency or
amplitude estimates could, for example, be used to study possible habituation
effects due to repetition of stimuli, or to study cognitive changes due to
time-varying task difficulty or extra distraction. Latency or amplitude changes
of peak potentials can also be used to track changes caused by sedative drugs
during anesthesia.EP measurements are usually made with multiple
electrodes providing spatial information for the experiment. This information
can be used at least to remove artifacts from the signals. We showed by means
of ICA that even when the signal subspace is distorted from characteristic
artifacts the method is still able to track changes in EP peak components. This
is because in the filtering or smoothing procedure phenomena uncorrelated from
trial to trial are largely eliminated. In fact, this is exactly the main
advantage of dynamical estimation for single-trial EP analysis. However,
accurate artifact removal or further elimination of undesirable brain generated
components can enable better quality for the signal subspace and individual
channel measurements. Extensions to multichannel measurements could be searched
by applying the method to each channel separately. Then the variable
signal-to-noise ratio conditions from channel to channel should be considered.
Another approach could be to direct introduce spatial information in the
state-space model. Such multichannel extensions could be investigated for
further development of the method. Finally, the signal subspace method can be
extended to multichannel measurements. Then it could, for example, be combined
with BSS methods.
Authors: Wilson A Truccolo; Mingzhou Ding; Kevin H Knuth; Richard Nakamura; Steven L Bressler Journal: Clin Neurophysiol Date: 2002-02 Impact factor: 3.708
Authors: Anu Holm; Perttu O Ranta-aho; Mikael Sallinen; Pasi A Karjalainen; Kiti Müller Journal: Int J Psychophysiol Date: 2005-12-20 Impact factor: 2.997
Authors: Stefanos D Georgiadis; Perttu O Ranta-aho; Mika P Tarvainen; Pasi A Karjalainen Journal: IEEE Trans Biomed Eng Date: 2005-08 Impact factor: 4.538