Alva Appelgren1, William Penny2, Sara L Bengtsson1. 1. Department of Clinical Neuroscience, Karolinska Institutet, <location>Stockholm, Sweden</location> 2. Wellcome Trust Center for Neuroimaging, University College London, <location>UK</location>
Abstract
We investigated if certain phases of performance monitoring show differential sensitivity to external feedback and thus rely on distinct mechanisms. The phases of interest were: the error phase (FE), the phase of the correct response after errors (FEC), and the phase of correct responses following corrects (FCC). We tested accuracy and reaction time (RT) on 12 conditions of a continuous-choice-response task; the 2-back task. External feedback was either presented or not in FE and FEC, and delivered on 0%, 20%, or 100% of FCC trials. The FCC₂₀ was matched to FE and FEC in the number of sounds received so that we could investigate when external feedback was most valuable to the participants. We found that external feedback led to a reduction in accuracy when presented on all the correct responses. Moreover, RT was significantly reduced for FCC₁₀₀, which in turn correlated with the accuracy reduction. Interestingly, the correct response after an error was particularly sensitive to external feedback since accuracy was reduced when external feedback was presented during this phase but not for FCC₂₀. Notably, error-monitoring was not influenced by feedback-type. The results are in line with models suggesting that the internal error-monitoring system is sufficient in cognitively demanding tasks where performance is ∼ 80%, as well as theories stipulating that external feedback directs attention away from the task. Our data highlight the first correct response after an error as particularly sensitive to external feedback, suggesting that important consolidation of response strategy takes place here.
We investigated if certain phases of performance monitoring show differential sensitivity to external feedback and thus rely on distinct mechanisms. The phases of interest were: the error phase (FE), the phase of the correct response after errors (FEC), and the phase of correct responses following corrects (FCC). We tested accuracy and reaction time (RT) on 12 conditions of a continuous-choice-response task; the 2-back task. External feedback was either presented or not in FE and FEC, and delivered on 0%, 20%, or 100% of FCC trials. The FCC₂₀ was matched to FE and FEC in the number of sounds received so that we could investigate when external feedback was most valuable to the participants. We found that external feedback led to a reduction in accuracy when presented on all the correct responses. Moreover, RT was significantly reduced for FCC₁₀₀, which in turn correlated with the accuracy reduction. Interestingly, the correct response after an error was particularly sensitive to external feedback since accuracy was reduced when external feedback was presented during this phase but not for FCC₂₀. Notably, error-monitoring was not influenced by feedback-type. The results are in line with models suggesting that the internal error-monitoring system is sufficient in cognitively demanding tasks where performance is ∼ 80%, as well as theories stipulating that external feedback directs attention away from the task. Our data highlight the first correct response after an error as particularly sensitive to external feedback, suggesting that important consolidation of response strategy takes place here.
Entities:
Keywords:
error-monitoring; external feedback; information theory; internal feedback; working memory
Error-monitoring is thought to be of
particular importance for successful performance, since error signals directly call
for adjustment of actions (Botvinick, Braver,
Barch, Carter, & Cohen, 2001; Holroyd, Yeung, Coles, & Cohen,
2005; Ridderinkhof,
Ullsperger, Crone, & Nieuwenhuis, 2004). One early
observation that has been made in support of this claim is that whereas RTs on most
correct responses in a learned continuous choice task are fast, a characteristic of
error-monitoring is a post-error slowing in RTs (Danielmeier, Eichele, Forstmann, Tittgemeyer, & Ullsperger,
2011; King, Korb,
Von Cramon, & Ullsperger, 2010; Rabbitt, 1969). Rabbitt (1969) suggested that the
slowing of responses immediately after errors is due to the validation of an error,
and thus transient changes in response strategy to minimize the possibility of
further errors. This proposal is supported by empirical findings that post-error
slowing lowers the probability of committing a subsequent error in the post-error
trial (Danielmeier et al.,
2011; Rabbitt,
1969; Rabbitt &
Rodgers, 1977). The conflict monitoring model by
Botvinick et al.
(2001) specifies that the Anterior Cingulate Cortex (ACC)
plays a central role in error detection, serving as a learning signal that increases
the threshold for executing the subsequent response. ACC has been found to register
errors both when they are detected by the individual and when external
error-feedback is provided and is thus regarded as a general error-monitoring module
(Holroyd et al.,
2004; Ullsperger,
Nittono, & von Cramon, 2007).However, post-error slowing does not
always lead to improved performance (Hajcak, McDonald, & Simons, 2003). Notebaert et al. (2009) have proposed
an alternative account for the post-error slowing where the slowing is caused by the
error being a rare outcome and therefore grasping attention. Thus, it may take
attentional resources from the task, which may result in reduced performance
(Huettel & McCarthy,
2004). They found that when correct responses outnumbered
error responses, post-error slowing was observed, whereas when the majority of the
trials were incorrect post-correct slowing was observed (Notebaert et al., 2009). Regardless of
whether external error-feedback was present or not, they found the same pattern of
prolonged post-error RT when errors were rare outcomes and the absence of post-error
slowing when error frequency reached 50%, which made them argue that the
internal error-monitoring system is more important than the external. The accuracy
levels were fixed and therefore the impact of feedback on accuracy was not
investigated (Houtman, Castellar,
Notebaert, & Nu, 2012).External feedback on trial outcomes
informs us on task success. It has been argued that we use this feedback to confirm,
restructure, or tune information so that behavior meets the task goals (Hattie &
Timperley, 2007). Feedback signals are designed to minimize the risk that a
participant would miss the outcome and as such the feedback may grasp attention. It
is, however, unclear whether this is beneficial for performance or if it directs
attention away from the task. A meta-analysis on feedback interventions showed that
one third of the studies reported reduced performance upon external feedback
(Kluger & DeNisi,
1996). No consistent conclusion could be drawn as to whether
feedback played a different role dependent on the type of task, for example,
vigilance tasks or problem-solving tasks. The main factors contributing to the
impact of explicit feedback on performance were if outcome was measured on a
trial-to-trial basis or after a time of consolidation (Goodman, 1998; Schmidt, Young, Swinnen, & Shapiro, 1989), if
outcome was measured in terms of the intention of the participants to invest effort
(motivation) (Van-Dijk & Kluger,
2004), or if feedback was given on errors or corrects
(Wade, 1974).
Goodman (1998) showed
that detailed task-feedback when solving a puzzle helped the participants to perform
better, but the absence of explicit feedback had beneficial learning effects in the
long run, that is, to solve a later puzzle. A similar pattern of results was
observed in a study by Schmidt et al.
(1989) where the frequency of feedback was manipulated and
they observed that error rate increased when feedback was delivered after every
trial, compared to when feedback was delivered after every 15th trial. They
concluded that feedback after every trial may eliminate the participant’s
internal evaluation process. Van-Dijk and
Kluger (2004) demonstrated that the participants’
intention to invest effort was influenced by whether they preferred positive or
negative feedback. Wade
(1974) used a letter matching task and asked participants to
confirm with a button press that they had understood the task-feedback after each
trial. They either confirmed the feedback for errors, for corrects, for both the
errors and corrects or neither. Selective feedback on correct responses or on the
error responses led to the best performance results. Even though results suggest
that external error-feedback has limited impact (Holroyd et al., 2004; Houtman et al., 2012), it may still be
argued that we process error-feedback as more valuable than feedback on correct
responses when errors are rare outcomes, as would be predicted from an information
theoretic perspective (Shannon &
Weaver, 1963). For example, if an individual makes
20% errors on a continuous performance choice task, providing external
feedback on the error trials would give them more information than if external
feedback was given on 20% of the correct responses. This argumentation is
lined out in more detail in Information Theory section. We can compute the Mutual
Information (MI) between feedback and outcome, which quantifies how informative the
external feedback is about the outcome. It has been shown that external
error-feedback is processed in different neural circuits than external feedback on
correct responses (Ullsperger & von
Cramon, 2003). These results illustrate that feedback-type,
that is, erroneous and correct feedback, may matter for performance.An interesting observation is that
among the correct responses, the first correct response after an error seems to
differ from other correct responses, where the correct response following an error
gives rise to more activity in, for example, right dorsolateral prefrontal cortex
(Kerns et al.,
2004; King et al.,
2010; Marco-Pallarés, Camara, Münte, & Rodríguez-Fornells,
2008). Although less explored than the post-error slowing,
there are reports of the first correct response after an error also slowing RT
(Laming, 1979;
Marco-Pallarés et al.,
2008; Rabbitt,
1969). This slowing could reflect that the individual
responds more cautiously because of a recent error; in order to guard against
further errors (Laming,
1968), or because a change in strategy contingent on his
recognition of his mistake (Rabbitt,
1969). The impact of external feedback has not been
evaluated for this phase in particular.In the present study we investigate if
three phases of performance monitoring, the error phase, the phase of the correct
response after an error, and the phase of corrects following correct responses, are
differentially influenced by external feedback and whether the external feedback is
beneficial for performance or not. We measured accuracy and RTs on a 2-back task for
letters. The 2-back task is a continuous performance task where each trial is
dependent on other trials, and as such it measures a person’s sustained and
selective attention. This is useful when investigating interactions effects of
feedback between the phases. Interactions, that is, how feedback in one phase may be
influenced by feedback on previous trials, require that there is a sequential
dependence between trials. This is seen for tasks such as the
n-back task, but not for tasks where each trial is preceded by
separate rules. In the present study it was important to use a task that was
moderately difficult, since we are investigating error processing. The accuracy
level of the n-back task can easily be manipulated by varying
n. Additionally, by comparing experimental conditions with the
same number of feedback events (sounds), but varying in the amount of information
feedback conveys about outcome (the mutual information), we can test if information
content has an effect on performance.Because the above studies suggest that
the three phases rely on different processes, we hypothesize that external feedback
is processed differently for errors, correct after error, and corrects following
corrects. Whereas we do not predict that external error-feedback will alter
performance when compared to no external feedback on errors, we do hypothesize that
error-feedback will be more informative than feedback on correct responses.
Method
Participants
Sixty-three neurologically
healthy, right-handed participants took part in this study (age range
18–40 years, mean age ± SD: 26.8 ± 5.1, 43
females). Three participants were excluded before the data analysis because they
did not complete the task. Participants were recruited from the Stockholm area
and they all gave written informed consent prior to participating in the study.
The study was approved by the ethics committee in Stockholm, Sweden (Dnr No.
2010/1546-31/1).
Experimental Procedure
The experimental task was
performed on a PC (Latitude E5510, DELL Inc., Texas, US) with a screen
resolution of 1366 × 768. We used Cogent (UCL, London, UK) for sequence
presentation and data collection. Prior to data collection we conducted a pilot
study where n was either 1, 2, 3, or 4 and found that
n = 2 yielded an accuracy level of ≈ 80%. In
this pilot study eight participants performed a sequence of 60 letters for each
n. Accuracy was: n = 1 (84.0%
± 14.3), n = 2 (78.3% ± 21.5),
n = 3 (63.4% ± 27.0), n = 4
(56.5 ± 25.4).The 60 participants in the present
study were seated in a quiet testing room and were tested on the 2-back task for
letters (Figure 1
), a task widely used to test the ability
to maintain information across a delay (Cohen, MacWhinney, Flatt, & Provost, 1993). We used
a sequence of 200 letters per condition. White letters (10 mm in height) were
presented centrally on a black computer screen, one letter at the time. Each
letter was presented for 230 ms with an interstimulus interval (ISI) fixed to
1,400 ms. If the letter they saw also appeared two letters back the participant
made a “yes” response, otherwise they made a “no”
response. The “yes” response consisted of pressing the button
corresponding to the right index finger, while a “no” response was
made by pressing the button corresponding to their right middle finger, on the
computer keyboard. The same letter, regardless if written as capital letter or
lowercase letter, was regarded a match. Both capital and lowercase letters were
used in the sequences to reduce the possibility that participants solely relied
on visual memory. A sequence had 30% hits (“yes”
responses).
Figure 1
The 2-back task. A sequence of letters is presented on a computer screen one letter
at a time. Participants are asked to make a response for each presented letter: A
“yes” response on the computer keyboard if the letter also appeared two
letters back, or a “no” response if it did not.
In order to study the influence of
external feedback on the performance monitoring system, either an auditory
signal delivered through headphones, or no sound, followed immediately after
each key response. Two different sounds were used as external feedback; a 74 Hz
beep (55 ms) indicating an error and a 740 Hz beep (55 ms) indicating a correct
answer. The participants were not instructed to correct their errors.We compared external and no
external feedback on errors and correct responses, where the correct responses
were divided into corrects after errors, and corrects following corrects. This
enables us to study if the correct responses differ in their processing
depending on the outcome of the preceding trial. This gives us three factors:
the error phase (FE), the phase of corrects after errors (FEC), and the phase of
corrects following corrects (FCC). Each of the factors had two or three levels
of feedback. The error phase had two levels of feedback; either external
feedback on all errors (FE100) or no external feedback
(FE0). The phase “corrects after errors” had two
levels of feedback; either external (FEC100) or no external feedback
(FEC0). The phase “corrects following corrects” had
three levels of feedback; external feedback on 100% of the correct
responses (FCC100), external feedback on 20% of the correct
responses randomly distributed (FCC20), or no external feedback
(FCC0). The reason for having three levels of feedback on FCC was
because we wanted to compare external feedback with internal feedback
(100% sound vs. 0% sound), as well as to investigate a parametric
modulation of the amount of external feedback on performance, and thirdly, to
test the information theory hypothesis suggested in the Introduction and
Information Theory sections. Testing this hypothesis required that we introduce
sequences with feedback on 20% of the correct following correct responses
(FCC20), since this would roughly correspond to the percentage of
errors made. We cannot know beforehand how many errors the participants will
make, so an exact correspondence in the amount of sound between the two
sequences was not possible. In total, the study was made up of twelve 2-back
conditions, each condition consisted of a 200-letter long 2-back sequence. These
conditions fitted in a 2 × 2 × 3 factorial design
(Figure 2
).
Figure 2
The 2 × 2 × 3 factorial design. We focused our analysis on three phases
of performance monitoring; the error phase (FE), the phase of the correct response after
an error (FEC), and the phase of the corrects following correct responses (FCC). We
manipulate the performance monitoring by delivering external feedback (sounds), or no
external feedback, on FE and FEC, while on FCC trials we either provide external
feedback on 100%, 20%, or none of the trials. This results in 12
conditions of the 2-back task with different combinations of feedback. We denote the
experimental conditions in the order [FE; FEC; FCC].
The three phases of interest are
denoted; FE: feedback on errors, FEC: feedback on the correct response after an
error, and FCC: feedback on correct responses following corrects. When
describing our 12 different feedback conditions we use the order; error, correct
after errors, correct following corrects [FE; FEC; FCC]. We denote external
feedback (sound) as 1 and no external feedback (silence) as 0 for the phases FE
and FEC. For FCC, 0 corresponds to no external feedback (silence), 1 corresponds
to external feedback on 20% of the trials, and 2 corresponds to external
feedback on all of the corrects following corrects (Figure 2). For example, [101] denotes a 2-back
sequence where external feedback was received on error trials as well as on
20% of the FCC trials, and [002] denotes a 2-back sequence where no
external feedback is given on errors, nor the subsequent correct response, but
external feedback is given on all corrects following corrects.For each condition, instruction of
the feedback characteristics was presented on the computer screen for 1,000 ms.
This was followed by a sequence of 100 letters. Each feedback condition was
presented twice, so in total 200 letters were presented for each condition for
each participant, apart from sequences [000] and [011] where only 96% of
the letters were presented due to technical failure. There were four types of
2-back sequences of letters that were randomized between conditions. The design
is a mixed design, each participant performed on average 3.5 ± 1.5
conditions. The order of conditions between participants was pseudorandomized,
and the subject effect was taken into account in the statistical analysis.Prior to data collection, the
participants practiced each of the sequences they were to perform, for 25
letters per condition, and were at the same time becoming familiar with the two
sounds representing errors and corrects respectively. They were verbally
instructed on the task rules with the aid of a cartoon. They were carefully
instructed on the characteristics of each sequence and its corresponding
computer instruction label.
Statistical Analysis
We measured percent correct
responses (accuracy) and RT as dependent variables (Table 1
). Prior to data analysis, we excluded
nonresponse trials and removed the first two trials of each 100-letter sequence
because of the nature of the 2-back task, that is, only from the third letter
presented can a response be a match or a mismatch. When computing RT, we
excluded error-trials that were followed by another error trial. When computing
the RTs we extracted the time between the stimulus presentation and key press.
Accuracy was computed on all trials included in the analysis. In total 31,103
trials were entered into the analysis. On average 173.4 ± 10.0
trials/condition/participant were entered into the analysis.
Table 1
Descriptive statistics for accuracy, RT and double errors are shown for each of the
12 conditions
Conditions [FE;FEC;FCC]
Accuracy (%) ± SEM
RT (ms) ± SEM
Double errors ± SEM
[000]
87.4 ± 2.2
580.0 ± 38.7
1.4 ± 0.8
[001]
88.1 ± 0.8
558.0 ± 12.3
2.5 ± 0.3
[002]
85.2 ± 0.8
545.2 ± 13.7
4.9 ± 1.0
[010]
81.2 ± 0.9
561.2 ± 6.9
8.1 ± 1.1
[011]
85.2 ± 0.7
578.1 ± 9.3
3.8 ± 0.8
[012]
87.2 ± 0.8
539.6 ± 6.7
4.3 ± 0.8
[100]
88.9 ± 1.3
570.9 ± 18.5
1.0 ± 0.3
[101]
86.8 ± 0.6
552.5 ± 7.8
3.4 ± 0.9
[102]
86.0 ± 1.5
566.7 ± 14.7
4.4 ± 1.3
[110]
85.0 ± 0.7
575.3 ± 8.7
3.0 ± 1.2
[111]
85.0 ± 1.2
581.4 ± 15.4
3.9 ± 1.2
[112]
83.9 ± 0.8
555.6 ± 9.4
4.6 ± 0.8
We performed a 3-way ANOVA based
on summary statistics for each subject and feedback combination
(df = 164) using Matlab (r2010a, The Math Works, Natick,
MA) and the spm_ancova function from the SPM software library
(Friston, Ashburner, Kiebel, Nichols,
& Penny, 2007) compatible with Matlab, to make a
between-subjects design after correcting for subject effects. We investigated
the main effect of FE, FEC, and FCC for accuracy and RTs using the 12 different
conditions, as well as the interaction effects among them. The main effects show
us the average effect of a factor when this factor is “high”
versus “low,” that is, to compute the main effects (RT and
accuracy) of external feedback on errors we subtract the average response of all
experimental runs for which FE was low (no external feedback, conditions [000]
[001] [002] [010] [011] [012]) from the average responses of all experimental
runs for which FE was high (external feedback on errors [100] [101] [102] [110]
[111] [112]).We then counted the committed
double-errors for each participant and condition. The number of double-errors is
sometimes used to study how readily participants are monitoring and adjusting
their errors (Hajcak & Simons,
2008; Houtman
et al., 2012; Notebaert et al., 2009). This measure will give us an
indication of: (i) if in the absence of external error-feedback the participants
make more double-errors because they monitor their error less readily or (ii) if
when external error-feedback is present, the feedback disturbs the
participants’ internal error-monitoring hence resulting in more
double-errors. We compared the number of double-errors between conditions where
external error-feedback was presented with those without external error-feedback
using a one-way ANOVA. We also correlated double-errors with performance using
Pearson’s correlations (SPSS Statistics 17.0, Chicago, IL) for each
condition to study possible individual differences in response to external
feedback.Additionally, for each subject and
sequence, we compute the Mutual Information (MI) between feedback and outcome.
Details of this computation are provided in Information Theory section. MI
quantifies how informative the external feedback is about the outcome. For each
sequence, we then regress subject RT’s onto subject MI’s to see
if, over the group, more informative feedback significantly increases or
decreases RT. Here we could compare MI for the sequences [100] and [001] to test
our information theoretic hypothesis (see Introduction). We also compare their
accuracy levels with a two-sided Student’s t-test.We supplemented our hypothesis
testing concerning “feedback on errors” with Bayesian statistics
in order to quantify how much evidence there is in favor of the null hypothesis.
This approach is now becoming widely adopted in experimental psychology
(Dienes, 2011).
Our analysis was based on mean-corrected average accuracy and average RT for
each condition. We used a custom written Matlab script for Bayesian ANOVAs where
computations were based on Equation 1 in Wetzels and Wagenmakers (2012). The output of
this analysis is a Bayes Factor which quantifies the strength of evidence for
the alternative versus the null hypotheses, with values larger than 1 favoring
the alternative and less than 1 favoring the null. These values are grouped in
ranges (Jeffreys,
1961) quantifying “weak” (1/3–1),
“substantial” (1/10–1/3), and “strong”
(1/30–1/10) evidence for the null. The equivalent Log Bayes Factors are
−1.1 to 0 for weak, −2.3 to −1.1 for substantial, and
−3 to −2.3 for strong.
Information Theory
If there is an outcome
o = {c,e} where
c is correct and e is error with probabilities
pc and pe with pc = 1
− pe then Shannon defines the “surprise”
of an outcome as measuring the improbability of that event (Shannon & Weaver, 1963).
Mathematically, surprise is defined as log2(1/p)
where p is the probability of an event and use of base-2
logarithms means that surprise is measured in bits. Thus the surprise associated
with an error is log2(1/pe) and with a correct is
log2(1/pc). The information content of a
variable, also known as the entropy, is then the average surprise. The entropy
of the outcome is H(o) = pe
× log2(1/pe) + pc
× log2(1/pc). The entropy measures the
information content of a variable, in bits. The more uncertain we are about the
value of a variable the greater the information conveyed when it is observed.
For pe = 0.2, we have H(o) =
0.72 bits. Note that H(o) would reach a
maximal possible value of one bit if pe = 0.5.If there is feedback
(f) in the form of a sound f =
{s,n} where s is sound
and n is no sound with probabilities ps and
pn, with pn = 1 −
ps, then the entropy of the feedback is
H(f) = ps ×
log2(1/ps) + pn ×
log2(1/pn).Importantly we can also quantify
the information one variable contains about another. This is given by the mutual
information (MI). For example, the mutual information between feedback and
outcome is the reduction in uncertainty about outcome after experiencing
feedback. Mathematically this is given by the uncertainty in the outcome,
H(o), minus the uncertainty in the outcome
after having received feedback, H(o|f). That
is, MI = H(o) −
H(o|f). The mutual
information is a strictly positive quantity.Calculating the mutual information
of our two fictive sequences (see next section for details of this calculation)
gives the following result: Sequence 1 (20% errors, auditory feedback on
all errors) gives MI = 0.722; Note that this is the same as
H(o) because there is no uncertainty in
the outcome after feedback (i.e.,
H(o|f) is zero). This is
because feedback is always provided after an error so, upon hearing a sound we
can be sure we made an error. Sequence 2 (20% errors, no feedback on
errors, auditory feedback on 20% of correct responses) gives MI = 0.057.
That is, Sequence 2 feedback provides less information about outcome than does
Sequence 1.Note that we cannot match the
number of sounds between the two sequences perfectly since the error rate varies
between participants. We have used an estimation based on previous data that
participants perform between 80% and 90% correct and therefore set
the amount of feedback received on the correct trials to 20%, which
corresponds to approximately 16%–18% of the total amount of
trials. However, the potential difference between the two sequences is
small.
Computing the Mutual Information Between Outcome and Feedback
For many of the sequences we
have used, the type of feedback (sound or no sound) depends on the outcome
of the current trial and the previous trial. The levels of the three
experimental factors FE, FEC, and FCC determine the values of the following
probabilities:where t indexes the trial.
p(FE) can be 0 or 1, p(FEC) can be 0 or 1,
and p(FCC) can be 0, 0.2, or 1. The experimental condition
specifies these probabilities. Given these, and the error probabilities
p(e
t) = 1−p(c
t), we have the quantities we need to compute the entropies and mutual
information. First we compute the joint probability of the eight possible
three-way events:where we have assumed p(e
t, e
) =
p(e
t)p(e
). We also assume
p(e
t) = p(e
). We then compute the probabilities of
the four possible two-way eventsAnd then the probabilities of
sound and no sound:The mutual information between
feedback and outcome is then given byOur calculation of the mutual
information assumes that subjects have no knowledge of the outcome prior to
receiving external feedback. However, it may be the case that subjects are
able to assess whether their response was correct or incorrect using their
internal monitoring system. Evidence against the information theoretic
hypothesis (as characterized using the MI equation derived above) is
therefore evidence in favor of an internal monitoring system. We return to
this topic in the discussion.
Results
Accuracy
Effect of Gender, Age, and Order
There was no effect of gender,
age, or condition order on the accuracy level.
Main Effects
FE: External feedback on
errors showed no significant difference compared to no external feedback on
errors in accuracy level, F(1, 164) = 0.02,
p > 0.89, log Bayes Factor = −2.55
(Figure 3A
). The Bayes factor provides strong
evidence for the null hypothesis.
Figure 3
Accuracy; main effects of feedback. Each bar corresponds to the average ±
SEM of the mean accuracy of each of the conditions within the main
effects. (A) Errors: The main effect of FE showed no significant difference to whether
external or internal (no external) feedback was presented (p >
0.89). (B) The correct response after an error: Main effect of FEC showed a significant
reduction in performance when external feedback was presented during this period
(p < 0.002). (C) Correct following corrects: Main effect of FCC
showed a significant effect (p < 0.01).
FEC: External feedback on
corrects after errors revealed a significant effect, compared to no external
feedback on corrects after errors, F(1, 164) = 9.94, mean
effect size 0.11%, p < 0.001. As seen in
Figure 3B,
there was a reduction in performance when participants were presented with
external feedback.FCC: External feedback on
correct following corrects revealed a significant effect,
F(2, 164) = 4.74, mean effect size 0.6%,
p < 0.0001 (Figure 3C). A post hoc pairwise analysis showed a
significant difference in performance between FCC100 and
FCC20 (p < 0.04). The comparison between
FCC100 and FCC0 did not reach significance
(p > 0.16). There was no significant change in
accuracy between FCC20 and FCC0 (p
> 0.55).
Interactions
FE-FCC: There was a
significant interaction between errors and corrects following corrects,
where external feedback on error, together with FCC100, that is,
the two conditions [102] [112], resulted in reduced performance compared to
other FE and FCC combinations, F(2, 164) = 71.8,
p < 0.0001.FEC-FCC: The interaction
analysis between corrects after errors and corrects following corrects also
revealed a significant effect F(2, 164) = 75.2,
p < 0.0001. Performance was significantly improved
when no external feedback was presented on FEC (FEC0) in
combination with either no external feedback on the corrects following
corrects or when there is external feedback on only 20% of the
corrects following corrects.The interaction between FE and
FEC and the three-way interaction FE-FEC-FCC did not reveal any significant
differences.
Feedback and Double-Errors
To investigate if there may be
any sign of reduced error detection in the six conditions without external
error-feedback we compared the number of double-errors between the
conditions with and without external error-feedback. There was no
significant difference between the two groups t(10) = 0.31,
p > 0.76, log Bayes Factor = −2.31 (mean
double-errors external error-feedback: 4.17 ± 0.86; no external
error-feedback: 3.39 ± 0.49), nor between the 12 conditions,
F(11) = 1.6, p > 0.10. See
Table 1 for
individual data. The Bayes factor provides substantial evidence for the null
hypothesis.Correlations between accuracy
and double-errors showed that in the four sequences where external feedback
was given on some random general corrects (FCC20), participants
who performed worse made more double-errors; [001] r
2 = 0.495, p < 0.01; [101] r
2 = 0.61 p > 0.001; [011] r
2 = 0.42, p < 0.001; [111] r
2 = 0.60 p < 0.01. Also in two of the
conditions where external feedback was presented on all “corrects
following corrects” (FCC100) the participants that
performed the worse made more double-errors [112] r
2 = 0.34, p < 0.05; [002] r
2 = 0.52 p < 0.05. There was a marginal
significance in the condition [012] r = 0.211
p < 0.11.
Reaction Time
FE: External feedback on
errors revealed no significant effect compared to no external feedback on
errors F(1, 164) = 0.69, p > 0.41, log
Bayes Factor = −2.23 (Figure
4A
). The Bayes factor provides
substantial evidence for the null hypothesis.
Figure 4
RT; Main effect of feedback. Each bar corresponds to the average ±
SEM of the mean RT of each of the conditions within the main
effects. (A) Errors: Main effect of FE showed no significant effect. (B) The correct
response after an error: Main effect of FEC showed no significant effect. (C) Correct
following corrects: There was a significant main effect of FCC. RT was significantly
faster when external feedback was provided on FCC100 compared to no external
feedback (p < 0.05).
FEC: External feedback on
corrects after errors did not show any significant difference in RTs
compared to no external feedback on corrects after errors
F(1, 164) = 0.14, p > 0.71
(Figure
4B).FCC: There was a significant
main effect of external feedback on corrects following corrects,
F(2, 164) = 4.88, mean effect size 17.41 ms,
p < 0.008. A significant shortening in RT was
observed for FCC100 when compared to FCC0
(p < 0.05). There was a marginal significance
(p < 0.11), in a shortening of RT for
FCC100 when compared to FCC20. No significant
difference was observed between no external feedback and 20% external
feedback on corrects following corrects, FCC0 versus
FCC20 (p > 0.66; Figure 4C).FEC-FCC: The interaction
analysis regarding RT between external feedback on corrects after errors and
corrects following corrects revealed a significant effect
F(2, 164) = 3.3, p < 0.04 meaning that
RT was significantly faster in the conditions where external feedback was
received on corrects after errors together with external feedback on all
corrects following corrects, that is, the [012] and [112].No other interactions were
found to be significant.
Testing the Use of Feedback With Information Theory
To evaluate the hypothesis that
external feedback on an error would be of more information value to participants
than external feedback on correct responses, we compared conditions [100] and
[001]. There was no significant difference in performance between the conditions
[100] and [001], t(22) = 0.27, p > 0.6, log
Bayes Factor = −3.1, nor were these conditions influenced by MI, that is,
the amount of information the feedback signal provides about the outcome, [100]
r = 0.2, r
2 = 0.04, p > 0.52; [001] r =
0.02, r
2 = 0.0004, p > 0.94. The Bayes factor provides
strong evidence for the null hypothesis.The instances when the external
feedback signal was used by the participants as sufficient information about the
outcome to influence RT were for the two sequences that contained the largest
amount of sound: [012] and [102] ([012] r = −0.64,
r
2 = 0.41, p < 0.02; [102] r =
−0.52, r
2 = 0.27, p < 0.04). These significant
correlations mean that the more information the participant extracts from the
feedback signal about the outcome the shorter the RT. Note that the analysis
could not be performed on the sequences [000] and [112], as MI did not vary over
participants (this is because external feedback is provided on none or every
outcome).
Discussion
Our results indicate a differential
effect of feedback on performance depending on in which phase the feedback is
presented. Accuracy and RTs vary depending on feedback-type and phase. We find that
error-monitoring differs from the subsequent correct response, in the sense that the
phase on the correct after an error (FEC) is sensitive to external feedback, whereas
errors (FE) are not. FEC appears to differ from FCC responses as well. There was a
reduction in performance for both the main effects (FEC and FCC) when external
feedback was provided, however a closer look on the FCC conditions revealed that
FCC100 was responsible for this effect. Moreover, the feedback did
not influence RTs on FEC, but did so significantly for FCC100. This
finding shows that the FEC in particular is a phase sensitive to external
disturbance.We do not seem to care about whether
we are externally informed about errors or not, since there is no difference in how
people perform with and without error-feedback, as revealed by our main effects
analyses. To quantify how much evidence there is in favor of no difference in
performance between external and no external feedback on errors, we computed the log
Bayes Factor (logBF). We found that for accuracy logBF was −2.55 and for RT
logBF was −2.23. This tells us that it is about (exp(2.5) = 12.2) 12 times
more likely that the data have occurred under the null hypothesis than the
alternative hypothesis. In other words, this is a strong support for the null
hypothesis (Jeffreys,
1961). When investigating the effect of external feedback on
errors with an information theoretic model, again we found no evidence for the
hypothesis that the brain utilizes external error information more readily than
external information about other outcomes in a cognitively demanding sequential
response task. Looking at the two sequences with the highest performance scores
[100] and [001], one of which had external feedback on errors (approximately
20% errors), the other which had sounds delivered on approximately 20%
of the correct responses randomly distributed, there was no significant difference
in accuracy scores. Supplementary Bayesian statistical analysis gave a Log Bayes
Factor of −3.10, which gave strong support for the null hypothesis. The
finding is in line with a brain imaging study by Holroyd et al. (2004) showing that ACC responds in
a similar magnitude to errors independent of external or internal feedback. It
therefore seems unlikely that the participants are unaware of their errors in the
conditions without external error-feedback, or that the external error-feedback
would interfere with performance monitoring. Nevertheless, we looked into this issue
by counting double-errors arguing that there would be more of these if the
participants lacked coherent error-monitoring. We found no support for more
double-errors being committed in either the internal or the external error-feedback
conditions. The estimated Log Bayes Factor was −2.31, which gives us a
substantial support in favor of the null hypothesis. This supports the claim that
feedback-type on errors, on a task where the accuracy level is around 80%,
has no impact on error-monitoring.When we computed the MI, that is, the
reduction in uncertainty before versus after hearing the feedback, we assumed that
the participant thinks they got it right with a probability of 80% (average
performance level) before hearing the tone. This however, turned out to be wrong.
This is most likely due to the fact that the brain has already worked out the
outcome (error or correct) prior to the feedback signal. The real uncertainty before
hearing the feedback is much less and so the MI is much less. Thus, we can infer
from the results given from the information theory that the participants are not
ignorant about the outcome before hearing the feedback because the internal
monitoring system is doing a good job. This is consistent with our other analyses,
which show that external feedback does not help. We argue that this is due to the
efficiency of our error-monitoring system, which has developed through evolution to
assist progress and survival without having to rely on external sources.Only when external feedback was given
on each of the correct responses following corrects was there a significant
reduction in both accuracy and RT. Reduced RT with increased amount of external
feedback has previously been observed by Houtman et al. (2012). The correlation between MI and RT
for these conditions supported the above finding in showing that RT is influenced by
the information from the external feedback when the sequences consist of a large
amount of external feedback (>80%) and is influenced in such a way that RT
is being shortened. Our finding of the information theory that the participants most
likely register their outcome before the feedback signal is delivered suggests that
the effect that feedback on many correct responses has is preparatory, or
confirmatory, rather than reactive. We know from a previous study that predictable
auditory signals automatically activate pre- and primary motor cortices and
suggestively lower the execution threshold (Bengtsson et al., 2009). In order to generate a response,
according to the Evidence Accumulation type models (Gold & Shadlen, 2001), the motor system
triggers a response signal when enough information has been accumulated to reach
decision threshold. In the present study, it seems as if the feedback signal is
incorporated into preparing a response that lowers the threshold. For about
80% of the trials the participants are doing fine, they are in a
“standard/automatic response mode,” perhaps gradually losing task
control exercised on the motor system by the prefrontal cortex. Alternatively, the
effect that large amount of external feedback leads to reduced performance accuracy
could be due to superfluous external information taking up attentional resources
(MacLeod & MacDonald,
2000). A third possibility is that the phonological loop
used during working memory (Baddeley,
Gathercole, & Papagno, 1998) is active during the
n-back task for letters, and that the auditory feedback
interferes with this loop. However, we have unpublished pilot data showing that also
visual feedback, in the form of a flash of light, disturbs performance, which would
speak against an interaction between the external feedback and the
n-back task within the phonological loop. Future brain imaging
data will shed light on which of these mechanisms is operating.From our results we conclude that
processes active during FEC are different from those active during FE. It is
therefore unlikely that the phase FEC would display simply more
“cautious” behavior as a consequence of the error as suggested by
Laming (1968).
Instead, we suggest that this period contains an additional process unique for this
phase, which may be one of consolidation, stating that the change of strategy was
accurate. This finding is in line with brain imaging studies showing a different
activity pattern in this phase when compared to errors as well as other correct
responses (Marco-Pallarés et al.,
2008). Delivering external feedback on 20% of
corrects following corrects did not significantly change performance. When
participants make an error they need to reset their response mode and the outcome of
the trial after an error is therefore crucial for evaluating whether the response
mode is reset correctly. While they are assessing this it seems particularly
deleterious to also process external feedback signals, while on a correct response
after a correct response they have already established that their response mode has
been appropriately reset.We found that the participants who
were the weaker performers made significantly more double-errors in the conditions
where they were presented with feedback on random correct responses and
FCC100. This shows that not only are there individual differences in
how people handle external feedback, but that the sequential structure of the
feedback matters as well. In fact, we find that certain combinations of feedback
between the different phases matter for accuracy. For example, external
error-feedback together with external feedback on corrects gave the poorest
accuracy, whereas no external feedback on the first correct after an errors together
with less than 20% feedback on other corrects, regardless of error-feedback,
led to the best performance. This suggests that the participants, to a certain
degree, process an outcome in relation to the character of previous trials.
Conclusion
In summary, our finding that external
error-feedback does not influence performance is in line with the theories that
outline ACC as a generic error-monitoring system (Botvinick et al., 2001; Holroyd et al., 2005) and resonates
with the finding of Houtman et al.
(2012). Thus, our finding supports the notion that the
internal error-monitoring system is sufficient in cognitive tasks where accuracy is
around 80%. We find that external feedback on correct responses leads to
deteriorating accuracy, which suggests that external signals are diverting attention
away from the task when present on correct responses. An interesting novel finding
is that the correct response after an error is particularly sensitive to external
signals, which suggests that important internal consolidation of strategy
implementation takes place here. We propose that feedback manipulations of three
different phases can be used in future studies to investigate individual
characteristics and deviations in performance monitoring.
Authors: John G Kerns; Jonathan D Cohen; Angus W MacDonald; Raymond Y Cho; V Andrew Stenger; Cameron S Carter Journal: Science Date: 2004-02-13 Impact factor: 47.728
Authors: Carmen Moret-Tatay; Craig Leth-Steensen; Tatiana Quarti Irigaray; Irani I L Argimon; Daniel Gamermann; Diana Abad-Tortosa; Camila Oliveira; Begoña Sáiz-Mauleón; Andrea Vázquez-Martínez; Esperanza Navarro-Pardo; Pedro Fernández de Córdoba Castellá Journal: Psychol Belg Date: 2016-12-20
Authors: Carmen Moret-Tatay; Enrique Vaquer-Cardona; Gloria Bernabé-Valero; José Salvador Blasco-Magraner; Begoña Sáiz-Mauleón; María José Jorques-Infante; Isabel Iborra-Marmolejo; María José Beneyto-Arrojo Journal: Children (Basel) Date: 2022-01-23