We examined the neural signatures of stimulus features in visual working memory (WM) by integrating functional magnetic resonance imaging (fMRI) and event-related potential data recorded during mental manipulation of colors, rotation angles, and color-angle conjunctions. The N200, negative slow wave, and P3b were modulated by the information content of WM, and an fMRI-constrained source model revealed a progression in neural activity from posterior visual areas to higher order areas in the ventral and dorsal processing streams. Color processing was associated with activity in inferior frontal gyrus during encoding and retrieval, whereas angle processing involved right parietal regions during the delay interval. WM for color-angle conjunctions did not involve any additional neural processes. The finding that different patterns of brain activity underlie WM for color and spatial information is consistent with ideas that the ventral/dorsal "what/where" segregation of perceptual processing influences WM organization. The absence of characteristic signatures of conjunction-related brain activity, which was generally intermediate between the 2 single conditions, suggests that conjunction judgments are based on the coordinated activity of these 2 streams.
We examined the neural signatures of stimulus features in visual working memory (WM) by integrating functional magnetic resonance imaging (fMRI) and event-related potential data recorded during mental manipulation of colors, rotation angles, and color-angle conjunctions. The N200, negative slow wave, and P3b were modulated by the information content of WM, and an fMRI-constrained source model revealed a progression in neural activity from posterior visual areas to higher order areas in the ventral and dorsal processing streams. Color processing was associated with activity in inferior frontal gyrus during encoding and retrieval, whereas angle processing involved right parietal regions during the delay interval. WM for color-angle conjunctions did not involve any additional neural processes. The finding that different patterns of brain activity underlie WM for color and spatial information is consistent with ideas that the ventral/dorsal "what/where" segregation of perceptual processing influences WM organization. The absence of characteristic signatures of conjunction-related brain activity, which was generally intermediate between the 2 single conditions, suggests that conjunction judgments are based on the coordinated activity of these 2 streams.
Working memory (WM) is the essential cognitive ability to maintain and manipulate information that is no longer physically available (Baddeley 1992). This ability is mediated by a network of brain regions, mainly in prefrontal and parietal regions, which may be differentially engaged depending on the task and load (Curtis and D'Esposito 2003; Linden et al. 2003). In the visual domain, WM appears to depend on separate subsystems for storage and manipulation of visual and spatial information (Mohr and Linden 2005), and it has been suggested that the ventral/dorsal “what/where” segregation of perceptual processing (Ungerleider and Mishkin 1982) may continue into frontal areas (Goldman-Rakic 1987; Ungerleider et al. 1998). This idea is supported by experiments using functional magnetic resonance imaging (fMRI), which reveal dorsolateral prefrontal cortex (DLPFC) or premotor cortex (PMC) to be preferentially involved in WM for spatial information, whereas ventrolateral prefrontal cortex (PFC) is preferentially involved in WM for nonspatial information (Courtney et al. 1998; Munk et al. 2002; Sala et al. 2003). Alternatively, WM has been conceptualized as “process specific,” with dorsal and ventral PFC specialized for manipulation and maintenance, respectively (D'Esposito et al. 1999; Owen et al. 1999). The domain- and process-specific accounts of WM architecture are not mutually exclusive, and recent neuroimaging experiments have suggested that both maintenance and manipulation processes recruit content-specific brain regions, as well as regions specialized according to the executive demands of the task (Mohr et al. 2006).Given that fMRI provides only limited information about the timing of cognitive processes, in the present report we wish to elucidate the temporal dynamics of visuospatial WM using the event-related potential (ERP) approach. In particular, the P3b subcomponent of the P300 (see Polich 2003) has been strongly linked with visual WM encoding and retrieval. For example, the P3b is larger during encoding for stimuli that are subsequently recalled (Fabiani et al. 1986), and increasing WM load causes P3b amplitude to decrease at both encoding and retrieval (Kok 2001; Morgan et al. 2008). Amplitude modulations of the P3b are thought to indicate the allocation of processing resources (see Kok 2001), whereas P3b topography may reflect the nature of the information to be processed (Johnson 1989), and the generators of P3b may vary depending on whether the task involves objects or spatial locations (Mecklinger and Müller 1996; Mecklinger et al. 1998).WM maintenance during delay intervals has been indexed by a sustained negative slow wave (NSW) that varies in amplitude and topography depending on the nature of the retained information. The NSW increases with WM load and is largest over left frontal areas during WM maintenance of phonological and auditory information and over posterior occipital and parietal areas during maintenance of visuospatial information (Barrett and Rugg 1989, 1990; Lang et al. 1992; Ruchkin et al. 1992). Within the visual domain, the NSW may reflect segregation of what and where information, with larger amplitude for maintenance of spatial compared with visual information (Mecklinger and Müller 1996; Woodman and Vogel 2008). Mental rotation and size scaling of WM contents produce an NSW that is also maximal over parieto-occipital scalp sites (Rösler et al. 1995), which suggests that information coding in WM may be similar for both maintenance and manipulation processes. However, it is not known whether ERPs associated with WM manipulation differ for visual and spatial information and visual–spatial conjunctions.To increase our understanding of the sequence of processing stages underlying complex cognitive tasks, termed “mental chronometry” (Posner 1978), we have combined the temporal resolution of ERPs with the spatial resolution of fMRI using an fMRI-guided source analysis approach (Scherg and Berg 1991, 1996; Bledowski et al. 2006). More specifically, the current study examines the mental chronometry of visuospatial WM by integrating fMRI and ERP data recorded during “manipulation” of visual or spatial information, as well as the conjunction of visual and spatial information. In a delayed matching-to-sample paradigm modeled after Mohr and Linden (2005), participants were instructed to mentally manipulate the colors, rotation angles, or both colors and angles (dual task) of 2 briefly presented sample stimuli, and then indicate whether or not a subsequent test stimulus was the intermediate color blend and/or rotation angle of the previous samples. The goal of this approach was to decompose the processing stages involved in WM encoding, manipulation, and retrieval in order to shed light on the neural mechanisms underlying content-specific stimulus processing.
Materials and Methods
Participants
Eighteen neurologically healthy students from Bangor University (8 males, 10 females) aged between 20 and 38 years (mean age 25 years) took part in the study in return for payment. Participants all had normal or corrected-to-normal visual acuity and were tested for color vision (Dvorine 1953).
Study Design (Stimuli and Procedure)
The experiment required manipulation of colors, angles, or both colors and angles in WM. Figure 1 shows an example of the trial sequence. The stimuli were colored semicircles (visual angle = 2.2° × 4.1°) on a black background. Each trial began with a central instruction letter, indicating which task to perform. “A” indicated the angle task, “C” indicated the color task, and “D” indicated the dual task (i.e., the combination of both angle and color tasks). A white fixation cross appeared for 500 ms and then 2 sample stimuli with different colors and rotation angles appeared for 500 ms on the left and right of the fixation cross (distance from fixation = 3.8°). There was a 2000-ms delay in which only the fixation cross was present. Then a test stimulus appeared in the center of the screen for 3000 ms, during which participants had to indicate with a left or right hand button press whether the test stimulus matched or mismatched the average of the 2 sample stimuli in terms of color, rotation angle, or both color and angle. This was followed by a feedback display for 1000 ms, in which the fixation cross turned green for a correct response, red for an incorrect response, and gray if no response was made during the 3000-ms presentation of the test stimulus. The assignment of match and mismatch to the left and right response buttons was counterbalanced across participants. The intertrial interval, which contained only the fixation cross, was 3–7 s (average 5 s) during the electroencephalography (EEG) session and 3.5–11.5 s (average 7.5 s) during the fMRI session. The experiment was divided into separate blocks of 45 trials, with 15 trials per task. For each task, the test stimulus matched the average of the 2 samples on one-third of trials and mismatched on two-thirds of trials. The order of conditions within each block was randomized.
Figure 1.
Example of a typical trial. A single letter instructed participants which task to perform (A for angle, C for color, and D for dual task). Participants had to mentally manipulate the colors, rotation angles, or both colors and angles (dual task) of the sample stimuli and then indicate whether or not the test stimulus matched the intermediate color blend and/or rotation angle of the samples. The subsequent fixation cross changed color to indicate response accuracy (red for incorrect, green for correct, and gray for no response/timeout).
Example of a typical trial. A single letter instructed participants which task to perform (A for angle, C for color, and D for dual task). Participants had to mentally manipulate the colors, rotation angles, or both colors and angles (dual task) of the sample stimuli and then indicate whether or not the test stimulus matched the intermediate color blend and/or rotation angle of the samples. The subsequent fixation cross changed color to indicate response accuracy (red for incorrect, green for correct, and gray for no response/timeout).The rotation angles of the 2 sample stimuli differed by a rotation of 60°. In the “match” condition of the angle task, the rotation angle of the test stimulus differed from each sample stimulus by a rotation of 30°. In the “mismatch” condition, the rotation angle of the test stimulus differed from the matching rotation angle by 20° (50%) or 30° (50%). Colors were defined in hue saturation value color space, in which the hue is represented by 0°–360°. The 2 sample stimuli differed in hue by 60°, and the hue of the test stimulus in the match condition differed from each sample by 30°. In the mismatch condition, the hue of the test stimulus differed from the matching hue by either 50° (50%) or 30° (50%). During the EEG session, the colors of the sample and test stimuli were matched in luminance.The order of the fMRI and EEG sessions was counterbalanced across participants. The EEG session contained 6 blocks, with overall 90 trials per task. Stimuli were presented on a 54-cm thin film transistor (TFT) monitor, and participants responded using keys “A” and “L” on the computer keyboard. The fMRI session contained 4 blocks, with overall 60 trials per task. Stimuli were back projected onto a frosted screen and viewed through a mirror positioned on the head coil. Participants’ responses were registered using the left and right keys of a fiber-optic response box (Current Designs, Philadelphia, PA). A practice of 9 trials per task was given before each recording session.
EEG Recording and Analysis
The EEG session took place inside a Faraday cage to minimize electrical interference. The EEG was recorded from 64 ring electrodes using Abralyt Light (FMS, Munich, Germany) as a conducting agent. An elastic cap (Easy Cap; FMS) was used to place the electrodes in the following 10-10 positions (American Electroencephalographic Society 1991): Nz, FP1, FP2, AFz, F9, F7, F5, F3, F1, Fz, F2, F4, F6, F8, F10, FT9, FT7, FC5, FC3, FC1, FCz, FC2, FC4, FC6, FT8, FT10, T7, C5, C3, C1, C2, C4, C6, T8, TP9, TP7, CP5, CP3, CP1, CPz, CP2, CP4, CP6, TP8, TP10, P9, P5, P3, P1, Pz, P2, P4, P6, P10, PO7, PO3, POz, PO4, PO8, O1, O2, Iz. Two infraorbital channels (IO1 and IO2) were located vertically below each eye. All channels were referenced during recording to a reference electrode positioned at Cz, and an electrode positioned at FPz served as ground. Electrode impedances were kept below 5 kΩ. The EEG was recorded with 2 BrainAmps DC amplifiers (Brain Products, Munich, Germany) and sampled at 1000 Hz with a 250-Hz low-pass filter.BESA software (MEGIS Software GmbH, Gräfelfing, Germany) was used for EEG analysis. The EEG was rereferenced to the average reference and separated into epochs of 3700-ms duration for correct response trials only, starting 200 ms before the onset of the sample stimuli and ending 1000 ms after the onset of the test stimulus. Eyeblink artifacts were identified using a template-based method (Ille et al. 2002) to allow for subsequent correction in the ERP analysis (see below). Apart from eyeblinks, epochs with amplitudes exceeding ±100 μV in any channel were excluded from further analysis. On average, 87% of trials were retained after the artifact rejection.ERPs for each condition were calculated separately for encoding (−200 to 800 ms from the onset of the sample stimuli), delay (800 to 2300 ms from the onset of the sample stimuli), and retrieval (−200 to 800 ms from the onset of the test stimulus). A 0.5- to 20-Hz filter was applied prior to ERP averaging of the encoding and retrieval phases; this filter removed the delay activity from the retrieval baseline. For analysis of delay activity, a 0.05- to 8-Hz filter was used, and the 200 ms prior to the onset of the sample stimuli was used as the baseline. Figure 2 shows the grand-average waveform at Pz over the entire trial (filtered: 0.5–20 Hz), as well as the separate encoding, delay, and retrieval waveforms used in the ERP analysis.
Figure 2.
Grand-average ERP waveforms for each condition over the entire trial, including encoding, delay, and retrieval phases (A). Grand-average ERP waveforms for each condition and spline-interpolated topographical maps of scalp voltage (collapsed over all conditions) at the latencies of each component for encoding (B), delay (C), and retrieval (D). * Indicates significant amplitude differences between conditions.
Grand-average ERP waveforms for each condition over the entire trial, including encoding, delay, and retrieval phases (A). Grand-average ERP waveforms for each condition and spline-interpolated topographical maps of scalp voltage (collapsed over all conditions) at the latencies of each component for encoding (B), delay (C), and retrieval (D). * Indicates significant amplitude differences between conditions.For the ERP analysis, eyeblink artifacts were corrected using an adaptive artifact correction method (Ille et al. 2002) implemented in BESA. Prior to analysis, individual ERPs were interpolated to the standardized 81-electrode configurations of the 10-10 system using spherical spline interpolation. ERP components were identified by their scalp topographies and peak latencies. For analysis of early visual ERP components, peak amplitudes were determined as the local maxima or minima within the time segments 90–160 ms (P1) and 170 to 230 ms (N200) after the onset of the sample stimuli (encoding phase) and the test stimulus (retrieval phase). P1 peak amplitude was measured at P9, P10, PO9, and PO10 for encoding and at PO9, PO10, O1, and O2 for retrieval. N200 peak amplitude was measured at P7, P8, PO7, and PO8 for encoding and retrieval. P1 peak latency was measured at PO9 for encoding and O1 for retrieval. N200 peak latency was measured at PO8 for encoding and retrieval. The amplitudes of the other electrodes were measured at those latencies. For analysis of the P3b, peak latency was measured at Pz between 260 and 400 ms (early P3b) and 450 to 700 ms (late P3b at retrieval only). Mean amplitudes from a 50- (early P3b) or 100-ms (late P3b) window around each individual's peak latency were measured at 5 medial parietal electrodes (CPz, P1, Pz, P2, and POz). For analysis of delay activity, mean amplitudes of 6 lateral parietal electrodes (P5, P3, P1, P2, P4, and P6) were measured over three 500-ms windows from 800 to 2300 ms following the onset of the sample stimuli. For each component, repeated-measures analyses of variance (ANOVAs) with the factor task (color, angle, or dual) were performed. The additional factor hemisphere was included for all components except the P3b. An alpha level of P < 0.05 (2 sided) was used for all tests. Greenhouse–Geisser corrections were applied where appropriate to correct for violations of the assumption of sphericity. The F values, the uncorrected degrees of freedom, and the corrected P values are reported, and only the significant effects are presented. Significant effects were followed up by Bonferroni-corrected post hoc tests.
fMRI Data Recording and Analysis
Magnetic resonance imaging data were recorded using a 3-T Philips MR Scanner with a SENSE parallel head coil. The functional imaging used a T2*-weighted gradient echo planar imaging (EPI) sequence (17 axial slices, repetition time/echo time = 1000/30 ms, flip angle = 65°, field of view = 240 × 240 mm, voxel size = 3 × 3 × 3 mm). There were 4 EPI runs, and 736 volumes were collected within each run. In addition, 2 dummy volumes were acquired before each run to allow steady-state tissue magnetization. For coregistration and display of functional data, high-resolution 3D data sets (1 × 1 × 1 mm) were acquired for each individual using a T1-weighted sequence.fMRI data were analyzed using the BrainVoyager QX software package (Brain Innovation, Maastricht, The Netherlands). The following preprocessing steps were performed: Talairach transformation, temporal slice scan time correction using sinc interpolation, 3D motion correction using trilinear interpolation, spatial smoothing (8 mm Gaussian kernel), and a temporal high-pass filter (3 cycles per time course). The Talairach transformed 3D anatomical scans were coregistered with the functional data in order to yield 3D functional volumes (volume time courses). fMRI contrasts were a comparison of the whole trial duration with the intertrial baseline, which consisted of a fixation cross. Only trials with correct responses were entered into the analysis. Seventy-two z-normalized volume time courses were entered into a whole brain, random effects general linear model. Multisubject statistical maps for all tasks (color, angle, and dual) compared with fixation were thresholded at P < 0.000002 (uncorrected), and the results were visualized on a surface reconstruction of the Montreal Neurological Institute template brain (see Fig. 3).
Figure 3.
Group statistical maps of blood oxygenation level–dependent signal increase for all conditions compared with fixation (P < 0.000002 uncorrected) and positions of the RSs superimposed on a surface reconstruction of the Montreal Neurological Institute template brain.
Group statistical maps of blood oxygenation level–dependent signal increase for all conditions compared with fixation (P < 0.000002 uncorrected) and positions of the RSs superimposed on a surface reconstruction of the Montreal Neurological Institute template brain.
Source Analysis
fMRI-guided discrete multiple source analysis was performed using BESA software. Source activities were computed using a 4-shell spherical head model and a regularization constant of 1%. Regional sources (RSs), consisting of 3 dipoles with orthogonal orientations, were used in the model to account for individual variance in cortical folding. RSs were placed in the foci of fMRI activity clusters for all tasks (color, angle, and dual) compared with fixation, in order to enable statistical comparison of time courses across conditions (Bledowski et al. 2004, 2006). Because the amount of ERP variance explained by the model was high for all conditions, a probe source analysis was not considered necessary. Eyeblink artifacts were corrected by adding a spatial component containing the averaged blink topography. This model was applied to the grand-average ERP waves collapsed over all tasks for encoding, delay, and retrieval, with the orientation of the first dipole of each RS determined at the maximum source strength during each epoch. Note that all RSs were placed in the model prior to adjusting the orientation, in order to avoid contagion between sources. Source activities were projected back to scalp voltage, and topographical maps were calculated at peak latencies for each condition (see Figs 4, 5, and 6).
Figure 4.
Source activity during encoding. Time courses of RS intensity and topographical voltage maps at the peak intensity. * Indicates significant differences in source intensity between conditions.
Figure 5.
Source activity during delay. Time courses of RS intensity and topographical voltage maps at the peak intensity. * Indicates significant differences in source intensity between conditions.
Figure 6.
Source activity during retrieval. Time courses of RS intensity and topographical voltage maps at the peak intensity. * Indicates significant differences in source intensity between conditions.
Source activity during encoding. Time courses of RS intensity and topographical voltage maps at the peak intensity. * Indicates significant differences in source intensity between conditions.Source activity during delay. Time courses of RS intensity and topographical voltage maps at the peak intensity. * Indicates significant differences in source intensity between conditions.Source activity during retrieval. Time courses of RS intensity and topographical voltage maps at the peak intensity. * Indicates significant differences in source intensity between conditions.To compare source activity between conditions, the model was applied to individual ERPs for the color, angle, and dual conditions and a spatial component containing the individual blink topography was added to correct for eyeblink artifacts. As an orientation-independent measure of source intensity, the root squared sum of the 3 orthogonal components was obtained for each RS (e.g., Weisser et al. 2001), and the resulting waveforms were averaged across individuals (see Figs 4, 5, and 6). Statistical analysis of source intensity was accomplished using the same nonparametric bootstrapping procedure as in previous work (Bledowski et al. 2004, 2006; Strobel et al. 2008), with an alpha level of P < 0.02 to correct for the increased number of comparisons. For encoding and retrieval source intensity waveforms, 98% confidence intervals of the difference waveforms between the 3 conditions were calculated using the bootstrap bias-corrected and adjusted (BCa) method (Efron and Tibshirani 1993) with 1000 bootstrap samples. Two conditions were considered to be significantly different at time points where the confidence interval of the difference did not include zero. For analysis of delay activity, mean source intensities over three 500-ms time windows during delay were obtained for each RS, and 98% confidence intervals of the differences between mean source intensity for the 3 conditions were determined using the BCa method with 1000 bootstrap samples.
Results
Behavioral Data
As can be seen in Table 1, accuracy was lower in the dual task compared with each of the single tasks. Response times (RTs) for correct response trials and A′ scores were submitted to 1-way repeated-measures ANOVAs with the factor task (color, angle, and dual). Analysis of A′ revealed a significant effect of task, F2,34 = 7.5, P = 0.002. Post hoc tests (Bonferroni corrected) showed that accuracy was significantly lower in the dual task compared with the angle task (P = 0.046) and the color task (P = 0.004). The RT analysis also found a significant effect of task, F2,34 = 17.4, P < 0.001; RT was slower in the dual task than in the color (P = 0.006) and angle (P < 0.001) tasks and slower in the color task than in the angle task (P = 0.02).
Table 1
Mean RT in milliseconds on correct response trials, hit rate, false alarm rate, and A′ for each condition
RT
Hits
False alarms
A′
Color
1155 (49)
0.80 (0.02)
0.23 (0.02)
0.86 (0.01)
Angle
1042 (52)
0.80 (0.03)
0.25 (0.03)
0.85 (0.02)
Dual
1214 (52)
0.77 (0.04)
0.32 (0.04)
0.81 (0.02)
Note: Standard error is shown in parentheses.
Mean RT in milliseconds on correct response trials, hit rate, false alarm rate, and A′ for each conditionNote: Standard error is shown in parentheses.
ERP Results
ERP responses are illustrated in Figure 2. The sample shapes (encoding phase; Fig. 2, panel A) and the test shape (retrieval phase; Fig. 2, panel C) both elicited a positive wave (P1) at 141 ms (encoding) and 147 ms (retrieval) with maximum peak amplitude at PO9 during encoding and at O1 during retrieval. This was followed at 203 ms by a negative deflection (N200), which was maximal over lateral parieto-occipital electrodes (PO7 and PO8). The N200 was followed by a large ERP response over parietal electrodes corresponding to the P3b component. The P3b elicited by the sample stimuli at 300 ms was followed by an NSW (Fig. 2, panel B), which was maximal over parietal sites from 800 ms until presentation of the test stimulus. The P3b occurred later and was more sustained during retrieval compared with encoding. Consistent with previous work showing 2 distinct P3b subcomponents (Bledowski et al. 2006; Morgan et al. 2008), the P3b at retrieval was divided into 2 peaks at 357 and 555 ms. Analysis of ERP latencies revealed no significant task differences.
Encoding ERPs
N200 analysis revealed a main effect of task, F2,34 = 3.8, P = 0.03. N200 amplitude was larger in the color task compared with the dual task (P = 0.02), whereas the angle task did not significantly differ from the color or dual tasks. There were no significant effects on P1 or P3b.
Delay Activity
Analysis of the first time window (800–1300 ms) revealed a main effect of task, F2,34 = 3.8, P = 0.03. The NSW was larger in the angle task compared with the color task, and this difference was marginally significant (P = 0.07). In the second time window (1300–1800 ms), there was also a main effect of task, F2,34 = 3.4, P = 0.045, with a larger NSW in the angle task compared with the color task (P = 0.04). The dual task did not significantly differ from either angle or color in the first and second time windows. In the third time window (1800–2300 ms), there were no significant effects.
Retrieval ERPs
There was a significant main effect of task on the late P3b subcomponent, F2,34 = 4.5, P = 0.02, with reduced amplitude for the color task compared with the angle task (P = 0.04). There were no significant differences on P1, N200, or early P3b at retrieval.
fMRI Results
Random effects statistical maps for all conditions compared with fixation showed significant activation in occipital, inferior temporal, parietal, and frontal areas (see Fig. 3). Although these areas were active in all conditions, there were significant differences between conditions, which are reported in detail by Jackson MC, Morgan HM, Mohr H, Shapiro KL, Linden DEJ (unpublished data). To summarize, the results largely replicate previous fMRI work with a similar paradigm (Mohr et al. 2006), showing increased activity for color in left inferior frontal gyrus (IFG) and right occipital cortex and for angle in right parietal cortex (PC), left temporal cortex, and superior frontal cortex. Interestingly, conjunction activity was intermediate in most areas that differentiated between the color and angle conditions.
Source Localization
The fMRI-guided seeding procedure produced 4 bilateral pairs of RSs in inferior temporal cortex (ITC), PC, PMC, and IFG and 2 singular RSs in medial frontal cortex (MFC) and medial occipital cortex (MOC). The right IFG RS was actually placed in the activity center of a large frontal cluster, which extended into right DLPFC. However, placing an additional RS in right DLPFC would deteriorate the model fit due to its proximity to right IFG. Locations of the RSs are shown in Figure 3 and Table 2.
Table 2
Talairach coordinates of RSs
Talairach
x
y
z
MOC
2
−74
17
Left ITC
−30
−57
−1
Right ITC
29
−58
1
Left PC
−27
−43
43
Right PC
26
−47
43
Left PMC
−25
−4
53
Right PMC
28
0
51
MFC
2
17
40
Left IFG
−28
18
9
Right IFG
32
19
10
Talairach coordinates of RSs
Model Validation
The amount of cross talk between sources was calculated by applying the source model to a simulated data set consisting of temporally nonoverlapping activity in all RS locations. Cross talk analysis was then performed by computing the proportion of variance in each source waveform that was caused by activity in all other RS locations. Mean cross talk for all RSs in the model was 12%, indicating that there was sufficient separation between sources (e.g., Bledowski et al. 2006). Moreover, analysis of peak latencies across the entire trial found no significant correlations between different RSs (all r values < 0.40, P values > 0.1).
Encoding
The source model explained on average 98.7% (color and angle) and 98.8% (dual) of the scalp ERP variance. Figure 4 shows the grand-average source intensity waveforms for each RS and scalp voltage topography at the latency of RS peak intensity. Because the ITC source intensity waveforms showed 2 clear peaks at around 194 and 320 ms, scalp topographies at both peak latencies are shown. These different scalp topographies generated in ITC likely reflect separate stages of object recognition, such as perceptual encoding followed by identification. Note that a third peak occurred around 200 ms after the offset of the sample stimuli and had the same scalp topography as the first ITC peak. Analysis of scalp projections and peak latencies showed that the N200 was mainly generated in ITC, with smaller contributions from PC. The P3b was associated with activity in ITC and MOC, whereas IFG appeared to contribute to increased negativity over parietal scalp regions in the time range of the P3b. The contributions of the PMC and MFC sources during encoding were minimal.The bootstrap analysis showed that the increased N200 for the color condition was generated by significantly stronger activity in left ITC for the color task compared with the angle and dual tasks. Activity in left ITC was also significantly higher for the color and dual tasks compared with the angle task between 298 and 357 ms, and MOC activity was significantly increased for the angle task relative to the color and dual tasks between 261 and 280 ms. However, these differences in source activities did not produce significant ERP differences, possibly because they canceled each other out. Left IFG activity was significantly higher for the color task compared with the angle and dual tasks between 355 and 432 ms, and right IFG activity was significantly greater for the color task compared with the angle task between 406 and 428 ms and compared with the dual task between 364 and 439 ms. As the IFG source contributed to negativity over parietal scalp regions, this corresponds to the reduction in the later part of the P3b for the color task. To examine whether this ERP reduction was significant, mean amplitudes of the parietal electrodes (CPz, P1, Pz, P2, and POz) between 340 and 430 ms following the onset of the sample stimuli were submitted to a repeated-measures ANOVA. This revealed a significant effect of task, F2,34 = 6.2, P = 0.005, with reduced amplitude in the color task compared with the angle task (P = 0.08) and the dual task (P = 0.009).
Delay
The source model explained on average 98.2% (color), 98.5% (angle), and 98.4% (dual) of the scalp ERP variance. Figure 5 shows the grand-average source intensity waveforms for each RS and scalp voltage topography at the latency of RS peak intensity. Confidence intervals of the difference in source activity between the pre-encoding baseline and the delay interval showed that source activity was significantly greater than baseline throughout the delay interval for each RS and each condition. The ITC and MOC sources showed initially strong activity, which decreased throughout the delay interval. The ITC sources contributed to a negative lateral parieto-occipital scalp topography, whereas the MOC source generated a positive central parieto-occipital scalp topography. Activity of the parietal and frontal sources was sustained throughout the delay period. The left IFG source produced a scalp topography that was negative over left inferior frontal scalp sites, whereas the right IFG source was associated with positivity over the right lateral frontal scalp. The PC sources generated negativity at parietal scalp sites, and the PMC and MFC sources contributed to negative scalp voltage over bilateral central sites.The bootstrap analysis found that during the first part of the delay (800–1300 ms), ITC activity was significantly greater for the angle task compared with the color task (right ITC only) and the dual task (left ITC only). Right PC activity was significantly greater for the dual compared with the color task during the second part of the delay (1300–1800 ms) and significantly greater for both angle and dual tasks compared with color during the final part of the delay (1800–2300 ms). This suggests that the NSW increase for the angle and dual tasks during delay was generated initially by activity in ITC and then by activity in right PC.
Retrieval
The source model explained on average 98.8% (color) and 98.7% (angle and dual) of the scalp ERP variance. Figure 6 shows the grand-average source intensity waveforms for each RS and scalp voltage topography at the latency of RS peak intensity. Because the ITC source intensity waveforms showed 3 clear peaks, scalp topographies at all peak latencies are shown. As already noted, these probably reflect the hierarchical process of object recognition. The MOC waveform also contained 2 peaks at 149 and 244 ms, but because the topographies were similar, only the first is shown. Similar to encoding, the main contribution was from the ITC sources, with smaller contributions from MOC, PC, and IFG, and minimal activity in MFC and PMC.ITC activity peaked first around 143–144 ms and was associated with a positive scalp deflection at occipital scalp sites, corresponding to the P1. The first MOC peak at 149 ms also appeared to contribute to P1, as the scalp projection revealed a central occipital positivity. The second ITC peak occurred around 197–199 ms and generated a negative scalp topography over lateral parieto-occipital sites, corresponding to the N200. The N200 also appeared to be related to PC activity, as the PC sources showed a negative deflection over parieto-occipital sites at 209–215 ms. The main generators of the P3b appeared to be the ITC sources, which showed sustained activity associated with positive scalp topographies over parietal sites in the time range of both early and late P3b subcomponents. The IFG sources also showed sustained activity over the time range of the early and late P3b, but the scalp projection showed that these sources generated a negative scalp deflection over parietal sites, suggesting that IFG activity contributed to a reduction in P3b amplitude, similar to encoding.The bootstrap analysis showed that the (not significantly) increased early scalp P3b amplitude for the color task was due to significantly stronger source activity in bilateral ITC for the color task compared with the angle task between 289 and 364 ms. Left IFG activity was significantly higher for the color relative to the angle task from 413 to 469 ms, which may have contributed to the reduction in the amplitude of the late P3b in the color task.
Discussion
This study combined ERP and fMRI recordings from the same participants to decompose the processing stages underlying WM for colors, angles, and color–angle conjunctions. ERP topographies were largely similar across the 3 tasks, but task-related amplitude modulations were observed for the N200 at encoding, the NSW during delay, and the P3b at retrieval. We now discuss these findings in the context of previous work and models of WM architecture.
Task Performance
The finding of a slight dual task cost appears inconsistent with the idea that visual and spatial WM processes operate in parallel (Baddeley and Hitch 1974) and with previous work showing no dual task cost for WM manipulation of colors and angles (Mohr and Linden 2005). However, there are 2 possible explanations for this result. First, in the study of Mohr and Linden (2005), the single visual and spatial tasks were not matched in terms of difficulty, and a dual task cost was defined as lower performance in the dual task compared with the more difficult of the 2 single tasks. By contrast, the color and angle tasks in the current study were closely matched in terms of difficulty. Second, most previous work showing equivalent WM performance for single tasks (of equal difficulty) and visual–spatial conjunctions has examined WM maintenance (e.g., Munk et al. 2002; Sala and Courtney 2007), whereas the current study required manipulation of WM contents. Recent work has shown that WM consists of both content- and task-specific components, with manipulation recruiting frontoparietal regions associated with cognitive control (Mohr et al. 2006). This suggests that manipulation relies more on executive processes than maintenance; therefore, the dual task cost observed in the present study may reflect increased demands on limited central executive resources in the dual task relative to the single tasks.
Encoding
Previous work has shown that selective attention to color is associated with a posterior negativity between 150 and 300 ms following stimulus onset (Hillyard and Münte 1984), which may be generated by sources in occipitotemporal regions (Anllo-Vento et al. 1998). Indeed, in the current study, enhancement of the parieto-occipital N200 component in the color task was associated with increased activity in the left ITC source. This corresponds to fMRI findings of color-selective regions in ventral occipitotemporal cortex, which show increased activity when color is task relevant (Beauchamp et al. 1999).The P3b was generated mainly by sources in ITC and MOC, whereas IFG activity was associated with P3b suppression. During the early part of the P3b, left ITC activity was reduced and MOC activity was increased for the angle task, which is consistent with previous work showing differences in P3b topography for object and location information (Mecklinger and Müller 1996). However, these task differences in source activities produced only nonsignificant differences in the topography or amplitude of scalp ERPs, possibly because the ERP measures were not sufficiently sensitive to detect these effects. During the later part of the P3b, increased IFG activity contributed to a significant reduction in scalp P3b amplitude for the color task. Increased IFG activity in the color condition is compatible with the view that IFG, part of the ventrolateral PFC, is part of the ventral system for visual WM (Linden 2007a). Recent work has shown that WM encoding is faster for colors compared with orientations or shapes but is slowed when orientations or shapes must also be encoded in a conjunction task (Woodman and Vogel 2008). PFC has been implicated in WM consolidation processes, such as chunking and compressing information (Bor et al. 2003). Therefore, the increased IFG activity in the later part of the P3b time range during the color task compared with the dual task likely reflects the more efficient WM consolidation of colors compared with color–angle conjunctions. Importantly, these task-related modulations of N200 and P3b show that participants were not simply encoding all stimulus attributes regardless of the task.
Delay
As expected, the P3b elicited by the encoding array was followed by a parieto-occipital NSW, which began around 300 ms after the offset of the encoding array and was sustained throughout the delay period. This corresponds to previous reports that the NSW is largest over posterior occipital and parietal areas during maintenance of visuospatial information (Barrett and Rugg 1989; Lang et al. 1992; Ruchkin et al. 1992). Of most interest, the NSW was increased in the angle task relative to the color task. Increases in task difficulty, regardless of information content, have been shown to increase NSW amplitude (Ruchkin et al. 1995; Woodman and Vogel 2008). However, this does not explain the current results, as the behavioral data show that the color and angle tasks were of equal difficulty.Rather, the NSW modulation observed here may reflect the recruitment of different WM subsystems for visual and spatial information. These results are consistent with previous work reporting increases in the amplitude of the parieto-occipital NSW during maintenance of spatial locations compared with objects (Mecklinger and Müller 1996) and during maintenance of orientations and color–orientation conjunctions compared with colors (Woodman and Vogel 2008). Importantly, although spatial manipulation (mental rotation and size scaling) in WM has previously been shown to produce a parieto-occipital NSW (Rösler et al. 1995), the current study provides the first evidence for content-specific differences in NSW amplitude during WM manipulation.Source analysis revealed that the NSW was generated by sustained activity in parietal and frontal sources throughout the delay period, together with initially high activity in ITC and MOC, which decreased over the delay. As can be seen in Figure 2, the NSW was initially maximal over the inferior parieto-occipital scalp region and then moved toward the vertex as the delay period progressed. This supports the idea that the NSW is generated by cortical areas close to the recording site (Birbaumer et al. 1990; Rösler et al. 1997). The increased NSW in the angle task was indeed associated with increased activity in ITC during the first part of the delay and in right PC during the final part of the delay, which may reflect the transfer of information from early visual processing to parietal areas supporting spatial transformations (Kosslyn et al. 1998). In particular, the transformational processes underlying mental rotation are thought to be nonsymbolic and analogue in nature, resulting in a right hemisphere bias in tasks involving mental rotation (Corballis 1997).There is evidence to suggest that the medial temporal lobe (MTL) is involved in WM maintenance of object–location conjunctions, but not in maintenance of object–color conjunctions or single visual or spatial items (Olson et al. 2006; Piekema et al. 2006). The present study did not find any contribution of MTL in WM maintenance of color–angle conjunctions. Although computing an intermediate angle has a spatial component, this intermediate angle might be maintained in a within-object frame-of-reference. That is, MTL may only be involved in WM for object–location bindings, which require a between-object frame-of-reference. Other work suggests that parietal activity during WM maintenance reflects retrospective sensory coding of space (Curtis 2006), and it has been proposed that MTL is involved in WM for relational information only when it is not possible to maintain this information in viewer-centered coordinates (Hannula and Ranganath 2008).
Retrieval
There were no task-related modulations of P1 or N200 during retrieval. Whereas the P3b at encoding was relatively brief, the P3b elicited by the test stimulus consisted of a brief peak followed by a broader peak. This is consistent with recent work showing that in complex tasks, the P3b may consist of 2 subcomponents, thought to reflect stimulus evaluation in posterior brain regions followed by executive processes in prefrontal and parietal areas (Bledowski et al. 2006; Morgan et al. 2008). Indeed, the early P3b subcomponent was generated mainly in ITC, with a significantly enhanced source activity in the color task. Although the late P3b subcomponent also had a strong ITC contribution, the amplitude reduction in the color task was associated with increased IFG activity. This finding is similar to the IFG-generated amplitude reduction in the late part of the P3b during encoding, which seems plausible in the context of work showing that WM consolidation and response selection depend on the same resources (Jolicœur and Dell'Acqua 1998, 1999; Hommel and Doeller 2005).The current findings are consistent with the idea that initial posterior activity is followed by anterior executive processing in situations when WM retrieval cannot proceed automatically and prefrontal areas are recruited for disambiguation of relevant information (Kostopoulos and Petrides 2003). However, this only seems to be the case when WM retrieval is exogenously triggered in the absence of any interference. Other work has shown that if WM retrieval is endogenously initiated following interference during the delay period, then anterior executive processing precedes posterior activation (Kessler and Kiefer 2005).
Color versus Angle
Our data partly support a modular model of visual processing where color information is handled by areas in the “ventral” stream (e.g., IFG) and spatial information by areas in the “dorsal” stream (e.g., posterior PC) (Mohr et al. 2006). However, such a simple spatial dissociation cannot explain all our findings, considering the clear temporal dissociation, with color-specific effects appearing during encoding and retrieval and angle-specific effects mainly during the delay. This temporal differentiation may be more related to the manipulation than the maintenance component of the present task. That is, computation of a color blend may proceed relatively quickly compared with computation of an intermediate angle. As already noted, information about an object's color can be processed faster than spatial information (e.g., Woodman and Vogel 2008). Therefore, computation of the intermediate color could be achieved in the presence of the sample display. This color could then be maintained with relatively little effort and compared against the test stimulus at retrieval. Conversely, computation of the intermediate angle, which involves analogue transformation (Corballis 1997), may be a more incremental process that requires longer coactivation of visual and (right) parietal areas, resulting in the angle- and conjunction-related parietal activity during the delay. Alternatively, it is possible that neural signatures of color maintenance during WM delays may not be evident in ERP measures. Indeed, other work has also shown an increased parieto-occipital NSW during maintenance of spatial compared with object information when no transformations were required (Mecklinger and Müller 1996; Woodman and Vogel 2008). Future work can examine the spatiotemporal characteristics of manipulation compared with maintenance of visual features.
Conclusions
The integration of fMRI and ERP data recorded during a WM task requiring mental manipulation of colors and angles revealed a progression in neural activity from posterior visual areas to higher order areas in the ventral and dorsal processing streams, which showed content-specific modulations in activity. Interestingly, this progression was not only confined to the task phases where a visual stimulus was present, encoding and retrieval, but was also observed for the NSW during the delay interval, which moved from occipitotemporal to parietal scalp regions. This latter observation would suggest that memory consolidation, at least for spatial material, proceeds from visual to higher order transformational representations.Color processing was associated with activity in ITC and IFG during encoding and retrieval, whereas angle processing involved right parietal regions during the delay interval. The right PC is strongly associated with visuospatial operations both in the neuropsychological and the neuroimaging literature (Sack et al. 2002). In accordance with other findings (Munk et al. 2002; Sala and Courtney 2007; Woodman and Vogel 2008), no additional neural processes appeared to be involved in WM for color–angle conjunctions. This is consistent with a biased competition model of WM architecture, in which integrated representations are formed by competitive interactions between visual and spatial processing streams (Sala and Courtney 2007). Our study also illustrates the importance of using integrated fMRI/ERP methods to examine brain activity during cognitive tasks (Linden 2007b) by revealing the spatiotemporal properties of the neural mechanisms of WM for stimulus features and conjunctions. Interestingly, neural signatures of color processing were observed at encoding and retrieval, whereas angle-specific effects were only found during the WM delay interval. In addition, the finding that IFG activity contributed to posterior scalp negativity shows that the generators of scalp ERP effects are not necessarily located in cortical structures close to the recording electrodes. In summary, our results support the view that the neural systems for visual and spatial WM differ both in space and time and that feature conjunctions are remembered through a dynamic interplay of these systems.
Funding
The Wellcome Trust (grant number 077185/Z/05/Z); Wales Institute for Cognitive Neuroscience.
Authors: Helen M Morgan; Margaret C Jackson; Martijn G van Koningsbruggen; Kimron L Shapiro; David E J Linden Journal: Brain Stimul Date: 2012-03-19 Impact factor: 8.955
Authors: Tommaso Piccoli; Giancarlo Valente; David E J Linden; Marta Re; Fabrizio Esposito; Alexander T Sack; Francesco Di Salle Journal: PLoS One Date: 2015-04-07 Impact factor: 3.240