Literature DB >> 29877520

Effect of pattern awareness on the behavioral and neurophysiological correlates of visual statistical learning.

Sonia Singh¹, Jerome Daltrozzo¹, Christopher M Conway^1,2.

Abstract

Statistical learning is the ability to extract predictive patterns from structured input. A common assumption is that statistical learning is a type of implicit learning that does not result in explicit awareness of learned patterns. However, there is also some evidence that statistical learning may involve explicit processing to some extent. The purpose of this study was to examine the effect of pattern awareness on behavioral and neurophysiological correlates of visual statistical learning. Participants completed a visual learning task while behavioral responses and event-related potentials were recorded. Following the completion of the task, awareness of statistical patterns was assessed through a questionnaire scored by three independent raters. Behavioral findings indicated learning only for participants exhibiting high pattern awareness levels. Neurophysiological data indicated that only the high-pattern awareness group showed expected P300 event-related potential learning effects, although there was also some indication that the low awareness groups showed a sustained mid- to late-latency negativity. Linear mixed-model analyses confirmed that only the high awareness group showed neurophysiological indications of learning. Finally, source estimation results revealed left hemispheric activation was associated with statistical learning extending from frontal to occipital and parietal regions. Further analyses suggested that left insula, left parahippocampal, and right precentral regions showed different levels of activation based on pattern awareness. To conclude, we found that pattern awareness, a dimension associated with explicit processing, strongly influences the behavioral and neurophysiological correlates of visual statistical learning.

Entities: Chemical Disease Gene Species

Keywords: P300; awareness; event-related potential; explicit processing; implicit learning; statistical learning

Year: 2017 PMID： 29877520 PMCID： PMC5858025 DOI： 10.1093/nc/nix020

Source DB: PubMed Journal: Neurosci Conscious ISSN： 2057-2107

Introduction

Statistical learning is the ability to extract statistical associations or predictive patterns from structured input. Statistical learning can be used to infer sequential probabilities among ordered elements in the environment (Saffran ). For example, in natural language, linguistic units (e.g. phonemes, syllables, and words) are arranged in a non-random sequence according to the specific language’s phonology, phonotactics, semantics, and syntax. Statistical learning can occur even after relatively brief exposure times (Saffran ; Aslin ; Fiser and Aslin 2002; Kirkham ), allowing the extraction of statistical structures to anticipate and predict future events (Conway ). Article Highlights The effect of pattern awareness on the behavioral and neural correlates of visual statistical learning was explored. Analyses of the behavioral and neurophysiological data showed that the level of awareness of the individuals for the underlying statistical patterns—assessed through questionnaire following the completion of the task—was closely associated with behavioral and neurophysiological indications of learning. Specifically, the findings demonstrated that only the participants with a high level of pattern awareness showed clear evidence of statistical learning as measured by response times and event-related potentials. Source estimation results further indicated that statistical learning was associated with left hemispheric activation in a network spanning occipital, parietal, and frontal regions, with increased activation observed for participants demonstrating high pattern awareness relative to low pattern awareness in a subset of brain regions including left parahippocampal cortex. Overall, these findings show that pattern awareness and learning ability are closely linked, suggesting that statistical learning in this task is mediated largely by explicit processes. Traditionally, statistical learning is believed to be a type of implicit learning, occurring in the absence of explicit pattern awareness of the underlying structure that needs to be learned (Reber 1989; Perruchet and Pacton 2006; Reber 2013). Implicit learning can be defined as learning without the intention to learn or without conscious awareness of the knowledge that has been acquired (Cohen and Squire 1980; Reber and Squire 1994; Travers ; Jeste ). However, there is also evidence that statistical learning may involve explicit processing to some extent (Turk-Browne ; Wessel ; Daltrozzo and Conway 2014). In particular, a number of studies are in line with the assumption that statistical learning involves both implicit and explicit mechanisms (Batterink ; for a recent review, see Daltrozzo and Conway 2014). Although some findings imply that attention to stimuli is not required for statistical learning (Saffran ), others indicate that attention improves both visual (Turk-Browne ) and auditory learning (Toro ; Emberson ). More specifically, Turk-Browne concluded that visual statistical learning is both automatic and intentional, meaning, although attention is a prerequisite for relevant stimulus selection, subsequent learning occurs without intent or awareness. In addition, Hendricks utilized a dual-task paradigm involving a working memory task in conjunction with an artificial grammar learning task to dissociate (visual) automatic and intentional learning. They found that although some aspects of visual statistical learning are relatively automatic, making direct grammaticality judgments at test as well as transferring knowledge to perceptually dissimilar stimulus sets both appeared to depend on explicit processing resources. The extent to which statistical learning results in awareness of the learned information is also controversial. On the one hand, several researchers have suggested that the relationship between statistical learning performance and awareness may not be so clear, because implicit learning can occur independently of explicit awareness (Curran and Keele 1993; Goschke, 1998; Song ). Others, such as Cleeremans (2006), propose a more nuanced view, where statistical learning affects awareness but only under specific circumstances. Specifically, mental representations obtained from exposure to a sequence might only result in awareness when the strength of activation of these representations reaches some critical level. Consequently, statistical learning without awareness may ensue whenever these representations are poorly activated. Findings from another recent study, suggested that measures of learning that primarily target the explicit knowledge of sequences (e.g. recognition judgment and familiarity ratings) were not as sensitive as other indirect indices such as response times that do not rely solely on awareness (Batterink ). According to another view, the relationship between statistical learning and awareness may be bidirectional. In particular, for participants who become aware of the existence of structured sequences, their level of intention to learn the sequence structures might modulate learning (Rüsseler ). There are other reasons to believe that awareness of the to-be-learned patterns can affect performance. For example, Cleeremans (1993) proposed an information processing model of statistical learning by building on the simple recurrent network (Cleeremans and McClelland 1991; Cleeremans 1993). The guiding hypothesis behind the model is that awareness of sequence structure alters the nature of the task in that instead of anticipating subsequent events in temporal context there is a switch to upcoming event retrieval from short-term memory. Here, performance is contingent on attentional resources, and such dependence could result in degradation of output, especially when memory representations are less reliable (during dual task performance). Their main findings were that explicit knowledge may enhance implicit learning and also that participants will attempt to utilize explicit knowledge whenever accessible. Findings from a different study by McIntosh involved participants who were either aware or unaware of a tone that predicted a visual event or not. Participants who were aware of this versus those who were unaware showed different responses both behaviorally and neuroanatomically (as measured by regional cerebral blood flow). Several of the interacting brain areas (left prefrontal cortex, contralateral prefrontal cortex, sensory cortex, and cerebellum) showed changes in functional connectivity that also correlated with the awareness of participants.

The present study

The purpose of the present study was to explore the behavioral and neurophysiological effects of pattern awareness on statistical learning. To achieve this aim, we measured event-related potentials (ERPs) in response to a visual statistical learning task following a paradigm similar to that of Jost . The task involved the presentation of a series of visual stimuli wherein “target” stimuli could be predicted with varying levels by the preceding “predictor” stimulus. ERPs to two different types of predictor—cueing the target with either high predictability (HP) or low predictability (LP)—were recorded. Jost found that a greater P300 component was observed for the HP relative to the LP stimuli following learning. In the present study, this ERP effect was explored as a dependent variable against the effect of the independent variable pattern awareness, as assessed through a questionnaire after completion of the statistical learning task. We hypothesized that participants who show more awareness of the underlying statistical patterns would also show the largest learning effects in terms of behavioral response times and ERPs. We also incorporated source estimation analyses to explore the activation of brain areas during the statistical learning task and to determine how activation was affected by the level of pattern awareness of participants.

Methods

Participants

A total of 34 participants (27 females, aged 18–49 years; M = 22.4 years, SD = 6.3) without any language, neurological, or psychological deficits from Georgia State University participated in the study for class credits. All participants were right handed according to the Edinburg Handedness Inventory (Oldfield 1971), except seven (3 left handed and 4 ambidextrous). All participants were native English speakers. None of them spoke, wrote, read, or understood Chinese (some of the stimuli were Chinese characters, see Visual Statistical Learning Task section below). Participants were recruited from the local University online recruiting system and provided written informed consent to participate. The study was approved by the local ethics committee (The Institutional Review Board of Georgia State University).

Visual statistical learning task

Participants were administered a visual statistical learning task similar to that used in Jost and Daltrozzo while ERPs were recorded (see the Electroencephalography Acquisition section below). To discourage verbal rehearsal or naming of the stimuli (and to increase the difficulty of the learning task), the task used non-verbal and unfamiliar characters from standard Chinese script (common to Mandarin and Cantonese), and participants were only allowed to participate if they were unfamiliar with this script. As a follow-up, after the task, participants were asked whether they used verbal labels during the task. In the statistical learning task, a set of traditional Chinese characters were presented to participants interspersed with target face stimuli (Fig. 1). The target faces could be either “happy” or “unhappy.” Target assignment was at the beginning of the experiment and once chosen was applied across the whole experiment for that participant. Fifty percent of participants saw a happy smiley face as the target, and the remaining saw the unhappy smiley as the target (i.e. none of the participants ever saw both happy and unhappy targets during the experiment). Both happy and unhappy faces were used to balance any spontaneous emotion elicited by the stimuli (Halberstadt ) that might affect emotion perception and in turn affect learning across participants. The task of the participants was to press a button as fast as possible when they saw the target. Unbeknownst to the participants, the target followed either a high-probability predictor (HP) on 90% of trials or a low-probability predictor (LP) on 20% of trials (Fig. 1). The predictors and the target were presented within a stream of standard (S) items. For each participant, HP, LP, and S were pseudo-randomly assigned to 1 of the 6 Chinese characters displayed on the top panel of Fig. 1. As in Jost , the participant was expected to learn the statistical relationship between the predictors and the target. Participants were not given information about the existence of underlying probabilities that define the co-occurrence of the predictors and the target to encourage incidental learning.

Figure 1.

Statistical rules of the sequential learning task. The target (T) followed either a high predictability predictor (HP) on 90% of trials or a low predictability predictor (LP) on 20% of trials. These two items were presented within a stream of standard (S) items. For each participant, HP, LP, and S were randomly assigned to one of the six Chinese characters displayed on the top panel of the figure. For each participant, T was pseudo-randomly assigned to one of the two smileys, see top panel. Bottom panel shows stimulus onset asynchrony (SOA) and inter-stimulus intervals (ISI). Each predictability condition (HP and LP) was presented 50 times. All trials were continuous and pseudo-randomly ordered across the two predictability conditions, so that participants encountered a seamless presentation between trials. Each participant was presented with a total of 100 trials, divided into 5 blocks of 20 trials each. A break of 30 s was given between each block. Stimuli were presented electronically using E-Prime 2.0.8.90 software (Psychology Software Tools, Pittsburgh, PA, USA) on a Dell Optiplex 755 computer. All visual stimuli were presented in white in the center of the computer screen on a dark background. Stimuli were displayed for 500 ms, followed by a dark screen, which was displayed for an additional 500 ms (inter-stimulus interval was 500 ms). Thus, the visual stimuli were presented with a 1000-ms stimulus onset asynchrony.

Electroencephalography acquisition

During the statistical learning task, electroencephalography (EEG) data were taken from 256 scalp sites using an Electrical Geodesic Inc. (EGI) sensor net (Fig. 2) and was preprocessed using Net Station Version 4.3.1 with subsequent processing using custom scripts written in Matlab (version R2012b 8.0.0783, The MathWorks) using the EEGLAB toolbox (version 10.2.2.2.4a; Delorme and Makeig, 2004). Electrode impedances were kept below 50 kΩ. The EEG was acquired with a 0.1- to 100-Hz band-pass filter at 250 Hz with vertex reference and then rereferenced to the average reference of all sensors and low-pass filtered at 30 Hz. Signals containing non-stereotypical artifacts, including high-amplitude, high-frequency muscle noise and electrode cable movements, were rejected (∼25% of trials). Prior to segmentation, stereotypical artifacts, such as vertical eye blinks and horizontal eye movements, were corrected with an extended Infomax independent component analysis using EEGLAB (Lee ). The continuous EEG was then segmented into epochs −200 ms to +1000 ms with respect to the predictor onset. ERPs were baseline corrected with the 200 ms prestimulus data. Individual ERPs were computed for each participant, predictor type, and electrode. All experimental sessions were conducted in a 132-square feet double-walled, soundproof acoustic chamber.

Figure 2.

Electrical Geodesic Inc. sensor net with the nine ROIs highlighted [left (LAn), middle (FRz), and right anterior (RAn); left (LCn), middle (CNz), and right central (RCn); and left (LPo), middle (POz), and right posterior (RPo) regions used for recording cortical activity].

Pattern awareness questionnaire

After the statistical learning task, the EEG electrode net was removed, and the participants completed a questionnaire to assess their level of pattern awareness (see Table 1). Pattern awareness levels were obtained from an inter-rater agreement among three independent scorers of the participants’ responses. Each rater was requested to provide a score of 1 (“low awareness” to the statistical rules/patterns embedded in stimuli sequences of the statistical learning task) or 2 (“high awareness” of the pattern). The inter-rater reliability was 96.5% (Cronbach’s alpha). For each participant, the final pattern awareness score was the mean of the scores of the three raters. The participants were then separated into two groups with n = 17 in each group based on a median split of these mean scores of pattern awareness: the group of high pattern awareness and the group of low pattern awareness.

Table 1.

Pattern awareness questionnaire

Pattern Awareness Questionnaire
Think about the task with Chinese characters you did. Did you notice anything about the Chinese characters? Tell me about your perception of the task. [Verbatim record] Do you think the Chinese characters were occurring randomly? [If the participant says no, ask to explain how the characters were non-randomly displayed.] Was there a pattern or anything regular in the order that the Chinese characters were presented? Was there a Chinese character that usually came before the target (the smiley face you were looking for)? If you noticed a pattern, at what point did you notice it? Before the 1st break, after the 1st break, after the 2nd break, after the 3rd break, or after the 4th break?

Pattern Awareness Questionnaire

Think about the task with Chinese characters you did. Did you notice anything about the Chinese characters? Tell me about your perception of the task. [Verbatim record]

Do you think the Chinese characters were occurring randomly? [If the participant says no, ask to explain how the characters were non-randomly displayed.]

Was there a pattern or anything regular in the order that the Chinese characters were presented?

Was there a Chinese character that usually came before the target (the smiley face you were looking for)?

If you noticed a pattern, at what point did you notice it? Before the 1st break, after the 1st break, after the 2nd break, after the 3rd break, or after the 4th break?

Pattern awareness questionnaire Think about the task with Chinese characters you did. Did you notice anything about the Chinese characters? Tell me about your perception of the task. [Verbatim record] Do you think the Chinese characters were occurring randomly? [If the participant says no, ask to explain how the characters were non-randomly displayed.] Was there a pattern or anything regular in the order that the Chinese characters were presented? Was there a Chinese character that usually came before the target (the smiley face you were looking for)? If you noticed a pattern, at what point did you notice it? Before the 1st break, after the 1st break, after the 2nd break, after the 3rd break, or after the 4th break? The two groups did not differ significantly in terms of age (high-pattern awareness: M = 22.47, SD = 7.15; low-pattern awareness: M = 22.29, SD = 5.86; Uʹ = 154.5, P = .74, Mann–Whitney, two tailed]. In addition, before the statistical learning task, the executive control capacity of the participants was assessed with the Flanker task (Eriksen and Eriksen 1974), which is commonly used to test the executive control (Fan ). There was no significant difference between Flanker performance in the high pattern (n = 16; M = 2.187, SD = 87.817) and the low pattern awareness groups [n = 16; M = −58.000, SD = 89.026; after removing 2 outliers and using the centered Flanker difference scores; incongruent minus congruent trials; F(30, 1) = 0.500, P = 0.832; η = 0.938].

Analyses of the ERPs

We applied a linear mixed model (LMM) (West ) to our pattern awareness and statistical learning data at the single-trial level. (The LMM was performed with R (version 3.1.2) using the lmer() function of the lme4 library, Bates .) The LMM is increasingly used to analyze EEG data comprising of such large data sets (Bagiella ; Davidson and Indefrey 2007; Moratti ; Pritchett ; Wierda et al. 2010; Newman ). The LMM offers advantages over traditional repeated-measures analyses of variance (ANOVAs), including richer modeling of random effects with, for instance, multiple, crossed, and/or nested random effects (Newman ). As a result, this model allows increased accuracy and external validity of the parameter estimate. Another important advantage of using the LMM instead of an ANOVA approach is the ability of the LMM to handle missing data and non-sphericity issues, both of which the LMM can adequately address. Thus, unlike the ANOVA model, there is no need for subsequent correction for non-sphericity (e.g. Greenhouse–Geisser or Huynh–Feldt; Bagiella ; Baayen ). To analyze the effect of the cortical topography, nine regions of interests (ROIs; see Fig. 2) were defined: left (LAn), middle (FRz), and right anterior (RAn); left (LCn), middle (CNz), and right central (RCn); and left (LPo), middle (POz), and right posterior (RPo) regions. The applied LMM was similar to the model used by Newman , with fixed effects defined below from predictability (HP or LP), pattern awareness (PA, i.e. high or low awareness of the statistical patterns), and ROI as well as with intercept by pattern awareness and intercept and ROI by participant random factors. According to the R syntax, the LMM was: Similar to Newman , the LMM was applied on a single-trial data. Single-trial data were the mean values of EEG over 200–700 ms time window based on the topography of the statistical learning ERP effect in Jost . [To correct for the incompatibility between the additive nature of the LMM (and ANOVA models) with the multiplicative nature of interactions that could yield incorrect significant (i.e. Type I error) interactions involving ROI, McCarthy and Wood (1985) developed a correction by EEG mean scaling, see also Dien and Santuzzi 2005. In every condition, for each participant, mean EEG amplitudes are scaled by the square root of the sum of the squared mean EEG amplitudes, i.e. , where X is the EEG mean amplitude for participant i in condition j. If the scalp ROI by condition interaction remains significant after rescaling, this allows for more confidence in the authenticity of the interaction, under certain conditions, Urbach and Kutas 2002.] Similar to the LMM mentioned earlier, response times (RTs) were also analyzed with the LMM procedure. The model was based on the model used to analyze ERPs. Because it pertains to behavioral RT data only, the model neither contains the EEG mean variable nor the ROI factor and is adjusted appropriately. [Bonferroni-corrected pairwise comparisons were applied with the mcposthoc.fnc() of the LMERConvenienceFunctions (version 2.10) library (Tremblay and Ransijn 2015). RT data were normalized using a square root transformation.] According to the R syntax, the LMM was: Table 2 shows the model specifications for both RT and ERP models.

Table 2.

Table of LMM specifications (df, AIC, BIC LL, deviance) for the neurophysiological (ERP) model (within 200–700 ms) and behavioral (RT) model across all trials

Model Specifications	df	AIC	BIC	LL	Deviance	chisq	Df	P
Behavioral RT
RT ∼ Predictability + PA + Predictability:PA + (1 \| participant)	6	7548.4	7581.5	−3768.2	7536.4	0	0	1
Neurophysiological data (200–700 ms)
EEG Mean ∼ Predictability + PA + Predictability:PA + Predictability: ROI + Predictability:PA:ROI + (1 + ROI \| participant)	82	−1 972 002	−1 971 122	986 083	−1 972 166	14 257	43	<0.001***

Table of LMM specifications (df, AIC, BIC LL, deviance) for the neurophysiological (ERP) model (within 200–700 ms) and behavioral (RT) model across all trials

EEG source estimation

Source estimates were performed to further investigate the underlying neural mechanisms of statistical learning between the levels of pattern awareness. [For source estimation, all procedures were processed with BrainStorm software package (Tadel ). Cortical generators of cue-locked ERP activity were reconstructed by modeling conductive head volume according to OpenMEEG BEM (Kybic ; Gramfort ) that is executed in the Brainstorm software package (Tadel ). The solution space was constrained to the cerebral cortex, and cortical current source density mapping was obtained using a distributed model consisting of 15 000 fixed dipoles normally oriented to the cortical surface. Additionally, the inverse transformation was applied to Brainstorm’s default template Montreal Neurological Institute (MNI) brain (colin27 atlas) (Collins ; Tzourio-Mazoyer ) i.e. a canonical mesh of the cortex to approximate real anatomy (see Tadel for a review). This head model was then fit to the standard geometry of the current 256 sensor net. All subsequent source analysis, and statistical estimation of the Z-scores relative to the baseline (before cue onset) was then processed. Cortical current maps were computed from the EEG time series using a linear inverse estimator called weighted minimum-norm current estimate (WMNE). WMNEs are a measure of the current density flowing at the surface of the cortex.] Source estimation was first applied at a preliminary level on the entire data set (i.e. all 34 participants of the study) to visually and statistically infer the difference between HP and LP conditions. This difference was computed as the difference between HP and LP. To further obtain separate source estimate maps for the between-group variable, pattern awareness (high and low), each having two within conditions (HP and LP), we first computed the difference (HP − LP) in each group of high- and low pattern awareness. Thus, for each participant i (i = 1,…, n = 17), we computed: (i) A single average HP; (ii) a single average for LP; and (iii) the difference D = HP − LP. Then, at the group level, we computed the following: (i) m1 = |mean (D)|highpattern awareness: the absolute value of the mean of D (with i = 1,…, n = 17, i.e. over all participants of the high awareness group); (ii) m2 = |mean (D)|low-pattern awareness: the absolute value of the mean of D (with j = 1,…, N = 17, i.e. over all participants of the low awareness group); (iii) D = D− D: the difference between these means. As a final result, we obtained a signed (±) source estimation difference D, indicating for which group the difference was more important. A non-parametric test termed permutation t-test (Maris and Oostenveld 2007) as implemented by Brainstorm software was also used to compare high and low pattern awareness to obtain a statistically significant difference using scouts (see Results section) per the Mindboggle brain atlas via Brainstorm (Klein ).

Results

Behavioral results

Analysis of the RT data showed that predictability (i.e. HP vs. LP) interacted with pattern awareness [F(1, 1805.03) = 9.195, P < 0.01]. Figure 3 shows the average RTs to targets following the HP and LP stimuli for the high- and low pattern awareness groups. Posthoc tests indicated a significant effect of predictability (i.e. learning had occurred) in the high pattern awareness group [P = 0.001] but a non-significant effect of predictability in the low-pattern awareness group [P = 0.627]. That is, for the high pattern awareness group, participants responded quicker to the targets when they were preceded by the HP stimulus compared with the LP stimulus, suggesting that they had learned the predictor–target probabilities. No such behavioral facilitation was observed in the low pattern awareness group. Table 3 shows the fixed and random effects from the RT analyses across groups.

Figure 3.

Response times in the high (HP, solid bar) and low predictability (LP, textured bar) conditions for the high and low pattern awareness (PA) groups. The vertical axis is mean response time in milliseconds; Horizontal axis depicts the post hoc tests to the Predictability × PA interaction; **P < 0.001.

Table 3.

Model RT
Fixed effects	Estimate	SE	P
Intercept	19.824	0.352	<0.001***
High predictability	0.075	0.155	0.627
High PA	0.184	0.498	0.713
High predictability:high PA	−0.667	0.220	0.002**
Random effects	SD
Participant	1.33
Residual	1.818

Table of fixed and random effects from the RT analyses in the high PA and low PA groups, [*P < 0.01, **P < 0.001, ***P < 0.0001; pattern awareness (PA); high predictability (HP); low predictability (LP)] Response times in the high (HP, solid bar) and low predictability (LP, textured bar) conditions for the high and low pattern awareness (PA) groups. The vertical axis is mean response time in milliseconds; Horizontal axis depicts the post hoc tests to the Predictability × PA interaction; **P < 0.001.

ERP results

The grand averaged ERPs across participants with high pattern awareness (n = 17) for the HP and LP conditions are shown in Fig. 4. Overall, the right ROI appears to show evidence of a P300-like response, a component that was also observed by Jost using a similar paradigm. Specifically, there is an increased positivity for the HP stimulus relative to LP roughly between 250 ms and 500 ms. There also appears to be an N200 effect in the medial ROI, with a more negative peak for HP compared with LP. Based on visual inspection, one-way ANOVAs were conducted on the EEG means averaged across trials for two time windows: (i) 150–250 ms for the N200 and (ii) 300–500 ms for the P300. Results showed that there was no significant difference between high and low predictability for the N200 time window for any of the ROIs: left [F(1, 1528) = 0.006, P = 0.936], medial [F(1, 1528) = 2.304, P = 0.129], and right [F(1, 1528) = 0.745, P = 0.388]. On the other hand, the results confirmed the existence of the P300, with a significant difference between high and low-predictability conditions in the right ROI [F(1, 1528) = 4.351, P = 0.037] but not the left [F(1, 1528) = 1.062, P = 0.303] or medial [F(1, 1528) = 0.105, P = 0.746] ROI.

Figure 4.

Grand averaged ERPs for high pattern awareness participants within left, medial, and right ROIs (n = 17). ERPs to the predictor in the HP (solid line) and LP (dashed line) conditions (vertical axis: electric potential in microvolts, positivity upward; horizontal axis: time in seconds). Shaded gray bar shows time window corresponding to ANOVA; *P < 0.01. The grand averaged ERPs across participants with low pattern awareness (n = 17) for the HP and LP conditions are shown in Fig. 5. Overall, the right and, particularly, the medial ROIs appear to show differences in the waveforms between predictor stimuli in mid and late latencies, but unlike the high-pattern awareness participants, the pattern is reversed, i.e. there is higher amplitude for the LP than the HP condition. Akin to the high pattern awareness group, we performed ANOVAs for each ROI using the EEG averages. For the low awareness group, we focused on a larger 300–1000 ms time window due to the observed sustained negativity. The results showed no significant difference between high and low-predictability condition for the left ROI [F(1, 1528) = 1.745; P = 0.187]. In contrast, there was a significant difference between predictor conditions for medial [F(1, 1528) = 40.539; P =< 0.001] and right [F(1, 1528) = 4.502; P = 0.034] ROIs, with the high predictability stimuli showing more negative potentials than the low-predictability stimuli (see Fig. 5).

Figure 5.

Grand averaged ERPs for low pattern awareness participants within left, medial, and right ROIs (n = 17). ERPs to the predictor in the HP (solid line) and LP (dashed line) conditions (vertical axis: electric potential in microvolts, positivity upward; horizontal axis: time in seconds). Shaded gray bar shows time window corresponding to ANOVA; *P < 0.01, **P < 0.001. To further explore the possible effects of pattern awareness on the ERP correlates of statistical learning, we applied an LMM on the single-trial ERP data [using the McCarthy and Wood (1985) correction] within the 200–700 ms window. This window includes most of the primary ERP effects observable by visual inspection. The results indicated a two-way interaction: Predictability × ROI [F(16, 62) = 8.39, P < 0.001]; a second two-way interaction: Predictability × Pattern Awareness [F(1, 340203) = 65.763, P < 0.001]; and a three-way interaction: Predictability × Pattern Awareness × ROI [F(16, 62) = 2.74, P < 0.001]. Because all factors of the two-way interactions (i.e. predictability, pattern awareness, and ROI) belong to the three-way interaction, we focus on the three-way interaction. Table 4 displays the fixed and random effects from the LMM analyses.

Table 4.

Table of fixed and random effects from the ERP analyses in the high pattern awareness (HPA) and low pattern awareness (LPA) groups, [*P < .01, **P < .001, ***P < .0001; pattern awareness (PA)]

Full model
Fixed effects	Estimate	SE	P
(Intercept)	−0.002	0.001	0.003**
HP	0.000	0.000	0.948
HPA	−0.001	0.001	0.556
HP:HPA	0.000	0.000	0.180
LP:CNz	0.001	0.001	0.064
HP:CNz	0.001	0.001	0.334
LP:FRz	−0.002	0.001	0.006**
HP:FRz	−0.002	0.001	0.015*
LP:LCn	0.004	0.001	0.000***
HP:LCn	0.003	0.001	0.000***
LP:LPo	0.007	0.001	0.000***
HP:LPo	0.007	0.001	0.000***
LP:POz	0.007	0.001	0.000***
HP:POz	0.005	0.001	0.001***
LP:RAn	0.000	0.001	0.580
HP:RAn	0.000	0.001	0.915
LP:RCn	0.004	0.001	0.003**
HP:RCn	0.003	0.001	0.009**
LP:RPo	0.007	0.002	0.000***
HP:RPo	0.007	0.002	0.000***
LP:HPA:CNz	0.001	0.001	0.201
HP:HPA:CNz	0.002	0.001	0.039*
LP:HPA:FRz	0.000	0.001	0.596
HP:HPA:FRz	−0.001	0.001	0.456
LP:HPA:LCn	0.000	0.001	0.747
HP:HPA:LCn	0.001	0.001	0.384
LP:HPA:LPo	0.001	0.002	0.666
HP:HPA:LPo	0.001	0.002	0.717
LP:HPA:POz	0.002	0.002	0.266
HP:HPA:POz	0.003	0.002	0.102
LP:HPA:RAn	0.000	0.001	0.612
HP:HPA:RAn	0.000	0.001	0.633
LP:HPA:RCn	0.001	0.002	0.472
HP:HPA:RCn	0.002	0.002	0.274
LP:HPA:RPo	0.002	0.003	0.401
HP:HPA:RPo	0.002	0.003	0.471
Random effects	SD
Participant	0.003
CNz	0.003
FRz	0.002
LCn	0.003
LPo	0.005
POz	0.006
RAn	0.003
RCn	0.004
RPo	0.007
Residual	0.013

Table of fixed and random effects from the ERP analyses in the high pattern awareness (HPA) and low pattern awareness (LPA) groups, [*P < .01, **P < .001, ***P < .0001; pattern awareness (PA)] A graphical depiction of the model effects from the three-way interaction between the ERP means (McCarthy corrected) for each level of predictability and ROI is provided, for high pattern awareness (Fig. 6, right panel) and low pattern awareness (Fig. 6, left panel) groups.[Figure 6 was obtained with the Effects displays package in R (Fox, 2003).] Participants with high pattern awareness show an expected ERP effect across all ROIs, i.e. enhanced positivity for HP compared with LP stimuli (similar to the positivity observed in Jost ). The low pattern awareness group shows little differentiation of the predictor conditions in any ROIs, and when it does occur, the HP predictor is more negative than LP.

Figure 6.

Predictability x pattern awareness X ROI interaction for all participants (n = 34). Right panel depicts high pattern awareness and the left panel depicts low pattern awareness. Horizontal axis are the 9 ROIs [left (LAn), middle (FRz), and right anterior (RAn); left (LCn), middle (CNz), and right central (RCn); and left (LPo), middle (POz), and right posterior (RPo)]. The vertical axis is the LMM estimated ERP mean amplitude between 200 ms and 700 ms post-predictor onset (positivity upward in microvolts; LP: solid line; HP: dashed line).

Source estimation results

First, we used source estimation to examine for any differences in brain activation related to learning across all participants (see Methods section). This was performed using scouts according to the Mindboggle brain atlas (Klein ) incorporated in Brainstorm and not the full cortical maps. (By restricting activity to the scouts, we discard any spatial resolution, and thus, the statistical results per se cannot be represented on a cortical map. However, we can explore the following issues: (i) the brain region where a statistically significant difference of source activity was found (upon correcting for multiple comparisons) and (ii) the direction of the difference, i.e. which condition was associated with a higher or a lower source activity.) A scout represents a region or a subset of dipoles demarcated on the cortical surface or head volume (Tadel ). Scout selection was performed within the same time window as the ERP LMM analyses, which was 200–700 ms post-stimulus onset and the ROIs were chosen a priori. We checked for differences between HP and LP source activation across all 34 participants over the entire time window and then over 50 ms bin increments. Second, we examined brain activation for high and low pattern awareness using scouts across the full 200–700 ms time window as well as at 50 ms time window increments. For time periods across and within these time windows, which are not depicted graphically, t-values were non-significant. Figure 7 depicts the results of the source estimation for the difference between HP and LP (i.e. areas of the brain indicative of a learning effect) for all participants. Each depicted subfigure is showing the demarcated ROI generically on the brain template corresponding to the view (left, L and right, R), and below each template is the associated graph. All t-values are representative of results from a non-parametric permutation test (Maris and Oostenveld 2007) in ROIs (scouts), where significance at the 0.05 level remained even after correcting for multiple comparisons (for signal, frequency, and time) with the false discovery rate procedure (Benjamini and Hochberg 1995). (The t-values are only reported as statistically significant when significance remains after the false discovery rate correction; α = 0.05; the positive or negative direction depicted in the t-value graph denotes which condition was associated with higher or lower source activity. For predictability, and pattern awareness, the high condition was denoted by positive values and the low by negative values.) In Fig. 7A, negative t-values represent the statistically significant difference for HP − LP in superior parietal regions and that activation was greater for the LP condition between 200 ms and 300 ms post-stimulus. Between 450 ms and 500 ms, a significant HP − LP difference (greater for HP) in the left lateral occipital region (Fig. 7B) as well as a significant HP − LP difference (greater for HP) in the left pericalcarine region (Fig. 7C) was observed. Between 650 ms and 700 ms, activation was found in the left caudal mid-frontal region (Fig. 7D). Thus, we see a left lateralized posterior–anterior shift over time associated with the HP–LP learning effect, with superior parietal, pericalcarine, and lateral occipital regions showing activation in the first 200–500 ms and caudal mid-frontal regions showing activation after 650 ms post-stimulus.

Figure 7.

Cortical source estimation maps with t-values for scouts for HP minus LP (n = 34). (A) Highlighted scouts show significant difference for HP − LP in superior parietal region (left) at 200–250 ms and 250–300 ms. (B) Highlighted scouts show significant HP − LP difference during a 450–500 ms time window in the lateral occipital (left) region. (C).Highlighted scouts show significant HP − LP difference in a 450–500 ms time window depicted in the pericalcarine (left) area. (D) Highlighted scouts show significant HP − LP difference in a 650–700 ms in the caudal mid-frontal (left) region. (A–D) Accompanied by graph showing averaged t-values (vertical axis) and time in seconds (horizontal axis). We then tested for statistically significant effects of pattern awareness at 50 ms increments within the 200–700 ms time windows. These are brain areas showing significant activation corresponding to learning the predictor stimuli (HP − LP) and that differed between the high and low awareness groups. As shown in(Fig. 8, the insula (L), parahippocampal (L) and precentral (R) regions were found to show significant activation differences between the high and low awareness groups.

Figure 8.

Cortical source estimation maps with t-values for scouts for differences in pattern awareness (n = 17). (A) Highlighted scouts show significant differences between high and low pattern awareness between 200 ms and 250 ms for the insula (left) region. (B) Highlighted scouts show significant differences between high and low pattern awareness between 350 ms and 400 ms and between 400 ms and 450 ms for the parahippocampal (left) region. (C) Highlighted scouts show significant differences between high and low pattern awareness between 500 ms and 550 ms and between 550 ms and 600 ms for the precentral (right) region. (A–C) Accompanied by graph showing averaged t-values (vertical axis) and time in seconds (horizontal axis).

Discussion

The aim of this study was to examine the extent to which pattern awareness influences the learning of visual statistical regularities. Our main findings were the following. First, only participants showing high levels of pattern awareness demonstrated robust behavioral learning effects as measured by RTs. Second, only, participants with high pattern awareness showed the expected P300 ERP effects as well as clear indications of learning as assessed with the LMM analyses. Finally, source estimation results showed left lateralization and a caudal–rostral gradient accompanying learning across all participants. Differences in brain activation were also observed for the high and low pattern awareness groups in specific brain regions. We discuss each of these primary findings in turn.

Behavioral findings

Behavioral data revealed that pattern awareness appeared to have a strong effect on learning. For those participants with low pattern awareness, there was no difference in the response times between predictor conditions and thus no evidence of learning. When awareness was high, the response times were much lower for HP compared with LP conditions, indicating learning. Thus, in contrast to earlier conceptualizations of statistical learning being an implicit process (Reber 1989), these findings reveal that only participants demonstrating high awareness showed behavioral indications of learning.

ERP findings

In a previous study using a similar learning paradigm, Jost observed a P300-like positivity elicited by the high predictability condition. The P300 is regarded as an index of target detection and evaluation (van Zuijen ) and has also been observed in other learning tasks (Baldwin and Kutas 1997; Rüsseler ; Carrión and Bly 2007). In the present study, we also obtained a P300 but only for participants with high pattern awareness (Fig. 4, right ROI). For the low pattern awareness group (Fig. 5), instead of the expected P300 effect, there was an extended negativity of the HP relative to LP conditions in the medial ROI. At least one other study from the literature has reported obtaining a similar result with unconsciously processed stimuli eliciting a reversal (i.e. negative) P300 effect (van Gaal ), albeit with a different task (the go/no-go task). The results of their study suggested that inhibitory control functions might be influenced by unconscious events. Applied to the current findings, the extended negativity would appear to suggest that learning occurred. On the other hand, the results of the LMM analyses are much more clear in showing ERP effects only in the high pattern awareness group (Fig. 6), suggesting that only the high pattern awareness participants demonstrated learning as revealed by the ERPs. With regard to the interpretation of the P300 in the high pattern awareness group, Jost suggested that the occurrence of the P300, which is normally elicited by targets in a standard oddball paradigm, shifted from the target to the stimulus that predicted the target with a high level of probability. Ample exposure to the sequential statistics of the input array may have enabled participants to view the frequent HP stimulus as if it were the target itself, displaying the prototypical P300 response. That the P300 effect was observed only in the high pattern awareness group suggests qualitatively different neural processes occurring during the task for the two groups of participants. In sum, the ERP results show clear indications of learning for the high awareness participants but much less clear evidence for the low awareness participants. Taken together, with the behavioral evidence suggesting learning only for the high awareness participants, these results add to the pre-existing literature underscoring awareness as a prerequisite for or at the very least, influential, to statistical learning ability (Cleeremans 1993; McIntosh ; McIntosh ).

Source estimation findings

Source analysis for all 34 participants of the study indicated left hemispheric activation across the 200–700 ms time window, specifically over the superior parietal, occipital, and mid-frontal ROIs (Fig. 7). The left superior parietal region is involved in spatial orientation (Corbetta ) and receives visual and sensory input and is closely tied to self-awareness (Goldberg ). Other visual processing areas were also observed, specifically, left lateral occipital and left pericalcarine cortex. Lateral occipital regions are known to be involved in object processing (Grill-Spector ) and possibly visual awareness (Ro ). More specifically, perception at early stages of visual encoding can result from exposure to a prior visual stimulus via feedback projections in the visual cortex. Pericalcarine cortex is a part of the occipital lobe and is closely tied to the central visual field. Research shows that this area is implicated in early visual processing, which in turn is also associated with pre-attentive and attentive vision (see Lamme and Roelfsema 2000). In general, the involvement of visual processing areas during the visual statistical learning task is consistent with previous empirical findings and theory suggesting an important role of modality-specific perceptual processing during statistical learning (Conway and Christiansen 2005; Turk-Browne ; Frost ). The caudal mid-frontal region also showed activation and corresponds roughly to Brodmann area 46, a part of the frontal cortex associated with sustained attention and working memory (Curtis and D’Esposito 2003; Rypma ). Frontal activations are consistently observed across different kinds of statistical learning and sequential learning tasks (e.g., Fletcher ; Skosnik ). While investigating the the asymmetry, connectivity, and segmentation of the arcuate fascicle Fernández-Miranda concluded that the caudal middle frontal gyrus along with other cortical areas (pars opercularis, pars triangularis, and ventral precentral gyrus) are part of a frontal trajectory that is integral to language processing. A possible overlap in neural areas supporting statistical learning and the “language network” is consistent with previous research suggesting strong links between statistical learning and language processing (Conway ; Misyak ; Arciuli and Simpson 2012; Christiansen ; Tabullo ; Daltrozzo ). Overall, these source estimation findings point to a general left hemispheric activation pattern associated with statistical learning, across a network of areas involved in perceptual processing, working memory, sustained attention, and language, along a caudal to rostral temporal gradient (i.e. earlier activation in posterior brain regions and later activation in frontal regions). In addition, the left hemispheric pattern of activation is reminiscent of the left laterality observed in language acquisition and other aspects of learning (Friederici and Alter 2004). Further source estimation analyses comparing high and low pattern awareness groups revealed different levels of activation for the left insula, right precentral cortex, and left parahippocampal regions. The anterior insula has been shown to be activated during performance monitoring and is also modulated by error awareness. Such activity is thought to be associated with automatic consciously perceived errors, and the encountered errors elicit responses akin to an orienting response (Ullsperger ). Additionally, early activation of the insula (which is what we report) has also been associated with risk prediction error and that its time course is consistent with a role in rapid updating (Preuschoff ). The precentral region is a part of the motor cortex and has been associated with sequence learning. Specifically, learning-related changes have been observed in the right precentral region, in addition to other areas (Bischoff-Grethe ). The left precentral gyrus is involved in speech articulation (Itabashi ) and language processing (Fernández-Miranda ). In addition, some research shows that learning new words in a language is associated with increased functional connectivity of regions for learners (compared with non-learners) between the left supplementary motor area and the left precentral gyrus among other regions implicated in phonological rehearsal (Veroude ). The left parahippocampal brain region also showed differences between the low compared with high pattern awareness groups. There is evidence to show that parahippocampal activation is associated with item-based processing (Davachi and Wagner 2002) in humans. Electrophysiological findings (in rats and monkeys) have indicated that the neuronal responses in parahippocampal regions represent information about previously occurring items (Brown ; Li ). Preston and Gabrieli (2008) found that activations in hippocampus and parahippocampal cortex were associated with explicit memory, dissociating between subsequently remembered and forgotten repeated contexts but were unrelated to context-dependent learning. Importantly, the parahippocampal cortex has recently been implicated as playing an important role during statistical learning (Schapiro ).

Limitations and future directions

One limitation of this study is that our measure of pattern awareness was taken at the end of the statistical learning task, rather than during, and as such cannot provide fine-grained information about awareness as it might have unfolded in time. Also, regarding the pattern awareness questionnaire, although we tried to quantify subjective participant self-report, there are other ways to measure awareness that are likely less subjective and reliant on participants’ own reports. One such method is the process dissociation procedure (Jacoby 1991), which uses a combination of direct and indirect assessments to tease apart the contribution of explicit and implicit memory (i.e. conscious from unconscious learning). Future research could usefully use such a procedure during statistical learning and investigate how differences in conscious awareness contribute to neural patterns of activation. Another potential limitation is that in our statistical analyses we ignore the fact that the statistical significance obtained with ROIs as a predictor variable is also influenced by the spatial proximity between ROIs and as such could be modeled in terms of spatial distance, when entered as an interaction term. From a technical point of view, adding such complex interaction terms in the model involves additional parameter estimation. Provided that such technical issues can be adequately addressed, future statistical analyses using mixed models with ERP data could benefit from a more detailed analysis of each ROIs independent as well as interdependent effects on the other variables. Another limitation to consider is to what extent the current findings will generalize to other statistical learning and implicit learning tasks. The task we used here differs from other learning tasks in important ways. First, even though we used Chinese character stimuli in non-Chinese speakers to increase the difficulty of the learning task (see Methods section), the statistical contingencies are relatively simple and easy to learn, which might make this task easier to become aware of the patterns compared to more complex learning tasks such as artificial grammar learning tasks that require not only learning of the statistical regularities but also generalization to novel patterns. Another aspect of the current task that makes it somewhat unique (and we believe is a strength) is that the primary behavioral and neurophysiological indicators of learning are online and indirect measures. That is, at no point (except after the task is over when pattern awareness is assessed) are participants explicitly queried as to their knowledge of the patterns. This is not the case in artificial grammar learning tasks (Reber 1989) or in word segmentation/triplet tasks (Saffran ), where learning is typically assessed through direct explicit measures. This task characteristic could actually make it less likely that the measures of learning themselves are contaminating the learning process. Thus, it is currently an open question to what extent the findings obtained in the current study will generalize to other measures of statistical learning commonly used in the literature. In addition, future research might usefully attempt to disentangle the influence of attention and working memory in relation to awareness during statistical learning. In the current study, executive control was measured with the Flanker task at the start of the experimental session. A comparison of the Flanker data across the two pattern awareness groups showed that they did not differ, which suggests that executive control may not affect the relationship between awareness and statistical learning. However, one limitation with the use of the Flanker task in the current study is that it was measured before the statistical learning task and therefore does not provide an online measure of executive control during learning. It is likely that executive control and awareness interact in a complex way to affect learning processes. In their review on the neural mechanisms of attention and awareness, Tallon-Baudry (2012) discuss different ways that attention and awareness could be related: (i) the gateway hypothesis, (ii) the reverse dependence hypothesis, and (iii) the cumulative influence hypothesis. The gateway hypothesis is Dehaene classical view, where attention facilitates awareness and might even be a prerequisite for awareness to emerge. According to the reverse dependence hypothesis, attentional mechanisms are only activated if a stimulus is detected at the neural level, implying awareness. In the cumulative influence hypothesis, attention and awareness are each implemented by separate mechanisms, but both independently influence the participant’s report of the existence of the stimulus. In contrast, Lamme (2003) argues that although visual attention and awareness are intimately related, the overlap between mechanisms of attention and memory are more likely than that of attention and awareness. According to Lamme (2003), the current state of the neural network characterizes attentional selection, whereas phenomenal experience ensues from the recurrent interaction between groups of neurons. Future research examining the constructs of attention, memory, and awareness in relation to statistical learning is needed. Finally, due to the nature of this study, it is not possible to determine the nature of cause and effect between awareness and learning. One possibility is that as participants become increasingly aware of the patterns during the course of the task (possibly due to the use of explicit strategies or the deployment of attention or cognitive effort), then learning improves and the P300 effect results. This possibility would be more consistent with the gateway hypothesis discussed above. On the other hand, it is also possible that differences in learning ability directly affect awareness of the patterns, with better learning resulting in heightened pattern awareness. This perspective seems more consistent with the reverse dependence hypothesis. Additional research is needed to better understand how these variables causally affect one another.

Conclusion

In conclusion, we have provided evidence for the influence of pattern awareness on statistical learning. Both behaviorally and neurophysiologically, our findings suggest that pattern awareness is closely associated with statistical learning ability. Neurophysiologically, we observed more distinct ERP learning effects in participants who demonstrated high pattern awareness. Across all participants, source estimation results revealed left lateral regions (superior parietal, lateral occipital, pericalcarine, and caudal mid-frontal) that were activated temporally in a caudal-to-rostral manner. Furthermore, differences in pattern awareness were associated with greater levels of activation in the left (insula and parahippocampal regions) as well as right (precentral) hemispheric regions. These findings suggest that pattern awareness influences visual statistical learning and points to an increased need to manipulate and/or measure how this construct affects learning across a variety of individuals and tasks.

78 in total

1. Conscious, preconscious, and subliminal processing: a testable taxonomy.

Authors: Stanislas Dehaene; Jean-Pierre Changeux; Lionel Naccache; Jérôme Sackur; Claire Sergent
Journal: Trends Cogn Sci Date: 2006-04-17 Impact factor: 20.229

2. Implicit, intuitive, and explicit knowledge of abstract regularities in a sound sequence: an event-related brain potential study.

Authors: Titia L van Zuijen; Veerle L Simoens; Petri Paavilainen; Risto Näätänen; Mari Tervaniemi
Journal: J Cogn Neurosci Date: 2006-08 Impact factor: 3.225

3. Neuronal evidence that inferomedial temporal cortex is more important than hippocampus in certain processes underlying recognition memory.

Authors: M W Brown; F A Wilson; I P Riches
Journal: Brain Res Date: 1987-04-14 Impact factor: 3.252

Effect of pattern awareness on the behavioral and neurophysiological correlates of visual statistical learning.

Introduction

The present study

Methods

Participants

Visual statistical learning task

Electroencephalography acquisition

Pattern awareness questionnaire

Analyses of the ERPs

EEG source estimation

Results

Behavioral results

ERP results

Source estimation results

Discussion

Behavioral findings

ERP findings

Source estimation findings

Limitations and future directions

Conclusion

1. Conscious, preconscious, and subliminal processing: a testable taxonomy.

2. Implicit, intuitive, and explicit knowledge of abstract regularities in a sound sequence: an event-related brain potential study.

3. Neuronal evidence that inferomedial temporal cortex is more important than hippocampus in certain processes underlying recognition memory.

4. Motor-linked implicit learning in persons with autism spectrum disorders.

5. Superior parietal cortex activation during spatial attention shifts and visual feature conjunction.

6. The representation of stimulus familiarity in anterior inferior temporal cortex.

7. Exploring the neurodevelopment of visual statistical learning using event-related brain potentials.

8. Is statistical learning constrained by lower level perceptual organization?

9. Similar Neural Correlates for Language and Sequential Learning: Evidence from Event-Related Brain Potentials.

10. OpenMEEG: opensource software for quasistatic bioelectromagnetics.

1. How statistical learning interacts with the socioeconomic environment to shape children's language development.

2. Learning to suppress a location does not depend on knowing which location.

3. Brain-correlates of processing local dependencies within a statistical learning paradigm.