Literature DB >> 35351649

The power of rhythms: how steady-state evoked responses reveal early neurocognitive development.

Claire Kabdebon¹, Ana Fló², Adélaïde de Heering³, Richard Aslin⁴.

Abstract

Electroencephalography (EEG) is a non-invasive and painless recording of cerebral activity, particularly well-suited for studying young infants, allowing the inspection of cerebral responses in a constellation of different ways. Of particular interest for developmental cognitive neuroscientists is the use of rhythmic stimulation, and the analysis of steady-state evoked potentials (SS-EPs) - an approach also known as frequency tagging. In this paper we rely on the existing SS-EP early developmental literature to illustrate the important advantages of SS-EPs for studying the developing brain. We argue that (1) the technique is both objective and predictive: the response is expected at the stimulation frequency (and/or higher harmonics), (2) its high spectral specificity makes the computed responses particularly robust to artifacts, and (3) the technique allows for short and efficient recordings, compatible with infants' limited attentional spans. We additionally provide an overview of some recent inspiring use of the SS-EP technique in adult research, in order to argue that (4) the SS-EP approach can be implemented creatively to target a wide range of cognitive and neural processes. For all these reasons, we expect SS-EPs to play an increasing role in the understanding of early cognitive processes. Finally, we provide practical guidelines for implementing and analyzing SS-EP studies.

Entities: Chemical

Mesh：

Year: 2022 PMID： 35351649 PMCID： PMC9294992 DOI： 10.1016/j.neuroimage.2022.119150

Source DB: PubMed Journal: Neuroimage ISSN： 1053-8119 Impact factor: 7.400

Introduction

Much remains to be learned about the complex processes that give rise to human cognition, and infant studies represent a unique opportunity to explore the origins and the growth of human cognitive achievements. However, because of their limited behavioral repertoire, it is particularly challenging to explore infants’ early abilities for at least their first three years of life. Over the past decades, developmental psychologists have deployed a panoply of ingenious tricks to shed light on the developing mind, mostly relying on measures of attention such as variations in heart rate, in non-nutritive sucking, or recordings of the direction and duration of eye gaze during short experimental studies (Golinkoff and Hirsh-Pasek, 2012). Together, these measures have yielded critical insights into various aspects of infant cognition including sensory thresholds (Dobson and Teller, 1978), perceptual categories (Quinn and Eimas, 1986), speech processing (Mehler et al., 1988), word recognition (Swingley and Aslin, 2002), conceptual development (Spelke, 1990), and social preferences (Vouloumanos et al., 2009). Despite the convenience and utility of these behavioral measures, their validity can be - and has been - questioned (Aslin, 2007) on at least two grounds. First, because of the limited duration of infant cooperation and the sluggishness of the measured attentional responses, behavioral estimates are highly variable, typically derived from a small number of test trials. Second, because each behavioral measure results from a series of complex and hidden cognitive processes, the linking hypothesis that joins the recorded dependent variable to the cognitive process of interest is oftentimes unclear or undefined. Both of these issues can therefore weaken the strength of the inferences one can draw from behavioral measures and call for novel methods that can get as close as possible to infants’ hidden brain processes. Electro-encephalography (EEG) appears as an ideal alternative to investigate cognitive processes directly and non-invasively from the infant brain. This versatile and accessible technique is indeed relatively well tolerated across many age ranges, including premature, neonates, infants, toddlers and school-aged children. In addition, EEG signals are rich and complex, and with the advent of modern computational abilities, many novel approaches have become available to analyze EEG data. Traditional Event-Related Potentials (ERPs) are classically used to assess the brain responses evoked by a stimulus. However, detecting the signal from the background electrophysiological noise requires many trials, which means long and usually highly repetitive experiments. Additionally, the technique is particularly susceptible to movement artifacts that may contaminate the averaged time-locked response. In sum, ERPs are particularly challenging to use with infants, and developmental neuroscientists often have to make compromises in their design and/or research questions. The aim of this article is to highlight another EEG experimental approach, which has substantial practical advantages for infant testing: Steady-State Evoked Potentials (SS-EPs, also denoted in the literature as rhythmic stimulation, frequency tagging, neural tracking or neural entrainment). Rather than an exhaustive review of the SS-EP literature, we provide an overview of the method and a selection of representative studies, from both inside and outside the field of infancy research. We review and discuss the practical and conceptual advantages of SS-EPs for studying cognitive functions in developmental populations, and we lay down some practical guidelines to help more researchers exploit the method to its full potential. SS-EPs occur in the brain when presented with a periodic and sustained sequence of stimulation. Brain activity naturally synchronizes to the rhythm of the stimulation, and this periodic electrical activity (or SS-EP) can be recorded from the scalp. Over the years, the technique has been successfully applied to shed light on various aspects of human brain function including vision (D. Regan, 1966), audition (Galambos et al., 1981), and tactile perception (Namerow et al., 1974). More recently, it has also been used to investigate higher-level processes such as attention (Morgan et al., 1996; Muller et al., 2006; Silberstein et al., 2003), memory (Peterson et al., 2014; Wimber et al., 2012), object perception (Alp et al., 2016; Boremanse et al., 2013; Rossion and Boremanse, 2011), conscious perception (Parkkonen et al., 2008; Tononi et al., 1998), learning (Buiatti et al., 2009; Henin et al., 2021) or language processing (Ding et al., 2016). Crucially, the SS-EP technique offers several advantages for studying infants (see Box 1). First, it complies with their limited attentional resources. While in traditional ERPs, stimuli are presented as discrete events, interspaced with long inter-stimulus intervals, SS-EP experiments allow for a continuous presentation of a high number of events in a short amount of time. For example, to observe the various time-locked waveforms in the ERP, stimulation rates are typically in the range of 1–2 Hz, whereas SS-EP stimulus repetitions are typically in the range of 3–15 Hz. Second, this technique is both objective and predictive: the SS-EP response is restricted to a specific narrow frequency band corresponding to the exact stimulus frequency, contrary to ERPs analyses for which the polarity, and/or timing of the expected effect is often underspecified. Third, due to its sharp spectral definition, the SS-EP response can easily and robustly be segregated from artifacts (e.g., blinks or eye movements) and spontaneous background activity, both of which are uncorrelated with the rhythmic stimulation and typically spread out across the entire frequency spectrum. In this article, we first introduce some technical and conceptual prerequisites which underlie the SS-EP technique. We then provide an overview of the existing SS-EP literature in infancy research, to illustrate its practical advantages for studying developmental populations. We additionally report some inspiring uses of the technique in the M/EEG adult literature to further illustrate its conceptual richness. Finally, we end with a practical description of the design and analysis of an example SS-EP study.

Definitions and measurements

Time and frequency domains

Brain function arises from the cooperative generation of a multitude of electrochemical currents within and between neurons, inducing electric variations at the millisecond scale. EEG signals capture some of these electrical dynamics as time series of voltage changes recorded from scalp electrodes. Crucially, it is possible to represent this time-resolved signal, without any loss of information, in terms of a set of oscillating voltages at various frequencies. This is achieved by means of Fourier analysis, a mathematical process named after the French mathematician Joseph Fourier, who demonstrated that any waveform can be expressed as the sum of a set of sine waves of specific frequencies, amplitudes and phases (Fig. 1B). This theorem establishes a mathematical equivalence between the time domain, in which the signal is represented as a waveform that changes in amplitude over time (Fig. 1A) and the frequency domain in which the signal is represented as a spectrum of amplitudes and phases that change with frequency (Fig. 1C).

Fig. 1.

Frequency decomposition and steady-state measurements. A – Time-domain representation of a simulated EEG signal, constructed as the sum of 5 pure sinusoids. B – The Fourier transform decomposes the signal into a set of pure sinusoids, each characterized by an amplitude A and an initial phase φ. Each time-resolved sinusoid, or spectral component, (on the left) can be represented as a complex number depicted in the complex plane (polar plots on the right). The amplitude of the sinusoid corresponds to the distance of the point from the origin (absolute value). The initial phase of the sinusoid corresponds to the angle of the point from the horizontal axis (argument). C – In the frequency domain, the power spectrum (top) represents the squared amplitude of each spectral component, and the phase spectrum represents the initial phase of each spectral component. D – Power spectrum of real EEG recordings, exhibiting a typical inverse power law distribution (1/f). A 6 Hz steady-state response is visible in the spectrum as a sharp narrow-band peak. Background noise is often quantified as the average activity over neighboring frequency bins. E – For each frequency, phase coherence captures the variance of the distribution of phases across epochs. Polar plots represent phase distributions at the stimulation frequency (in red, high phase coherence) and at a non-stimulated frequency (in black, low phase coherence). F – Examples of SNR (top) and ITC (bottom) plots.

The fast Fourier transform (FFT) is a popular and computationally efficient algorithm for converting a digitized signal from the time domain to the frequency domain. Because it requires a minimum of two samples per cycle to detect the time course of a sinusoid, the upper frequency limit depends on the sampling rate. The frequency spectrum of a digitized signal thus extends from 0 Hz up to one half of the digitization rate, a limit known as the Nyquist frequency. The resolution of the spectrum, that is the frequency difference between two adjacent frequency bins, corresponds to the reciprocal of the duration of the signal analyzed. In other words, for a 2-s EEG epoch digitized at 250 Hz, the FFT decomposes the signal into a set of frequencies at a resolution of 0.5 Hz from 0 Hz to 125 Hz. The FFT outputs a set of complex numbers, or Fourier coefficients, which contain information about the phases (argument of the coefficients) and amplitudes (absolute value of the coefficients) associated with the successive frequencies (comprised between 0 Hz and the Nyquist frequency) embedded in the signal (Fig. 1B). The EEG waveform can then be reconstructed by summing up all of these sine waves. The phase at each frequency represents the point in the cycle of the sine wave at the beginning of the analysis window. In other words, it contains the timing information of the signal. The amplitude characterizes the strength of the oscillation at each frequency. The amplitude associated with the lowest frequency, 0 Hz, represents the mean of the signal. The frequency content of a signal is most commonly represented by its amplitude or power (squared amplitude) spectrum (Fig. 1C-top).

Rhythmic stimulation and EEG brain responses

Brain activity can thus be described in terms of rhythms, and thanks to Fourier analysis, the strength and timing of these rhythms can be mathematically computed. While in typical ERP experiments, cognition is assessed by presenting discrete and isolated stimulus events, each of which elicits a transient response evaluated in the time domain, the SS-EP method consists in repeatedly presenting stimuli at a regular rate, which elicits a periodic neural response at that rate. In other words, the brain is entrained by the rhythmic stimulation, and reaches a steady-state mode in which the frequency content of the neural response is stable over time and directly related to the frequency content of the stimulation. For slow presentation rhythms, the evoked steady-state response will consist of the successive transient responses, phase-locked to the stimulation. For faster rhythms, however, brain responses to individual stimuli will start to overlap in time and after some time reach its steady-state mode. In both cases, the evoked activity is periodic, resonating at the stimulation frequency f. Moreover, depending on the properties of the transient response and on the non-linearities of the processing stream, the evoked neural response can additionally show energy at higher harmonic frequencies (i.e., exact integer multiples of the stimulus frequency 2f, 3f and so on, see (Zhou et al., 2016)). The recorded steady-state response can thus be conveniently evaluated in the frequency domain (see Section 2.3). However, the exact nature of the neurophysiological mechanisms that give rise to the recorded rhythmic brain response remains a controversial issue. The SS-EP response was initially conceptualized as exclusively arising from the (nonlinear) superimposition of the successive transient responses to the individual stimuli of the stimulation stream (Capilla et al., 2011; Dawson, 1954; Keitel et al., 2014; D. Regan, 1965). In other words, the rhythmic sensory input elicits the regular repetition of impulse-like evoked responses, resulting in the recorded periodic neural activity. Over the past decades, a richer interpretation has gained popularity, which assumes that steady-state responses can additionally include the contribution of endogenous rhythmic neural generators synchronized to the exogenous rhythmic stimulation (Calderone et al., 2014; Makeig et al., 2002; Notbohm et al., 2016). Contrary to the former interpretation which does not make any mechanistic assumption about the underlying neural process, this more recent perspective posits that neural encoding relies on populations of neurons whose resting-state firing activity naturally fluctuates around an intrinsic frequency defined by the neurobiology of the neural ensemble. Those endogenous oscillators can then slightly shift their firing frequencies and/or phase to align with the external stimulation, in order to facilitate information processing (Giraud and Poeppel, 2012; Hyafil et al., 2015; Schroeder and Lakatos, 2009). This model is particularly attractive as it offers a mechanistic framework to explore both the neural underpinnings of cognition and the functional role of neural oscillations (see section 4.3). The oscillatory framework has proven productive to explore many cognitive processes ranging from attention selection (Lakatos et al., 2008) to predictive processes (Kösem et al., 2014), speech parsing (Hyafil et al., 2015) or music perception (Doelling and Poeppel, 2015). While there is substantial evidence that endogenous neural oscillations are indeed involved in the response to certain rhythmic input, in practice, it remains particularly challenging to assess whether a steady-state response exclusively reflects a series of evoked responses, or whether it also includes the action of an endogenous neural oscillator entrained to the rhythm of the exogenous stimulation (Doelling et al., 2019; Lerousseau et al., 2021; Zoefel et al., 2018). The answer may depend upon the brain region and the stimulation frequency, and may be modulated by the maturational age of the participants. Yet, studies often implicitly posit that a rhythmic brain response reflects an underlying endogenous neural oscillator. When conducting a steady-state study, it is thus important to explicitly state and justify whether and why it is assumed that endogenous oscillatory processes are involved in the generation of the expected steady-state response. Obleser and Kayser (2019) propose a distinction between (neural) entrainment in the narrow sense, which implies an endogenous neural oscillator playing a functional role in the processing of the external stimulation, and the term (neural) tracking, or neural entrainment in the broad sense, which refers to the mere alignment of the brain signal to the external stimulation, irrespective of the generating mechanism. In this article, the term entrainment is used in the broad sense. Regardless of this important conceptual delineation, rhythmic stimulation and steady-state responses remain a versatile tool to investigate brain function and cognitive processes.

Measurements

The frequency-domain signal processing approaches for detecting a steady-state response can be subdivided into two categories based on whether they rely on amplitude/power spectrum or phase coherence computation. First, the most common and perhaps most intuitive approach derives from the inspection of the signal power spectrum in which steady-state responses are qualitatively observed as narrowband peaks at the stimulation frequency and higher harmonic frequencies (Fig. 1D). The strength of the peak is then quantified as a signal-to-noise ratio (SNR) between the power recorded at the stimulation frequency and some estimate of the background spectral noise. The higher the SNR, the stronger the steady state response. For example, background spectral noise can be estimated as the average power at frequencies surrounding the response frequency (Fig. 1D), provided these surrounding frequencies do not contain any harmonically related frequencies (M. P. Regan and Regan, 1989; Zurek, 1992). Alternative noise estimates are further discussed in Section 4 of this paper. The second approach for quantifying SS-EPs focuses on the phase information of the signal, and consists in quantifying phase coherence, also denoted as phase locking. It relies on the fact that an SS-EP response by definition is precisely and steadily synchronized to the rhythmic stimulation. As a result, the phase of the recorded neural response at the stimulation frequency will be fairly stable from one stimulation cycle to the next, and thus phase coherence will be high across stimulation cycles. Conversely, if the recorded neural activity is not synchronized to the external stimulation, the phase will vary randomly across stimulation cycles, and phase coherence will be low. Several implementations of phase coherence have been proposed, some relying solely on the phase information computed by the FFT of the signal (Jerger et al., 1986; D. R. Stapells et al., 1987), others incorporating both phase and amplitude information (Dobie and Wilson, 1989, 1995).

SS-EP in infancy research

Historically, the SS-EP approach was first introduced by David Regan in the 1960s, as a complementary method to the traditional time-domain signal averaging used for enhancing the signal-to-noise ratio of visual evoked potentials (D. Regan, 2009). The method initially aimed at measuring “the characteristics of the observer’s color vision by purely electrophysiological means, without the use of any psychophysical measurements” (D. Regan, 1965). Together with a growing research community, Regan rapidly enlarged the scope of SS-EP applications beyond color vision, to explore other aspects of the visual system (for a review see (Norcia et al., 2015; Vialatte et al., 2010)). Meanwhile, auditory SS-EPs (often denoted as auditory steady-state responses - ASSRs) were first described in detail by Galambos et al. (1981), as a candidate technique to test the integrity of auditory pathways, closely followed by several other groups around the globe (Kuwada et al., 1986; Rees et al., 1986; Rickards and Clark, 1984; D. Stapells et al., 1984). The practical advantages of the technique (see Box 1 for a discussion on the advantages and limitations of the technique), in providing rapid and robust neural responses, soon triggered the interest of developmental scientists to investigate low-level vision and audition, yielding some important theoretical results as well as practical screening applications to detect early visual and auditory impairments. To a lesser extent, SS-EPs were also used with infants to investigate higher-level cognition. We review in this section the use of SS-EPs with pediatric populations when the rhythmic stimulation was in the visual or auditory modality.

The visual modality

Inspired by the work of Regan in the late 1980s, a few research groups started adapting ingenious SS-EP protocols to provide rapid and objective measures of visual function in pediatric populations, among which the sweep visual evoked potential (VEP) (Norcia and Tyler, 1985) and the steady-state orientation-reversal VEP (O. J. Braddick et al., 1986) were used to assess visual acuity and orientation selectivity. In the sweep VEP (Norcia and Tyler, 1985), a visual stimulus is flashed on a screen at a fixed frequency while one of its physical attributes is parametrically varied, or ‘swept’, across a range of values during a few tens of seconds. For example, visual acuity can be rapidly assessed at the individual level by sweeping the spatial frequency of visual gratings from low to high. SS-EP amplitude is computed at several points throughout the sweep and visual acuity is estimated as the highest spatial frequency at which a significant SS-EP response is recorded. In their seminal study, Norcia and Tyler (1985) used the sweep-VEP to track the developmental trajectory of visual acuity during infants’ first year of life, and showed that visual acuity reaches adult levels at about 8 months. Since then, the sweep-VEP procedure has been successfully adapted to reliably assess other aspects of low-level vision in infants, such as Vernier acuity (Skoczenski and Norcia, 1999) or contrast sensitivity (Norcia et al., 1989) (for a review on the sweep-VEP see Almoqbel et al. (2008)). In the orientation-reversal VEP (O. J. Braddick et al., 1986), both the spatial phase and the orientation of visual gratings change, but at two different rates. This dual frequency-tagging allows for isolating an orientation-specific response, corresponding to the orientation-reversal rate of the stimulation, over and above brain responses to local changes in luminance and contrast induced by the rhythmic phase alternations. Using this procedure, Braddick et al. (1986) demonstrated that newborns do not exhibit any orientation specific response. Their results showed that orientation selectivity develops rapidly during the first two postnatal months. The procedure was thereafter adapted to objectively study the ontogeny of several other aspects of visual function such as motion sensitivity (O. Braddick et al., 2005) or binocular vision (O. Braddick et al., 1980, 1983). These electrophysiological measures of early visual functions were shown to correlate particularly well with behavioral estimates of visual acuity, obtained using preferential looking techniques (Allen et al., 1992; Sokol et al., 1988). SS-EP was even shown to provide better visual acuity estimates than behavioral assessments, with higher test-retest reliability (Dobson and Teller, 1978; Polevoy et al., 2017). It is only from the 2010s that SS-EPs emerged as a promising technique to address more complex cognitive processes, with a focus on the processing of high-level visual stimuli such as faces and objects. Farzin et al. (2012) recorded infants’ neural responses to a stream of either face or object images presented at a fixed rate, and reported that the two categories elicited different scalp topographies. A comparable design was recently used with newborn infants, contrasting newborns’ responses to slowly contrast modulated facelike and non-facelike schematic patterns (Buiatti et al., 2019). Within only a few minutes of stimulation, results showed that facelike patterns elicited stronger SS-EPs, already recruiting a right-lateralized cortical network (Fig. 2A). In another ingenious design using a dual frequency-tagging paradigm, de Heering and Rossion (2015) used a stream of face and object images presented at a fixed base rate of 6 Hz, with face images appearing every 5th item, i.e. at a fixed rate of 1.2 Hz (Fig. 2B). Importantly the specific tokens in each category were never repeated, such that only high-level content was periodically modulated at 1.2 Hz. After only a few minutes of recording, significant activity was observed at this f/5 frequency, over right occipito-temporal sites in 4- to 6-month-old infants, reflecting the face categorization response (Fig. 2B). This paradigm (termed the fast periodic oddball paradigm) was thereafter adapted to investigate other categorical responses (Bertels et al., 2020; Peykarjou et al., 2017), as well as individual-level discrimination of unfamiliar faces (Barry-Anwar et al., 2018). More recently, studies used similar fast periodic oddball paradigms while infants were exposed – or not – to their maternal odor to investigate whether and how olfactory information shapes high-level vision. Results demonstrated that maternal odor enhanced face-selective neural processing (Leleu et al., 2020), and possibly triggered infant’s interpretation of facelike stimuli as faces (Rekow et al., 2021).

Fig. 2.

Examples of infant SS-EP studies. A – Schematic patterns are contrast-modulated at a fixed rate (0.8 Hz), eliciting a steady state response in newborns. A significant difference between upright and inverted facelike patterns emerges in the power spectrum at 0.8 Hz over two clusters of electrodes (occipital and frontal). Adapted from Buiatti et al. (2019). B – Face (F) and object (O) stimuli are presented at 6 Hz to 4–6-month-old infants, with face stimuli appearing every fifth image, i.e. at 1.2 Hz. A clear peak associated to the face categorization response appears in the SNR plot at 1.2 Hz, over the right-lateralized electrode P8. Adapted from de Heering and Rossion (2015). C – Amplitude modulated sounds are constructed by varying the amplitude of a carrier signal (e.g. 500 Hz pure tone, in blue) using a rhythmic modulation signal (e.g. 25 Hz sinusoid in red). The strength of the auditory SS-EP amplitude varies as a function of modulation frequency. In infants, SS-EPs decrease with increasing frequency, while adult SS-EPs show a large enhancement around 40 Hz. Adapted from Stapells et al. (1988), with permission from Elsevier. D – Tri-syllabic non-sense words embedded in a continuous speech stream are presented to 8-month-old infants. The rhythmic presentation of syllables elicits a large steady-state response at syllable presentation rate. As infants progressively discover the systematic dependencies between the first and last syllables of the tri-syllabic words, another steady-state response emerges at word presentation rate. Adapted from Kabdebon et al. (2015).

SS-EP has also been used in a more indirect way to investigate how low level steady-state visual responses driven by flickering luminance can be modulated by various cognitive processes. Using this approach, Roberston et al. (2012) explored how 3 month-old infants deploy attention to a complex visual scene, and how attention modulates SS-EPs. In this study, three visual objects tagged at different frequencies were presented simultaneously, while infants’ looking behavior and neural activity was recorded. Results showed that SS-EPs provided a measure of covert attention, as the SS-EP response to the soon-to-be fixated object increased shortly before gaze redirection. In a series of experiments, Christodoulou et al. (2018) demonstrated that infants’ sustained attention during habituation and recovery could be measured using SS-EPs. Importantly, they showed that SS-EPs yielded more robust results than looking time measures. Recent studies also explored whether and how low-level SS-EP responses can be modulated by surprising events (Kabdebon and Dehaene-Lambertz, 2019; Köster et al., 2019).

The auditory modality

Because acoustic sounds consist of mechanical vibrations and the cochlea processes these vibrations as a Fourier analyzer, SS-EPs stand out as an especially promising technique to explore auditory function. As introduced by Galambos et al. (1981), the integrity of auditory pathways can be objectively assessed by purely electrophysiological means. Indeed, when presenting a continuous acoustic high-frequency vibration (the carrier frequency, e.g., 500Hz), modulated in amplitude at a lower frequency (the modulation frequency, e.g., 25Hz, see Fig. 2C), the auditory cortex resonates at the modulation frequency. If a steady-state response is recorded at the modulation frequency, it provides unequivocal evidence that the cochlea transforms the carrier frequency tone into a neural signal which is efficiently transmitted along the brainstem auditory pathway. In their seminal study, Galambos et al. (1981) demonstrated that the adult auditory cortex resonates strongly for modulation frequencies around 40 Hz, rendering steady-state measurements particularly robust. Importantly, they also reported that the strength of the auditory steady-state response (auditory SS-EP, often denoted as ASSR or SSR) was directly linked to behavioral hearing sensitivity associated with the carrier frequency. These early results triggered a large body of research in the field of applied audiology to develop early screening procedures for hearing loss in pediatric populations. However, infants’ immature auditory cortices show a different resonance pattern than adults’ mature auditory cortices (Fig. 2C). Rhythmically modulated sounds were found to elicit much weaker steady-state responses in infants (D. R. Stapells et al., 1988). In addition, the amplitude of the infant ASSR decreases with increasing modulation frequency (as does the adult response), but unlike adults, it does not show any enhancement around 40 Hz (Levi et al., 1993; Riquelme et al., 2006). This discrepancy arises from the fact that the underlying neural generator of the 40 Hz response (as well as lower modulation frequencies) is located in the auditory cortex (Hari et al., 1989; Herdman et al., 2002; Ross et al., 2005). This cortical region has a protracted maturation, which implies delayed temporal processing (Adibpour et al., 2020; A. Chen et al., 2016; Shafer et al., 2015; Wunderlich and Cone-Wesson, 2006), such that it does not support steady-state responses at high rates. Besides, as with any cortical source, the recorded activity is much impacted by the arousal state of the subject (L. T. Cohen et al., 1991), which is problematic for screening newborns and young infants who are typically asleep. Above 50 Hz, however, the brainstem becomes the main source of entrained activity (Cone-Wesson et al., 2002; Hari et al., 1989; Herdman et al., 2002), and efficient steady state responses can be recorded from sleeping adult and infant participants (L. T. Cohen et al., 1991; Rickards et al., 1994). These auditory brainstem steady-state responses (ABR) are less impacted by arousal state and developmental factors (Pethe et al., 2004), and they are reliably detectable even at high frequencies (e.g., 80 Hz) in individual newborns and asleep infants (Levi et al., 1993, 1995; Rickards et al., 1994; Riquelme et al., 2006) Auditory SS-EP have thus become an important component of the test battery for evaluating newborns and young infants with hearing impairment (Korczak et al., 2012; Picton, 2010; Rance, 2008). The technique allows audiologists to stimulate various frequency regions along the cochlea (depending on the properties of the carrier signal), and automatically assess the integrity of the auditory pathways up to the central nervous system. Thanks to the spectral specificity of the steady-state response, more advanced screening protocols allow for simultaneously testing multiple carrier frequency tones, each associated with a different modulation frequency, presented in either one or both ears (Hatton and Stapells, 2011; Lins and Picton, 1995). With recent technological and analytical improvements, the technique is still under development (Sininger et al., 2018, 2020) and many opportunities remain to extend SS-EP applications in audiology. SS-EPs have also been used to track the emergence and development of auditory cortical function in very early infancy. In a recent study using fetal magnetoencephalography, Niepel et al. (2020) reported auditory steady-state responses in human fetuses. They were exposed to 500 Hz tones modulated in amplitude at 27 Hz or 42 Hz. The authors reported enhanced phase coherence at 27 Hz, but not at 42 Hz, consistent with previous work showing that the immature brain cannot sustain fast rhythms. In a much slower frequency range, Daneshvarfard et al. (2019) presented preterm infants aged 29 to 34 gestational weeks with repetitive syllabic stimuli at a 1.6 Hz rate, and found that both phase coherence and spectral amplitude at the stimulation frequency increased with gestational age. They additionally observed a rightward lateralization of the steady-state response. These results demonstrate that during the last trimester of pregnancy when thalamocortical fibers are still growing into the cortical plate (Kostović and Judaš, 2015), the human auditory network is already functional and able to process slow modulation rhythms. Beyond low-level auditory function, a few studies additionally investigated infants’ auditory statistical learning abilities using SS-EP (Choi et al., 2020; Fló et al., 2022; Kabdebon et al., 2015), extending earlier adult studies. In these studies, participants were presented with a continuous stream of syllables presented at a fixed rate. The syllable stream actually consisted of nonsense tri-syllabic words defined by transitional probabilities between syllables (Fig. 2D). If participants picked up on these statistical dependencies, they should have progressively discovered the non-sense words, and a steady-state response should emerge at the f/3 word presentation rate. Using this approach, Choi et al. (2020) demonstrated that 6 month-old infants can track statistical dependencies between adjacent syllables, and Kabdebon et al. (2015) showed that 8 month-old infants are sensitive to statistical dependencies between non-adjacent syllables (Fig. 2C). Finally, SS-EPs were used to investigate beat and meter perception in 7- and 15-month-old infants, and their relation to previous musical experience (Cirelli et al., 2016).

Summary

Overall, this review of infant SS-EP studies demonstrates the critical advantages that characterize the technique for developmental research: (1) an objective identification of the expected neural responses elicited at the exact frequency defined by the experimental design, (2) highly robust SS-EP responses, concentrated in a narrow frequency band, and (3) short recording durations compatible with infants’ limited attentional span (see box 1 for a larger discussion on the strengths and limitations of the technique for infant research). In the next section, we draw on the adult literature to further emphasize the diversity of the cognitive processes that can be addressed with SS-EP studies, which opens up exciting perspectives for developmental research.

Perspectives

The vast majority of the developmental SS-EP literature reviewed in the previous section focuses on vision and audition, studying relatively low-level processes – with only a few notable exceptions. By contrast, in the adult literature, the SS-EP technique has been used to investigate other sensory modalities, more diverse cognitive processes, and across a variety of imaging techniques. Although developmental research has constraints of its own and cannot consist of merely transposing adult-oriented questions and methods to infants, we believe that researchers studying infant development can draw insights from this adult literature to develop robust and innovative experimental approaches, as well as novel lines of inquiry for studying the infant mind. In order to illustrate its conceptual richness, this section highlights some inspiring new uses of the SS-EP technique for studying higher-level cognitive processes.

An electrophysiological correlate of attention

One of the primary applications of SS-EPs has been in the study of attention (Andersen et al., 2011). In a seminal study, Morgan et al. (1996) elegantly demonstrated that visual attentional processes modulate the strength of SS-EP brain responses in adult participants: attended flickering stimuli elicited enhanced steady-state responses compared to unattended stimuli. These results were replicated and extended in the auditory (Ross et al., 2004), and somatosensory (Giabbiconi et al., 2004) modalities, as well as in multisensory settings (Colon et al., 2015; Covic et al., 2017; Porcu et al., 2013). Just like fluctuations of attention in young infants can be monitored through looking time behavior, the level and focus of attention can be continuously tracked over space and time using the SS-EP technique. Thanks to the robustness of the elicited response, SS-EPs are also commonly used in brain-computer interface settings (X. Chen et al., 2015). By simply superimposing a continuous flicker over the display of a visual task, it was possible to monitor changes in adult participants’ attentional load, and dissect the successive cognitive processes induced by a working memory task (Ellis et al., 2006; Perlstein et al., 2003; Silberstein et al., 2001, 2003). Another approach consists in presenting multiple stimuli tagged at different frequencies, each driving an SS-EP at its respective rate. Variations around this procedure allowed for investigations of many aspects of attention. For example, the speed of spatial shifts of attention was characterized by analyzing the time course of SS-EP amplitudes following attention-directing cues (Andersen and Muller, 2010; Hindi Attar et al., 2010). The procedure was also adapted to explore the spatial distribution of visual attention by using concentric arrangements of flickering items around a centrally fixated flicker (Müller et al., 2003; Müller and Hübner, 2002), suggesting that the spotlight of attention is shaped like a donut. Using two superimposed fields of moving dots flickering at two different frequencies, SS-EPs further demonstrated that attention can be deployed to track distributed features, independently of spatial attention (Andersen and Muller, 2010; Muller et al., 2006).

Tagging high-level cognition

SS-EPs offer further opportunities to tag increasingly complex cognitive processes, and investigate their neural underpinnings. A major application of the SS-EP technique has focused on high-level vision, by presenting participants with a high-frequency periodic stream of visual stimuli in which high-level content is periodically modulated at a lower frequency. For example, the neural underpinnings of face identification can be isolated by modulating high-level features such as facial identity or facial expression (Rossion and Boremanse, 2011; Zhu et al., 2016). The fast-periodic oddball paradigm (described in section 3.1) can be used to directly tag the contrasting categorical response between two sets of stimuli, using a single stimulation stream. Using this paradigm with adult participants, it was possible to identify categorical responses to faces (Liu-Shuang et al., 2014; Rossion, 2014), tools (De Keyser et al., 2018), and letters (Lochy et al., 2016), but also words (Lochy et al., 2015), semantic categories (Stothart et al., 2017) or numerical quantities (Guillaume et al., 2018; Park, 2018). Importantly, the SS-EP technique proved particularly versatile to investigate face perception across a variety of neuroimaging techniques, including M/EEG (Rossion, 2014), intracranial recordings (Jonas et al., 2014, 2016; Rossion et al., 2018) and even fMRI (Gao et al., 2018). More recently, ingenious SS-EP paradigms were developed to investigate cognitive and neural integration mechanisms. In such designs, different elements of the stimulus are modulated at slightly different frequencies (e.g., f1 and f2), inducing peaks in the power spectrum not only at the fundamental modulation frequencies and their harmonics (e.g., f1, f1, 2f1, 2f2, etc.), but also at the sum of any non-zero integer multiple of the input frequencies (e.g., f1 + f2, f1−f2, 2f1 + f2 etc.). Crucially, these intermodulation components provide direct evidence for the non-linear integration of the brain signals driven by the differentially modulated elements of the stimulation (for a full review see (Gordon et al., 2019)). Intermodulation analyses are typically used to shed light on several aspects of perceptual binding. For example, visual “pacman” inducers can be frequency tagged to elicit an intermodulation response to the illusory percept of a Kanizsa square (Alp et al., 2016). Boremanse et al. (2013) used split face images flickering at different frequencies to isolate intermodulation components reflecting integrated – or holistic – face percepts. In a recent study, Adibpour et al. (2021) used a similar approach with interacting bodies to probe the emergence of a neural representation for social interactions. This approach can also be used across sensory modalities to investigate multisensory integration (Giani et al., 2012). High-level auditory processing has also been productively and creatively investigated using SS-EPs, with two main fields of inquiry: music and speech processing. SS-EPs are indeed a particularly useful tool to explore musical rhythm perception, and address how it gets impacted by various factors (for a review see (Nozaradan, 2014; Nozaradan et al., 2018)). While beat and meter percepts can be induced by simple metronomic pulses, they can also arise at frequencies that are not even present in the acoustic input in response to more complex rhythmic patterns, and/or as a result of top-down processes. SS-EP analyses allow for the capture of these percepts directly and objectively from the brain response. For example, Nozaradan et al. (2011) presented participants with a musical beat and asked them to imagine a binary or a ternary meter on this beat. They were able to record not only the entrained response at the beat frequency but also a neural response at the imaginary meter frequency. Others used a similar approach to explore how musical rhythm perception is influenced by various factors, including the nature of the sound (Lenc et al., 2018), body movements (Chemin et al., 2014), musical expertise (Stupacher et al., 2017) or development (Cirelli et al., 2016). This approach opens up exciting opportunities to investigate early music perception and its development. Another especially productive line of research actually derives from seminal infant studies on statistical learning (Saffran et al., 1996). Participants are exposed to a continuous syllable stream in which non-sense words are embedded, as defined by the co-occurrence patterns between syllables. If participants detect these statistical regularities, a steady-state response should emerge at the word presentation rate. Such artificial grammars proved particularly useful to directly explore the cerebral bases of an online learning mechanism (Batterink and Paller, 2017, 2019; Batterink and Zhang, 2022; Buiatti et al., 2009; Elmer et al., 2021; Farthouat et al., 2017, 2018; Henin et al., 2021; Ordin et al., 2020; Ramos-Escobar et al., 2021). High-level speech processing has recently been addressed using a similar SS-EPs approach, whereby a speech stream induces high-level linguistic processes at a fixed rate. In an ingenious experimental design, Ding et al. (2016, 2017) used cleverly designed speech stimuli with a fixed word rhythm such that the acoustic signal did not contain any cues about higher-level linguistic units. Nevertheless, a hierarchy of rhythmic brain responses was recorded in response to not only the word frequency but also phrasal and sentential rhythms, reflecting the extraction of high-level linguistic chunks (Jin et al., 2020). This approach was further developed to study the impact of attention (Ding et al., 2018; Har-shai Yahav and Zion Golumbic, 2021), sleep (Makov et al., 2017) or training (Y. Chen et al., 2020) on linguistic processes.

A window into underlying neural mechanisms

A particularly inspiring and promising use of SS-EPs is to take advantage of the rhythmic nature of the stimulation to explore the mechanistic underpinnings of mental processes. Neural oscillations are indeed ubiquitous across the brain and reflect rhythmic fluctuations in the membrane potentials of neurons switching between high and low excitability states (Buzsáki, 2006). These endogenous neural oscillations have been linked to numerous perceptual and cognitive functions. They have been put forward as playing a mechanistic role in several of these cognitive functions, whereby the high-excitability phase of oscillations is adjusted so as to coincide with relevant sensory input (Lakatos et al., 2019). SS-EP experiments thus represent an elegant opportunity to interact with some of these ongoing neural oscillations and inspect whether and how they effectively modulate cognition. For example, Gulbinaite et al. (2019) demonstrated that the attentional enhancement of the steady-state response initially reported by Morgan et al. (1996) actually varies across flicker frequencies, with stronger effects in the alpha and gamma frequency bands, indicating that input rhythms interact with the intrinsic neurophysiological properties of the visual and attentional networks. In an extensive body of research, Lakatos et al. (2008) demonstrated that given two competing rhythmic stimuli, the brain phase-locks to the attended stimulus, inducing an increased response gain, and enhanced behavioral performances, suggesting that oscillations mediate attentional processes. Speech processing has also been proposed to rely on the recruitment of neural oscillations (Giraud and Poeppel, 2012; Kösem and van Wassenhove, 2017). SS-EP experiments thus represent a critical tool to test and enrich such theoretical frameworks (Kösem et al., 2018; Power et al., 2012), at the interface between cognition and neurophysiology. Note however that rhythmic brain activity does not necessarily entail the action of an endogenous neural oscillatory process and may be driven entirely by the exogenous stimulation (see section 2.2 for a discussion on rhythmic brain responses). This overview of the adult literature emphasized the potential of SS-EPs to explore a wide variety of cognitive and neural processes. The conceptual richness of these studies together with the practical advantages of using SS-EPs with developmental populations opens up exciting and timely opportunities for investigating the developing mind.

Implementing a SS-EP study

Study design

Direct or indirect paradigm?

As illustrated in the previous literature overview, there are two dissociable experimental approaches to investigate a given cognitive process using SS-EPs. In a first approach, the experimental paradigm is designed to periodically trigger the cognitive or sensory process of interest at a fixed frequency throughout the stimulation. Importantly, the paradigm must ensure that the process of interest can be isolated from lower-lever processes. For example, in Buiatti et al. (2019) a schematic facelike pattern is periodically presented to newborns, triggering facelike pattern detection processes at every cycle of the stimulation. In this study, a non-facelike pattern control condition is used to isolate the cognitive process of interest. Another prime example of direct paradigms is the fast-periodic oddball paradigm (e.g. (de Heering and Rossion, 2015) described in section 3.1) in which the stimulation stream is built hierarchically, with infrequent stimuli embedded at a slow frequency rate, within a faster periodic stream of stimuli. This dual frequency design allows for dissociating the rate of the slow categorization response from that of the faster local low-level changes within a single stimulation stream. Note that the stimulation frequency(ies) used in direct paradigms must be carefully selected, in order to ensure that the brain network supporting the process of interest can sustain the driving rhythm (e.g., the 3–15 Hz frequencies used with EEG must be lowered to 0.1–0.5 Hz with fMRI; Gao et al. 2018). The second indirect approach consists in tagging low-level sensory features (e.g., contrast) while participants are engaged in a (non-rhythmic) cognitive task, and inspecting how experimental conditions modulate the low-level sensory response. SS-EP modulations arise from top-down effects whereby the cognitive process of interest induces changes in attention, or cognitive load. Importantly, the low-level features should remain as similar as possible across experimental conditions. For example, in Kabdebon and Dehaene-Lambertz (2019), the background of a target image was flickered at a fixed rate, and the authors reported enhanced SS-EPs when the image was expected. The distinction between these two types of experimental paradigms – direct and indirect – is crucial for the interpretation of the results.

Which stimulation frequency(ies)?

The choice of the stimulation frequency(ies) should factor in (1) the properties of the targeted neural response, (2) the properties of the endogenous background physiological noise, and (3) some practical constraints. The stimulation frequency(ies) should match the temporal dynamics of the targeted neural system(s): sensory responses typically have short latencies and can be tagged at relatively fast frequencies, but as brain responses move up the cortical hierarchy, their latency typically increases such that integrated cognitive responses are usually tagged at lower frequencies. In other words, there is a ‘sweet spot’ in the frequency domain where stimulation will elicit robust SS-EPs, and ‘blind spots’ where the stimulation will not elicit any sustained response. Critically, these limitations depend upon the neurophysiological properties of the underlying neural system, which varies with age. The frequencies yielding robust responses with adults may thus be inappropriate to elicit SS-EP responses in the infant brain, especially for higher-level cognitive processes. Ideally, the frequency tuning function of the targeted system should be defined ahead of the experiment (Alonso-Prieto et al., 2013). This, however, represents an important cost, which is not always achievable. A second factor to consider when choosing the stimulation frequency is the background brain activity. Indeed, neural recordings contain a wide range of spontaneous fluctuations (termed background noise, by contrast to the evoked signal) which are not distributed uniformly across frequencies. Instead, EEG typically exhibits an inverse power law (or 1/f) distribution: in other words, the recordings are dominated by slow fluctuations, especially in young infants (Eisermann et al., 2013; Marshall et al., 2002). Additionally, some frequency bands can show particularly strong spontaneous activity. This is the case for the classical alpha band (9–12Hz in adult, 6–9Hz in infants (Marshall et al., 2002)) over occipital sites. As a result, unless the aim of the experiment is to interact with these endogenous rhythms, it is typically recommended, whenever possible, to avoid these noisy frequency bands in order to maximize the signal-to noise-ratio at the targeted frequency. If it cannot be avoided, the signal-to-noise ratio can be increased by presenting a longer stream, with more stimulation cycles. However, it is important to note that the distribution of the background noise level is impacted by the mental state of the participant (e.g., asleep vs awake). Finally, various practical aspects of the experimental design need to be considered. First, the stimulation frequency(ies) must be compatible with the duration of the trials. A general rule of thumbs is to ensure that at least two to four stimulation cycles fit within a stimulation stream. The larger the number of cycles, the more robust the SS-EP response, and the cleaner the subsequent analyses. Another potential limiting factor is the stimulus presentation device. Stimulus presentation is indeed necessarily synchronized with the device’s refresh rate, which can severely restrict the range of possible stimulation frequencies. For example with a visual flickering stimulation, if all stimuli have to have a 50/50 on/off ratio for the purpose of the experiment, then frequencies are limited to even integer divisors of the monitor’s refresh rate (but see (Andersen and Müller, 2015) for interpolation methods). Finally, if the paradigm involves multiple stimulation frequencies, one should ensure that the successive frequencies and their harmonics can be separated by a minimum of 4 to 8 frequency bins during the spectral analysis. Failure to do so will allow cross-contamination and an inflated (or deflated) estimate of the true EEG power at the targeted frequency.

Which control condition?

A control condition is crucial for the interpretation of SS-EP results. It allows for not only testing the specificity of the paradigm to evoke the targeted periodic neural response, but also to further interpret the potential non-linearities of the response. A good control condition typically preserves the exact same features as the experimental condition except that it does not trigger the cognitive process of interest. Some direct paradigms nonetheless allow for directly tagging the cognitive process of interest, such that the sole presence of a rhythmic response at the stimulation frequency above background noise provides strong evidence in support of the targeted cognitive process. For example, in the fast-periodic oddball paradigm in which face images are embedded within object images every 5th image at 1.2 Hz, the recorded 1.2 Hz peak in the EEG spectrum is necessarily linked to face perception, and one could dispense with a control condition. However, to ensure the clarity of the results and their interpretation, we recommend to always include a control condition. One way to do this in the above-mentioned example is to randomly shuffle the assignment of images to conditions with the expectation that this control condition will yield null results.

Data pre-processing

Once the data is recorded, brain responses to the steady-state stimulations must be extracted from the continuous data, and this is typically done by first segmenting the EEG data into epochs. Epochs should contain an integer number of stimulation cycles (a minimum of two to four), in order to ensure the reliable computation of the corresponding oscillation during transformation into the frequency domain. Moreover, the epoch length determines the frequency resolution of the spectral decomposition (frequency resolution (Hz) = 1/epoch length (s)), and hence the number of reconstructed frequency bins in the spectral domain. While the SS-EP response is concentrated in a unique frequency bin, non-SS-EP related activity (noise) is distributed across all frequency bins of the computed spectrum. As a result, the longer the epoch, the higher the frequency resolution, and the better the signal-to-noise ratio. However, a typical infant’s EEG recordings contain several periods of highly contaminated data, and/or moments when the infant is not exposed to the periodic stimulation (e.g., when the infant looks away from the visual display, in the case of visual SS-EPs). Due to their low signal-to-noise ratio, epochs overlapping these data portions should be rejected from the analyses. For paradigms involving the presentation of short stimulation streams (few seconds), an epoch will be typically created for each trial, and rejected if artifacts are present. However, for paradigms involving long stimulation streams, developmental scientists cannot afford to reject an entire epoch because of an occasional movement artifact contaminating only a portion of the epoch. Therefore, the brain response to a long stimulation stream is segmented into shorter epochs in order to save as much artifact-free data as possible. A trade-off should thus be found to ensure both an accurate SS-EP reconstruction and enough artifact-free trials. Finally, the epoching strategy also depends upon the subsequent intended analyses. While the power of the SS-EP response can in theory be derived from a single EEG epoch, the phase coherence computations require, by definition, an evaluation across multiple epochs (a minimum of 10 to 15 epochs). The aim of these measures is to evaluate how consistent the phase of the evoked SS-EP oscillation is across the successive stimulation cycles. Data thus needs to be segmented into multiple epochs each aligned to the beginning of a stimulation cycle (or to the same phase of a stimulation cycle). It is essential to note that the data should always be segmented into non-overlapping epochs to avoid the appearance of falsely enhanced activity due to the multiple uses of the same data (Benjamin et al., 2021). Moreover, the same number of epochs should be used for each of the stimulus frequencies and experimental conditions because the estimates of phase coherence are influenced by statistical power.

Spectral analyses

Transformation into frequency domain

A crucial step in SS-EP analyses is to transform signals from the time-resolved EEG epochs into the frequency domain. A variety of mathematical tools have been developed to perform such spectral analysis. However, given the specificity of SS-EP paradigms, and the stationarity of the targeted SS-EP response, the traditional Fourier transform is the most prevalent and most appropriate (Bach and Meigen, 1999). Typically, a fast Fourier algorithm is performed on EEG epochs, yielding a set of sinusoidal components characterized by their amplitude and phase, such that the sum of all components perfectly describes the observed data. The spectral components are typically outputted as complex values associated with each frequency f, where a corresponds to the amplitude and φ the phase of the corresponding sinusoid. The component associated with the stimulation frequency is further analyzed. To some extent, the frequency resolution can be increased by zero-padding (adding zeros at the edges of the data segment increases its length and hence its frequency resolution). Note however, that zero-padding does not add information as such. Instead, whenever possible, we recommend setting the appropriate frequency resolution directly when defining the epoch length. Wavelet-based time-frequency decomposition has occasionally been used (Batterink and Paller, 2017; Choi et al., 2020). However, this is a computationally costly operation, without any real added value, unless a time-resolved representation of the steady-state response is needed.

Quantifying neural entrainment

After spectral analysis, each epoch is expressed as a set of spectral components (or oscillations), each characterized by their amplitude and initial phase. Different analysis routines can be deployed to quantify neural entrainment at each frequency, depending on the research question and on the nature of the SS-EP paradigm. We provide here an overview of the most common approaches.

Phase and coherence.

Methods involving phase information are only compatible with paradigms where all epochs are precisely aligned to the same phase of a stimulation cycle (Fig. 3A). If the neural signal consistently follows the stimulation, then the SS-EP responses will be aligned across epochs. In other words, the distribution of phases computed from all epochs should have little variance. Measures of phase coherence (or phase locking, or phase consistency) can thus be implemented. The most common is the inter-trial coherence (ITC, also denoted as the phase-locking value, PLV), expressed as follows:

Fig. 3.

Examples of SS-EP analysis pipelines. A – For a set of phase-locked epochs, EEG epochs are first Fourier-transformed, each yielding a set of spectral components. The polar plot represents the distribution of three spectral components (red, blue and black) across epochs, each characterized by a specific amplitude and phase. Each dot represents the spectral component computed from a single epoch, for a given frequency. Inter-trial coherence (ITC) can be computed at each frequency from these distributions. Alternatively, spectral components can be averaged across epochs before computing the power spectrum. In both cases, the SS-EP pops out as a sharp peak in the power or ITC spectrum. Signal-to-noise ratios can finally be computed to correct for the background noise. B – For a set of non-phase-locked epochs, SS-EPs can only be assessed using power-based measurements. After computing the Fourier transform and the power spectrum of each epoch, power spectra are averaged across epochs. The SS-EP is then visible as a sharp peak in the average power spectrum. Finally, the power spectrum can be normalized by some estimate of the background physiological noise, yielding a SNR value for each frequency.

Where N is the number of epochs, is the computed phase at frequency f, for the epoch n, and i is the imaginary unit. If, φ is constant across all N epochs (fully aligned brain responses), then the sum term will add up to N, and ITC will be equal to 1. On the contrary, if φ is highly variable across epochs (desynchronized brain responses), ITC will be close to zero. Crucially, parametric circular statistics (Rayleigh test) allow for assessing whether this metric could result from a random sampling of all possible phases, and thus test the reliability of the SS-EP. Note that this implementation of phase coherence does not involve the amplitude of the spectral components, but an alternative measure of coherence combining both amplitude and phase can be implemented (Dobie and Wilson, 1989; Picton et al., 2003), yielding, in theory more sensitive estimates. In practice, the two measures provide very similar results (Dobie and Wilson, 1994). Overall, phase coherence measures show the interesting advantage of being less impacted by the inverse power law distribution of spectral noise (Fig. 3A). They can nonetheless be corrected by some estimate of the background noise, yielding a signal-to-noise ratio (Fló et al., 2022). However, because ITC essentially captures the variance of the distribution of phases across epochs, it is strongly influenced by the number of epochs: experimental conditions containing fewer epochs will generally have higher ITC values than conditions containing more epochs. It is thus recommended to equate the number of epochs before comparing experimental conditions. Besides, ITC is also sensitive to the noise level in the recordings. For these reasons, ITC cannot be compared across subjects, sessions and/or age ranges as such.

Amplitude and power.

Methods relying only on the amplitude information of the spectral components can be applied more generally to any kind of SS-EP paradigm (Fig. 3B). The strength of each spectral component corresponds to the amplitude (or power, computed as the squared amplitude) of the corresponding complex value. The amplitude/power spectrum represents the amplitude/power of each spectral component as a function of frequency. The log-transformed power spectrum is often represented instead of the raw power spectrum in order to mitigate the inverse power law distribution of the background noise. Besides, the symmetry of the Fourier spectrum is sometimes corrected by multiplying the spectrum by two. Such transformations affect the scaling of the represented spectrum and should be reported. The amplitude/power values are computed at each frequency and each epoch and averaged across epochs to yield the averaged amplitude/power spectrum. Note that if the successive epochs are phase-locked to the onset of a stimulation cycle, the spectral components can be averaged first, before amplitude/power spectrum computation (Fig. 3A). In that case, averaging maintains the signal but reduces the background noise (as in traditional ERPs). This approach is particularly useful for noisy recordings, and/or when small effects are expected. Because the amplitude/power spectrum is strongly impacted by background noise fluctuations, it is typically normalized: at each frequency, the spectral amplitude/power is corrected by some estimate of the background noise. SS-EP responses will thus pop-out as sharp peaks in the corrected spectrum (Fig. 3). A common correction procedure consists in dividing the evoked power at the frequency of interest by the average power measured over N surrounding frequencies, yielding a signal-to-noise ratio (SNR). This approach is valid as long as the chosen neighboring frequencies do not include any harmonic of the stimulation frequency. This procedure is quite prevalent in the literature because it is distributed as the F-statistic with degrees of freedom 2 and 2N (Dobie and Wilson, 1996; Zurek, 1992), and it allows for directly assessing the reliability of the SS-EP using standard statistics (see section 5.4). Another way to correct for the background noise is to quantify local deviations of the power spectrum from the idealized inverse power-law function (e.g. in (Kabdebon and Dehaene-Lambertz, 2019)). An inverse power law is fitted on the power spectrum, and the deviation from this idealized function at the frequency of interest is compared to local deviations computed over neighboring frequencies. This measure has the advantage of accounting for the 1/f distribution of the power spectrum, which can be particularly useful for low frequencies. Alternatively, the background noise can be recorded over a period of baseline activity.

Statistical analyses

Which H1 hypothesis?

Several approaches can be implemented to assess the significance of SS-EP results. Importantly, it is crucial for results interpretation to clearly specify what is being tested. Depending on the metric used, the research question, and the analysis pipeline different hypotheses can be tested.

The evoked amplitude/power exceeds background noise level.

A very common way of assessing SS-EPs is to ask whether or not the oscillatory power associated with the stimulation frequency is significantly higher than that of the background noise level. This approach is implemented using metrics based on amplitude and power, and it is typically used in very straightforward experimental designs (e.g., to assess the presence of an auditory SS-EP response to an amplitude modulated tone), or as a first step in the analysis to establish the presence (or absence) of a response at a given frequency, before further comparisons between experimental conditions. As mentioned in the previous section, background noise is often assessed over a set of N frequencies, surrounding the frequency of interest. A ratio can be computed in order to test the difference between the signal (i.e., power at the stimulation frequency) and the noise (i.e., average power over N neighboring frequencies). This signal-to-noise ratio is distributed as the F-statistic with degrees of freedom 2 and 2N (Dobie and Wilson, 1996; Zurek, 1992), and the strength of the test will vary with the number of neighboring frequencies N, used in the comparison. A higher number will increase the power of the test. However, extending the range of the surrounding frequencies may cause problems because the background noise is not uniformly distributed: the technique can however be adapted to exclude certain frequency bins from the calculation. Using the same logic, other procedures using traditional statistics (t-test, z-score) can be implemented to compare the SS-EP activity with that of the surrounding frequency bins (e.g., in (de Heering and Rossion, 2015; Liu-Shuang et al., 2014)), or some other estimate of background noise. The normality of the data must however be assessed. Overall, such statistical tests are performed at the individual participant level or at the group-average level. They typically represent a first step to detect significant above noise activity. This is for example useful to identify a set of responsive electrodes, or to assess the presence of intermodulation responses. Similarly, this procedure can be used to identify higher harmonic activations in order to aggregate the brain responses (Heinrich, 2010). Then to further assess the significance of the response above background noise across participants, the mean of the distribution of SNR measures can be compared against 1 (or 0 in the log space) (de Heering and Rossion, 2015).

The evoked response is phase locked to the stimulation.

This approach aims at assessing phase consistency across the successive stimulation cycles, exploiting coherence measures (e.g., ITC). It can be used to test the presence of an SS-EP response at specific frequencies both at the individual participant level and at the group level. Phase coherence measures are typically adapted from standard statistics, and parametric statistical tests can be performed to test the presence of a SS-EP response at a given frequency. For example, the Rayleigh test assesses the probability that a given ITC measure results from a random distribution of phases. Like SNR measures, it allows for identifying the presence or absence of significant rhythmic responses at specific frequencies, or at specific electrode sites. This procedure is, by definition, implemented at the individual participant level. The significance of phase coherence can also be tested using non-parametric statistics, by comparing the phase coherence measure against the null distribution of phase coherence measures computed over surrogate datasets. Surrogate datasets correspond to the null hypothesis of non-rhythmic activity, and are constructed by randomly shuffling the onset (and thus the phase) of the successive epochs (Henin et al., 2021). This approach can be adapted to test the significance of SS-EP responses across participants (Kabdebon et al., 2015).

The SS-EP response in condition A exceeds that in condition B.

Contrasting entrainment measures across well-designed experimental conditions is the final test to probe the specificity of the targeted cognitive process. Typically conducted at the group level, this approach compares phase coherence measures or corrected power measures (e.g., SNR) evoked by the different experimental conditions. Statistical comparisons can be conducted using parametric statistics (if the entrainment measures follow the appropriate distributions), or non-parametric statistics. Non-parametric statistics generally involve randomly shuffling the condition labels for each participant, in order to construct a permutation distribution (Maris and Oostenveld, 2007). Importantly, non-parametric statistics do not require any distributional assumptions.

Multiple comparisons

While SS-EP paradigms offer the important advantage of concentrating the signal in a limited set of frequency bins, the spatial localization of the effect can be more variable. With the advent of high-density recording systems (i.e., 64, 128, 256 electrodes), the dimensionality of the data can explode (number of electrodes × number of frequency bins), and multiple comparisons must be controlled. Dimensionality of the data can be reduced by averaging entrainment measures over all electrodes (Liu-Shuang et al., 2014) or a cluster of electrodes, defined a priori, before performing statistical tests. The cluster of electrodes can be defined based on the existing literature (Boremanse et al., 2013; Peykarjou et al., 2017), or based on the evoked response of the targeted sensory modality (Doelling et al., 2019). Another possibility for reducing the dimensionality of the data entails the implementation of spatial filters, whereby a single ideal electrode is constructed as a weighted sum of all electrodes to optimally separate the signal from the noise (M. X. Cohen and Gulbinaite, 2017; de Cheveigné and Parra, 2014; de Cheveigné and Simon, 2008). Alternatively, clustering and permutation algorithms can be used to control for multiple comparisons across electrodes, and identify the spatial distribution of the experimental effects (Kabdebon et al., 2015). Finally, more traditional methods can be implemented using Bonferroni or FDR corrections. Similarly, some studies require testing the experimental effects over multiple frequency bins (harmonics, or intermodulation components). It is then necessary to control for multiple comparisons across this set of frequencies. In any case, the key experimental comparisons of interest and the chosen statistical approach should be decided in advance.

How to interpret developmental differences

Although many experiments with infants are focused on whether a given cognitive process is present at a single age, studies of development often include comparisons across different ages (and with adults). There are a number of interpretative issues that should be considered when drawing inferences about developmental changes. First, EEG signals detected at a given electrode are the sum of a variety of cortical sources that originate from different sites (some quite distant from that electrode). Thus, the EEG signal is a mixture of multiple cortical generators whose relative amplitudes and phases are combined at each electrode site. As a result, there could be reliable SS-EP responses at locations in the brain that are not visible because of this spatial summation process at the level of the scalp. Second, the size of cortical generators and their distance from electrodes, even if scaled by the 10–20 system to compensate for age differences in head-size, could lead to developmental differences that are not reflective of true age differences in the underlying brain responses. That is, some developmental differences could be due to these measurement issues (e.g., propagation of potentials through multiple types of brain tissue) and not to actual differences in underlying neural mechanisms. Moreover, the absence of some developmental differences could be due to extraneous factors (e.g., maturational differences in internal noise). Thus, one must be careful not to over-interpret negative results obtained at a given age (especially early in infancy) as evidence for the absence of sensitivity to a given cognitive process simply because the SS-EP is not detectable. Third, the SS-EP paradigm holds the promise of providing a measure of neural responses from the developing brain that is more reliable or sensitive than what can be obtained from existing behavioral measures. However, it is important to keep in mind the possibility that a fully intact neural mechanism which is masked by internal noise or poor connectivity within a complex neural network may not enable the functional use of that neural mechanism. Thus, while we may discover using SS-EP paradigms that infants’ neural mechanisms are more mature than estimates provided by behavior, it is an infant’s behavioral competence that enables it to function effectively in its natural environment. Thus, an intact neural mechanism is a necessary but not sufficient requirement to support a behavioral response.

Conclusion

We have reviewed the SS-EP approach as a powerful and versatile neuroimaging technique, offering crucial advantages for testing developmental populations, as well as allowing for the evaluation of a broad range of cognitive and neural mechanisms. Although SS-EPs are already present in the developmental literature, this approach could attract a larger and broader community of developmental scientists, especially as applied to higher-level cognition. In order to facilitate the implementation of future SS-EP studies, we have attempted to lay down some basic practical guidelines, which will hopefully help to strengthen the position of SS-EP studies in developmental research.

176 in total

1. Dynamic brain sources of visual evoked responses.

Authors: S Makeig; M Westerfield; T P Jung; S Enghoff; J Townsend; E Courchesne; T J Sejnowski
Journal: Science Date: 2002-01-25 Impact factor: 47.728

2. Intracerebral sources of human auditory steady-state responses.

Authors: Anthony T Herdman; Otavio Lins; Patricia Van Roon; David R Stapells; Michael Scherg; Terence W Picton
Journal: Brain Topogr Date: 2002 Impact factor: 3.020

3. Measurement of spatial contrast sensitivity with the swept contrast VEP.

Authors: A M Norcia; C W Tyler; R D Hamer; W Wesemann
Journal: Vision Res Date: 1989 Impact factor: 1.886

4. Effect of sleep on the auditory steady state evoked potential.

Authors: J Jerger; R Chmiel; J D Frost; N Coker
Journal: Ear Hear Date: 1986-08 Impact factor: 3.570

5. An effect of stimulus colour on average steady-state potentials evoked in man.

Authors: D Regan
Journal: Nature Date: 1966-06-04 Impact factor: 49.962

6. Overt and covert attention in infants revealed using steady-state visually evoked potentials.

Authors: Joan Christodoulou; David S Leland; David S Moore
Journal: Dev Psychol Date: 2017-12-28

7. A robust index of lexical representation in the left occipito-temporal cortex as evidenced by EEG responses to fast periodic visual stimulation.

Authors: Aliette Lochy; Goedele Van Belle; Bruno Rossion
Journal: Neuropsychologia Date: 2014-11-11 Impact factor: 3.139

8. Piecing it together: infants' neural responses to face and object structure.

Authors: Faraz Farzin; Chuan Hou; Anthony M Norcia
Journal: J Vis Date: 2012-12-06 Impact factor: 2.240

9. Auditory ERP response to successive stimuli in infancy.

Authors: Ao Chen; Varghese Peter; Denis Burnham
Journal: PeerJ Date: 2016-02-02 Impact factor: 2.984

10. Visually Entrained Theta Oscillations Increase for Unexpected Events in the Infant Brain.

Authors: Moritz Köster; Miriam Langeloh; Stefanie Hoehl
Journal: Psychol Sci Date: 2019-10-11

1 in total

1. Steady-state visual evoked potentials in children with neurofibromatosis type 1: associations with behavioral rating scales and impact of psychostimulant medication.

Authors: Eve Lalancette; Audrey-Rose Charlebois-Poirier; Kristian Agbogba; Inga Sophia Knoth; Emily J H Jones; Luke Mason; Sébastien Perreault; Sarah Lippé
Journal: J Neurodev Disord Date: 2022-07-22 Impact factor: 4.074

1 in total