Literature DB >> 34780497

Regular rhythmic and audio-visual stimulations enhance procedural learning of a perceptual-motor sequence in healthy adults: A pilot study.

Yannick Lagarrigue¹, Céline Cappe², Jessica Tallet¹.

Abstract

Procedural learning is essential for the effortless execution of many everyday life activities. However, little is known about the conditions influencing the acquisition of procedural skills. The literature suggests that sensory environment may influence the acquisition of perceptual-motor sequences, as tested by a Serial Reaction Time Task. In the current study, we investigated the effects of auditory stimulations on procedural learning of a visuo-motor sequence. Given that the literature shows that regular rhythmic auditory rhythm and multisensory stimulations improve motor speed, we expected to improve procedural learning (reaction times and errors) with repeated practice with auditory stimulations presented either simultaneously with visual stimulations or with a regular tempo, compared to control conditions (e.g., with irregular tempo). Our results suggest that both congruent audio-visual stimulations and regular rhythmic auditory stimulations promote procedural perceptual-motor learning. On the contrary, auditory stimulations with irregular or very quick tempo alter learning. We discuss how regular rhythmic multisensory stimulations may improve procedural learning with respect of a multisensory rhythmic integration process.

Entities: Chemical

Mesh：

Year: 2021 PMID： 34780497 PMCID： PMC8592429 DOI： 10.1371/journal.pone.0259081

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.240

Introduction

Procedural learning refers to the acquisition and retention of motor and cognitive skills with repeated practice [1-5]. It is essential for many everyday life activities such as driving a car or playing a musical instrument but also for reading or writing [6,7]. Many studies showed that repeated practice of a structured perceptual-motor sequence specified by a stimulation-response association improves the speed of the motor responses and leads to the acquisition of the perceptual-motor sequence [8-12]. The Serial Reaction Time Task (SRTT), previously developed by Nissen and Bullemer (1987) [13], has been widely used to study the implicit procedural learning of a visuo-motor sequence in clinical and nonclinical populations [14-17]. As described by [18], in the classic form of the task, participants have to respond as fast and as accurately as possible by pressing one of four keys corresponding to one of four locations of a visual cue. Unbeknownst to the participant, the first blocks of practice are composed of a repeated structured sequence (e.g., 10 items) that is implicitly learned. Then, on the sixth block of practice the sequence is unexpectedly removed and replaced with random trials. Participants who learned the perceptual-motor sequence will respond less quickly and/or less accurately in this random block. Thus, as explained by Robertson (page 10073), “the difference between sequential and random response times provides a specific and sensitive measure of skill acquisition in the SRTT”. Indeed, as opposed to the general learning that includes both familiarization and sequence learning, this difference in RT (or errors) reflects the learning of the specific sequence and is called specific learning. As stated by [19], multisensory-training protocols could enhance learning and better approximate natural multisensory settings. Several studies have tested how the manipulation of the perceptual stimulations could improve motor procedural learning. Globally, it appears that stimulations’ features play an essential role for learning of a motor sequence [for review see 20,21]. Manipulations refer to stimulation modality [22-24], stimulation type [17,25,26], stimulation-response mapping [27,28] or response effect mapping [29]. Results generally support the role of visuo-spatial stimulations to memorize the sequence. The effects of auditory stimulations on learning of a visuo-motor sequence have been more rarely studied. Yet, several studies suggest that the temporal regularity of the sequence is learned concomitantly with the visuo-spatial sequence itself [30-32]. More precisely, [32] showed that temporal patterns can be learned when the intervals are associated with concrete events, such as specific visual stimuli or finger movements, and that temporal and spatial parameters are learnt in an integrated fashion allowing to acquire the order of a repeated sequence. Given that temporal regularity detection is best accomplished with the auditory system [33,34], we hypothesized that providing auditory stimulations could facilitate the detection of the temporal regularity of the perceptual-motor sequence and so facilitate its learning. Two types of mechanisms have been discovered to explain the benefits of auditory stimulations on motor control and learning: multisensory integration and audio-motor entrainment. Firstly, auditory stimulations provided simultaneously with visual stimulations can lead to multisensory integration. [35] showed that some neural cells of animals have higher neuronal activity in response to multisensory stimulations compared to unisensory stimulations. To be integrated as a coherent whole, stimulations have to be presented in a spatial and temporal “binding window” [i.e., 36–39]. Multisensory integration leads to behavioral improvements, i.e. a reduction of simple reaction time [40] and choice reaction times [41] and a quickening of the detection of visual targets [e.g. 42]. Multisensory stimulations seem also to benefit perceptual learning [19]. Particularly, [43] found faster improvements on a motion-detection learning task with multisensory stimulations compared with unisensory stimulations. On this basis, it is possible that multisensory stimulations could also enhance perceptual-motor learning. Secondly, another way to promote procedural perceptual-motor learning with auditory stimulations comes from studies on Regular Auditory Stimulations (RegAud). RegAud have been proved to induce benefits on motor control [e.g., 44]. In many situations, movements are spontaneously attracted to external regular rhythms although participants are not instructed to synchronize with [see review of [45,46]. RegAud induce a priming effect which can lead to a facilitation to produce voluntary movements called audio-motor entrainment [44,47]. This facilitation consists in an improvement of stability of movements in both time and space [48-50] and these effects are still observable even when attention is focused on the visual modality [51]. The spontaneous sensorimotor synchronization with an auditory rhythm can be explained by the involvement of motor cerebral areas, particularly the supplementary motor area and the primary motor cortex, in rhythm perception and production [52-55]. Moreover, listening of auditory stimulations with regular tempo, such as a metronome, modulates corticospinal excitability measured via motor-evoked potentials (measured thanks to TMS) [56] and creates a stable time scale with a predictable pace to which the motor system adjusts for motor programming [54,57]. [59] showed that audio-motor synchronization is more accurate with simple metrics (regular intervals) than with irregular metrics (i.e., irregular intervals). Benefits of RegAud on motor control could be explained by combined activations of both auditory and motor areas [57-60]. It remains to explore the possible benefits of RegAud on perceptual-motor learning. Previous studies suggested that the temporal organization of the stimulations is an essential part of the perceptual-motor sequence learning [30,32,61]. Hence, the predictable tempo of the regular auditory stimulations may modulate the possible improvement of procedural learning. Thus, we hypothesize that RegAud improve procedural learning of a perceptual-motor sequence. On these bases, the aim of our study is to investigate the possible effects of auditory stimulations on procedural learning evaluated with a SRTT. Auditory stimulations are provided either in congruency with visual stimulations (Congruent Audio-Visual, CongrAV) or regularly (RegAud). Possible effects of these additional auditory stimulations are controlled with four conditions: visual only stimulations (Visual Only, VisOnly), incongruent audio-visual stimulations (Non-Congruent Audio-Visual, NonCongrAV), and non-regular stimulations (Irregular Auditory Stimulations, IrregAud). Our main hypothesis is that auditory stimulations presented congruently with visual stimulations (CongrAV condition) and regular rhythmic auditory stimulations (RegAud condition) will enhance procedural learning compared to control conditions. We also hypothesized that the effects of the regular auditory stimulations (RegAud) could be linked to their speed and if this speed is not suitable to the task, we would not observe any benefits. Thus, speed effects are controlled with a condition with Quick Regular Auditory Stimulations (FastRhyth).

Method

Participants

Sixty right-handed (laterality quotient = 77.59 ± 21.45) adults participated to the study (32 females). Participants were undergraduate students pursuing sports science courses in Toulouse university. They were from 18 to 30 years old (mean age = 21.80 ± 2.53), reported normal or corrected-to-normal vision and hearing and were naïve as to the purpose of the study. They did not practice music more than two hours per week during a maximum of two years. They were randomly and equally assigned to the six different conditions. The study was conducted in accordance with the Declaration of Helsinki and approved by the Inserm (Institut National de la Santé et de la Recherche Médicale) ethical committee (Institutional Review Board IRB00003888—agreement n°14–156). Before the experiment beginning, all volunteers provided a verbal informed consent and documented a written form specifying their motivation to participate to the study.

Materials

Stimulation presentation and data collection were achieved using experimental software Presentation version 17.2 (Neurobehavioral System Inc, Albany, CA) which provide a precision under 1 millisecond for motor response measures [62]. The laptop was connected to an external display (40cm, refreshing at 60Hz) and to an adapted keyboard. On this keyboard, the keys were removed except those corresponding to the four letters D, F, G and H which were marked with yellow pallets. Preparatory attention and divided attention were tested with the standardized alert phasic test and divided attention test of the Test of Attentional Performance (TAP, Version 2.3 Zimmermann and Fimm, 2002).

Procedure

The participant was seated in standardized sitting posture in a quiet room without visual or auditory distractors. The viewing distance was approximately 80cm and the keyboard was at 50cm from the screen. The experimenter space was separated from the participant space by a curtain. Before the experiment, useful data about the participants (date of birth, gender, handedness assessed with the Edinburgh Handedness Inventory, [63]), were collected.

Test of Attentional Performance (TAP)

Attentional performance was assessed to explore the link between visuo-motor learning and attentional skills. This measure was also done to make sure that groups did not differ in terms of attentional functions. Each of the two neuropsychological tests was composed of two parts: a phase to familiarize the participant with the instructions and a test phase in which the results were recorded. We assessed two attentional functions: preparatory attention (alert phasic test) and divided attention. Note that one participant (RegAud condition) did not perform the attentional tests for time issues.

The Serial Reaction Time Task (SRTT)

We used a version of the serial reaction time task (SRTT) in which participants were instructed to respond using the four fingers except thumbs (index, middle, ring, and little finger) of their right hand to press the D, F, G, and H keys of the keyboard. Each of the four possible keys corresponded to one of the four stimulation locations. The four possible stimulation positions were specified by four equally spaced gray boxes, each a 2cm square, presented on a computer screen so that the stimulation-response mapping would be compatible with the keyboard. On each stimulation-response mapping, one of the four boxes on the monitor was colored in yellow, and the participant’s task was to press the corresponding key on the keyboard (Fig 1) as fast as possible (“Try to go as fast as possible and make as less as possible mistakes”). Once a key was pressed, the computer recorded the participant’s reaction time and then moved to a different box with an interval of 250ms before the next target. If the participant did not press any key, the stimulation remained for 3000ms on the screen and then went on with the next stimulus.

Fig 1

Serial reaction time task.

Each finger is associated with a response key and each key is associated with a visual cue. When a box lighted the participant had to press the corresponding key as quickly as possible.

Serial reaction time task.

Each finger is associated with a response key and each key is associated with a visual cue. When a box lighted the participant had to press the corresponding key as quickly as possible. All participants went through 7 Blocks containing 100 items: 1 Block of familiarization (B0) displaying a sequence of 10 positions repeated 10 times (100 items). It was performed in the same condition as the following blocks in order and aimed to make sure that there were no significant differences between the groups in performance at the beginning of the experiment (B0 can be considered as a baseline), Then, 5 Blocks of practice of a same sequence displaying a repeated pattern of 10 positions presented ten times (100 items each block, i.e., 500 items in total) (B1 to B5) in order to test general learning. Finally, a last Block (B6) presented the visual stimulations in a pseudo-random fashion (100 items). The sequence with the repeating pattern of positions was no longer played out. This Block aimed to test specific learning of the sequence. We used several different sequences instead of a unique one because learning can depend on the sequence used [64] and it opens up the possibility that the obtained results may be specific to a particular fixed sequence. Thus, each group was we proposed different controlled sequences sharing the same rules: The same position could not appear on successive trials Each position appeared an equal number of times The sequence could not contain runs (e.g., 1234) The sequence could not contain trills of four units (e.g., 1313). On this basis, four sequences of ten positions were generated (sequence A: 1 3 4 2 3 1 4 2 1 4, sequence B: 2 4 1 3 4 2 1 4 3 1, sequence C: 3 1 4 2 1 3 4 1 2 4, sequence D: 4 2 3 1 2 4 1 3 4 1). Each sequence was attributed in a counterbalanced manner to the participants for each condition. The selected sequence of B0 was different from this selected for B1 to B5 in order to avoid a possible transfer of learning between two specific sequences. For Block 6 (B6), all participants performed the same pseudo-random stimulations following the previous rules applied to 100 items (3,2,1,3,4,1,3,4,2,3,1,2,4,3,1,2,3,4,2,3,1,4,3,2, 4,1,2,4,1,3,2,4,1,2,4,1,3,4,2,1,4,2,3,1,4,3,1,2,4,3,1,2,4,3,1,2,4,3,1,2,4,2,3,1,4,1,3,2,1,3,4,2,1,3,2,1,3,2,1,4,2,3,1,4,3,2,4,1,2,4,1,3,4,2,3,4,1,4,2,3). All participants were randomly and equally assigned to one of the six different conditions. In the Visual Only (VisOnly) condition, visual stimulations were presented without auditory stimulations. In the Congruent Audio-Visual (CongrAV) condition, an auditory stimulation was presented at exactly the same time as each visual cue. In the Non-Congruent Audio-Visual (NonCongrAV) condition, an auditory stimulation was presented 200ms after each visual cue. If the participants pressed a key before this delay, the auditory stimulation was not presented. In the Regular Rhythmic Auditory Stimulations (RegAud) condition, auditory stimulations were presented every 500ms independently of visual stimulations and participants’ responses. In the Irregular Rhythmic Auditory Stimulations (IrregAud) condition, auditory stimulations were presented irregularly and independently of visual stimulations and participants’ responses. There was the same number of auditory stimulations in this soundtrack than in the RegAud condition but intervals were pseudo-randomly generated in a range of 0.022s to 2.891s and a mean of 0.494s. This pattern was programmed with the free software Audacity. The same soundtrack was presented in all blocks. In the Quick Regular Rhythmic Auditory Stimulations (FastRhyth) condition, auditory stimulations were presented every 300ms independently of visual stimulations and participants’ responses. In CongrAV, NonCongrAV, RegAud, IrregAud and FastRhyth conditions the auditory stimuli were presented via two external speakers placed on both sides of the screen and consisted in a 500Hz, 100ms sinewave presented at 80dB. Participants were told that there would be auditory stimuli but we did not tell them anything about their purpose. In the conditions CongrAV, NonCongrAV, RegAud, IrregAud and FastRhyth, the auditory stimuli were presented from Block 0 to Block 6. Note that B0 is performed in the same condition as the following Blocks because changing the way how stimuli are introduced between B0 and B1 would have possibly induced confounding effects on participants’ performance, hence confusing the interpretation of the results of the general learning phase. Moreover, a removal of the auditory stimuli in the random block (B6) would have induce a double change (sequence and auditory stimuli) for participants and it would have been difficult to determine if decrease change in performance would be due to a change in the sequence or the removal of the auditory stimuli. The order of the SRTT and TAP tests was counter-balanced for each participant to prevent training or fatigue effects. At the end of the random Block (B6), the participants’ explicit knowledge of the sequence was measured by asking them whether or not they noticed a repeated sequence. The entire experiment took approximately 1h.

Data analyses

For all analyses, incorrect responses were not included in the RT analyses.

Pre-tests

The laterality quotient was assessed with the Edinburgh Handedness Inventory [63]. For the two attentional tests, mean reaction times and errors were computed by the TAP software. For the B0, the averaged reaction times (RTmean), variability of reaction times (RTsd) and errors were computed and compared between Conditions.

SRTT

Average reaction times (RTmean), variability of reaction times (RTsd) and errors for all participants in each Condition were computed across trials of each of the block. Difference in performance between Blocks 1 to 5 was considered as a measure of general learning (difference B1-B5 = RTmeanB1-B5, RTsdB1-B5, ErrorB1-B5) whereas performance in Blocks 5 to 6 (fixed to pseudorandomized order of visual stimulations) was considered as a measure of specific learning of the sequence (difference B6-B5 = RTmeanB6-B5, RTsdB6-B5, ErrorB6-B5). A decrease in RTmean, RTsd and errors was expected between the B1 and B5 to give evidence of general learning and an increase in RTmean, RTsd and errors is expected between B5 and B6 (fixed to randomized order of visual stimulations) to give evidence of specific sequence learning.

Explicit knowledge

Three kinds of responses were recorded to answer the question “Did you notice that the presentation of the boxes followed a repeated sequence?”: yes, no and maybe. We then computed the percentage of yes in each Condition.

Statistical analyses

The equality of variances was assessed by Levene’s test and normality of the distribution was tested by the Kolmogorov-Smirnov test. If assumptions were verified (at least 50% of data with equality of variance and normal distribution), analyses of variance (ANOVAs) were used. If data did not satisfy the criteria of equality of variance or normality, nonparametric tests (Friedman, Mann-Whitney and Wilcoxon signed rank test) were used. When appropriate, the data were further analyzed with post-hoc analysis (Fischer test). A significance level of 0.05 was adopted for all analyses. Only significant results (p < .05) are reported in the Results section. All results are plotted using means and standard errors. To make sure that laterality of participants did not differ between Conditions, one-way ANOVAs with Conditions as a Factor was conducted on the Laterality Quotient. To make sure that attentional characteristics of participants did not differ between Conditions prior to the SRTT, one-way ANOVAs with Conditions as a Factor were conducted on the mean reaction times and errors for the two attentional tests. Pearson’s correlations were used to explore the link between attentional performance (alert phasic and divided attention) and the SRTT general learning score (difference B1-B5 = RTmeanB1-B5, RTsdB1-B5, ErrorB6-B5) and specific learning score (difference B6-B5 = RTmeanB6-B5, RTsdB6-B5, ErrorB6-B5).

B0 analyses

To make sure that participants did not differ between Conditions prior to the SRTT, one-way ANOVAs with Conditions as a Factor were conducted on RTmean, RTsd and errors. Pearson’s correlations were used to explore the link between B0 performance (RTmean, RTsd and errors) and the SRTT general learning score (difference B1-B5 = RTmeanB1-B5, RTsdB1-B5, ErrorB6-B5) and specific learning score (difference B6-B5 = RTmeanB6-B5, RTsdB6-B5, ErrorB6-B5). To determine general learning (B1-B5) for all conditions, Conditions x Blocks ANOVAs with Conditions (VisOnly, CongrAV, NonCongrAV, RegAud, FastRhyth and IrregAud) as between-subject factor and Blocks (B1 to B5) as a repeated measure were conducted on RTmean. One-way ANOVA was conducted on the RTmean differences B1-B5 (RTmeanB1-B5) to compare general learning evolution between Conditions. Friedman tests were used to assess RTsd and errors’ evolution in each Conditions. To determine sequence-specific learning (B5-B6), Conditions x Blocks ANOVAs with Conditions (VisOnly, CongrAV, NonCongrAV, RegAud, FastRhyth and IrregAud) as between-subject factor and Blocks (B5 to B6) as a repeated measure were conducted on RTmean and errors from B5 to B6. One-way ANOVAs were conducted on the RTmean and errors differences B6-B5 (RTmeanB6-B5 and ErrorB6-B5) to compare specific learning evolution between Conditions. Wilcoxon signed rank tests were used to compare RTsd evolution between Conditions from B5 to B6 (RTsdB6-B5). We estimated the Bayes Factor for these data using JASP [65]. The Bayes Factor is used to compare two hypotheses (H0 and H1) based on collected data. It tells how much more likely one hypothesis is to be true compared to the other (e.g., [66-69]. The Bayes factor (BF10) measures the likelihood of H0 vs. H1 given our data. Although Bayes factors are defined on a continuous scale, several researchers have proposed to subdivide the scale into discrete evidential categories [70]. We used the standard non-informative Cauchy prior in JASP with a default width of 0.707. We computed the percentage of “yes” reported by participants in each Condition. Kruskal-Wallis ANOVAs by Ranks were used to compare this percentage between Conditions. We also compared the specific learning score of participants who noticed a repeating pattern of position and participants who did not with t.tests on the RTmean and errors differences B6-B5 (RTmeanB6-B5 and ErrorB6-B5).

Results

Pre-tests

No difference was found between Conditions on the participants’ mean reaction time (RTmean) for the divided attention task and the phasic arousal index. No difference was found on the Laterality Quotient and at B0 on the RTmean, the variation of the reaction time (RTsd) or errors (Table 1a). Moreover, we did not find correlation between the performance at the first block of practice (B0) and SRTT learning scores, or between the two scores of the attentional tasks and SRTT learning scores (see Table 1b).

Table 1

	CONDITIONS						F and p values
	VisOnly	CongrAV	NonCongrAV	RegAud	IrregAud	FastRhyth	-
Number of participants	10	10	10	10	10	10	-
Laterality Quotient	72.00 (±19.85)	79.03 (±23.72)	72.85 (±27.88)	75.06 (±16.46)	80.81 (±25.17)	85.76 (±14.95)	F (5, 54) = 0.6, p = .712
B0 RTmean	409.70 (±56.28)	453.16 (±80.82)	452.52 (±65.27)	398.83 (±44.98)	440.15 (±80.20)	428.95 (±54.09)	F (5, 54) = 1.2, p = .323
B0 RTsd	113.24 (±37.39)	108.77 (±30.26)	136.79 (±53.17)	100.42 (±31.07)	122.99 (±55.39)	120.74 (±64.48)	F (5, 54) = 0.7, p = .613
B0 errors	2.5 (±1.90)	4.6 (±4.40)	2.0 (±2.87)	5.4 (±3.03)	3.2 (±2.10)	3.8 (±2.40)	F (5, 54) = 2.0, p = .101
Phasic Alert (PA)	0.311 (±0.05)	0.022 (±0.06)	0.018 (±0.06)	0.012 (±0.07)	-0.001 (±0.04)	0.057 (±0.08)	F (5, 53) = 1.0, p = .417
Divided Attention (DA)	596.76 (±51.66)	661.28 (±90.35)	632.88 (±68.39)	599.25 (±68.39)	641.06 (±77.62)	619.00 (±61.84)	F (5, 53) = 1.3, p = .274

Numbers in parentheses represent standard deviations. RTmean = Reaction Times / RTsd = standard deviation of the reaction time / VisOnly = Visual Only / CongrAV = Congruent Audio-Visual / NonCongrAV = Non-Congruent Audio-Visual / RegAud = Regular Auditory Stimulations / IrregAud = Irregular Auditory Stimulations / FastRhyth = Quick tempo Rhythmic Auditory Stimulations. ANOVAs were used to compare Conditions.

None of the correlations was significant.

a: Participants’ results for the Block 0 (B0) and the Tests of Attentional Performance (TAP). b: Pearson correlations’ results between Block 0 (B0), the Tests of Attentional Performance (TAP) and SRTT learning scores. Numbers in parentheses represent standard deviations. RTmean = Reaction Times / RTsd = standard deviation of the reaction time / VisOnly = Visual Only / CongrAV = Congruent Audio-Visual / NonCongrAV = Non-Congruent Audio-Visual / RegAud = Regular Auditory Stimulations / IrregAud = Irregular Auditory Stimulations / FastRhyth = Quick tempo Rhythmic Auditory Stimulations. ANOVAs were used to compare Conditions. None of the correlations was significant.

General learning (B1-B5)

ANOVA on RT mean revealed a significant effect of Block (F (4, 216) = 27.11, p < .001, η2P = .334, BF10 > 100). We found BF10 greater than 100 which corresponds to decisive evidence for H1. As illustrated in Fig 2a, RTmean decreased from Blocks 1 to 5. ANOVA also revealed a significant interaction between Blocks and Conditions (F (20, 216) = 1.76, p = .027, η2P = .140, BF10 = 1.98) suggesting that the evolution of RTmean differed between Conditions. However, we found BF10 to be 1.98 which corresponds to anecdotal evidence in favor of H1.

Fig 2

a. Mean Reaction Times for general learning (from B1 to B5) and for specific learning (from B5 to B6) of all Conditions. Regular Rhythmic Auditory Stimulations (RegAud in purple squares), Congruent Audio-Visual (CongrAV in blue squares), Visual Only (VisOnly in green triangles), Non-Congruent Audio-Visual (NonCongrAV in red triangles), Irregular Auditory Stimulations (IrregAud in grey triangles) and Quick Rhythmic Auditory Stimulations (FastRhyth in yellow triangles). Vertical bars represent the standard errors. b. Mean Reaction Times (z-scores) for general learning (from B1 to B5) and for specific learning (from B5 to B6) of all Conditions. Regular Rhythmic Auditory Stimulations (RegAud in purple squares), Congruent Audio-Visual (CongrAV in blue squares), Visual Only (VisOnly in green triangles), Non-Congruent Audio-Visual (NonCongrAV in red triangles), Irregular Auditory Stimulations (IrregAud in grey triangles) and Quick Rhythmic Auditory Stimulations (FastRhyth in yellow triangles). Vertical bars represent the standard errors. To explore more precisely this result, we conducted an ANOVA with Conditions as a Factor on RTmeanB1-B5 which revealed a significant Condition effect (F (5, 54) = 2.84, p = .024, η2P = .208, BF10 = 2.50). We found BF10 to be 2.50 which correspond to anecdotal evidence in favor of H1. Post-hoc Fisher tests revealed that the RTmeanB1-B5 was lower for the FastRhyth Condition than for all the other Conditions (VisOnly: p = .007; CongrAV: p = .014; NonCongrAV: p = .003; RegAud: p = .005; IrregAud: p = .005). In order to remove inter-group differences in terms of absolute response speed and allow for more sensitive differences in general learning, each participant’s observation on each measure was converted to a z score standardized on participant’s mean and variance (pooled across blocks). As previous, ANOVA revealed significant block effect on general learning phase (F (4,236) = 23.67, p < .001, η2P = .286, BF10 > 1000) (Fig 2b). For general learning, Friedman tests revealed a significant Block effect only in the IrregAud Condition for both errors (X2R (10, 4) = 11.55, p = .021) and RTsd (X2R (10, 4) = 10.96, p = .027), suggesting that errors and RTsd increased in this Condition from B1 to B5 (Fig 3a and 3b respectively).

Fig 3

a. Mean variations of Reaction Times (RTds) for Irregular Auditory Stimulations Condition (IrregAud) over general learning (B1 to B5). Vertical bars represent the standard error. b. Number of errors for Irregular Auditory Stimulations Condition (IrregAud) over general learning (B1 to B5). Vertical bars represent the standard errors.

Specific learning (B5-B6)

The ANOVA on RTmean from B5 to B6 revealed a significant Block effect (F (1, 54) = 161.14, p < .001, η2P = .749, BF10 > 100). We found BF10 greater than 100 which corresponds to decisive evidence for H1. As illustrated in Fig 2, RTmean increased for all Conditions. The Block effect from B5 to B6 was also significant on errors (F (1, 54) = 39.75, p < .001, η2P = .424, BF10 > 100), suggesting that errors increased between B5 and B6 for all Conditions (Fig 4a). We found BF10 greater than 100 which corresponds to decisive evidence for H1. Moreover, ANOVA conducted on the ErrorB6-B5 confirmed the significant Condition effect (F (5, 54) = 3.83, p = .005, η2P = .262, BF10 = 9.42). We found BF10 to be 9.42 which corresponds from substantial to strong evidence for H1. Post-hoc Fisher tests revealed that ErrorB6-B5 was higher in the CongrAV Condition than in the VisOnly Condition (p = .006) and FastRhyth (p = .020) (Fig 4b). Moreover, ErrorB6-B5 was higher in the RegAud Condition than in the VisOnly (p < .001), NonCongrAV (p = .016), IrregAud (p = .049) and FastRhyth Conditions (p = .003) (Fig 4b).

Fig 4

a. Errors for general learning (from B1 to B5) and for specific learning (from B5 to B6) of all Conditions. Regular Rhythmic Auditory Stimulations (RegAud in purple), Congruent Audio-Visual stimulations (CongrAV in blue), Irregular Auditory Stimulations (IrregAud in grey), Non-Congruent Audio-Visual stimulations (NonCongrAV in red), Quick tempo Rhythmic Auditory Stimulations (FastRhyth in yellow) and Visual Only (VisOnly in green). Vertical bars represent the standard errors. b. Error differences during specific learning from B5 to B6 (ErrorB6-B5) of all Conditions. Regular Rhythmic Auditory Stimulations (RegAud in purple), Congruent Audio-Visual stimulations (CongrAV in blue), Irregular Auditory Stimulations (IrregAud in grey), Non-Congruent Audio-Visual stimulations (NonCongrAV in red), Quick tempo Rhythmic Auditory Stimulations (FastRhyth in yellow) and Visual Only (VisOnly in green). Vertical bars represent the standard errors. In order to remove inter-group differences in terms of absolute response speed and allow for more sensitive differences in specific learning, each participant’s observation on each measure was converted to a z-score standardized on participant’s mean and variance (pooled across blocks). As previous, ANOVA revealed significant block effect on specific learning phase (F (1, 59) = 393.21, p < .001, η2P = .870, BF10 > 1000) (Fig 2b).

Explicit knowledge

Of the 60 participants, 39 (65%) reported that they had perceived a repeated pattern including 6 participants in the VisOnly condition, 9 in the CongrAV condition, 6 in the NonCongrAV condition, 6 in the RegAud condition, 5 in the IrregAud condition and 7 in the FastRhyth Condition. There were no significant differences between Conditions in explicit knowledge (H (5,60) = 4.11, p = .534)). The t.tests on the RTmean and errors differences B6-B5 (RTmeanB6-B5 and ErrorB6-B5) between participant who noticed a repeating pattern of positions and the one who did not were not significant.

Discussion

The present study aimed to investigate the effects of auditory stimulations on procedural learning of a visuo-motor sequence. To this aim, auditory stimulations were introduced during a SRTT, either with a regular rhythm (RegAud) or in temporal congruency with visual stimulations (CVA). These conditions were compared to four control conditions: without auditory stimulations (VisOnly), with incongruent audio-visual stimulations (NonCongrAV), with irregular auditory stimulations (IrregAud) or with a quick tempo regular rhythm (FastRhyth). Globally, our results are in accordance with our hypotheses and indicate that both rhythmic auditory stimulations (RegAud) and congruent audio-visual (CongrAV) stimulations enhance procedural learning. This improvement concerns the specific learning, as attested by a larger increase of errors when randomized order of visual stimulations is introduced. Firstly, it is important to note that these results are not related to laterality, attentional scores or performance at B0, that is, at the beginning of the SRTT. Moreover, all conditions lead to the same explicit detection of the sequence. There is a huge debate in the literature about the link between awareness and what is learned [71,72]. In our study, although some subjects became aware of a repeating pattern during the learning phase (B1 to B5), there were no learning differences between aware and unaware subjects, as also shown by [28]. Hence, the detection of the repeating pattern cannot explain the differences in learning between the conditions. However, further investigations about explicit knowledge associated to learning with and without auditory stimuli are needed, for example with subjective scale with more gradings, which might be more sensitive to differences. Secondly, given that RegAud and CongrAV conditions led to better improvement compared to control conditions with both auditory and visual stimulations, one cannot explain the differences in learning by the addition of two stimulations compared to one cue. Thirdly, attentional profile is not linked to the learning process. It is in line with several studies showing no links between different executive function tasks and implicit learning [73,74]. Thus, the way by which the auditory stimulations are delivered is not likely to be responsible for the benefits. We discuss the learning improvement in the RegAud and the CongrAV conditions with respect of the involvement of a multisensory rhythmic integration process.

Specific learning can occur without general learning

In two conditions (IrregAud and FastRhyth) we found specific learning without general learning. It is in accordance with previous findings showing that it is possible to enhance general learning but not sequence-specific learning [see 75]. Thus, our results support the idea that general and specific learning are two distinct processes which are subserved by distinct neural correlates [76]. The first one corresponds to stimulus-based mappings and the second one to internalized sequence representation, or response-based mappings [e.g., 76,77].

General learning is lower with irregular stimulations and quick tempo auditory stimulations

Errors are discarded from a large number of studies using SRTT although they are important indicators of learning. Indeed, SRTT involves a permanent speed–accuracy trade-off and learning can be attested by a concomitant decrease in RTmean and/or errors. If, an increase in errors is associated with a decrease in RTmean or a decrease in RTmean is associated with an increase in errors, it is not clear whether this improvement reflects a learning effect or only a change of strategy. However, if one of these two variables decreases while the other one remains stable, it means that the performance increase. Interestingly [78], showed that performance of accuracy and speed depend on the instructions (more directed to speed or accuracy) but the learning effect occurs in both cases. Given that our results show differences in conditions on errors only, it is possible that our instructions were more emphasized on accuracy than speed. Taken both variables into account, our results reveal that the IrregAud condition did not lead to a general learning because it induced a decrease in RTmean concomitant to an increase in errors between B1 and B5. Due to the irregularity (non-isochrony) of the auditory stimulations in the IrregAud condition, it is likely that extracting a temporal pattern of irregular auditory stimulations is more difficult than regular rhythmic auditory stimulations [34,52,79-81]. During a motor task, the introduction of irrelevant auditory stimulation negatively influences motor control [82]. Indeed, [37] showed that irrelevant sounds can cause a disengagement of attention from the task. In this case, participants attempt to suppress the distractors in order to complete successfully the motor task. Other studies showed that auditory distraction alters both visual attention and motor control [83-85]. Hence, introducing irregular auditory stimulations could have generated a distractor effect which limits the attentional focus on the SRTT and could have limited the general learning. Our results also highlight a higher RT variability with the irregular auditory stimulations compared to the other conditions. Again, this is consistent with results of studies showing that providing rhythmic auditory stimulations automatically attracts the tempo of tapping and leads to better stability of movement tempo compared to control conditions without auditory stimulations. Particularly, [86] suggested that auditory rhythms can modify parameters related to the motor production, especially by reducing the variability of muscle activity during the preparatory period. The decrease in RTmean was lower in the quick tempo rhythmic auditory stimulations (FastRhyth) condition, suggesting that general learning was lower in this condition compared to the other conditions. Given that the only difference between FastRhyth and RegAud is the speed of the auditory stimulations’ tempo, it suggests that this speed affects motor learning. The delay between a participant’s response and the next stimulation was 200ms. Thus, with a tempo at 300ms when a motor response occurred at the same time as an auditory stimulation, the next auditory stimulation occurred 100ms after the next visual stimulation, which is too short for the participant to respond. Indeed, the RTmean mean achieved at the last Block (B5) in the FastRhyth condition was 381,93 (± 61,13ms). Literature shows that audio-motor entrainment is strongest when the tempo of the external rhythm is close to the spontaneous movement tempo (about 600 ms) but vanishes when the difference between the tempo of the external rhythm and the individual’s movement tempo is too high [45,87,88]. Furthermore, our results also suggest that a tempo quicker than the spontaneous tempo is detrimental for general learning. As in the IrregAud condition, the deleterious effect of the auditory metronome in the FastRhyth condition could be explained by distractor effect.

Specific learning is enhanced with congruent audio-visual stimulations and regular rhythmic auditory stimulations

Specific learning of the sequence is attested by an increase in RTmean and errors at B6 (sequenced visual stimulations) compared to B5 (random visual stimulations). This means that a larger increase in RTmean and errors between B5 and B6 highlights a larger specific learning of the visuo-motor sequence. Our results indicate that RTmean increased for all groups, hence suggesting that each condition led to a specific learning of the visuo-motor sequence. However, errors increased more in the conditions with congruent audio-visual stimulations (CongrAV) and regular rhythmic auditory stimulations (RegAud) than in control conditions. Even though the number of participants is small, the relatively high BF10 means that the observed effect is real. Even if we expected to find the effects on RTmean rather than on errors, this result is in accordance with our hypotheses. Hence, both the congruent audio-visual stimulations (CongrAV) and the regular rhythmic auditory stimulations (RegAud) enhance procedural learning of the sequence. Benefits in the CongrAV and RegAud conditions are not due to the introduction of two sources of stimulation rather than one source of stimulation, given than the IrregAud et FastRhyth, which provide two sources of stimulation, did not enhance learning. Our result in the CongrAV condition is in line with the findings of [43,89,90] showing that practice with audio-visual stimulations would improve the acquisition of visual motion-detection skills faster compared to practice with visual stimulations only. The advantage of audio-visual stimulations compared to visual stimulations could be attributed to several processes related to multisensory integration such as (1) a faster detection of audio-visual stimulations than visual stimulations [35,39,42], (2) an improvement of spatial attention [91] and (3) a faster visual learning with multisensory stimulations compared to visual stimulations only [43]. Our results also contribute to the debate in the literature distinguishing (1) a modality-specific mechanism proposing that learning occurs in each modality separately and (2) a modality-general mechanism in which learning is independent across modalities [90,92]. In line with previous results [90], our results tend to be in favor of the latter purpose. Indeed, in [90] authors showed that learners are able to extract statistical regularities from audiovisual input and to integrate it into audio and visual streams separately. In our case, even if the auditory stimulations alone don’t provide any cue regarding the sequence, it seems that they still helped the learning of the visual sequence when they were presented simultaneously with the visual stimulations. Therefore, our results are in line with the purpose that procedural learning is sensitive to multimodal input. As regard to the benefice of regular rhythmic auditory stimulations (RegAud), our results on errors is surprising given that previous studies in the literature indicate that RegAud quicken RTmean compared to visual stimulations [57]. Our results suggest that RegAud improve learning of the motor sequence by modulating errors (i.e., response on a wrong spatial location), suggesting that RegAud enhance spatial encoding of the motor sequence. This facilitation is in line with previous results showing an improvement of movement stability in both time and space with RegAud [48-50]. Interestingly we found facilitation in the RegAud condition even if we did not manipulate directly the temporal pattern of the to-be-learned material as it was done in most previous studies [see for example [32,93,94]. Indeed, the sequence of position was played out through the visual modality whereas we implemented a temporal structure through auditory stimuli. One hypothesis is that the temporal regularity of these stimuli could have prepared the attentional system to deal with specific stimuli arriving in the same temporal pattern [e.g., [95,96]. Indeed, these effects have already been shown using other tasks of implicit learning of pitch structures [97,98], working memory [99], and statistical learning of artificial language [100]. Moreover, this tempo seems to be well suited given that the optimum tempo for motor synchronization is between 400 and 800ms [101]. Overall, the regularity of the RegAud may have facilitated the learning of the visual stimulations sequence and the increase in the number of errors at the random block (B6) suggests that participants continue to inappropriately play the sequence out [18]. Note that we used an ambitious design with six different conditions. However, all of them were required to understand the overall effect. For example, we showed that it is not only the regularity of the tempo that is decisive but that this tempo is in adequacy with the motor task to facilitate the learning. Moreover, the small sample might have led to underestimate the effects. However, a within-participants design would not have been possible because of (1) the possible transfer or interference effects between conditions and (2) the length of the experiment (6 different conditions x 6 blocks of 100 stimuli + attentional tests). Despite this ambitious design, we found some promising results that need to be explored more deeply and replicated.

Conclusion

For the first time, our results provide a strong argument in favor of the benefits of audio-visual and regular rhythmic auditory stimulations on procedural learning. This benefit was absent in the control conditions. Given that the addition of auditory information does not automatically enhance procedural learning (control conditions), the benefits cannot be attributed to the addition of auditory information but actually to the rhythmic structure of the auditory stimulations and to the temporal congruency of the auditory and visual stimulations. It suggests that regular rhythmic audio-visual stimulations seem to be a relevant condition to improve procedural learning of perceptual-motor sequences. Even if these preliminary results need for replications and extension with a retention test (with reintroduction of the repeated sequence in another Block, B7), future research is required to find out how sequence learning and temporal information are precisely related, possibly with investigations of the temporal structure of the sequence and the cerebral correlates of procedural learning with rhythmic multisensory stimulations.

Bayes factors levels.

Table for the interpretation of each Bayes factors level. Adapted from Jeffreys (1998). (DOCX) Click here for additional data file.

SRTT data.

(CSV) Click here for additional data file.

Auditory sounds—Regular auditory sequence.

(WAV) Click here for additional data file.

Auditory sounds—Irregular auditory sequence.

(WAV) Click here for additional data file. 10 May 2021 PONE-D-21-00943 Rhythmic and audio-visual stimulations enhance procedural learning of a perceptual-motor sequence in healthy adults: a pilot study PLOS ONE Dear Dr. Lagarrigue, Thank you for submitting your manuscript to PLOS ONE, and we wish to apologize again about the unusually long delay in returning to you the reviewer comments. Both reviewers (and I) were generally positive about the contribution, but both raise substantial issues and questions that need to be addressed before the manuscript fully meets PLOS ONE’s publication criteria. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Hopefully the distinction is clear between what is suggested or recommended (e.g. "...Figure 2 could...") versus what is required to address (any direct questions), but feel free to reach out with any questions about that. In addition to the scientific concerns, I would like to point out that the Data Availability requirement has not been met for this submission. Specifically, the data availability questions state: "Stating ‘data available on request from the author’ is not sufficient. If your data are only available upon request, select ‘No’ for the first question [Data Availability] and explain your exceptional situation in the text box." In most cases, PLOS ONE policy requires data to be deposited in a public repository or submitted as supplementary information. Please submit your revised manuscript by Jun 24 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Christopher R. Fetsch Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. Please ensure that you have provided sufficient detail on participant recruitment in the Methods section. 3. Thank you for including your ethics statement: "All participants provided informed consent prior to data collection. Procedure was in accordance with the Declaration of Helsinki and followed institutional ethics board guidelines for research on humans (IRB00003888)". 3.1. Please amend your current ethics statement to include the full name of the ethics committee/institutional review board(s) that approved your specific study. 3.2. Please amend your current ethics statement to confirm that your named institutional review board or ethics committee specifically approved this study. 4.3. Please provide additional details regarding participant consent. In the ethics statement in the Methods and online submission information, please ensure that you have specified what type you obtained (for instance, written or verbal, and if verbal, how it was documented and witnessed). If your study included minors, state whether you obtained consent from parents or guardians. If the need for consent was waived by the ethics committee, please include this information. Once you have amended this/these statement(s) in the Methods section of the manuscript, please add the same text to the “Ethics Statement” field of the submission form (via “Edit Submission”). For additional information about PLOS ONE ethical requirements for human subjects research, please refer to http://journals.plos.org/plosone/s/submission-guidelines#loc-human-subjects-research. 4. We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For more information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. In your revised cover letter, please address the following prompts: a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent. b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. We will update your Data Availability statement on your behalf to reflect the information you provide. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Partly Reviewer #2: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: No Reviewer #2: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The authors investigate the effects of auditory stimulations on learning a visuomotor sequence (serial reaction time task). Their findings suggest that, when these stimulations are congruent with the task and/or rhythmic at a regular tempo, they promote procedural learning. Their findings are interesting, and indeed show that auditory stimulations affect procedural learning of a motor sequence. I have some comments/questions as to whether this effect amounts to enhancement; my reservations may be resolved with some additional clarifications. 1) The evidence for enhancement comes from studying specific learning (comparing performance in late learning of a sequence vs. subsequent performance in a random sequence block). The reaction times are similar across groups, but the error rates are different, which is the evidence for enhancement in the RAS and CAV groups. While SRTT studies seem to mostly rely on RT analysis, I agree with the authors that the error rate is also important and an indication of learning. However, one issue with looking at error rates (and their differences) is that they might be more likely to have floor/ceiling effects – for example, a participant might have little in terms of errors in the random block (B6), thus can't be much better in B5. In particular, maybe the VO group (essentially the “vanilla” SRTT group, against which we compare for enhancement) is simply pretty good in both B5 and B6, with little errors. So, given the importance of errors in justifying the paper’s thesis – that there is enhancement of learning – it will be helpful to see the absolute error rates during each block (like Figure 2, but for errors), rather than just the difference in error rates between B5 and B6. 2) What could obscure this enhancement when it comes to general learning? If we assume general learning encompasses specific learning (assessed by comparing B5 with B6) among other things, one would expect that increases in specific learning would increase general learning as well. 3) Another limitation in comparing learning across the different groups is that they start from different baselines: e.g. there are large differences in reaction times in B1, before much learning is supposed to occur. In other words, there might be a confound due to differences in baseline performance. 4) As a minor point – auditory stimuli might also act as cues to retrieve the previously learned sequence. So, during the random block, auditory stimuli might promote retrieval of the sequence (even though it is not helpful at that block) – in that way, we could see an over-expression of the previously learned sequence, which would not necessarily mean that the sequence was learned to a greater extent previously. 5) Participants might be learning the sequence but also learning to deal with the auditory stimulation – it might be helpful to show that their performance was stable *before* the sequence was introduced. Errors and requests for clarification: 1) The caption in Figure 2 does not match the legend in terms of describing the different curves (*for the purpose of my review, I assumed the legend is correct*): it mentions RAS as yellow circles, whereas the legend shows them as purple squares; QRAS as gray circles, whereas the legend shows them as yellow triangles, etc. Please make sure all the colors/symbols mentioned in the caption are correct. The caption also mentions * / # that are not shown in the figure. Captions for figures 3a/3b also mention asterisks that are not shown. 2) p.15 , second to last line: “a larger improvement of errors” – is it meant “a larger increase of errors”? 3) Clarification: Page 4 “… than with irregular metrics (i.e. isochronous intervals)” - Aren’t isochronous intervals also regular tempo? 4) Having “results” as a subtitle for the methods section (2.6) might be not helpful since this is also section 3 5) Typos on Page 16, bottom: one of these two variable*s* decrease*s*… the performance increase*s*…. The IAS condition did not led -> lead Reviewer #2: The present study investigates the potential benefit of additional auditory stimulation (with temporal regularity or not) on visual sequence learning, as implemented by a classical SRTT. The study is timely and novel, proposing to combine temporal regularity processing with prediction and sequence learning. The project is very interesting and promising. However, the design with 6 between-participants conditions and only 10 participants per condition is ambitious and might lead to underestimate the effects or the missing observation of differences for RTs. The manuscript is missing numerous details and requires some explanations and clarifications (see below). Some streamlining of the writing and section presentation should also help the reader. The manuscript could also benefit from integrating the present work into other previous research. Numerous statistical learning studies investigating the influence of concurrent or implemented temporal structures are missing in the introduction and discussion (e.g., Buchner & Steffens, 2001; Shin & Ivry 2002; Selchenkova 2014a,b). The paper might also benefit from the integration of the Dynamic Attending Theory and its hypothesis of metric binding (see for example Jones, 2016, 2019). Further relevant references for the present work are the papers by Fujii & Wan (2014), Patel & Iversen (2014) and Tierney & Kraus (2014). Buchner, A., and Steffens, M. C. (2001). Simultaneous learning of different regularities in sequence learning tasks: limits and characteristics. Psy- chol. Res. 65, 71–80. Fujii, S., & Wan, C. Y. (2014). The role of rhythm in speech and language rehabilitation: The SEP hypothesis. Frontiers in Human Neuroscience, 8. Jones, M. R. (2016). Musical time. In S. Hallam, I. Cross, & M. Thaut, The Oxford Handbook of Music Psychology (2nd ed.). Oxford University Press. Jones, M. R. (2019). Time will tell. Oxford University Press. Patel, A. D., & Iversen, J. R. (2014). The evolutionary neuroscience of musical beat perception: The Action Simulation for Auditory Prediction (ASAP) hypothesis. Frontiers in Systems Neuroscience, 8. Selchenkova et al (2014a). Metrical presentation boosts implicit learning of pitch structures. PLOS ONE 9(11): e112233 Selchenkova et al. (2014b). The influence of temporal regularities on the implicit learning of pitch structures. Quarterly Journal of Experimental Psychology, 67, 2360–2380. Shin, J. C., & Ivry, R. B. (2002). Con- current learning of temporal and spatial sequences. J. Exp. Psychol. Learn. Mem. Cogn. 28, 445–457. Tierney, A., & Kraus, N. (2014). Auditory-motor entrainment and phonological skills: Precise auditory timing hypothesis (PATH). Frontiers in Human Neuroscience, 8. Methods - For the musical background of the participants, it would be relevant to further add the information of musical training in terms of average and SD. Table 1 does not present the participant characteristics (see page 5). One table seems to be missing and the reference to table 1 related to the pre-test condition should read table 2? - Response-Stimulus-Interval was 200ms. In absence of response, the highlighted square remained on the screen for 3000ms. Did the sequence then go on with the next highlighted square? Please clarify. - Please clarify the construction of B6. Does “same pseudo-random sequence following the same rules” now refer to all 100 stimuli? Or also a repeating sequence of 10 cycling through ten times? - Considering the discussions in SRTT research about what is learned during exposure, why didn’t the authors use two controlled Second-Order-Conditional sequences (see research by Cleeremans or Destrebecqz), counterbalancing as test or training across participants? - Why was the first block referred to as B0? What was the difference between B0 and B1? I guess B0 was presented without auditory stimulation, please clarify. - Page 7 “This sequence was randomly assigned to each participant” – Does this refer to one of the four sequences (A, B, C or D)? Why were these sequences not respectively used as the test-block B6 sequences (such as, for example, Participant 1 would get sequence A for blocks B0 to B5 and sequence B for block 6 while Participant 2 would get the reverse attribution)? - Why did the authors not include the expected B7 block that returns back to the exposure sequence (B0 to B5), as usually done in SRTT? - The material presentation requires further details: what was the irregular pattern used in IAS? Was it the same for all blocks? If yes, participants might have learned it and it might have become less unexpected over time. - Was the auditory material presented during all blocks (B0 to B6)? - How was the sound of the auditory stimulation made? At which pitch height and loudness level was it presented? How was it presented (via headphones or free field)? - What was told to the participants about the purpose of the sound? - When were the TAP tests presented? - For the explicit knowledge testing, why the choice or 3-alternatives rather than a subjective scale with more gradings, which might be more sensitive to differences? Results: - Were RTs only kept from correct responses? It is not clear how the average and the variability was calculated: “across Blocks of trials were computed” Wouldn’t it need to be done across trials of each of the blocks? - Figure 2 could integrate an extra data point presenting RTs at B0 (without connecting lines to B1). - Should title 2.6. read “data analyses”? This information could be combined with the data analysis section further up, allowing for streamlining the manuscript and removing redundancies. - 2.6.1: The correlations should also be completed with the task learning score B0-B5 for attention and B0 analyses. - 2.6.2: the two ANOVAs are redundant – using blocks as additional factor (to investigate the difference between B5 and B6) or running the analysis on the computed difference should provide equivalent results (as shown by same F-values, p-values and partial eta2, see page 13). - Page 11: Please present the used categories of the Bayes factor interpretation in the text. - 3. Results: o Page 12: add exact p-values for the correlations into the text (or the table). Were Bayesian analyses also run for these pre-test analyses (as for 3.2)? o Page 13: why were additional Friedman tests ran for IAS? o Figure 2 suggests that the RTs differ in absolute terms between at least a subset of the conditions (e.g., fastest RT for RAS, followed by VO, QRAS and IAS being slowest). It might be interesting to normalize response times (e.g., using z-scores for each participant). This would remove these inter-group differences in terms of absolute response speed, but might allow for being more sensitive to reveal differences in general and specific learning. Discussion - Page 18/19: The discussion of the QRAS condition is interesting, but it requires that the overall speed of response of the participants in this condition is not faster (as faster RTs would leave less room for improvement, that is decrease of RTs) - Page 19: This study showed some learning on errors but not on RTs. Was RTs particularly fast in comparison to other studies? The explanation that error modulation with RAS reflects enhanced spatial encoding does not seem to be complete because one could also argue that enhanced spatial encoding of the motor sequence should lead to modulation of RTs. The manuscript could gain in clarity and should be checked by a native English speaker too. At some points, clarity in writing could be increased (e.g., “Attentional performance was assessed to explore the link between visuo-motor learning and attentional skills. They were also used to make sure that groups …”) I guess ‘they’ refers to the two attentional tests, but they were not mentioned here. The abbreviations used for the six conditions are somewhat abstract and difficult to remember; replacing them by more informative ones would facilitate reading (e.g., VisOnly, CongrAV, NonCongrAV, RythAud, IrregAud, FastRyth). Also, “rhythmic” does not seem to be the best wording for the “rhythmic auditory stimulation”, in particular with its contrast to the Irregular auditory stimulation, the label “Regular auditory stimulation” (RegAud) seems more appropriate. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 21 Jul 2021 Dear Professor Fetsch and reviewers, We would like to thank you for reviewing our manuscript entitled “Rhythmic and audio-visual stimulations enhance procedural learning of a perceptual-motor sequence in healthy adults: a pilot study” for publication in PLOS ONE. All of you did very insightful comments and proposed relevant changes that highly improved the quality of the manuscript. The manuscript has been revised for better readability according to the suggestions. We responded to all of the remarks with associated changes highlighted in blue. See all changes in the document "Response to Reviewers". In accordance with PLOS ONE’s policy, we uploaded anonymized data set as Supporting Information files and we adjusted titles and Figures’ style to meet PLOS ONE’s requirements. All authors have reviewed and agreed to the submission of the revised manuscript. We hope that our responses fix all the issues that you raised and that the manuscript is now acceptable for publication. Please do not hesitate to contact me if there are any questions. Submitted filename: Responses to reviewers.docx Click here for additional data file. 17 Aug 2021 PONE-D-21-00943R1 Rhythmic and audio-visual stimulations enhance procedural learning of a perceptual-motor sequence in healthy adults: a pilot study PLOS ONE Dear Dr. Lagarrigue, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it largely meets PLOS ONE’s publication criteria, although there are a small number of remaining issues to address and suggestions for improvement. Therefore, we invite you to submit a revised version of the manuscript that addresses these issues. Both reviewers and I appreciate the effort in response to the previous comments, and all agree the resulting manuscript is greatly improved. Reviewer 2 listed a number of relatively minor concerns/suggestions, and one or two more substantial ones, that should be addressed. Please submit your revised manuscript by Oct 01 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Christopher R. Fetsch Academic Editor PLOS ONE Journal Requirements: Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: (No Response) ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Partly ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: No ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: I’d like to thank the authors for their response. I believe the inclusion of Figure 4a, and block b0 in multiple figures, paint a more clear picture of what’s going on. Moreover, thank you for the general vs. specific learning clarification. I originally understood general learning as “total” learning, i.e. something that would encompass specific learning; here however general and specific learning are distinct. My one remaining suggestion is that you add a short definition earlier in the manuscript, when these terms first appear, as not all readers might be familiar with the distinction. - Reviewer #2: The revised manuscript is considerably improved in clarity and presentation. I still have the following points that the authors should address in a revision. - The manuscript should address the concern of the ambitious design of six between-participants conditions and only 10 participants per condition, which might lead to underestimate the effects, for example. - The response letter explains that B0 was another sequence different from the to-be-learned sequence and supposed to act as baseline. However, this would require to be presented without auditory stimulation. If B0 is presented with the same auditory condition as B1, then its purpose is not clear. Indeed, then even B0 is submitted to potential influences of the experimental condition. Using the first block to check that participants did not differ between groups would have served the same purpose. Considering that within-block learning can occur, presenting two different sequences at the beginning might also affect learning (i.e., learning and re-learning) and raises concerns regarding the similarity between B0 and the experimental sequence, which might affect learning differently. - Page 84 (pdf) “following the previous rules applied to 100 trials” (p. 8) This “citation” is not in the manuscript, please check all citations to make sure that the manuscript contains the same text as claimed in the response letter to the reviewers. - Page 84 (pdf) “applied to 100 trials” “trials” is a terminology that might lead to confusion, “items” seems more appropriate (here and elsewhere) or clarify. Keep the same wording throughout (sometimes referred to as “stimulations”) - The suggestion to use the same sequences across participants as test or learning was not to investigate “transfer effects” (page 85 of the pdf), but to control for sequence specific features. Please clarify and integrate your explanation and justification also in the manuscript as other readers might wonder the same. - Page 53 pdf/page8 manuscript: “a last Block (B6) presented the visual stimulations in a pseudo-random fashion”. The authors should spell out the exact sequence used (as they do for sequences A, B, C and D) so that the reader can compare it with the different training sequences used. - Page 86 pdf: The temporal pattern consisted of 160 stimuli. This needs to be clarified as one block consisted of 100 items. In addition, the authors should explain how the “intervals between them were determined randomly”, what were the possible intervals (range min/max, sampling etc]. Also provide the used sequence as supplementary material in written form and audio file. - It would be helpful for the readers to appreciate the work done by the authors and the manipulations used by adding as supplementary materials some short video excerpts displaying the paradigm and illustrating the auditory congruency or not (e.g., allowing for evaluating also the potential disturbance of the irregular sequence). - page 87. The presentation at 80dB seems quite loud. Why did the authors opt for this loudness level? - Page 89 (pdf): I agree with the authors that Jeffreys table could be placed in an appendix section. - Page 90: Thanks for this response. The authors should present their normalised RT data also into the manuscript, as they propose here. The z-score transformation (pooled across blocks) should remove the between-group speed differences and allow for the discussion (previous version page 18/19 of the manuscript). However, I do not see the purpose to use z-scores standardized by block as this does not allow for testing learning effects (see bottom part of page 90). - page 10 manuscript: “fixed to randomised” - pseudorandomized ? - page 19 manuscript : more difficult than rhythmic auditory stimulations” - does this here refer to “regular” rhythmic stimulations? Please clarify (and check throughout the manuscript that the labelling is clear). - The authors should also extend their discussion to the influence of multi-dimensional or dual cues in learning (e.g., Mitchel & Weiss, 2011 JEP:LMC). ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 27 Sep 2021 Dear Professor Fetsch and reviewers, We would like to thank you for your positive and constructive comments on our manuscript entitled “Regular rhythmic and audio-visual stimulations enhance procedural learning of a perceptual-motor sequence in healthy adults: a pilot study”. The manuscript has been revised for better readability according to the suggestions. We responded to all of the suggestions with associated changes highlighted in blue. All authors have reviewed and agreed to the submission of the revised manuscript. We hope that our responses fix all the concerns that you raised and that the manuscript is now acceptable for publication. Please do not hesitate to contact me if there are any questions. Submitted filename: Responses to Reviewers.docx Click here for additional data file. 13 Oct 2021 Regular rhythmic and audio-visual stimulations enhance procedural learning of a perceptual-motor sequence in healthy adults: a pilot study PONE-D-21-00943R2 Dear Dr. Lagarrigue, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Christopher R. Fetsch Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: 26 Oct 2021 PONE-D-21-00943R2 Regular rhythmic and audio-visual stimulations enhance procedural learning of a perceptual-motor sequence in healthy adults: a pilot study Dear Dr. Lagarrigue: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Christopher R. Fetsch Academic Editor PLOS ONE

94 in total

1. Concurrent learning of temporal and spatial sequences.

Authors: Jacqueline C Shin; Richard B Ivry
Journal: J Exp Psychol Learn Mem Cogn Date: 2002-05 Impact factor: 3.051

Review 2. Development of multisensory integration from the perspective of the individual neuron.

Authors: Barry E Stein; Terrence R Stanford; Benjamin A Rowland
Journal: Nat Rev Neurosci Date: 2014-08 Impact factor: 34.870

Review 3. Benefits of multisensory learning.

Authors: Ladan Shams; Aaron R Seitz
Journal: Trends Cogn Sci Date: 2008-11 Impact factor: 20.229

Regular rhythmic and audio-visual stimulations enhance procedural learning of a perceptual-motor sequence in healthy adults: A pilot study.

Introduction

Method

Participants

Materials

Procedure

Test of Attentional Performance (TAP)

The Serial Reaction Time Task (SRTT)

Serial reaction time task.

Data analyses

Pre-tests

SRTT

Explicit knowledge

Statistical analyses

B0 analyses

Results

Pre-tests

General learning (B1-B5)

Specific learning (B5-B6)

Explicit knowledge

Discussion

Specific learning can occur without general learning

General learning is lower with irregular stimulations and quick tempo auditory stimulations

Specific learning is enhanced with congruent audio-visual stimulations and regular rhythmic auditory stimulations

Conclusion

Bayes factors levels.

SRTT data.

Auditory sounds—Regular auditory sequence.

Auditory sounds—Irregular auditory sequence.

1. Concurrent learning of temporal and spatial sequences.

Review 2. Development of multisensory integration from the perspective of the individual neuron.

Review 3. Benefits of multisensory learning.

4. Response-to-stimulus interval does not affect implicit motor sequence learning, but does affect performance.

5. Specific sequence effects in the serial reaction time task.

Review 6. Common mechanisms of human perceptual and motor learning.

Review 7. Capturing spatial attention with multisensory cues: a review.

8. Period basin of entrainment for unintentional visual coordination.

Review 9. The cerebellum and neural networks for rhythmic sensorimotor synchronization in the human brain.

10. Context dependent learning in the serial RT task.

1. Task-irrelevant auditory metre shapes visuomotor sequential learning.