Literature DB >> 34698553

Temporal grouping effects in verbal and musical short-term memory: Is serial order representation domain-general?

Abstract

The question of the domain-general versus domain-specific nature of the serial order mechanisms involved in short-term memory is currently under debate. The present study aimed at addressing this question through the study of temporal grouping effects in short-term memory tasks with musical material, a domain which has received little interest so far. The goal was to determine whether positional coding-currently the best account of grouping effect in verbal short-term memory-represents a viable mechanism to explain grouping effects in the musical domain. In a first experiment, non-musicians performed serial reconstruction of 6-tone sequences, where half of the sequences was grouped by groups of three items and the other half presented at a regular pace. The overall data pattern suggests that temporal grouping exerts on tone sequences reconstruction the same effects as in the verbal domain, except for ordering errors which were not characterised by the typical increase of interpositions. This pattern has been replicated in two additional experiments with verbal material, using the same grouping structure as in the musical experiment. The findings support that verbal and musical short-term memory domains are characterised by similar temporal grouping effects for the recall of 6-item lists grouped by three, but it also suggests the existence of boundary condition to observe an increase in interposition errors predicted by positional theories.

Entities: Chemical

Keywords: Serial order; domain-general; grouping; music; verbal; working memory

Mesh：

Year: 2021 PMID： 34698553 PMCID： PMC9329764 DOI： 10.1177/17470218211057466

Source DB: PubMed Journal: Q J Exp Psychol (Hove) ISSN： 1747-0218 Impact factor: 2.138

Daily life activities such as remembering a phone number, having a discussion, or listening attentively to a piece of music all require the processing of serially organised information that unfolds over time and draws on short-term memory (STM) resources. The question of whether the mechanisms contributing to the maintenance of serially organised memoranda are domain-general or domain-specific is currently under debate in the STM literature (Hurlstone et al., 2014; Jones et al., 1995; Logie et al., 2016; Majerus, 2013; Soemer & Saito, 2016; Vandierendonck, 2016). Previous research comparing serial order STM for verbal and visuospatial items supported the view that the representation of serial order in STM is supported by domain-general mechanisms (for a review, see Hurlstone et al., 2014). However, the extent to which the domain-generality hypothesis applies to STM for music remains unanswered. Given its inherent rhythmic and sequential structure, music represents an appropriate candidate to further our understanding of the ordering mechanisms involved in STM, as well as to address the question of the domain-generality of these mechanisms. Contrary to the verbal domain, there are only a few models of musical STM (see Berz, 1995; Ockelford, 2007), none of which provide a comprehensive account of the processes responsible for representing serial order of musical sequences. For instance, Ockelford (2007) suggested that serial order is coded in musical working memory through the action of a tagging mechanism in which each item serves as a retrieving cue for the next item (for more details regarding the notion of tagging, see Kieras et al., 1999). However, to the best of our knowledge, there is no direct empirical evidence that such a tagging mechanism plays a role in the representation of musical order in STM. Moreover, the tagging notion relies on a chaining account of serial order representation that has been challenged by a recent study on serial order STM for music (see Gorin et al., 2018a). Several serial order effects considered as benchmark phenomena in the verbal STM domain (Hurlstone et al., 2014; Lewandowsky & Farrell, 2008) have also been observed in a recent series of musical STM experiments (Gorin et al., 2016, 2018a, 2018b). The authors interpreted these results as evidence for the existence of domain-general processes to represent serial order information in the musical domain. This interpretation is also in line with the notion that verbal and musical STM systems involve common sequential processes even though they rely on different representational stores (Williamson et al., 2010). Thus, these results suggest that basic ordering principles are at work in the two domains. In addition, they justify the use of verbal order theories as a framework for exploring the nature of ordering mechanisms in musical STM and assessing the generality of these mechanisms. The best account of benchmark order phenomenon in the verbal domain comes from models relying on positional codes to represent serial order information (see, for example, Brown et al., 2000; Burgess & Hitch, 1999; Hartley et al., 2016; Henson, 1998; Lewandowsky & Farrell, 2008). In positional models, serial order is represented by associations between items and independent markers representing positions. The main strength of this class of models is its ability to account for temporal grouping effects. Temporal grouping is characterised by the insertion of additional pauses between some items during sequence presentation, inducing the perception of temporally distinct sub-groups of items. With verbal material, such manipulations lead to the well-replicated phenomena that constraint serial order models of STM (see Frankish, 1985, 1989; Hartley et al., 2016; Henson, 1996; Hitch et al., 1996; Maybery et al., 2002; Ng & Maybery, 2002, 2005; Ryan, 1969a, 1969b). For grouped sequences, a recall advantage as well as a multiply-bowed shape serial position curve are usually observed. An increase in the proportion of interposition errors, or between-group displacements of items that keep their initial within-group serial position, is also characteristic of the recall of grouped sequences. For instance, in a 6-item sequence composed of two groups of three items, an interposition error would be to recall the item from Position 2 (i.e., Position 2 in the first group) at Position 5 (Position 2 in the second group). The study of temporal grouping effects is of particular interest to help determine the precise nature of serial order representation in STM. For example, models relying on ordinal codes such as activation gradients to represent serial order (see, for example, Farrell & Lewandowsky, 2002; Page & Norris, 1998) can accommodate the main effects induced by temporal grouping manipulations, the recall advantage, and the scalloped serial position curve. However, to account for a wider range of effects induced by temporal grouping (i.e., an increase in interposition errors in addition to the recall advantage and the scalloped appearance of the serial position curve), it is necessary to assume the existence of positional codes. Positional models accommodate the increase in interposition errors by representing serial order in a hierarchical manner. Items are associated with positional markers representing within-group positions, as well as markers representing the position of the groups/items in the sequence (see Brown et al., 2000; Burgess & Hitch, 1999; Hartley et al., 2016; Henson, 1998). The hierarchical representation of serial order makes the items in grouped sequences more distinctive than in ungrouped ones, accounting for the recall advantage and the multiply-bowed shape of the serial position curve. Moreover, this hierarchical representation of serial order increases the similarity between items in different groups that share the same within-group position, thus accounting for the increase in interposition errors observed in grouped sequences (see Figure 1 markers for a graphical example).

Figure 1.

Schematic representation of positional markers. Top: ungrouped sequence where six digit items are associated to positional markers representing positions in the sequence (grey shades). Bottom: grouped sequences where items are associated to markers representing position of items in the sequence (grey shades) and within the groups (blue shades). Darkest and lightest shades represent start and end of the sequence, respectively. As one can see, the similarity between items at Positions 1 and 4 is low in ungrouped sequences. But the similarity between these items increases in grouped sequences because of the additional marker representing within-group positions. In the musical domain, a great deal of work has been devoted to the study of the psychophysical and musical components influencing how adult listeners process and maintain musical information in STM (for a review, see Deutsch, 2013a, 2013b). However, little is known about the cognitive mechanisms involved in the short-term maintenance of musical information, and particularly those required to represent and maintain the order of a series of tones. In non-musicians, serial order reconstruction of verbal and musical sequences is characterised by similar serial order effects, suggesting that verbal ordering principles could be extended to the musical domain (Gorin et al., 2018a). In another study using a serial recognition task, researchers observed temporal grouping effects that are comparable to those usually observed in verbal STM tasks, suggesting that the positional markers described in verbal STM models of serial order could play a role in STM for music (Gorin et al., 2016). More precisely, the authors showed that in non-musicians, the rate of correct serial recognition for matching probes is higher for grouped versus ungrouped sequences, and that recognition as a function of position adopted a shape reflecting the grouping structure used in the experiment, replicating previous results obtained with musicians (Deutsch, 1980). However, the conclusions drawn in Gorin et al. (2018b) are limited as the assessment of temporal grouping on interposition errors was not possible due to the use of a recognition procedure. At the same time, there is evidence for the existence of interposition-like errors in serial order production tasks requiring experts to retrieve and play short musical excerpts on the piano from memory (Mathias et al., 2015). In comparison to shorter musical excerpts, for which the similarity between elements sharing the same metrical accent (strong or weak) in the sequence is reduced, longer musical excerpts are characterised by increased long-distance transpositions between positions with the same metrical accent. Interestingly, this phenomena can be accounted for by a model of musical sequence production assuming that to-be-produced musical events are represented hierarchically (see Mathias et al., 2015; Palmer & Pfordresher, 2003; Pfordresher et al., 2007). This model represents musical events according to their serial position and their metrical status strength, which is similar to the hierarchical coding of serial order proposed in positional models of verbal STM described above (see Brown et al., 2000; Burgess & Hitch, 1999; Hartley et al., 2016; Henson, 1998). As mentioned earlier, a growing body of evidence shows that benchmark serial phenomena characterising verbal STM are also observed in visuospatial STM tasks (for a review, see Hurlstone et al., 2014), and to some extent to the musical domain as well (Gorin et al., 2018b). This evidence supports the first account that the processing of serial order information is supported by processes shared across domains. At the same time, some authors consider that the presence of similar ordering phenomena across domains is also compatible with the existence of domain-specific mechanisms, but with functional similarities (see, for example, Logie et al., 2016; Saito et al., 2008). Indeed, observing the same serial order phenomena across STM domains is compatible with both a single domain-general mechanism and with domain-specific mechanisms coding serial order in a similar manner, and only this second account assumes that the existence of functionally similar domain-specific mechanisms can account for both differences and similarities across domains (Logie et al., 2016). Another account would be that serial order mechanisms are shared across modalities (e.g., auditory or visual) but not specific domains (e.g., verbal, visual and musical). In other words, we could envisage that both verbal and musical materials similarly draw on auditory STM resources, the latter being underpinned by auditory domain-general processes responsible for coding serial order information for both types of material. This is in line with a recent proposal from Hartley et al. (2016) suggesting that cross-domain sequential principles are responsible for processing order information but function in parallel with domain-specific mechanisms responsible for perceptual input. They proposed a stimulus-driven mechanism responsible for processing and encoding order information in auditory–verbal sequences based on the activity of neuronal oscillators tracking amplitude variations of the speech envelope at different timescales. Interestingly, it has been suggested that the encoding of rhythmic features in both speech and music could be governed by a similar stimulus-driven oscillatory mechanism (see, for example, Musacchia et al., 2014). Thus, considering the evidence for domain-specificity in processing musical information (Peretz & Coltheart, 2003; Zatorre et al., 2002), a more parsimonious account would be that domain-specific features interact with domain-general ordering mechanisms (see Majerus, 2013). To summarise, the present study aimed at investigating the effects of temporal grouping on immediate serial reconstruction of tone sequences. Through the comparison of the temporal grouping effects observed for tone sequences with those reported in the verbal STM literature, our goal was to (1) improve our understanding of the mechanisms underlying the representation of serial order in musical STM and (2) address the question of the domain-generality of serial order processes in STM. We conducted a first preregistered experiment comparing the forward reconstruction of serial order information between ungrouped 6-tone sequences and the same sequences grouped in two groups of three items. Based on the results obtained in that first experiment, a non-preregistered follow-up online experiment requiring the serial recall of 6-letter grouped and ungrouped sequences has been conducted to allow a direct comparison with the data obtained in Experiment 1. Due to the presence of ceiling effect that limited the comparison of temporal grouping effects in the musical (Experiment 1) and verbal (Experiment 2) domains, another non-preregistered online experiment was conducted to account for ceiling effect. Overall, these experiments support the close similarity between the temporal grouping effects observed in the verbal and musical domains.

Experiment 1: forward reconstruction of musical order

Method

Sampling plan

There is currently a trend in the field of psychological sciences favouring the use of Bayesian statistical techniques to design experiments and make statistical inferences. Bayesian statistics provide several advantages (for a review, see Dienes, 2016; Wagenmakers et al., 2018). For instance, Bayesian statistical analyses allow the monitoring of statistical evidence during data collection, are not influenced by the intention with which data are collected, and are not sensitive to optional stopping rules (Berger & Berry, 1988; Rouder, 2014). With these considerations in mind, we used the following sampling plan for Experiment 1 (for a similar rationale in determining sampling plan, see Wagenmakers et al., 2015). We first recruited 20 participants and conducted the planned analyses. If for these analyses (see the “Analysis plan” section for more details), we obtained strong level of statistical evidence for either an alternative (H1) or the null (H0) hypothesis with a Bayes factor (BF) of 10 or more, data collection would be stopped. If that criterion was not met for at least one of our planned analyses, we would recruit more participants while monitoring BF values. In other words, we ran the same analyses after each batch of five participants and continued until we reached strong statistical evidence for all the planned analyses (H0 or H1). However, due to resource limitations, we planned to stop data collection after the recruitment of 50 participants, even though we did not meet the criterion of statistical evidence for all the planned analyses.

Participants

The experiment was approved by the ethics committee of the Faculty of Psychology and Sciences of Education of the University of Geneva. Fifty-eight first-year psychology students from the University of Geneva took part in Experiment 1 in exchange for partial course credit. The final sample was composed of 50 participants (45 females; age n years: M = 21.78, SD = 1.95; education level in years: M = 13.00, SD = 1.12; musical theory learning in years: M = 0.35, SD = 0.85; musical practice in years: M = 0.69, SD = 1.04) after the exclusion of eight participants who did not meet the inclusion criteria (see the demographic data file on the OSF repository associated to this manuscript for more details).

Inclusion and exclusion criteria

As we were interested in musical STM for serial order processing in participants with no musical expertise, participants must have had no more than 3 years of experience in studying music theory or practicing a music instrument (including singing) at the time of the experiment. We excluded participants with neurological or speech disorders (e.g., dyslexia) from the sample. Finally, we excluded the data from any participants with performance equal to or lower than the .17 chance-level in at least one of the experimental conditions from the analysis. To adhere to the sampling plan, excluded participants were replaced by recruiting other participants.

Stimuli

The stimuli consisted of 60, 6-tone sequences. To reduce the possibility that using a limited set of six tones could increase proactive interference, we used a set of 14 different tones consisting of all the diatonic steps of the C major scale (ranging from C4 to B5). The tones were pure sine waves generated with Audacity (Audacity Team, 2017) and saved as .wav files, each lasting for 500 ms with a rise and fall period of 10 ms. The tone sequences were generated using pseudo-random permutations following three rules adapted from previous studies on verbal STM for serial order (see, for example, Hartley et al., 2016): No more than two consecutive tones that are also consecutive in the tone set (e.g., C4–E4–G4 or B4–D5–F5 was not legal); No more than two consecutive intervals in the same direction (e.g., C4 E4 D5 G4 was permitted but not C4 E4 D5 F5); No tone at the same serial position in successive trials. As the tones used cover two octaves, we constrained interval sizes to a maximum of seven semitones to avoid the presence of unfamiliar large intervals. We also ensured that the sequences were highly related to a major scale. In other words, each sequence has a maximum key correlation of at least .70 with the tone distribution profile of at least one of the major scales. The maximum key correlation was determined using the Krumhansl & Schmuckler key-finding algorithm (Krumhansl, 1990). To have matched stimuli between the two grouping conditions, we reused the 30 sequences from the ungrouped trials but played them in reverse serial order and presented them from last to first in the grouped trials. To prevent unwanted effects resulting from the use of a fixed set of tone sequences, a new set of pseudo-randomly created tone sequences was generated in advance for each participant. To ensure that each created sequence was used both in an ungrouped and a group trial, even-numbered participants had the ungrouped and grouped sequences corresponding to the grouped and ungrouped sequences, respectively, of the preceding odd-numbered participant in the experiment.

Experimental design

The experiment was based on a 2-factor within-participants design. The two types of sequences were presented in two different blocks with the ungrouped sequences always presented first. This was done to avoid that presenting the grouped sequences first could lead to the use of subjective grouping strategies for ungrouped trials (for a similar procedure, see Farrell & Lewandowsky, 2004; Hartley et al., 2016). For ungrouped trials, the tones were presented at a regular pace.

Procedure

The procedure consisted of the auditory presentation of 60 trials in total. Stimuli were played at a comfortable auditory level through headphones connected to a portable workstation. Each trial began with a countdown from 3 to 1 displayed at the centre of the computer screen at a pace of 500 ms. The tone sequence was played consecutively on a blank screen displayed for 500 ms. In ungrouped trials, the tones were presented with a regular interstimulus interval (ISI) of 150 ms. In grouped trials, the ISI was 75 ms for within-group items (Positions 1–2, 2–3, 4–5, and 5–6) and 450 ms for between items forming group boundaries (Positions 3–4). Immediately after the presentation of a sequence, a virtual keyboard was displayed on the screen and the participants used the touch screen to reconstruct the sequence. The participants were forced to reconstruct the sequences in forward serial order. To do this, they had to find and validate the tone corresponding to the first position, then proceed to the second position, and so on until reconstructing the whole sequence. The virtual keyboard was used again to reconstruct the tone sequences (Figure 2). A layer of six white keys representing the six tones heard in the to-be-reconstructed sequence were displayed horizontally on the screen. The tones were organised in ascending order, from the lowest on the left to the highest on the right. Each time a key was pressed on the touch screen, the corresponding tone was played through the headphones. Touching a key activated the associated tone by changing the colour of the key to green (see panels 1, 5, 7, or 10 in Figure 2). Once the participant retrieved the tone for the current position and activated the key, they had to press the “validate” button to proceed to the next position (see panels 4, 6, 8, or 12 in Figure 2). After a tone has been assigned to a position, the corresponding key changed to grey to indicate that the tone could not be used anymore and the auditory feedback for that key was turned off. It was possible to change the “active” tone before validating a position (see panels 10–12 in Figure 2) but not once the position was validated. If for any position the participant did not remember the corresponding tone or did not want to guess, it was possible to answer “I don’t know” by selecting the “?” button before validating the position (see panel 11 in Figure 2). Finally, at any time during the reconstruction process, participants had the opportunity to hear the reconstructed sequence up until then (see panel 9 in Figure 2).

Figure 2.

Graphical representation of the functioning of the serial order reconstruction task for tone sequences in Experiment 1. See the main text for more details about its functioning.

Hypotheses

The experiment had the following aims: (1) to better understand the nature of ordering mechanisms of in musical STM through the study of temporal grouping effects in non-musicians, which in turn would allow (2) to assess the domain-generality hypothesis of serial order in STM. To achieve this, we compared recall performance for ungrouped and grouped 6-tone sequences, focusing on serial recall accuracy, the shape of the serial position curves, response latencies, and the rates of interposition errors. According to the domain-generality hypothesis of serial order STM, it was predicted to observe higher recall accuracy for grouped than ungrouped sequences. We also predicted the presence of a multiply-bowed serial position curve for grouped sequences. Finally, we expected to observe more interposition errors in grouped than ungrouped sequences.

Analysis plan

We used the open-source program JASP (version 0.14, JASP Team, 2018) with default settings for all planned (described here below) and exploratory analysis reported. For Bayesian t-tests, the prior was represented as a Cauchy distribution with an r scale of 0.707. For Bayesian analysis of variance (BANOVA), the prior also consisted of a Cauchy distribution, with an r scale of .5 and 1 for fixed and random effects, respectively.

Recall accuracy and serial position curve

We analysed serial position curves by averaging the recall accuracy as a function of serial position and grouping condition for each participant. Then, we performed a 2 × 6 repeated-measures BANOVA, with a 2-level type of sequence factor (ungrouped vs. grouped) and a 6-level serial position factor (from 1 to 6). In case of an interaction between the two factors (i.e., the full model is the best model and is supported by a BF of at least 10, relative to the second-best model), we assessed the presence of mini-primacy and mini-recency effects in grouped sequences by comparing recall accuracy between Positions 1 and 2 (H1: 1 > 2), Positions 2 and 3 (H1: 2 < 3), Positions 4 and 5 (H1: 4 > 5), and Positions 5 and 6 (H1: 5 < 6) via Bayesian paired samples t-tests.

Transposition gradients

We analysed transposition gradients by computing the proportion of transposition errors as a function of displacement separately for each condition and for each participant. To achieve this, we performed a 2 × 5 repeated-measures BANOVA with a 2-level type of sequence factor (ungrouped vs. grouped) and a 10-level displacement distance factor (from –5 to 5, excluding 0). If the full model turned out to be the best model (i.e., BF > 10 compared with the second best model), we analysed the interaction by focusing on the rate of adjacent displacements and interposition errors (see the next analysis for more details).

Interposition errors and adjacent displacement rates

The rate of interposition errors and displacements to adjacent serial positions was determined by calculating the proportion of errors involving between-group displacement of items keeping their initial within-group position (i.e., absolute distance of three positions) and the proportion of serial order transpositions characterised by an absolute displacement distance of one serial position among all serial order errors and separately for each type of sequence (ungrouped vs. grouped). Then, the two grouping conditions were compared based on the observed rate of interposition errors (H1: interpositions in grouped sequences > interpositions in ungrouped sequences) and adjacent displacement (H1: adjacent displacement in grouped sequences < adjacent displacement in ungrouped sequences) via Bayesian paired samples t-tests.

Results

Planned analyses

The 2 × 6 repeated-measures BANOVA performed on recall accuracy as a function of serial position (1–6) and grouping condition (grouped vs. ungrouped), revealed that the best model is the model with the two main effects (see Figure 3a). This model is preferred over the second best, full model by a factor of 1.80 (see “Serial position curves” rows in Table 1). As the preference was characterised only by anecdotal evidence, we conducted an analysis of effect. This was done with JASP via a method averaging evidence across all the models containing the effect of interest. The data provided decisive evidence in favour of the presence of a serial position effect (BFInclusion = ∞), very strong evidence in favour of a grouping effect (BFInclusion = 31.28), and anecdotal evidence in favour of the presence of the interaction (BFInclusion = 2.15). As initially planned, we did not analyse mini-primacy and mini-recency effects in grouped sequences as the presence of the interaction was not supported by the data.

Figure 3.

(a) Serial position curves, (b) transposition gradients, and (c) response latencies as a function of the type of grouping, from Experiment 1. Error bars represent confidence interval computed on data corrected for between-subject variability (Morey, 2008), following Baguley (2012, formula 8) recommendations.

Table 1.

Results of the Bayesian repeated-measures analyses of variances for the serial position curve, transposition gradients, and response latencies from Experiment 1.

Analysis	Models	P(M)	P(M \| data)	BFM	BF10	Error %
Serial position curves	Null model (incl. subject)	0.20	4.42e⁻⁸³	1.77⁻⁸²	1.00
	Condition + Position	0.20	0.63	6.80	1.42e⁸²	1.23
	Condition + Position+	0.20	0.35	2.15	7.91e⁸¹	1.25
	Condition × Position
	Position	0.20	0.02	0.09	4.72e⁸⁰	1.50
	Condition	0.20	6.97e⁻⁸³	2.79e⁻⁸²	1.58	6.19
Transposition gradients	Null model (incl. subject)	0.20	0.00	0.00	1.00
	Condition + Distance+	0.20	0.99	648.69	∞	1.18
	Condition × Distance
	Distance	0.20	5.73e⁻³	0.02	∞	0.78
	Condition + Distance	0.20	3.96e⁻⁴	1.58e⁻³	∞	1.23
	Condition	0.20	0.00	0.00	0.07	6.26
Response latencies	Null model (incl. subject)	0.20	1.12e⁻¹¹⁰	4.83⁻¹¹⁰	1.00
	Condition + Position+	0.20	1.00	1.55e⁸	8.28e¹⁰⁹	1.23
	Condition × Position
	Condition + Position	0.20	2.65⁻⁸	1.06⁻⁷	2.20e¹⁰²	1.22
	Position	0.20	2.61⁻¹¹	1.04⁻¹⁰	2.16e⁹⁹	1.50
	Condition	0.20	5.82e⁻¹¹⁰	2.33e⁻¹⁰⁹	4.82	6.17

BF: Bayes factor.

All models include subject and for each analysis models are compared with the null model; Condition: temporal grouping effect; Position: serial position effect; Distance: transposition distance effect.

Results of the Bayesian repeated-measures analyses of variances for the serial position curve, transposition gradients, and response latencies from Experiment 1. BF: Bayes factor. All models include subject and for each analysis models are compared with the null model; Condition: temporal grouping effect; Position: serial position effect; Distance: transposition distance effect. (a) Serial position curves, (b) transposition gradients, and (c) response latencies as a function of the type of grouping, from Experiment 1. Error bars represent confidence interval computed on data corrected for between-subject variability (Morey, 2008), following Baguley (2012, formula 8) recommendations. The 2 × 10 repeated-measures BANOVA performed on the proportion of transposition errors as a function of transposition distance (−5 to 5, excluding 0) and grouping condition (grouped vs. ungrouped), revealed that the best model to explain the data is the full model (see Figure 3b). This model is preferred over the second best containing only the effect of distance by a factor of 173.36, representing decisive support for the best model (see “Transposition gradients” rows in Table 1). Given the clear support for an interaction between grouping condition and transposition distance, we compared the rate of adjacent transpositions and interpositions between the two grouping conditions as initially planned. The directed Bayesian paired samples t-test comparing the rate of interposition errors between the two grouping conditions (H1: ungrouped > grouped) provided anecdotal evidence in favour of the null model (BF01 = 2.16). Next, we compared the rate of adjacent transpositions between the two conditions via another Bayesian paired samples t-test (H1: ungrouped < grouped). The results provided decisive evidence in favour of the presence of less adjacent transpositions in grouped than in ungrouped trials (BF10 = 623.10).

Exploratory analyses

Since the present study focused on exploring the nature of serial order representation in musical STM, it is critical to ensure that contour was not the dominant component in representing the sequences. Contour is a critical aspect of melodic representation, particularly in non-experts (see Dowling, 1978; Dowling & Tillmann, 2014). Thus, it is a possibility that the participants focused more on contour than item positions. If the grouping manipulation boosted recall performance, this could have influenced only contour-based representation. We then re-scored recall performance by considering an interval as correct when its direction (up or down) was the same as for the corresponding interval in the target sequence. Next, we compared the rate of above-chance correct recall for item position and contour scoring methods (subtracting 0.17 and 0.5 chance-level to item and contour scoring, respectively). The results of an undirected Bayesian paired samples t-test conducted on chance-corrected item position and contour scores provided decisive evidence (BF10 = 9.22e5) in favour of better performance when using the item position (M = 0.24, SD = 0.11) than the contour scoring method (M = 0.18, SD = 0.09). To gain a better idea of the origin of the decrease of adjacent transposition errors in grouped sequences, we compared the rates of within and between-group displacements—the latest differentiating interpositions, non-interposition, and group-boundary displacements—between the two conditions of grouping (see Table 2). Exploratory comparisons performed via undirected Bayesian paired samples t-test suggest a moderate level of absence of difference between the two conditions regarding the rate of interpositions (BF01 = 3.69), within-group transpositions (BF01 = 6.48), and other between-group transpositions (BF01 = 3.02). Interestingly, the results revealed decisive evidence that a difference between the rate of displacements involving group boundaries (BF10 = 153.29) was present.

Table 2.

Proportions of within- and between-group transposition errors, as a function of grouping condition, from Experiment 1.

Grouping type	Within groups	Between groups
Grouping type	Within groups	Boundary	Interpositions	Others
Ungrouped	.40(.05)	.10(.03)	.15(.04)	.35(.07)
Grouped	.40(.07)	.08(.04)	.15(.04)	.37(.10)

Values in parentheses are standard deviations.

Proportions of within- and between-group transposition errors, as a function of grouping condition, from Experiment 1. Values in parentheses are standard deviations. Finally, we took advantage of the changes introduced in the task, which made response behaviours more comparable with those characterising verbal serial recall, to perform an exploratory analysis of response latencies. This analysis is of interest because temporal grouping exerts an important effect on the pattern of recall timing, which is well accommodated by a two-dimensional representation of positional information (Lewandowsky & Farrell, 2008). In ungrouped sequences, response timing is characterised by a long latency for the initiation of the recall, followed by an inverted U-shaped response timing (Farrell & Lewandowsky, 2004). For grouped sequences, additional long latency is observed at the beginning of temporal groups, reflecting the temporal structure of the sequence (Farrell, 2008; Maybery et al., 2002). To determine the presence of such a pattern in the present study, we performed a BANOVA on the log of response latency (i.e., timing relative to the previous response or the last presented tone for the first responded item) for correct responses as a function of serial position (1–6) and grouping condition (grouped vs. ungrouped). The results revealed that the full model is the best model (see Figure 3c), preferred over the second best model containing only the effect of serial position by a factor of 3.77e7, representing decisive evidence supporting the presence of the two main effects and their interaction (see “Response latencies” rows in Table 1).

Discussion

Experiment 1 aimed to better understand the nature of serial order representations in musical STM. To achieve this goal, we tested whether, with tone sequences, temporal grouping exerts the same effects on recall accuracy, transposition errors, and response latencies as those reported with verbal material. We presented participants with ungrouped tone sequences and grouped tone sequences consisting of two groups of three items. The evidence that temporal grouping increased recall accuracy was strong. The effect of grouping on the shape of the serial position curve was anecdotal, with only limited scalloping. Analysis of response latencies showed a typical inverted U-shaped profile with a long latency for the first output item in the ungrouped condition, whereas we observed an increase in latency for the first output item in each group in the grouped sequences (for similar results in the verbal domain, see Farrell, 2008; Maybery et al., 2002). However, while temporal grouping reduced the rate of adjacent transpositions for items at group boundaries, a typical pattern in verbal STM for serial order (Henson, 1999; Maybery et al., 2002), we observed evidence against an increase in interposition errors in grouped sequences. This experiment confirmed, using a serial recall procedure, the results of Gorin et al. (2018b) that temporal grouping provides an advantage in short-term recognition of musical stimuli. The pattern of grouping effects observed in this experiment is very similar to what is typically reported for similar verbal STM tasks: grouping induces scalloping of the serial position curve and provides a recall advantage (Frankish, 1985; Hitch et al., 1996; Ryan, 1969a), leads to a decrease in adjacent transpositions (Maybery et al., 2002), and response latency is longer at the beginning of groups (Farrell, 2008; Maybery et al., 2002). However, we did not observe the classical increase in interposition errors, which is a benchmark of temporal grouping and is considered as evidence for the existence of two-dimensional positional markers coding the positions of items within the groups and the positions of the groups or items in the sequence, respectively (Brown et al., 2000; Burgess & Hitch, 1999; Hartley et al., 2016; Henson, 1998). The results reported here mirror those observed with visuospatial material and where benchmarks of temporal grouping effects were also observed, except for the increase in interposition errors (Hurlstone, 2019). The authors accounted for the difference by proposing a model of serial order coding positional information in a slightly different way with regard to the type of material. For verbal information, two-dimensional markers code group positions in the sequence and item positions within the groups. For visuospatial material, the two-dimensional markers code group and item positions in the sequence. A straightforward account of the results reported here would be to assume that the same positional coding scheme is used for visuospatial and musical material, but that the increase in interposition errors in grouped sequence is specific to the positional coding scheme used for verbal information. At the same time, the observation of interposition errors in the verbal domain is limited to a very specific context where the items are presented in a sequence of three groups of three items (e.g., Hartley et al., 2016; Henson, 1996; Hurlstone, 2019; Ng & Maybery, 2002, 2005; Ryan, 1969b). To the best of our knowledge, in the literature on temporal grouping effects with verbal sequences of six items (e.g., two groups of three items, see Farrell, 2008; Hitch et al., 1996; Maybery et al., 2002; Parmentier & Maybery, 2008), there is no study reporting an increase in interposition errors in grouped sequences. Consequently, inferring the nature of serial order representation in the musical domain based on the assumption that in the verbal domain grouping sequences of nine or six items in groups of three should lead to the same pattern of grouping effects may represent a shortcoming. Thus, it is a possibility that the absence of increase in interposition errors with musical material is related to the use of 6-item sequences but not to the presence of different positional coding scheme between the verbal and musical domains. If this is the case, we should observe the same effect with verbal material as seen in the present experiment. To explore this possibility, we conducted an online study where participants had to recall sequences of letters in serial order where we manipulated the phonological similarity (similar vs. dissimilar) and the type of grouping (ungrouped vs. grouped).

Experiment 2: forward serial recall of verbal order

This experiment was conducted to determine whether the absence of increase in interposition errors increase in musical grouped sequences observed in Experiments 1 was due to the use of 6-item sequences or due to different positional representation compared with the verbal domain. To this aim, we conducted an online experiment requiring participants to recall visually presented letter lists. The first half of the experiment presented participants with ungrouped sequences and the other half with grouped ones. Moreover, to take into account the fact that with musical material performance can be negatively impacted by tonal proximity (Williamson et al., 2010), half of trials presented sequences composed of phonologically similar (e.g., D–G–C–T–P–V) letters and the other half presented dissimilar letters (e.g., R–L–K–M–F–S). This experiment was conducted during the COVID-19 outbreak in Spring 2020. Due to organisational constraints, we had to allow all participants to take part in the study to ensure the validation of their course credits. Thus, no specific criteria were applied to prior participants who took part in this study as well. Consequently, the sampling plan consisted in letting as many students as possible to take part in the study. The URL of the experiment was shared with the participants via a forum used in one of their first-year psychology courses. Exclusion criteria was applied only once the data collection period ended. The experiment was approved by the ethics committee of the Faculty of Psychology and Sciences of Education of the University of Geneva. A total of 101 first-year psychology students from the University of Geneva participated in this online experiment in exchange for partial course credit. After the exclusion of 15 participants who met the exclusion criteria, the final sample was composed of 86 participants (gender: 66 females, 19 males and 1 other; age in years: M = 22.27, SD = 6.03).

Exclusion criteria

We excluded participants with any learning or neurological disorder as well as those not fluent in French. The stimuli consisted of 160, 6-letter sequences. Half of the sequences were composed of phonologically similar letters, drawn randomly without the replacement six letters from the pool B, C, D, G, P, T, and V. The other half was composed of phonologically dissimilar letters, drawn without the replacement six letters from the pool X, H, J, L, K, Q, and S. When generating the sequences, we ensured that the same letter did not occur at the same serial position in consecutive trials and that all the sequences were unique. Finally, as in the previous experiments, a new set of 180 sequences was generated for each participant. Each trial started with a countdown from 3 to 1. The countdown was displayed in blue on a white background with a sans-serif font and a font size of 30. Each digit was presented in the centre of the screen for a duration of 500 ms, followed by a blank screen of 100 ms. Immediately after the countdown, the six letters were presented sequentially in the centre of the screen. Letters were displayed in black on a white background with a sans-serif font and using a font size of 40. Each letter was presented for a duration of 500 ms, followed by a blank screen lasting for 100 ms. In grouped trials, an additional pause of 500 ms was added between the third and fourth items. Directly after the presentation of the last item, a response field represented by an array of six horizontal lines displayed from left to right was shown on the screen. Participants were required to recall the sequence by entering the letters in their order of presentation using the keyboard of their computer. The response field was automatically populated with participants’ answers without any possibility to correct their response. Only letters could be entered in the response field. Once the participant typed six letters, a message inviting them to start the next trial was shown on the screen. The experiment was separated into two blocks. In the first block, participants were presented with ungrouped and grouped sequences in the second block. The order of the trials presenting phonologically similar and dissimilar letters was random. Each block started with four trials, presenting two phonologically similar and dissimilar letter sequences. During the training session, participants had feedback regarding the accuracy of their response, but not during the experimental trials. The task was programmed with lab.js, a free and open-source online study builder (Henninger et al., 2019). Then, the experiment was exported to and hosted on a protected server of the University of Geneva. The management of online data collection was performed with JATOS, an open-source and free online studies manager (Lange et al., 2015). The data obtained so far show that temporal grouping does not lead to increased interpositions when using musical material, suggesting that different positional codes are required to represent verbal and musical serial order in STM. As there is no evidence in the verbal STM literature that grouping sequences in two groups of three items leads to an increase in interposition errors, an alternative interpretation would be to consider that this effect is specific to the use of longer grouping structures (e.g., 3 × 3 structure). If the latter is true and that similar positional codes underlie serial order representation in verbal and musical STM, we should observe the same pattern of data with verbal material as observed in Experiment 1 with musical material. In other words, we should observe a recall advantage, a scalloped serial position curve, and response latency with peaks at the beginning of groups for grouped sequences, but no increase in interposition errors. If the former interpretation is true, we should observe the same pattern with an additional increase in interposition errors for grouped sequences. Regarding the effect of phonological similarity, no predictions were made before running the experiment, except for the expectation that recall accuracy should be worse for phonologically similar sequences. The manipulation of phonological similarity was implemented in this experiment only to take into account the fact there is an inherent effect of pitch proximity when using tone sequences, which is considered as a musical proxy of phonological similarity (Williamson et al., 2010). As in the previous experiment, the analyses were performed with JASP (JASP Team, 2018), using the same default values for priors and applying the same analysis plan. For each type of analysis (i.e., serial position curves, transposition gradients, and response latencies), data from trials presenting phonologically dissimilar and similar letters were analysed separately.

Serial position curves

We computed the proportion of correct recall as a function of serial position and temporal grouping across all the dissimilar trials for each participant. We then performed a 2 × 6 repeated-measures BANOVA with serial position (1–6) and grouping condition (grouped vs. ungrouped) factors (see top-left of Figure 4). The results revealed that the best model was the model with the two main effects, preferred over the second best, the full model, by a factor of 4.44 (see “Serial position curves” rows in Table 3). This was confirmed by an analysis of effect that provided decisive evidence for the two main effect (Grouping: BFInclusion = 1.43e14; Position: BFInclusion = 1.43e14), but anecdotal evidence against the presence of an interaction (BFInclusion = 0.90).

Figure 4.

Top panels: serial position curve; middle panels: transposition gradients: bottom panels: response latencies. Left and right parts of the figure depict data from phonologically dissimilar and similar trials, respectively (Experiment 2). Error bars represent confidence interval computed on data corrected for between-subject variability (Morey, 2008), following Baguley (2012, formula 8) recommendations.

Table 3.

Analysis	Models	P(M)	P(M \| data)	BFM	BF10	Error %
Serial position	Null model (incl. subject)	0.20	1.16e⁻⁸⁸	4.63e⁻⁸⁸	1.00
	Condition + Position	0.20	0.82	17.76	7.05e⁸⁷	1.23
	Condition + Position+	0.20	0.18	0.90	1.59e⁸⁷	1.26
	Condition × Position
	Position	0.20	1.99e⁻¹⁷	7.98e⁻¹⁷	1.72e⁷¹	1.49
	Condition	0.20	9.95e⁻⁷⁸	3.98e⁻⁷⁷	8.59e¹⁰	6.18
Transposition gradients	Null model (incl. subject)	0.20	4.45e⁻²³⁴	1.78e⁻²³³	1.00
	Distance	0.20	0.93	51.09	2.08e²³³	1.94
	Condition + Distance	0.20	0.07	0.31	1.61e²³²	1.27
	Condition + Distance+	0.20	8.12e⁻⁴	3.25e⁻³	1.82e²³⁰	1.32
	Condition × Distance
	Condition	0.20	3.70e⁻²³⁵	1.48e⁻²³⁴	0.08	6.25
Response latencies	Null model (incl. subject)	0.20	3.70e⁻²⁵⁶	1.48e⁻²⁵⁵	1.00
	Condition + Position+	0.20	1.00	1.17e⁹	2.70e²⁵⁵	1.26
	Condition × Position
	Condition + Position	0.20	3.41e⁻⁹	1.36e⁻⁸	9.22e²⁴⁶	1.24
	Position	0.20	2.42e⁻¹³	9.67e⁻¹³	6.53e²⁴²	1.54
	Condition	0.20	9.88e⁻²⁵⁶	3.95e⁻²⁵⁵	2.67	6.17

BF: Bayes factor.

Results of the Bayesian repeated-measures analyses of variances for the serial position curve, transposition gradients, and response latencies for phonologically dissimilar sequences from Experiment 2. BF: Bayes factor. All models include subject and for each analysis models are compared with the null model; Condition: temporal grouping effect; Position: serial position effect; Distance: transposition distance effect. Top panels: serial position curve; middle panels: transposition gradients: bottom panels: response latencies. Left and right parts of the figure depict data from phonologically dissimilar and similar trials, respectively (Experiment 2). Error bars represent confidence interval computed on data corrected for between-subject variability (Morey, 2008), following Baguley (2012, formula 8) recommendations. The same analysis was performed with data from trials presenting phonologically similar letters, revealing that the best model was the full model and was preferred over the second best model by a factor of 1.67 (see top-right of Figure 4). Given the ambiguous evidence for preferring the best model over the second best model (see “Serial position curves” rows in Table 4), we performed an analysis of effect. The results yielded decisive evidence in favour of the two main effects (Grouping: BFInclusion = 2.70e11; Position: BFInclusion = 6.67e13) and moderate evidence in favour of the existence of an interaction (BFInclusion = 6.70).

Table 4.

Results of the Bayesian repeated-measures analyses of variances for the serial position curve, transposition gradients, and response latencies for phonologically similar sequences from Experiment 2.

Analysis	Models	P(M)	P(M \| data)	BFM	BF10	Error %
Serial position	Null model (incl. subject)	0.20	2.61e⁻¹⁰⁹	1.05e⁻¹⁰⁸	1.00
	Condition + Position+	0.20	0.63	6.70	2.40e¹⁰⁸	1.26
	Condition × Position
	Condition + Position	0.20	0.37	2.39	1.43e¹⁰⁸	1.24
	Position	0.20	2.46e⁻¹²	9.84e⁻¹²	9.41e⁹⁶	1.51
	Condition	0.20	4.72e⁻¹⁰³	1.89e⁻¹⁰²	1.81e⁶	6.20
Transposition gradients	Null model (incl. subject)	0.20	0.00	0.00	1.00
	Condition + Distance+	0.20	0.49	3.78	∞	1.30
	Condition × Distance
	Distance	0.20	0.48	3.65	∞	1.94
	Condition + Distance	0.20	0.04	0.15	∞	1.27
	Condition	0.20	0.00	0.00	0.08	6.25
Response latencies	Null model (incl. subject)	0.20	3.97e⁻²⁸⁰	1.59e⁻²⁷⁹	1.00
	Condition + Position+	0.20	1.00	8.54e⁶	2.52e²⁷⁹	1.26
	Condition × Position
	Condition + Position	0.20	4.64e⁻⁷	1.86e⁻⁶	1.17e²⁷³	1.25
	Position	0.20	4.67e⁻⁹	1.87e⁻⁸	1.18e²⁷¹	1.54
	Condition	0.20	1.88e⁻²⁸⁰	7.52e⁻²⁸⁰	0.47	6.17

BF: Bayes factor.

Results of the Bayesian repeated-measures analyses of variances for the serial position curve, transposition gradients, and response latencies for phonologically similar sequences from Experiment 2. BF: Bayes factor. All models include subject and for each analysis models are compared with the null model; Condition: temporal grouping effect; Position: serial position effect; Distance: transposition distance effect. Note that for the analysis of transposition errors we removed the participants that produced no error in at least one of the four experimental conditions, leading to a sample of 77 participants. For each participant, we computed the proportion of errors as a function of absolute distance displacement and temporal grouping across all the dissimilar among all the errors. Then, we analysed the data with a 2 × 2 × 5 repeated-measures BANOVA with absolute transposition distance (1–5) and grouping condition (grouped vs. ungrouped) as factors (see middle-left of Figure 4). The results provided strong evidence in favour of the best model containing only the effect of distance, being preferred over the second best model with the two main effects by a factor of 12.92 (see “Transposition gradients” rows in Table 3). The same analysis has been reproduced with data from trials with phonologically similar letters (see middle-right of Figure 4). This provided strong evidence that the best model is the full model that was preferred over the second best model containing only the effect of distance by a factor of 1.02 (see “Transposition gradients” rows in Table 4). As the results were ambiguous, we performed an analysis of effects that revealed decisive and moderate evidence supporting the presence of an effect of distance (BFInclusion = ∞) and an interaction between distance and grouping (BFInclusion = 3.78), respectively. Given the moderate support for the interaction, we analysed the rate of adjacent transpositions and interposition errors with directed Bayesian paired samples t-test (adjacent errors: H1 = ungrouped > grouped; interpositions: H1 = ungrouped < grouped), as in the previous experiment. We obtained strong evidence against both an increase in interposition errors (BF01 = 12.21) and a decrease in adjacent transposition (BF01 = 25.02) in grouped trials. Then, as in the previous experiment, we analysed the rate of within-group and between-group transposition errors, distinguishing for the latest between interposition errors, group-boundary transpositions, and other between-group transpositions (all comparisons involved undirected Bayesian paired samples t-test with default prior). As shown in Table 5, there is strong evidence that temporal grouping in dissimilar trials induced an increase of within-group transposition but a decrease of transpositions involving items at the group boundary. At the same time, there was moderate evidence supporting an absence of difference between the rates of interposition errors and other between-group transpositions. Regarding similar trials, the results reported in Table 6 show the exact same pattern as for dissimilar trials, except that there was strong evidence for a difference in the rate of other between-group transpositions.

Table 5.

Proportions in dissimilar trials of within- and between-group transposition errors, as a function of grouping condition, for dissimilar trials from Experiment 2.

Grouping type	Within groups	Between groups
Grouping type	Within groups	Boundary	Interpositions	Others
Ungrouped	.68(.23)	.13(.17)	.06(.08)	.13(.15)
Grouped	.80(.21)	.05(.09)	.05(.09)	.10(.15)
Pairwise-comparisons (ungrouped vs. grouped)	BF₁₀ = 29.38	BF₁₀ = 34.82	BF₀₁ = 6.46	BF₀₁ = 3.63

BF: Bayes factor.

Values in parentheses are standard deviations. Pairwise-comparisons were conducted with undirected Bayesian paired samples t-tests.

Table 6.

Proportions in similar trials of within- and between-group transposition errors, as a function of grouping condition, for similar trials from Experiment 2.

Grouping type	Within groups	Between groups
Grouping type	Within groups	Boundary	Interpositions	Others
Ungrouped	.67(.15)	.13(.10)	.07(.06)	.13(.10)
Grouped	.79(.19)	.07(.13)	.06(.08)	.08(.08)
Pairwise-comparisons (ungrouped vs. grouped)	BF₁₀ = 147.54	BF₁₀ = 46.95	BF₀₁ = 7.53	BF₁₀ = 63.81

BF: Bayes factor.

Values in parentheses are standard deviations. Pairwise-comparisons were conducted with undirected Bayesian paired samples t-tests.

Proportions in dissimilar trials of within- and between-group transposition errors, as a function of grouping condition, for dissimilar trials from Experiment 2. BF: Bayes factor. Values in parentheses are standard deviations. Pairwise-comparisons were conducted with undirected Bayesian paired samples t-tests. Proportions in similar trials of within- and between-group transposition errors, as a function of grouping condition, for similar trials from Experiment 2. BF: Bayes factor. Values in parentheses are standard deviations. Pairwise-comparisons were conducted with undirected Bayesian paired samples t-tests.

Response latencies

For each participant, we determined the mean response latency for correct recall in dissimilar trials as a function of temporal grouping and serial position. The data were next analysed via a 2 × 6 repeated-measures BANOVA with serial position (1–6) and grouping condition (grouped vs. ungrouped) factors (see bottom left of Figure 4). The results yielded decisive evidence in favour of the full model containing the two main effects and their interaction, this model being preferred over the second best model by a factor of 2.93e8 (see Table 3). The same analysis has been performed with similar trials, leading to the same outcome (see bottom-right of Figure 4); the full model being the best model and preferred over the second best by a factor of 2.16e6 (see Table 4). In Experiment 2, we observed that regardless of the phonological similarity of the material, grouped sequences were better recalled and characterised by a scalloped serial position curve compared with ungrouped sequences. In addition, the typical pattern of response latencies with a latency peak for the first item of the second group was found. However, in line with the results reported in Experiment 1 with musical material, no increase in interposition errors was observed in grouped sequences for both phonologically similar and dissimilar trials. At the same time, it should be noted that the performance can be seen as ceiling and that, in such a context, it is difficult to exclude the possibility that the absence of an increase in interposition errors in the grouped sequences is simply due to the fact that the overall number of errors was too low. To determine whether the lack of increase in interpositions is due to ceiling or is specific to the 2 × 3 grouping structure used in Experiment 2, we conducted an additional experiment replicating the procedure used in Experiment 2 but with an end-of-list distractor aimed at reducing recall performance while keeping the same sequence structure.

Experiment 3: serial recall of verbal order with end-of-list distractor task

The goal of this experiment was to test whether the absence of an increase in interpositions in grouped sequences in Experiment 2 was due to the very low number of errors induced by a ceiling effect or specific to the use of lists of 6 items grouped by three. The procedure was the same as in Experiment 2, except that the presentation of each list was followed by a parity judgement task asking participants to judge whether numbers presented on the screen were even or odd. The purpose of this distracting task was to reduce the precision of the recall—and therefore increase the number of ordering errors—while keeping the same grouping structure as in Experiments 1 and 2. Due to the COVID-19 pandemic situation, the experiment was conducted entirely online. As with Experiment 2, the sampling design was to let as many students and non-students from our participant pool take part in the study as possible. The experiment was approved by the ethics committee of the Faculty of Psychology of UniDistance Suisse. Participants were recruited through the UniDistance Suisse participant pool, which is composed mainly of German-speaking psychology students and German-speaking non-students interested in participating in experiments. Students received partial course credit for their participation and non-students participated in the experiment on a voluntary basis. A total of 79 participants completed the online experiment. After excluding 14 participants who met the exclusion criteria, the final sample consisted of 55 participants (gender: 47 females and 8 males; age in years: M = 35.83, SD = 9.43). We excluded participants with any learning or neurological disorder as well as those who were not fluent in German. Participants were also excluded from the analysis based on their performance in the end-of-list distracting task, to ensure that they were actively performing the task. Therefore, any participant with less than 60% accuracy in the end-of-list distraction task was excluded from the analysis. The stimuli were the same as in Experiment 2, but with two notable exceptions. First, due to the addition of a distraction task at the end of the list, the duration of a trial was increased compared with Experiment 2. Therefore, in order to keep the task to a similar duration as in Experiment 2, the total number of lists presented to the participant was 102 (25% phonologically similar and ungrouped, 25% phonologically similar and grouped, 25% phonologically dissimilar and ungrouped, and 25% phonologically dissimilar and grouped). Second, because the participants were German speakers, the phonologically dissimilar letters consisted of V, Y, X, Z, J, and Q, and the phonologically similar letters consisted of B, C, D, G, P, and T. The procedure was exactly the same as in Experiment 2, except for the addition of the end-of-list distractor. After the last item was presented, a blank screen was presented for 1,000 ms, followed by eight digits presented in the centre of the screen (700 ms on and 200 ms off). Participants were instructed to press the S key as quickly as possible when the digit presented was even, and to press L when the digit presented on the screen was odd. They were informed that they could press the keys during the presentation of the numbers as well as during the blank screen after each number was presented. The numbers were randomly selected with replacement. After the end-list distractor, the recall procedure proceeded as described in Experiment 2. During the training session, participants received feedback after each trial regarding the number of letters correctly recalled and the number of correct parity judgements. No feedback was given during the experimental trials. The task was programmed with lab.js, a free and open-source online study builder (Henninger et al., 2019), and implemented on a protected server with PHP. Participants accessed the experiment with a custom URL. The data from Experiments 1 and 2 support the view that temporal grouping has similar effects on the musical and verbal STM. It is noteworthy that the observed pattern in both domains indicates that for lists of 6 items grouped into threes, there is no increase in interposition errors, contrary to what would be predicted from serial order models that best account for temporal grouping effects in STM (see, for example, Brown et al., 2000; Burgess & Hitch, 1999; Hartley et al., 2016; Henson, 1998). At the same time, the presence of a ceiling effect in recall accuracy in Experiment 2 limits this interpretation for the verbal domain. By adding an end-of-list distractor, this experiment aims to confirm the data from Experiment 2, namely that verbal lists of 6 items grouped into threes do not lead to an increase in interposition errors, as also observed in Experiment 1 with musical material. In other words, this experiment aimed to test that verbal and musical STM are supported by common ordering mechanisms. The experiment also aimed to verify that the observation of increased interposition errors in recall of grouped lists is characteristic of longer sequences and/or sequences with more groups (e.g., a 3 × 3 grouping structure). If this hypothesis is correct, we would expect to observe the usual temporal grouping effects, except for the increase in interposition errors. As in Experiment 2, there was no specific prediction regarding the phonological similarity effect and its interaction with other factors, except that recall accuracy should be worse for phonologically similar sequences. As a reminder, this manipulation was introduced to have a closer comparison with musical material for which there is an inherent tonal proximity effect (Williamson et al., 2010). As in the previous experiment, the data were analysed using JASP (version 0.14, JASP Team, 2018) with the same default values for priors and applying the same analysis plan. For each analysis (i.e., serial position curves, transposition gradients, and response latencies), data from trials presenting phonologically dissimilar and similar letters have been analysed separately. We calculated for each participant the proportion of correct recalls as a function of serial position and temporal grouping first for phonologically dissimilar trials. The data were then submitted to a 2 × 6 repeated-measures BANOVA with serial position (1–6) and grouping condition (grouped vs. ungrouped) factors (see top-left of Figure 5). The results revealed that the best model was the one with the two main effects only, preferred over the second best model (full model) by a factor of 32.76. This result represents strong evidence in favour of an effect of grouping on recall accuracy and on serial position, but no interaction between the two factors (see “Serial position curves” rows in Table 7).

Figure 5.

Top panels: serial position curve; middle panels: transposition gradients: bottom panels: response latencies. Left and right parts of the figure depict data from phonologically dissimilar and similar trials, respectively (Experiment 3). Error bars represent confidence interval computed on data corrected for between-subject variability (Morey, 2008), following Baguley (2012, formula 8) recommendations.

Table 7.

Analysis	Models	P(M)	P(M \| data)	BF_M	BF₁₀	Error %
Serial position	Null model (incl. subject)	0.20	8.76e⁻⁴⁸	3.50e⁻⁴⁷	1.00
	Grouping + Position	0.20	0.97	131.05	1.11e ⁺ ⁴⁷	1.23
	Grouping + Position+	0.20	0.03	0.12	3.38e ⁺ ⁴⁵	1.24
	Grouping × Position
	Position	0.20	6.31e⁻¹²	2.52e⁻¹¹	7.20e ⁺ ³⁵	1.42
	Grouping	0.20	6.17e⁻⁴⁰	2.47e⁻³⁹	7.04e ⁺ ⁷	6.14
Transposition gradients	Null model (incl. subject)	0.20	6.72e⁻¹¹¹	2.69e⁻¹¹⁰	1.00
	Distance	0.20	0.90	37.66	7.44e⁻¹¹¹	1.90
	Grouping + Distance	0.20	0.09	0.37	7.85e⁻¹¹⁰	1.27
	Grouping + Distance+	0.20	0.01	0.04	6.47e⁻¹⁰⁹	1.31
	Grouping × Distance
	Grouping	0.20	6.83e⁻¹¹²	2.73e⁻¹¹¹	10.23	6.26
Response latencies	Null model (incl. subject)	0.20	9.27e⁻⁸³	3.71e⁻⁸²	1.00
Response latencies	Grouping + Position+	0.20	0.95	77.88	1.03e ⁺ ⁸²	1.23
	Grouping × Position
	Grouping + Position	0.20	0.05	0.21	5.27e ⁺ ⁸⁰	1.23
	Position	0.20	1.18e⁻⁵	4.73e⁻⁵	1.28e ⁺ ⁷⁷	1.48
	Grouping	0.20	2.13e⁻⁸¹	8.51e⁻⁸¹	22.95	6.14

BF: Bayes factor.

Results of the Bayesian repeated-measures analyses of variances for the serial position curve, transposition gradients, and response latencies for phonologically dissimilar sequences from Experiment 3. BF: Bayes factor. All models include subject and for each analysis models are compared with the null model; Condition: temporal grouping effect; Position: serial position effect; Distance: transposition distance effect. Top panels: serial position curve; middle panels: transposition gradients: bottom panels: response latencies. Left and right parts of the figure depict data from phonologically dissimilar and similar trials, respectively (Experiment 3). Error bars represent confidence interval computed on data corrected for between-subject variability (Morey, 2008), following Baguley (2012, formula 8) recommendations. The same analysis was performed with data from trials with phonologically similar letters, leading to the same pattern of data as for phonologically dissimilar letters, with the best model being the model with the two main effects, which was preferred to the full model by a factor of 68.68 (see top-right of Figure 5 and “Serial position curves” rows in Table 8).

Table 8.

Results of the Bayesian repeated-measures analyses of variances for the serial position curve, transposition gradients, and response latencies for phonologically similar sequences from Experiment 3.

Analysis	Models	P(M)	P(M \| data)	BF_M	BF₁₀	Error %
Serial position	Null model (incl. subject)	0.20	1.69e⁻⁸¹	6.74e⁻⁸¹	1.00
	Grouping + Position	0.20	0.99	274.68	5.85e ⁺ ⁸⁰	1.23
	Grouping + Position+	0.20	0.01	0.06	8.51e ⁺ ⁷⁸	1.26
	Grouping × Position
	Position	0.20	2.51e⁻⁶	1.01e⁻⁵	1.49e ⁺ ⁷⁵	1.49
	Grouping	0.20	5.55e⁻⁷⁹	2.22e⁻⁷⁸	329.14	6.17
Transposition gradients	Null model (incl. subject)	0.20	1.14e⁻¹⁴⁰	4.57e⁻¹⁴⁰	1.00
Transposition gradients	Distance	0.20	0.91	40.36	7.96e ⁺ ¹³⁹	1.91
	Grouping + Distance	0.20	0.09	0.38	7.54e ⁺ ¹³⁸	1.27
	Grouping + Distance+	0.20	4.02e⁻³	0.02	3.52e ⁺ ¹³⁷	1.31
	Grouping × Distance
	Grouping	0.20	1.17e⁻¹⁴¹	4.64e⁻¹⁴¹	0.10	6.26
Response latencies	Null model (incl. subject)	0.20	4.75e⁻⁶⁴	1.90e⁻⁶³	1.00
Response latencies	Grouping + Position+	0.20	0.50	3.93	1.04e ⁺ ⁶³	1.24
	Grouping × Position
	Position	0.20	0.30	1.72	6.33e ⁺ ⁶²	1.47
	Grouping + Position	0.20	0.20	1.02	4.29e ⁺ ⁶²	1.24
	Grouping	0.20	1.45e⁻⁶⁴	5.80e⁻⁶⁴	0.31	6.17

BF: Bayes factor.

Prior to statistical analysis of transposition errors, participants who produced no order errors in at least one of the four experimental conditions were removed. After the removal of these participants, the transposition error analysis was finally conducted on a sample of 51 participants. We calculated for each participant the proportion of errors, as a function of absolute distance shift and temporal grouping, among all order errors in the phonologically similar condition. We then analysed the data with a 2 × 5 repeated-measures BANOVA with absolute transposition distance (1–5) and grouping condition (grouped vs. ungrouped) as factors (see middle-left of Figure 5). The results provided strong evidence in favour of the model containing only the distance effect as the best model, which was preferred to the second best model with both main effects by a factor of 10.56 (see “Transposition gradients” rows in Table 7). The same analysis was repeated on data from trials with phonologically similar letters, leading to similar results to those obtained with phonologically dissimilar letters (see middle-right of Figure 5). The results provided strong evidence that the best model was the one with only a main effect of distance, preferred to the second best model with both main effects by a factor of 10.56 (see “Transposition gradients” rows in Table 8). As in previous experiments, we also analysed the rate of within-group versus between-group transposition errors. For the latter, we distinguished between interposition errors, transpositions at groups boundary, and other between-group transpositions (all comparisons involved undirected Bayesian paired samples t-test with default prior as provided in JASP). As shown in Table 9 the phonologically dissimilar lists showed overall moderate evidence for an absence of difference between the two grouping conditions with respect to different types of transposition errors. Regarding phonologically similar lists (see Table 10), we obtained decisive evidence of a decrease in transpositions at the groups boundary and moderate evidence for an absence of increase in interposition errors in grouped sequences.

Table 9.

Proportions in dissimilar trials of within- and between-group transposition errors, as a function of grouping condition, for dissimilar trials from Experiment 3.

Grouping type	Within groups	Between groups
Grouping type	Within groups	Boundary	Interpositions	Others
Ungrouped	.54(.17)	.08(.08)	.15(.15)	.23(.13)
Grouped	.58(.22)	.07(.15)	.18(.18)	.18(.13)
Pairwise-comparisons (ungrouped vs. grouped)	BF01 = 4.33	BF01 = 5.41	BF01 = 2.83	BF01 = 1.43

BF: Bayes factor.

Values in parentheses are standard deviations. Pairwise-comparisons were conducted with undirected Bayesian paired samples t-tests.

Table 10.

Proportions in similar trials of within- and between-group transposition errors, as a function of grouping condition, for dissimilar trials from Experiment 3.

Grouping type	Within groups	Between groups
Grouping type	Within groups	Boundary	Interpositions	Others
Ungrouped	.51(.12)	.09(.05)	.14(.06)	.26(.10)
Grouped	.57(.18)	.05(.04)	.16(.10)	.22(.09)
Pairwise-comparisons (ungrouped vs. grouped)	BF10 = 1.22	BF10 = 229.75	BF01 = 3.36	BF10 = 2.52

BF: Bayes factor.

Values in parentheses are standard deviations. Pairwise-comparisons were conducted with undirected Bayesian paired samples t-tests.

Proportions in dissimilar trials of within- and between-group transposition errors, as a function of grouping condition, for dissimilar trials from Experiment 3. BF: Bayes factor. Values in parentheses are standard deviations. Pairwise-comparisons were conducted with undirected Bayesian paired samples t-tests. Proportions in similar trials of within- and between-group transposition errors, as a function of grouping condition, for dissimilar trials from Experiment 3. BF: Bayes factor. Values in parentheses are standard deviations. Pairwise-comparisons were conducted with undirected Bayesian paired samples t-tests. For each participant, we averaged the log of response latency in milliseconds for each correct recall in dissimilar trials as a function of temporal grouping and serial position. The data were then analysed via a 2 × 6 repeated-measures BANOVA with serial position (1–6) and grouping condition (grouped vs. ungrouped) factors (see bottom-left of Figure 5). The results provided decisive evidence in favour of the full model containing the two main effects and their interaction, this model being preferred to the second best model with the two main effects by a factor of 19.47 (see “Response latencies” rows in Table 7). Response latencies for phonologically similar lists have been analysed in the same way, yielding anecdotal evidence (BF10 = 1.65) in favour of the full model (best model) when compared with the second best model containing only an effect serial position (see “Response latencies” rows in Table 8). Given the ambiguous evidence regarding the presence of an effect of interaction between serial position and grouping on response latency, we conducted an analysis of effect that provided moderate evidence for the presence of such an interaction (BFInclusion = 3.93, see also Figure 5). Results of the Bayesian repeated-measures analyses of variances for the serial position curve, transposition gradients, and response latencies for phonologically similar sequences from Experiment 3. BF: Bayes factor. All models include subject and for each analysis models are compared with the null model; Condition: temporal grouping effect; Position: serial position effect; Distance: transposition distance effect. By introducing an end-of-list distracting task, Experiment 3 aimed to test whether the absence of an increase in interposition errors in the recall of 6-letter grouped lists in Experiment 2 was due to a ceiling in recall or to the specific 2 × 3 grouping structure used. The experiment also sought to determine whether the lack of increase in interposition errors in the 2 × 3 grouped sequences observed in Experiment 1 was specific to the musical domain or whether it is a more general feature of STM that extends to the verbal domain as well. The end-of-list distractor has the expected effect of reducing recall accuracy relative to Experiment 2, especially for phonologically similar lists that are of particular interest for comparison with the music domain. We replicated the usual pattern of temporal grouping effects, but again observed an absence of increase in interposition errors. These results are in line with those reported in Experiments 1 and 2, suggesting that the absence of an increase in interposition errors in the recall of 6-letter lists grouped into two groups of three items was not due to a ceiling effect but might be related to the 2×3 grouping structure used in the experiment. This therefore supports that musical and verbal STM are characterised by similar temporal grouping effects—suggesting the presence of similar ordering mechanisms in both domains—while also indicating the presence of boundary conditions for observing increased interposition errors in the recall of grouped sequences from STM.

General discussion

The goal of the present series of experiments was to determine whether the temporal grouping effects predicted by positional theories of serial order in verbal STM (see, for example, Brown et al., 2000; Burgess & Hitch, 1999; Henson, 1998) can be extended to the musical domain (Gorin et al., 2018a). In a first experiment, non-musicians were required to reconstruct the serial order of 6-tone sequences in a forward manner. The results showed that grouped sequences were overall better recalled than ungrouped sequences and that the former were characterised by a scalloped-shape recall curve reflecting the grouping structure used in the experiment. Response latencies adopted a classical inverted U-shape with longer latency for the first item in the list, as well as for the first item in the groups in grouped sequences. We did not observe an increase in interposition errors in the recall of temporally grouped musical sequences, but we reported a small decrease in adjacent transposition errors in grouped sequences, reflecting a decrease of transpositions involving items at group boundaries. Since interposition errors in 6-item grouped sequences are not well documented in the verbal STM literature, we conducted an online experiment requiring participants to serially recall grouped and ungrouped 6-letter sequences (Experiment 2) to compare with the observations from the musical domain. The pattern observed was similar to that observed in Experiment 1 but the conclusions were limited by the presence of ceiling effect at recall. In a last online experiment (Experiment 3), we asked participants to performed a task similar to that in Experiment 2 while introducing an end-of-list distractor to reduce ceiling effect. Even in the absence of ceiling effect we reproduced the same pattern of data as observed in Experiments 1 and 2, supporting the view that it is a general phenomenon that grouping 6-item sequences into groups of three is characterised by benchmark grouping effects but without an increase in interposition errors. The experiments reported here provide additional evidence supporting the claim that temporal grouping effects observed in the verbal domain of STM could be extended to the musical domain as well (Gorin et al., 2018b). First, we obtained clear evidence from all experiments that presenting participants with 6-item verbal and musical sequences grouped by three lead to a recall advantage compared with the recall of the same, but ungrouped sequences. This replicates the recall advantage for grouped sequences observed with verbal (Farrell & Lewandowsky, 2004; Frankish, 1985; Hartley et al., 2016; Hitch et al., 1996; Ng & Maybery, 2002, 2005; Ryan, 1969b) and non-verbal materials (Hurlstone, 2019; Hurlstone & Hitch, 2015, 2018; Parmentier et al., 2004). Second, in all three experiments, the serial position curve for grouped sequences was characterised by a scalloped appearance reflecting the 2 × 3 grouping structure used in the present study. It is noteworthy that while the recall of grouped sequences showed a scalloped serial position curve, the interaction between serial position and grouping that characterises the scalloped shape was less strong than usually observed with longer grouped sequences (see, for example, Hartley et al., 2016; Ryan, 1969a). Indeed, the scalloping in our study was mainly limited to that of the first group, and this was similarly for musical and (phonologically similar) verbal sequences. This pattern is nonetheless in line with previous studies on temporal grouping with non-verbal sequences of similar length and grouping structure (Hurlstone & Hitch, 2018; Parmentier et al., 2004). Third, the use of a forward reconstruction-of-order procedure with musical material in Experiment 1 allowed to demonstrate that the recall of musical material from STM is characterised by the same inverted U-shaped profile, but with a long latency for the first output position. In addition, grouped musical sequences showed an additional latency peak for the first item of each temporal groups. Although the latencies were from different timescales, the same pattern has been reproduced with 6-item verbal sequences in Experiments 2 and 3. This corroborates previous findings in the verbal (Farrell & Lewandowsky, 2004; Maybery et al., 2002; Parmentier & Maybery, 2008) and non-verbal (Hurlstone & Hitch, 2015, 2018; Parmentier et al., 2004) domains of STM regarding the profile of response latencies and the influence of temporal grouping on these latencies in STM tasks. Finally, across all verbal and musical STM experiments, we observed that temporal grouping had none or only a limited influence on the pattern of transpositions. More importantly, we did not observe any increase in interposition errors. While this is in contradiction with the usually reported effect of temporal grouping on the pattern of transposition errors in the verbal domain (Henson, 1996, 1999; Ng & Maybery, 2002, 2005; Ryan, 1969b), we reported the same absence of effect of temporal grouping on interposition errors for the serial recall of 6-letter sequences in Experiments 2 and 3. Importantly, while data from Experiment 2 limit any interpretation of transposition patterns because of the presence of ceiling effect at recall, the comparison of data from Experiment 1 and phonologically similar verbal sequences from Experiment 3 (mimicking the pitch proximity inherent to musical sequences) clearly supports the view that, similar to the verbal and musical domains, grouping 6-item sequences by groups of three does not increase the proportion of interposition errors compared with ungrouped sequences.

Implication for theories of serial order STM

The observation of key grouping effects in forward reconstruction of tone sequences, as well as the reproduction of the same pattern of data in verbal and musical tasks observed in the current set of experiments is in favour of the view that representing serial order in musical and verbal STM could be supported by similar mechanisms. In the verbal domain, temporal grouping effects are well accommodated by models assuming that serial order is represented based on positional markers coding items or groups for their position in the sequence and within the groups (Hurlstone et al., 2014; Lewandowsky & Farrell, 2008). Consequently, the category of models assuming a hierarchical representation of serial order based on positional markers (Brown et al., 2000; Burgess & Hitch, 1999; Hartley et al., 2016; Henson, 1998; Hurlstone, 2019) represents a good candidate to account for the effects reported in the musical and verbal STM tasks described in the present study, and suggests that serial order representation across these two domains is general. At the same time, the absence of increase in interposition errors in grouped sequences is challenging for STM models assuming a hierarchical representation of serial order (Brown et al., 2000; Burgess & Hitch, 1999; Hartley et al., 2016; Henson, 1998; Lewandowsky & Farrell, 2008). The ability of these models to account for grouping effects (Frankish, 1985, 1989; Hartley et al., 2016; Henson, 1996; Hitch et al., 1996; Maybery et al., 2002; Ng & Maybery, 2002, 2005; Ryan, 1969a, 1969b) relies on the hierarchical representation of positional information. However, the consequence of using hierarchical representation of serial order is that any model implementing that mechanism should predict an increase in interposition errors in grouped sequences, even with shorter sequences. As it is not clear from previous research whether the absence of increased interpositions is typical of the recall of 6-item sequences grouped with a 2×3 structure (see Farrell, 2008; Hitch et al., 1996; Maybery et al., 2002; Parmentier & Maybery, 2008), it is a possibility that this specific grouping structure represents a particular case. In some positional models (e.g., Brown et al., 2000; Henson, 1998), terminal positions are represented with greater distinctiveness. Thus, the positional codes of the two groups in a 2-group structure are more distinctive compared with, for instance, the positional codes between the second and third groups in a 3-group structure. It is then a possibility that a 2×3 grouping structure represents a special case in which there is no group at terminal positions, which then prevents the occurrence of interposition errors due to the increased distinctiveness between the groups. Further modelling work would be required to explore this account. The analysis of interposition errors in grouped sequences is useful to better understand the mechanisms representing serial order in STM and to study the nature of these mechanisms across different domains. Hurlstone (2019) showed that the recall of visuospatial and verbal sequences grouped with a 3×3 structure are characterised by different patterns of transposition errors. To explain the absence of increase in interposition errors in the visuospatial domain, the authors suggested that positional information may be represented differently for visuospatial information. In the present study, the absence of increase of interpositions does not seem to be linked to the STM domain, but rather appears as specific to the 2 × 3 grouping structure used. Consequently, contrary to the comparison between the visuospatial and verbal domains for which the same grouping structure leads to different patterns of transposition errors and suggests the existence of different ordering codes (Hurlstone, 2019, see also Soemer & Saito, 2016 for similar claim), the comparison in the present study rather supports the existence of similar mechanisms for ordering musical and verbal information while highlighting the fact that the observation of increased interposition errors is dependent on the type of grouping pattern. Although the pattern of temporal grouping effects is similar across the musical and verbal domains in this study, it does not disprove evidence for domain-specificity of serial order in STM (Hurlstone, 2019; Logie et al., 2016; Saito et al., 2008; Soemer & Saito, 2016). Indeed, the results are also compatible with the view that domain-specific but functionally similar mechanisms for the retention of serial order exists across different domains (Logie et al., 2016). Further studies are then required to distinguish more precisely between the domain-general versus domain-specific theories of serial order in the verbal and musical domains of STM. Investigating the effect of cross-modal interference of order between musical and verbal domains in dual-task setting may be of great interest to tackle that question (Depoorter & Vandierendonck, 2009; Vandierendonck, 2016).

Methodological advances in studying musical STM for serial order

This series of experiments extends previous work on the development of a tool to study serial order phenomena in musical STM (Gorin et al., 2018a, 2018b). To address the question of the domain-generality of serial order mechanisms in STM, it is critical to use memory tasks having the same ordering requirements across domains. Gorin et al. (2018a) showed that using the same task as in Experiments 1A and 1B, recall of tone sequences in non-musicians was characterised by errors patterns and sequence length effects similar to those reported in verbal STM tasks. They also reported that the presence of serial position effects was characterised by smaller primacy and recency effects compared with what is usually reported with verbal tasks. In Experiments 1A and 1B, we reproduced the observation of serial position effects characterised by primacy and recency in ungrouped tone sequences, as well as typical transposition gradients, as observed with verbal material (Hurlstone et al., 2014; Lewandowsky & Farrell, 2008). In Experiment 2, participants had to reconstruct the sequence in forward serial order, making the task closer to the typical procedure used in verbal serial recall tasks. We also used a larger number of tones, instead of always using the same six tones, to reduce intertrial interference. This new procedure led to a clear improvement in recall accuracy compared with Experiments 1A and 1B as well as with more pronounced serial position effects. We also replicated the typical transposition gradients and were able to analyse response latencies, the latter being characterised by a shape similar to what is usually reported in the verbal domain (Hurlstone et al., 2014; Lewandowsky & Farrell, 2008). The presence of the same pattern of movement errors and forward recall serial position effects as reported with verbal material (Hurlstone et al., 2014; Lewandowsky & Farrell, 2008) in our musical STM task in Experiment 2 supports the reliability of this task to study serial order phenomena in the musical domain, and opens the way to systematic comparison between order phenomena observed in the musical and other domains. At the same time, it is important to note that the procedure developed in Experiment 2 still has some critical differences compared with verbal reconstruction tasks. For the latter, participants are asked to reconstruct the serial order of sequences of items (e.g., words, letters, and digits) that are represented at recall. The same principle was implemented in our musical task. However, even though the tones were organised from the lowest (left) to the highest (right) at recall to simplify the procedure, the participants had to search for the correct tones by clicking on the different items. Compared with verbal reconstruction tasks for which there is a direct access to the item at recall, this procedure inevitably created more interference. This could partially explain the poorer performance in the musical domain, in addition to the fact that participants were non-musicians and required more time to complete a trial. In turn, this would explain, at least in part, the presence of overall longer response latencies compared with the verbal domain.

Future directions

Future research should focus on adapting verbal serial reconstruction tasks to match the procedure used in Experiment 1. We could imagine an experiment in which participants would be asked to perform a serial order reconstruction task, as described in Experiment 1, with either tones or auditory consonants. This represents a more direct comparison between the two domains as the two tasks would be the exact same, except for the stimuli, allowing then to draw a better conclusion regarding the generality of serial order phenomena in the verbal and musical domains of STM. It is important to note that the order phenomena characterising verbal STM tasks (Hurlstone et al., 2014; Lewandowsky & Farrell, 2008) are usually reported when testing a population of adults that are highly familiar with the memoranda (e.g., letters, digits, and words). In other words, one could consider that verbal order phenomena in STM reflects the behaviour of verbal experts in maintaining the order of verbal information (for language-based accounts of serial order processing in verbal STM, see Acheson & MacDonald, 2009; Majerus, 2013; Schwering & MacDonald, 2020). Consequently, comparing the effect of grouping in musical STM with non-musicians to the same effect in the verbal domain with verbal experts may represent a sub-optimal comparison. A more optimal strategy to assess the domain-general hypothesis of serial order would be to explore the effect of grouping on the reconstruction of 9-tone sequences (e.g., 3-item group structure) in musicians. Using a melodic dictation recall method, Deutsch (1980) showed a positive effect of temporal grouping on recall accuracy of 12-tone sequences in musicians, as well as scalloped serial position curves. In addition, it has been shown that the recall from long-term memory of melodies played with a piano is characterised by interposition-like errors in sequences with a strong metrical structure (Mathias et al., 2015). Assuming the feasibility of asking musicians to reconstruct 9-tone sequences using the procedure described in Experiment 1, and considering the data from Deutsch (1980) and Mathias et al. (2015), further studies exploring grouping effects in musical STM in musicians are required to provide a more stringent test of the domain-generality hypothesis of positional markers in STM. The observation of an absence of increase in interposition errors in recalling 2 × 3 grouped sequences, consistently with both musical and verbal material (even in absence of ceiling effect), supports the potential existence of boundary conditions to observe an effect of temporal grouping on transposition errors in STM. While addressing that matter is out of the scope of the current paper, that observation places new constraints on models of serial order. In addition, studying more systematically the factors (e.g., sequence length, group sizes, number of groups), and their interaction, driving the increase in interposition errors in serial recall may help in shedding new light on our understanding of serial order representation in STM. Recent work by Kowialiewski et al. (2021) has shown that sequences of six words grouped into pairs are characterised by an increase in interposition errors compared with the same ungrouped sequences. This indicates that the observation of an increase in interposition errors in grouped sequences seems to depend more on the number of groups in the sequence, rather than on the length of the sequence.

Conclusion

We observed benchmark temporal grouping effects in serial order reconstruction tasks with tone sequences, except for the typical effect of grouping on interposition errors. This pattern was replicated with the serial recall of verbal sequences comparable to the musical material used in the first experiments. The results overall support the view that positional markers described in verbal models of STM to represent serial order (e.g., Brown et al., 2000; Burgess & Hitch, 1999; Henson, 1998) could be extended to the musical domain as well. Further research is nonetheless required to determine whether direct support for positional markers can be witnessed with longer musical sequences in musicians.

43 in total

1. Structure and function of auditory cortex: music and speech.

Authors: Robert J. Zatorre; Pascal Belin; Virginia B. Penhune
Journal: Trends Cogn Sci Date: 2002-01-01 Impact factor: 20.229

2. An endogenous distributed model of ordering in serial recall.

Authors: Simon Farrell; Stephan Lewandowsky
Journal: Psychon Bull Rev Date: 2002-03

3. Grouping in short-term verbal memory: is position coded temporally?

Authors: Honey L H Ng; Murray T Maybery
Journal: Q J Exp Psychol A Date: 2002-04

4. Perceptual organization and precategorical acoustic storage.

Authors: Clive Frankish
Journal: J Exp Psychol Learn Mem Cogn Date: 1989-05 Impact factor: 3.051

5. How is the serial order of a spatial sequence represented? Insights from transposition latencies.

Authors: Mark J Hurlstone; Graham J Hitch
Journal: J Exp Psychol Learn Mem Cogn Date: 2014-12-01 Impact factor: 3.051

Review 6. Verbal working memory and language production: Common approaches to the serial ordering of verbal information.

Authors: Daniel J Acheson; Maryellen C MacDonald
Journal: Psychol Bull Date: 2009-01 Impact factor: 17.737

7. lab.js: A free, open, online study builder.

Authors: Felix Henninger; Yury Shevchenko; Ulf K Mertens; Pascal J Kieslich; Benjamin E Hilbig
Journal: Behav Res Methods Date: 2022-04

8. "Just Another Tool for Online Studies" (JATOS): An Easy Solution for Setup and Management of Web Servers Supporting Online Studies.

Authors: Kristian Lange; Simone Kühn; Elisa Filevich
Journal: PLoS One Date: 2015-06-26 Impact factor: 3.240

9. Turning the hands of time again: a purely confirmatory replication study and a Bayesian analysis.

Authors: Eric-Jan Wagenmakers; Titia F Beek; Mark Rotteveel; Alex Gierholz; Dora Matzke; Helen Steingroever; Alexander Ly; Josine Verhagen; Ravi Selker; Adam Sasiadek; Quentin F Gronau; Jonathon Love; Yair Pinto
Journal: Front Psychol Date: 2015-04-24

10. Language repetition and short-term memory: an integrative framework.

Authors: Steve Majerus
Journal: Front Hum Neurosci Date: 2013-07-12 Impact factor: 3.169