Literature DB >> 35905111

Multisensory stimuli enhance the effectiveness of equivalence learning in healthy children and adolescents.

Gabriella Eördegh¹, Kálmán Tót², Ádám Kiss², Szabolcs Kéri², Gábor Braunitzer³, Attila Nagy².

Abstract

It has been demonstrated earlier in healthy adult volunteers that visually and multisensory (audiovisual) guided equivalence learning are similarly effective. Thus, these processes seem to be independent of stimulus modality. The question arises as to whether this phenomenon can be observed also healthy children and adolescents. To assess this, visual and audiovisual equivalence learning was tested in 157 healthy participants younger than 18 years of age, in both a visual and an audiovisual paradigm consisting of acquisition, retrieval and generalization phases. Performance during the acquisition phase (building of associations), was significantly better in the multisensory paradigm, but there was no difference between the reaction times (RTs). Performance during the retrieval phase (where the previously learned associations are tested) was also significantly better in the multisensory paradigm, and RTs were significantly shorter. On the other hand, transfer (generalization) performance (where hitherto not learned but predictable associations are tested) was not significantly enhanced in the multisensory paradigm, while RTs were somewhat shorter. Linear regression analysis revealed that all the studied psychophysical parameters in both paradigms showed significant correlation with the age of the participants. Audiovisual stimulation enhanced acquisition and retrieval as compared to visual stimulation only, regardless of whether the subjects were above or below 12 years of age. Our results demonstrate that multisensory stimuli significantly enhance association learning and retrieval in the context of sensory guided equivalence learning in healthy children and adolescents. However, the audiovisual gain was significantly higher in the cohort below 12 years of age, which suggests that audiovisually guided equivalence learning is still in development in childhood.

Entities: Chemical

Mesh：

Year: 2022 PMID： 35905111 PMCID： PMC9337650 DOI： 10.1371/journal.pone.0271513

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.752

Introduction

Equivalence learning is a specific kind of associative learning in which two discrete and often different percepts are linked together. Catherine E. Myers and coworkers developed a learning paradigm (the Rutgers Acquired Equivalence Test, also known as the fish-face paradigm) that can be applied to investigate visually guided equivalence learning [1]. A significant advantage of this test is that the brain regions associated with successful performance in each phase of the test are well established [1, 2]. The test can be divided into two main phases. The first one is the acquisition phase, which depends on the fronto-striatal [3] (cortex-basal ganglia) loops. Here the participants’ task is to associate two different visual stimuli based on feedback information about the correctness of the choices. After the acquisition phase, once the participants have learned the associations, the test phase ensues. The test phase assesses memory retrieval regarding the learned associations (retrieval) and also tests if the subject is able to generalize from the known associations- that is, to recognize hitherto not seen but predictable stimulus pairs (generalization or transfer). During the test phase, which primarily depends on the hippocampi and the mediotemporal lobes [3], no feedback is given about the correctness of the choices. Earlier studies have pointed out that both the basal ganglia and the hippocampi are fundamentally involved in visual associative learning [1-3], and they receive not only visual but also multisensory information [4-7]. Multisensory integration can be observed from the cellular to the behavioral level [5, 8–11]. To explore whether multisensory (audiovisual) information could facilitate the effectiveness of sensory guided equivalence learning, we developed and validated a new multisensory (audiovisual) equivalence learning test with the same structure as the original (visual) Rutgers Acquired Equivalence test [12, 13]. In a previous study involving 151 healthy adult volunteers, we demonstrated that visual and multisensory guided associative learning are similarly effective. Thus, these processes are independent of stimulus modality in healthy adults, but it is not known if the same applies to children and adolescents. Concerning the development of multisensory integration in childhood, the available data are controversial and they strongly depend on stimulus modality and the studies cognitive function. The literature distinguishes between two main types of multisensory integration: the integration of different modalities and the integration of redundant stimulus features (e.g., spatial or temporal integration). The integration of different modalities is not detectable until 8 to 10 years of age in the auditory and tactile modalities [14, 15], and audiovisual integration is suboptimal (but detectable) until 11 to 12 years of age [16-21]. Therefore, in this study we also sought to investigate if there was a difference in participants’ performance depending on whether they were above or below 12 years of age.

Materials and methods

Subjects

Altogether 167 healthy children and adolescents were involved in the study. The participants were recruited on a voluntary basis, received no compensation for their participation, and they were free to quit at any time without any consequence. The volunteers and their parents were informed about the aims and procedures of the study, and their medical history was taken with emphasis on neurological, ontological, psychiatric or chronic somatic disorders. Volunteers with such disorders in their history were not eligible for the study. Any regularly taken medication was recorded. The volunteers were also tested with the Ishihara plates to exclude color blindness. As all volunteers were under 18 years of age, the informed consent form was signed by their parents for them as required by the law. All volunteers were White and they were all native speakers of the Hungarian language. The study protocol followed the tenets of the Declaration of Helsinki in all respects, and it was approved by the Ministry of Human Resources (11818-6/2017/EÜIG).

Visual and multisensory associative learning paradigms

The tests were administered on laptops (Lenovo T430, Lenovo Yoga Y500, Samsung Electronics 300e4z/300e5z/300e7z, Fujitsu Siemens Amilo Pro V3505). The subjects were tested in a quiet room, sitting at a standard distance of 57cm from the laptop screen (the stimuli were equal in size, with a maximum diameter of 5 cm, which corresponds to a 5° angle of view). For the audiovisual test, Sennheiser HD439 over-ear headphones were used to generate the auditory stimuli (SPL = 60 dB). The keys X and M were labeled as “left” and “right” on the laptop’s keyboard. The subjects used these keys to indicate their choices in both test paradigms. The participants used both hands for the responses. The subjects were tested separately, one subject at a time. No time limit was set, and no forced quick responses were expected. Both paradigms consisted of two phases: the acquisition phase and the test phase. The test phase could be further divided into two parts: a retrieval part and a generalization (or transfer) part. During the acquisition phase, the subjects had to learn associations between antecedent and consequent stimuli. This happened through trial-and-error learning. In each trial, one of two consequent stimuli had to be chosen in response to an antecedent stimulus. The subjects indicated their choice by pressing either the “left” or the “right” key on the keyboard, corresponding to the side of the consequent stimulus. The computer provided feedback about the correctness of the response–a green checkmark if the response was correct or a red X if it was incorrect, along with the Hungarian words “helyes” (correct) and “helytelen” (incorrect) (Fig 1).

Fig 1

One trial from the visual (top) and the audiovisual (bottom) paradigms. Computer feedback to correct and incorrect responses (top and bottom, respectively) is also illustrated.

One trial from the visual (top) and the audiovisual (bottom) paradigms. Computer feedback to correct and incorrect responses (top and bottom, respectively) is also illustrated. New associations were presented one by one, and the participants had to provide a certain number of correct responses (4,6,8,10,12) after each new association before being allowed to proceed to the test phase. Thus, the number of trials was not constant in the acquisition phase; it depended on the subjects’ individual performance. In the test phase, the subjects first had to retrieve the already learned associations (the retrieval part of the test phase) then recognize new, hitherto not learned but predictable associations (generalization or transfer part of the test phase). These new associations were generated according to the previously formed associations that had been applied in the acquisition phase. In the test phase, no feedback was provided about the correctness of the answers. The number of trials was constant in the test phase. A total of 48 trials were presented, of which 36 were already learned (retrieval), and 12 were new associations (generalization or transfer). The basis of the applied visual associative test was the Rutgers Acquired Equivalence Test [1]. It was rewritten in Assembly for Windows, translated to Hungarian, and slightly modified (more trials in test phases to get more accurate information about the hippocampal functions) [22] with the written permission of professor Catherine E. Myers (Rutgers University), head of the research group where the test paradigm was originally developed. The antecedent visual stimuli were four cartoon faces (an adult man, an adult woman, a boy, and a girl; A1, A2, B1, B2), and the consequents were four cartoon sematic fish of different colors but of the same shape (X1, X2, Y1, Y2). It was possible to form altogether eight pairs from the antecedent and consequent stimuli. In each trial (See Fig 1), the subjects saw a face in the middle of the screen and two fish below it, one on the left and one on the right side. During the acquisition phase, the subjects learned a series of antecedent-consequent pairs in a trial-and-error manner. When face A1 or face A2 were shown, the correct choice was fish X1 over fish Y1; however, when face B1 or face B2 appeared on the screen, the correct answer was fish Y1, instead of fish X1. This way, beside the face-fish associations, the participants also learned that the face A1 was equivalent to face A2 in terms of their relation to the consequents (fish). New associations were introduced gradually, and they were presented mixed with trials of previously learned associations until six of the possible eight antecedent–consequent pairs were encountered by the participants. In the test phase, the participants had to recall these six pairs (retrieval), and the remaining two hitherto not presented combinations would be shown as well (generalization or transfer). If the participants successfully learned that A1 and A2 (or B1 and B2) were equivalent regarding their consequents, they could derive the rule and generalize it to make previously not learned associations. That is, by generalization, they inferred that consequent X2 (previously associated with antecedent A1) was also associated with antecedent A2 and consequent Y2 (previously associated with antecedent B1) was also associated with antecedent B2. These new associations were mixed with the old ones and the subjects were not informed about them. The structure of the audiovisual paradigm was the same as that of the visual paradigm; the only difference was that the subjects had to make associations between auditory (antecedent stimuli, A1, A2, B1, B2) and visual stimuli (consequents, X1, X2, Y1, Y2) [12]. The antecedent stimuli were clearly distinguishable sounds (cat’s meow, starting motor, guitar note, and woman saying a Hungarian word), and the consequents were the same four drawn faces as in the visual paradigm (adult man, adult woman, boy, and girl; A1, A2, B1, B2). In each trial, the subjects simultaneously heard a sound (SPL = 60 dB) through a loudspeaker and saw two faces on the right and left sides of the screen. The participants had to learn which face was associated with which sound. Table 1 summarizes the basic structure of the learning tests.

Table 1

Summary of the visual and audiovisual associative learning paradigms.

ACQUISITION			TEST
Shaping	Equivalence training	New consequents	Retrieval	Generalization
A1 -> X1	A1 -> X1	A1 -> X1	A1 -> X1
	A2 -> X1	A2 -> X1	A2 -> X1
		A1 -> X2	A1 -> X2
				A2 -> X2
B1 -> Y1	B1 -> Y1	B1 -> Y1	B1 -> Y1
	B2 -> Y1	B2 -> Y1	B2 -> Y1
		B1 -> Y2	B1 -> Y2
				B2 -> Y2

A, B: antecedents (faces in the visual and sounds in the audiovisual paradigm); X, Y: consequents (fish in the visual and faces in the audiovisual paradigm). For a detailed description, see text.

A, B: antecedents (faces in the visual and sounds in the audiovisual paradigm); X, Y: consequents (fish in the visual and faces in the audiovisual paradigm). For a detailed description, see text. The subjects completed both equivalence learning tests one after another. To avoid the carry-over effect, the tests were administered in a random order across the subjects.

Data analysis

The performance of the participants was characterized with four main parameters: the number of trials necessary for the completion of the acquisition phase (NAT), association learning error ratio (the ratio of incorrect choices during the acquisition trials, ALER), retrieval error ratio (RER), and generalization error ratio (GER). Error ratios were calculated by dividing the number of incorrect responses by the total number of guesses. Reaction times were recorded for ALER, RER and GER. Reaction times (RTs) defined as the time elapsed between the appearance of the stimuli and the subject’s response were also recorded for each trial. RT values over 3 SD of each participant’s individual average RT were excluded from further analysis. Statistical analysis was performed in Statistica 13.4.0.14 (TIBCO Software Inc., USA). NAT, ALER, RER and GER were compared between the visual and the audiovisual paradigms. As the data were non-normally distributed (Shapiro-Wilk p < 0.05), the Wilcoxon matched-pairs test was used for the hypothesis tests. We also analyzed multisensory gain and its correlation with the subjects’ age. Gain was defined as the difference in the performance values between the visual (V) and multisensory (M) paradigms. For example: GAIN NAT = MNAT—VNAT. For the correlation analysis, Spearman’s ρ was calculated. Multisensory gain was also compared between the cohorts. For this, the Mann-Whitney U test was used.

Results

Altogether167 healthy children and adolescents participated in the study. In three cases, due to technical reasons, the procedure was stopped. Four participants did not complete any of the two paradigms, and three could complete only the visual paradigm. Six percent (10/167) of the participants did not complete the procedure. Their data were not used in the analyses. This way, the data of 157 volunteers were analyzed (nmale = 65, age: 11.6±3.6 years, range: 5–17.5 years).

Comparison between the performances in the visual and multisensory learning paradigms

The median NAT in the visual paradigm was 63 (range: 41–204, n = 157), while in the audiovisual paradigm it was 53.0 (range: 41–134, n = 157). The median NAT in the audiovisual paradigm was significantly lower (Z = 5.098, p < 0.001; Fig 2).

Fig 2

Performance in the acquisition phase in the visual and audiovisual paradigms.

Performance in the acquisition phase in the visual and audiovisual paradigms.

NAT: the number of trials needed to complete the acquisition phase; ALER: error ratio in the acquisition phase. Gray: visual; white: audiovisual. The lower margin marks the first quartile and the upper margin the third quartile. The line in the box marks the median. The whiskers below the boxes indicate the 10th percentile and the whiskers above, the 90th percentile. The black dots represent the outliers. **: p<0.01. The median ALER in the visual paradigm was 0.082 (range: 0–0.34, n = 157), and it was 0.051 in the multisensory paradigm (range: 0–0.36, n = 157). Similarly, to the NATs, the ALERs differed significantly between the two paradigms (Z = 4.652, p < 0.001; Fig 2). In contrast to the psychophysical parameters, the RTs did not differ significantly between the two paradigms (Z = 0.050, p = 0.960) in the acquisition phase (AcqRTs). The median RT in the visual paradigm was 1655.811 ms (range: 885.508–4782.44ms, n = 157), and it was 1695.7 ms in the audiovisual paradigm (range: 1047.479–4573.56ms, n = 157; see Fig 3).

Fig 3

Reaction times in the visual and multisensory paradigms (in milliseconds).

The reaction times did not differ significantly between the visual and audiovisual paradigms in the acquisition phase. The conventions are the same as in Fig 2.

Reaction times in the visual and multisensory paradigms (in milliseconds).

The reaction times did not differ significantly between the visual and audiovisual paradigms in the acquisition phase. The conventions are the same as in Fig 2. In the retrieval part of the test phase, the median RER in the visual paradigm was 0.056 (range: 0–0.86, n = 157), and it was 0.028 (0–0.42, n = 157) in the audiovisual one. The difference was significant (Z = 4.812, p < 0.001; Fig 4). Furthermore, retrieval RTs (RER RTs) were significantly shorter in the audiovisual paradigm (Z = 4.452, p < 0.001 m; Fig 3). The median RT in the visual paradigm was 1869.750 ms (range: 984.625–6103.87 ms, n = 157), and in the audiovisual paradigm it was 1731.171 ms (range: 956.778–4506.25 ms, n = 157; Fig 3).

Fig 4

Performance in the test phase of the visual and audiovisual paradigms.

Performance in the test phase of the visual and audiovisual paradigms.

The test phase can be divided into two parts, retrieval and generalization (see text for details) Performance in these parts is characterized by the retrieval error ratio (RER) and generalization error ratio (GER). RER differed significantly between the paradigms at p<0.01, while GER did not differ significantly between the paradigms. The conventions are the same as in Fig 2. The median GER in the visual paradigm was 0.083 (range: 0–1.0, n = 157), and it was also 0.083 in the audiovisual paradigm (range: 0–1.0, n = 157). In contrast to NAT, ALER, and RER, GER did not differ significantly between the two paradigms (Z = 1.006, p = 0.315; Fig 4). Generalization RTs (GER RTs), however, were significantly shorter in the audiovisual paradigm (Z = 3.848, p < 0.001). The median generalization RT in the visual test was 2477.917ms (range: 1001.167–14796.50 ms, n = 153). In the audiovisual paradigm, it was 2064.167 (range: 1004.400–7054.000 ms, n = 152; Fig 3).

The effect of age on performance

Linear regression analysis was performed to analyze the age-dependence of the studied parameters. All the investigated parameters, both in the acquisition and the test phases, showed a significant negative correlation with the age of the participants. That is, performance improved with age in general (see Table 2).

Table 2

Linear regression results of correlation between age and performance (NAT, ALER, RER, GER, RTs) in both paradigms (V: visual, M: multisensory).

Parameter vs age	b*	p
VNAT	-0,421125	0,000000
VALER	-0,473864	0,000000
VAcqRT	-0,647056	0,000000
VRER	-0,239547	0,002514
VRER RT	-0,587382	0,000000
VGER	-0,210698	0,008079
VGER RT	-0,304428	0,000130
MNAT	-0,232480	0,003390
MALER	-0,244188	0,002056
MAcqRT	-0,511363	0,000000
MRER	-0,424617	0,000000
MRER RT	-0,492971	0,000000
MGER	-0,323081	0,000037
MGER RT	-0,401983	0,000000

Performance above and below 12 years of age

Eighty-five of the subjects (54.1%) were younger than 12 years of age, and 72 of them (45.9%) were older than 12 years of age. Descriptive statistics of their performance is shown in Table 3. As for the acquisition phase (as assessed with NAT and ALER), both cohorts’ performance was superior in the audiovisual paradigm. This was true for the retrieval part of the test phase as well (RER). However, no such difference was observed in either cohort in the generalization part (GER). The results of the hypothesis tests are given in Table 4.

Table 3

Descriptive statistics of performance (NAT, ALER, RER, GER) in both paradigms (V: visual, M: multisensory) below and above 12 years of age.

Parameter	< 12 years				>12 years
Parameter	N	Median	Minimum	Maximum	N	Median	Minimum	Maximum
VNAT	85	73.000	42.00	204.00	72	56.500	41.00	136.00
VALER	85	0.104	0.00	0.34	72	0.060	0.00	0.19
VRER	85	0.083	0.00	0.53	72	0.028	0.00	0.86
VGER	85	0.250	0.00	1.00	72	0.083	0.00	1.00
MNAT	85	55.000	41.00	134.00	72	50.500	41.00	123.00
MALER	85	0.056	0.00	0.36	72	0.042	0.00	0.29
MRER	85	0.056	0.00	0.42	72	0.000	0.00	0.17
MGER	85	0.167	0.00	1.00	72	0.000	0.00	1.00

Table 4

Between-paradigm comparisons below and above 12 years of age.

Results of the hypothesis tests. The conventions are the same as in Table 3.

Comparison	< 12 years			>12 years
Comparison	N	Z	p	N	Z	p
VNAT vs. MNAT	85	4.816559	0.000001	72	2.111445	0.034735
VALER vs. MALER	85	4.526674	0.000006	72	1.787709	0.073824
VRER vs. MRER	85	3.248503	0.001160	72	4.098464	0.000042
VGER vs. MGER	85	0.193603	0.846487	72	1.661560	0.096602

Between-paradigm comparisons below and above 12 years of age.

Results of the hypothesis tests. The conventions are the same as in Table 3. A comparison of the performance of the two cohorts (below and above 12 years of age) by the studied parameters shows that the older cohort outperformed the younger one in both paradigms and in all parameters. For these comparisons, the Mann-Whitney U test was used. The results are shown in Table 5.

Table 5

Parameter-by-parameter comparison of performance between the two cohorts (below and above 12 years of age).

Results of the hypothesis tests (Mann-Whitney U). The conventions are the same as in Table 3.

Parameter	Z	p	N below 12	N above 12
VNAT	4.370017	0.000012	85	72
VALER	4.850878	0.000001	85	72
VRER	3.748245	0.000178	85	72
VGER	2.680841	0.007344	85	72
MNAT	1.794860	0.072677	85	72
MALER	1.895259	0.058059	85	72
MRER	4.845593	0.000001	85	72
MGER	4.100524	0.000041	85	72

Parameter-by-parameter comparison of performance between the two cohorts (below and above 12 years of age).

Results of the hypothesis tests (Mann-Whitney U). The conventions are the same as in Table 3.

The correlation of multisensory gain with the age of the children

Correlation analysis between multisensory gain and age revealed significant correlation in the acquisition phase. In most parameters, the gain values were below zero, which means that the multisensory error ratios were frequently lower than the visual ones. Descriptive statistics and correlation coefficients are given in Table 6.

Table 6

Descriptive statistics of multisensory gain and correlation coefficients (Spearman’s ρ).

Parameter	Median	Minimum	Maximum	ρ
GAIN NAT	-8,000	-140,0	64,000	0,246756
GAIN ALER	-0,022	-0,2	0,248	0,255976
GAIN RER	-0,028	-0,8	0,333	0,029085
GAIN GER	0,000	-1,0	1,000	-0,054252

Significant correlations (p < 0.05) are marked in light gray. The conventions are the same as in Figs 3 and 4.

Significant correlations (p < 0.05) are marked in light gray. The conventions are the same as in Figs 3 and 4. Descriptive statistics of the multisensory gains of the two age groups is shown in Table 7. A comparison of the multisensory gain of the two cohorts (below and above 12 years of age) by the studied parameters shows that the older cohort has smaller gain than the younger, and the differences were significant in the acquisition phase (Table 8).

Table 7

Descriptive statistics of multisensory gain in all parameters below and above 12 years of age.

The conventions are the same as in Figs 3 and 4.

Parameter	< 12 years				>12 years
Parameter	N	Median	Minimum	Maximum	N	Median	Minimum	Maximum
GAIN NAT	85	-16,000	-140,0	63,000	72	-4,000	-85,00	64,000
GAIN ALER	85	-0,045	-0,2	0,231	72	-0,017	-0,15	0,248
GAIN RER	85	-0,028	-0,5	0,333	72	-0,028	-0,83	0,139
GAIN GER	85	0,000	-0,8	1,000	72	0,000	-1,00	1,000

Table 8

Comparison of multisensory gains between the two cohorts (below and above 12 years of age).

Results of the hypothesis tests (Mann-Whitney U). The conventions are the same as in Figs 3 and 4.

GAIN	Z	p
NAT	-2,73368261	0,006253
ALER	-2,63680597	0,008369
RER	-0,70808016	0,474929
GER	-1,52008253	0,125694

Descriptive statistics of multisensory gain in all parameters below and above 12 years of age.

The conventions are the same as in Figs 3 and 4.

Comparison of multisensory gains between the two cohorts (below and above 12 years of age).

Results of the hypothesis tests (Mann-Whitney U). The conventions are the same as in Figs 3 and 4.

Discussion

In this study, we investigated the effectiveness of visual and audiovisual equivalence learning in a large sample of healthy children and adolescents. To our knowledge, we are the first to demonstrate that, in contrast to healthy adults, audiovisual information facilitates equivalence learning in healthy children and adolescents. Two sensory guided associative learning tests with the same structure were used, one visual [22] and one audiovisual [12]. Both tests were developed in our laboratory, based on the Rutgers Acquired Equivalence Test [1]. The Rutgers Acquired Equivalence Test was originally developed to dissociate the contributions of the basal ganglia and the hippocampi to visual equivalence learning and transfer. Myers and co-workers [1] found that patients with Parkinson’s disease exhibited poor performance when forming the visual associations, while patients with hippocampal atrophy were characterized by poor transfer. In this way, the authors demonstrated that the basal ganglia and the hippocampi are key structures in associative equivalence acquisition and the transfer of the equivalence rule to new stimuli, respectively, and that the test is capable of picking up suboptimal function of these structures. Since then, it has become widely recognized in the literature the basal ganglia have a key role in the association of stimuli [23, 24], while transfer is linked mainly the hippocampi/medial temporal lobe [3, 25]. The Rutgers paradigm has been applied to learn about associative learning/equivalence learning in various psychiatric and neurological disorders characterized by the dysfunction of the basal ganglia and the hippocampi [22, 26–29] and also in healthy subjects [30, 31]. Since the key brain structures involved in sensory guided associative/equivalence learning (the basal ganglia and the hippocampi) process not only visual but also auditory and combined audiovisual information [4-7], we have developed a new multisensory (audiovisual) version of the Rutgers Acquired Equivalence Test to enable the exploration of multisensory guided associative/equivalence learning. We first used this new test to explore this kind of learning in healthy adults [12]. We also compared the results with those obtained with the original visual-only paradigm. The results revealed that performance throughout the test was fairly independent of stimulus modality [12]. The same was true for reaction times. We concluded that the effectiveness of sensory guided associative/equivalence learning does not depend on the modality of the applied stimuli in healthy adults. The findings presented in this study show a different picture. In terms of performance (assessed as error ratios in the various parts of the test) children and adolescents seem to benefit significantly from multimodality in acquisition and retrieval, but not in generalization. Reaction times, however, were significantly shorter in the audiovisual paradigm, even in the generalization part of the test phase. In other words, in the audiovisual paradigm, the subjects performed at approximately the same level as in the visual paradigm, but with significantly shorter reaction times. This all suggests that healthy children and adolescents learn and retrieve associations more efficiently if the stimuli are of different modalities. Generalization does not seem to be facilitated by multimodality in terms of performance, but the significantly shorter reaction times suggest that a certain level of facilitation is present also in this part of the paradigm. Multisensory integration plays an important role not only in sensory-motor but also in cognitive functions. Bimodal (or multimodal) facilitation could enhance sensory perception [32], object recognition [33, 34], emotional change recognition [35], face and voice recognition [36], and person recognition [37]. Semantic congruence can strengthen multisensory integration [38], but in the case of our stimuli, such a congruency is negligible if it exists at all. Thus, it is safe to assume that in this study multisensory integration facilitated performance without semantic interference. Multisensory integration has been described at various levels of observation. It has been described in detail at the single-cell level [39-42] in both the neocortex [8] and in subcortical structures [5, 9, 43]. It is also well documented in various cognitive functions at the behavioral level [10, 11, 44]. Multisensory integration has been shown to influence various cognitive-behavioral parameters such as reaction time, accuracy of answers, or perception thresholds [45-48]. Our results suggest that multisensory integration enhances the learning and retrieval of associations in healthy children and adolescents, and in this sense our results are in agreement with the literature. The reason for the superiority of audiovisual information as input for equivalence learning in children and adolescents but not in adults [12] can be that visually guided equivalence learning is still in development in childhood and adolescence [30], that is, it has not yet reached its optimum. It can be hypothesized that the additional modality enhances the suboptimal performance that is observed in the unimodal paradigm. By adulthood, however, visual equivalence learning reaches its optimum, there is no significant development anymore [12], so the beneficial effect of multimodality disappears. The developmental patterns of multisensory integration depend on the applied modalities and cognitive tasks. For instance, the integration auditory and tactile modalities goes through the most significant development between 8 and 10 years of age, while for the auditory and visual modalities, this falls between 11 and 12 years of age [14-19]. Incidental category learning is an intriguing exception, as children as young as 6 years of age use audiovisual stimuli efficiently for this cognitive task [49, 50]. In our study, subjects both below and above 12 years of age integrated auditory and visual signals successfully in an equivalence learning task, and the performance of both cohorts was superior in the audiovisual test as compared to the visual test. At the same time, we observed a significant performance improvement: when subjects below and above 12 years of age were compared, subjects above 12 years of age significantly outperformed subjects below 12 years of age in all parameters and in both test paradigms. Our results demonstrate that multisensory stimuli significantly enhance association learning and retrieval in the context of sensory guided equivalence learning in healthy children and adolescents. Furthermore, our results suggest that audiovisually guided equivalence learning are still in development in childhood and adolescence, which is especially well illustrated by the difference in audiovisual gain between subjects below and above 12 years of age. (XLSX) Click here for additional data file. 25 Feb 2022

PONE-D-21-24459

Multisensory Stimuli Enhance the Effectiveness of Equivalence Learning in Healthy Children and Adolescents

PLOS ONE Dear Dr. Eördegh, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please address the revisions recommended by both reviewers.

Please submit your revised manuscript by Apr 11 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Bernadette Ann Murphy, PhD Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. Please note that according to our submission guidelines (http://journals.plos.org/plosone/s/submission-guidelines), outmoded terms and potentially stigmatizing labels should be changed to more current, acceptable terminology. To this effect, please use "White or "of western European descent" instead of "Caucasian". 3. Thank you for stating the following in the Acknowledgments Section of your manuscript: "The authors thank András Puszta and Ákos Pertich for their help with data collection. This work was supported by a grant from SZTE ÁOK-KKA Grant No.:2019/270-62-2. KT was supported by EFOP 3.6.3-VEKOP-16-2017-00009 grant." We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows: "AN: SZTE ÁOK-KKA Grant No.:2019/270-62-2 Funder: University of Szeged, Faculty of Medicine, Szeged, Hungary http://www.med.u-szeged.hu/karunkrol/kari-palyazatok/aok-kari-kutatasi-alap-181005 The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. KT: EFOP 3.6.3-VEKOP-16-2017-00009 Funder: Goverment of Hungary https://www.palyazat.gov.hu/efop-363-vekop-16-felsoktatsi-hallgatk-tudomnyos-mhelyeinek-s-programjainak-tmogatsa# The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript." Please include your amended statements within your cover letter; we will change the online submission form on your behalf. 4. We note that you have stated that you will provide repository information for your data at acceptance. Should your manuscript be accepted for publication, we will hold it until you provide the relevant accession numbers or DOIs necessary to access your data. If you wish to make changes to your Data Availability statement, please describe these changes in your cover letter and we will update your Data Availability statement to reflect the information you provide. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Partly ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: No ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: Thank you for the opportunity to review this article. The topic and work are interesting and provide a novel contribution to the field of learning and sensory processing/multisensory integration. Overall, the article was well written, providing clear details on the methodology used and results yielded. There are a number of sections where further detail could be provided for clarity. Introduction: The introduction was concise. Although, several lines seem to be missing references, such as those referring to the neurological substrates involved (i.e. lines 50, 55/56, etc.). Methods: • Line 84 – how were participants defined as healthy? Were any screening questionnaires etc. administered? • Line 128 – when you say “it was slightly modified”, how so? Does this refer to translating it to Hungarian? Could more detail be provided or clarity of statement be improved? • Paragraph starting at line 153 – were the auditory and visual stimuli presented simultaneously? Or, were they presented at different times (i.e. antecedent and consequent)? The timing of the conditions would likely have an effect on how the stimuli are processed, and therefore integrated or not. Line 159 states the faces were presented simultaneously, was this at the same time as the sound as well? • Line 159 – audiovisual condition is lacking semantic congruence. Could this affect the results found? As literature, such as Laurienti et al., 2004, has shown that multisensory integration is strengthened when stimuli are semantically congruent. I.e. the auditory cues here are words/sounds that are incongruent with the visual cues. • Line 178 – Reaction Time defined as (RT)? Sometimes use RT but other times use reaction time, check for consistency. • Line 180 – “RT values over 3SD were excluded”. 3SD over what? Each participant’s individual average RT, the group average, etc.? I suspect the former, but this isn’t clear. • Was handedness checked or confirmed? If so, was this proportionally matched between groups? What hand did they respond with, or did they use both hands? Did all participants have to respond with the same hand, or did they use their dominant hand? Line 100 - mention how responses were given (M and X keys), but not many details provided, please elaborate. Results: • Line 237 – “leaning” should be “learning”. • Line 211 – sentence structure - where the “0.051” is placed seems strange, was this supposed to be before it reads “in the multisensory paradigm”? Discussion: • Line 308 paragraph – could this be a result of either a) stimulus incompatibility/semantic incongruence or b) the timing of the stimulus, possibly reducing the likelihood of them being integrated as a multisensory condition? • Line 328 – Interesting, what are some potential implications for this? As previously stated that MSI wasn’t present in children (10-11 years old; line 79). Would be interesting for this to be elaborated on, as it was a main finding. • Line 289 - discuss that audiovisual information facilitated equivalence learning in children. There were differences between groups (<12 and >12) but in the intro, stated that MSI didn’t occur until 10-11 years old. What do you think resulted in these differences if not the integration of the stimuli? Can this be elaborated on? • Paragraph from line 332 to 343 – seems out of place with respect to the preceding paragraph. Would be helpful to weave in how this information pertains to this particular study and results itself. • Line 359 – improvement in performance, using what metric? RT, accuracy, both, etc.? Reviewer #2: The study compared a visual and a crossmodal version of an association learning test (Rutgers Acquired Equivalence Test) in an impressive sample of 157 children and adolescents. It was found that RT was consistently faster and that performance (error rate) during acquisition and during retrieval of previously learned associations was better in the crossmodal as compared to the unimodal condition, whereas performance did not differ for generalization to new but predictable associations. Overall, this is a solid and technically sound contribution. My comments mainly pertain to the conceptualization of multisensory processing and the somewhat arbitrary distinction between children younger and older than 12 years of age: l. 28 "Performance during the acquisition phase (building of associations), which primarily depends on the function of the basal ganglia": the present study does not assess neural mechanisms, I would thus recommend to attenuate this statement here and in other parts of the manuscript, e.g. "which has been suggested to primarily depend…". l. 62 and elsewhere "Multisensory information could mean more than the sum of different modalities": The authors seem to allude to findings of superadditivity in neural processing of crossmodal stimuli. However, the wording is unfortunate and it is not clear what superadditivity would mean in the context of the present (behavioral) task? l. 77ff. "Multisensory integration is not detectable until the ages of 8–10 years in auditory and tactile modalities [13, 14], and audiovisual integration does not appear until the ages of 11–12 years [15-18]": I think that the available literature is in stark contrast to this statement. While the cited studies may show that integration is not necessarily optimal in younger children, it seems outright wrong to assume that this indicates that children up to 12 years do not integrate audiovisual stimuli at all. For examples of adult-like audiovisual integration in accord with Bayesian causal inference principles in children as young as 5 years old, see e.g. recent work by Rohlf and colleagues (Rohlf et al., 2020, Curr Biol; Rohlf et al., 2021, Multisens Res). Based on these findings, the presumed boundary at age 12 seems arbitrary. Why not test a correlation with age instead? For example, scatterplots of age versus the difference between visual and crossmodal conditions could be provided for each performance measure. Based on the equivalence of the present findings in children below and above 12 years, the authors conclude that children and adolescents <18 years benefit from crossmodal condition but adults >18 years do not. This seems arbitrary as well, why should 18 years of age be a cut-off age for the crossmodal benefit in association learning? Moreover, the type of stimuli used here seem to test a different type of multisensory processing (association learning) than the integration of redundant stimulus features (e.g., spatial or temporal integration) and likely depend on different neural mechanisms. This should be reflected in the discussion of the multisensory literature. l. 178ff. "Reaction times from the appearance of the stimuli until the participant’s decision were also registered for each answer with millisecond accuracy": RTs were measured with (different) standard computer keyboards which almost certainly introduced delays and jitter to the measured RTs, thus "millisecond accuracy" was likely not achieved. l. 285 "the effect of multisensory audiovisual stimuli in contrast to clear visual ones": What is meant with "clear" visual stimuli, this sounds as if the audiovisual stimuli were not "clear"? l. 332ff. lists textbook knowledge about multisensory processing in a very vague form, this paragraph should be revised. l. 344ff. suggests a ceiling effect in the employed task in adults. Thus, it seems unsurprising if no multisensory enhancement was observed in the previous adult study. Visual face stimuli were used as antecedent stimuli in the visual condition but as consequent stimuli in the crossmodal condition. Thus, stimuli differed not only in terms of sensory modality but also in terms of stimulus type between the two conditions. It seems that a better matching crossmodal condition would have been to use voice recordings of male/female adult/child speakers as antecedents and the same fish stimuli as consequents. I wonder whether the change in stimulus types might have had any effects on task difficulty that are independent of whether the stimuli were presented visually or crossmodally (and thus could explain the findings without assuming a multisensory benefit)? Data availability: Authors indicated that "all relevant data will be within the manuscript and its Supporting Information files". However, they were not in the reviewer's version of the manuscript and thus I was unable to assess this point. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 19 Apr 2022 First of all, we would like to express our gratitude for the scholarly and highly helpful criticism of the Editorial Board and both Referees, which helped us to improve the quality of our study. We discussed all criticisms and suggestions of the Referees and made changes accordingly. The suggestions are answered below point-by-point; textual changes are marked in the final manuscript. Reviewer #1: Thank you for the opportunity to review this article. The topic and work are interesting and provide a novel contribution to the field of learning and sensory processing/multisensory integration. Overall, the article was well written, providing clear details on the methodology used and results yielded. There are a number of sections where further detail could be provided for clarity. 1.Introduction: The introduction was concise. Although, several lines seem to be missing references, such as those referring to the neurological substrates involved (i.e. lines 50, 55/56, etc.). Answer: We have added the missing references. Methods: 2. Line 84 – how were participants defined as healthy? Were any screening questionnaires etc. administered? Answer: We made clearer this paragraph in the methods: “The participants were recruited on a voluntary basis, received no compensation for their participation, and they were free to quit at any time without any consequence. The volunteers and their parents were informed about the aims and procedures of the study, and their medical history was taken with emphasis on neurological, ontological, psychiatric or chronic somatic disorders. Volunteers with such disorders in their history were not eligible for the study. Any regularly taken medication was recorded. The volunteers were also tested with the Ishihara plates to exclude color blindness. As all volunteers were under 18 years of age, the informed consent form was signed by their parents for them as required by the law. All volunteers were of Caucasian descent and they were all native speakers of the Hungarian language. The study protocol followed the tenets of the Declaration of Helsinki in all respects, and it was approved by the Ministry of Human Resources (11818-6/2017/EÜIG).” 3. Line 128 – when you say “it was slightly modified”, how so? Does this refer to translating it to Hungarian? Could more detail be provided or clarity of statement be improved? Answer: We rephrased the criticized paragraph: “The basis of the applied visual associative test was the Rutgers Acquired Equivalence Test [1]. It was rewritten in Assembly for Windows, translated to Hungarian, and slightly modified (more trials in test phases to get more accurate information about the hippocampal functions) [22] with the written permission of professor Catherine E. Myers (Rutgers University), head of the research group where the test paradigm was originally developed.” 4. Paragraph starting at line 153 – were the auditory and visual stimuli presented simultaneously? Or, were they presented at different times (i.e. antecedent and consequent)? The timing of the conditions would likely have an effect on how the stimuli are processed, and therefore integrated or not. Line 159 states the faces were presented simultaneously, was this at the same time as the sound as well? Answer: We added to the manuscript: “In each trial, the subjects simultaneously heard a sound (SPL=60 dB) through a loudspeaker and saw two faces on the right and left sides of the screen.” 5. Line 159 – audiovisual condition is lacking semantic congruence. Could this affect the results found? As literature, such as Laurienti et al., 2004, has shown that multisensory integration is strengthened when stimuli are semantically congruent. I.e. the auditory cues here are words/sounds that are incongruent with the visual cues. Answer: We added to the Discussion: “Semantic congruence can strengthen multisensory integration (Semantic congruence is a critical factor in multisensory behavioral performance. Laurienti PJ, Kraft RA, Maldjian JA, Burdette JH, Wallace MT.Exp Brain Res. 2004 Oct;158(4):405-14. doi: 10.1007/s00221-004-1913-2. Epub 2004 Jun 18.PMID: 15221173), but in the case of our stimuli, such a congruency is negligible if it exists at all. Thus, it is safe to assume that in this study multisensory integration facilitated performance without semantic interference. Multisensory integration has been described at various levels of observation.” 6. Line 178 – Reaction Time defined as (RT)? Sometimes use RT but other times use reaction time, check for consistency. Answer: We introduced the RT abbreviation at the first appearance of the Reaction Time. 7. Line 180 – “RT values over 3SD were excluded”. 3SD over what? Each participant’s individual average RT, the group average, etc.? I suspect the former, but this isn’t clear. Answer: We rephrased the sentence: RT values over 3 SD of each participant’s individual average RT were excluded from further analysis. 8. Was handedness checked or confirmed? If so, was this proportionally matched between groups? What hand did they respond with, or did they use both hands? Did all participants have to respond with the same hand, or did they use their dominant hand? Line 100 - mention how responses were given (M and X keys), but not many details provided, please elaborate. Answer: We specified this question: “The keys X and M were labeled as “left” and “right” on the laptop’s keyboard. The subjects used these keys to indicate their choices in both test paradigms. The participants used both hands for the responses.” Results: 9. Line 237 – “leaning” should be “learning”. Answer: Corrected 10. Line 211 – sentence structure - where the “0.051” is placed seems strange, was this supposed to be before it reads “in the multisensory paradigm”? Answer: Done Discussion: 11. Line 308 paragraph – could this be a result of either a) stimulus incompatibility/semantic incongruence or b) the timing of the stimulus, possibly reducing the likelihood of them being integrated as a multisensory condition? Answer: We added to the Discussion: “Semantic congruence can strengthen multisensory integration (Semantic congruence is a critical factor in multisensory behavioral performance. Laurienti PJ, Kraft RA, Maldjian JA, Burdette JH, Wallace MT.Exp Brain Res. 2004 Oct;158(4):405-14. doi: 10.1007/s00221-004-1913-2. Epub 2004 Jun 18.PMID: 15221173), but in the case of our stimuli, such a congruency is negligible if it exists at all. Thus, it is safe to assume that in this study multisensory integration facilitated performance without semantic interference. Multisensory integration has been described at various levels of observation.” In each trial, the auditory and the visual stimuli appeared simultaneously. Thus the participant heard simultaneously a sound (SPL=60 dB) through a loudspeaker and saw two faces on the right and left sides of the screen. Thus, the timing of the stimuli were constant and consistent and it could not influence the results. 12. Line 328 – Interesting, what are some potential implications for this? As previously stated that MSI wasn’t present in children (10-11 years old; line 79). Would be interesting for this to be elaborated on, as it was a main finding. Answer: We gave the answer to this question in answer to line 289. 13. Line 289 - discuss that audiovisual information facilitated equivalence learning in children. There were differences between groups (<12 and >12) but in the intro, stated that MSI didn’t occur until 10-11 years old. What do you think resulted in these differences if not the integration of the stimuli? Can this be elaborated on? Answer: We have rephrased the discussion: “The reason for the superiority of audiovisual information as input for equivalence learning in children and adolescents but not in adults [12] can be that visually guided equivalence learning is still in development in childhood and adolescence [30], that is, it has not yet reached its optimum. It can be hypothesized that the additional modality enhances the suboptimal performance that is observed in the unimodal paradigm. By adulthood, however, visual equivalence learning reaches its optimum, there is no significant development anymore [12], so the beneficial effect of multimodality disappears. The developmental patterns of multisensory integration depend on the applied modalities and cognitive tasks. For instance, the integration auditory and tactile modalities goes through the most significant development between 8 and 10 years of age, while for the auditory and visual modalities, this falls between 11 and 12 years of age [14-19]. Incidental category learning is an intriguing exception, as children as young as 6 years of age use audiovisual stimuli efficiently for this cognitive task [49, 50]. In our study, subjects both below and above 12 years of age integrated auditory and visual signals successfully in an equivalence learning task, and the performance of both cohorts was superior in the audiovisual test as compared to the visual test. At the same time, we observed a significant performance improvement: when subjects below and above 12 years of age were compared, subjects above 12 years of age significantly outperformed subjects below 12 years of age in all parameters and in both test paradigms.” 14. Line 359 – improvement in performance, using what metric? RT, accuracy, both, etc.? Answer: We amended the sentence as follow: “At the same time, we observed a significant performance improvement: when subjects below and above 12 years of age were compared, subjects above 12 years of age significantly outperformed subjects below 12 years of age in all parameters and in both test paradigms.” Reviewer #2: The study compared a visual and a crossmodal version of an association learning test (Rutgers Acquired Equivalence Test) in an impressive sample of 157 children and adolescents. It was found that RT was consistently faster and that performance (error rate) during acquisition and during retrieval of previously learned associations was better in the crossmodal as compared to the unimodal condition, whereas performance did not differ for generalization to new but predictable associations. Overall, this is a solid and technically sound contribution. My comments mainly pertain to the conceptualization of multisensory processing and the somewhat arbitrary distinction between children younger and older than 12 years of age: l. 28 "Performance during the acquisition phase (building of associations), which primarily depends on the function of the basal ganglia": the present study does not assess neural mechanisms, I would thus recommend to attenuate this statement here and in other parts of the manuscript, e.g. "which has been suggested to primarily depend…". Answer: We have deleted from the Abstract the speculation about the role of neuronal structures in this learning process, and added the missing references to the Introduction section of the manuscript. 2. 62 and elsewhere "Multisensory information could mean more than the sum of different modalities": The authors seem to allude to findings of superadditivity in neural processing of crossmodal stimuli. However, the wording is unfortunate and it is not clear what superadditivity would mean in the context of the present (behavioral) task? Answer: Superadditivity could mean that the performances in the multisensory paradigm is better that those in the two unimodal visual or auditory paradigms. Because of the absence of a clear auditory associative learning test we were unable to check this statement. Thus we have deleted the speculations about the superadditivity in the learning processes. 3. 77ff. "Multisensory integration is not detectable until the ages of 8–10 years in auditory and tactile modalities [13, 14], and audiovisual integration does not appear until the ages of 11–12 years [15-18]": I think that the available literature is in stark contrast to this statement. While the cited studies may show that integration is not necessarily optimal in younger children, it seems outright wrong to assume that this indicates that children up to 12 years do not integrate audiovisual stimuli at all. For examples of adult-like audiovisual integration in accord with Bayesian causal inference principles in children as young as 5 years old, see e.g. recent work by Rohlf and colleagues (Rohlf et al., 2020, Curr Biol; Rohlf et al., 2021, Multisens Res). Answer: We rephrased the introduction and we inserted the two recommended references. “Concerning the development of multisensory integration in childhood, the available data are controversial and they strongly depend on stimulus modality and the studies cognitive function. The literature distinguishes between two main types of multisensory integration: the integration of different modalities and the integration of redundant stimulus features (e.g., spatial or temporal integration). The integration of different modalities is not detectable until 8 to 10 years of age in the auditory and tactile modalities [14, 15], and audiovisual integration is suboptimal (but detectable) until 11 to 12 years of age [16-21]. Therefore, in this study we also sought to investigate if there was a difference in participants’ performance depending on whether they were above or below 12 years of age.” 4. Based on these findings, the presumed boundary at age 12 seems arbitrary. Why not test a correlation with age instead? For example, scatterplots of age versus the difference between visual and crossmodal conditions could be provided for each performance measure. Answer: We wrote a new paragraph in the Results section with the title: “The effect of age on performance”, and showed the asked correlations: “Linear regression analysis was performed to analyze the age-dependence of the studied parameters. All the investigated parameters, both in the acquisition and the test phases, showed a significant negative correlation with the age of the participants. That is, performance improved with age in general (see Table 2).” 5. Based on the equivalence of the present findings in children below and above 12 years, the authors conclude that children and adolescents <18 years benefit from crossmodal condition but adults >18 years do not. This seems arbitrary as well, why should 18 years of age be a cut-off age for the crossmodal benefit in association learning? Answer: This is a technical consideration only. In our country (Hungary) this the border between the adolescents and adults. Furthermore, the results of the adult and children populations came from different studies. We stated in our previous study (Eördegh et al., 2019) that the multisensory stimuli cannot enhance the performances if adults (over 18 years old). But this is not the case in children and adolescents under 18 years old where the multisensory stimuli enhance the performances. However, the investigation of the correlation between the age and the performances in the entire data set (under and above 18 years old together) is a topic of a new analysis and a new manuscript. 6. Moreover, the type of stimuli used here seem to test a different type of multisensory processing (association learning) than the integration of redundant stimulus features (e.g., spatial or temporal integration) and likely depend on different neural mechanisms. This should be reflected in the discussion of the multisensory literature. Answer: The multisensory integration is a broad concept, and is sometimes used in the literature not with perfect accuracy. We amended the paragraph with the sentence: “The literature distinguishes between two main types of multisensory integration: the integration of different modalities and the integration of redundant stimulus features (e.g., spatial or temporal integration).” We added to the Discussion: “Multisensory integration has been shown to influence various cognitive-behavioral parameters such as reaction time, accuracy of answers, or perception thresholds [45-48]. Our results suggest that multisensory integration enhances the learning and retrieval of associations in healthy children and adolescents, and in this sense our results are in agreement with the literature.” 7. l. 178ff. "Reaction times from the appearance of the stimuli until the participant’s decision were also registered for each answer with millisecond accuracy": RTs were measured with (different) standard computer keyboards which almost certainly introduced delays and jitter to the measured RTs, thus "millisecond accuracy" was likely not achieved. Answer: The Reviewer has right there is no sense to use the ms accuracy. Thank you for this suggestion. We deleted the "millisecond accuracy" from the Materials and methods section. 8. 285 "the effect of multisensory audiovisual stimuli in contrast to clear visual ones": What is meant with "clear" visual stimuli, this sounds as if the audiovisual stimuli were not "clear"? Answer: We changed the word „clear” to „single”. 9. l. 332ff. lists textbook knowledge about multisensory processing in a very vague form, this paragraph should be revised. Answer: We have rephrased the criticized paragraph: “Multisensory integration has been described at various levels of observation. It has been described in detail at the single-cell level [39-42] in both the neocortex [8] and in subcortical structures [5, 9, 43]. It is also well documented in various cognitive functions at the behavioral level [10, 11, 44].” 10. l. 344ff. suggests a ceiling effect in the employed task in adults. Thus, it seems unsurprising if no multisensory enhancement was observed in the previous adult study. Answer: The results of the adult and children populations came from different studies. We stated in our previous study (Eördegh et al., 2019) that the multisensory stimuli can not enhance the performances if adults (over 18 years old). But this is not the case in children and adolescents under 18 years old where the multisensory stimuli enhance the performances. However, the investigation of the correlation between the age and the performances in the entire data set (under and above 18 years old together) is a topic of a new analysis and a new manuscript. 11. Visual face stimuli were used as antecedent stimuli in the visual condition but as consequent stimuli in the crossmodal condition. Thus, stimuli differed not only in terms of sensory modality but also in terms of stimulus type between the two conditions. It seems that a better matching crossmodal condition would have been to use voice recordings of male/female adult/child speakers as antecedents and the same fish stimuli as consequents. I wonder whether the change in stimulus types might have had any effects on task difficulty that are independent of whether the stimuli were presented visually or crossmodally (and thus could explain the findings without assuming a multisensory benefit)? Answer: The multisensory paradigm, which is our development and published earlier (see Eördegh et al., 2019), has this arrangement (faces as consequent stimuli in the crossmodal condition). This is a published and already validated test in a huge healthy human population and in patients with obsessive compulsive disorder (Pertich et al., 2019). However, it is an interesting question whether other audiovisual stimulus combinations could influence the effectiveness of the associative learning. We argue that the application of the fishes instead of the faces could not influence significantly the performances. 12. Data availability: Authors indicated that "all relevant data will be within the manuscript and its Supporting Information files". However, they were not in the reviewer's version of the manuscript and thus I was unable to assess this point. Answer: The statement was: "All relevant data will be within the manuscript and its Supporting Information files, after the acceptance of the manuscript." In this new submission we have provided all of the data (NAT, ALER, RER, GER, RT, ages) in an attached excel file. Submitted filename: Response to Reviewers.docx Click here for additional data file. 11 May 2022

PONE-D-21-24459R1

Multisensory Stimuli Enhance the Effectiveness of Equivalence Learning in Healthy Children and Adolescents PLOS ONE Dear Dr. Eördegh, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Jun 24 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Vanessa Carels Staff Editor PLOS ONE [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: (No Response) ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Partly ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The authors have adequately addressed all review comments. This work makes an interesting contribution to the field of equivalence learning and the role of sensory modalities (visual and multisensory) in various age groups. A few minor English/grammar corrections to note: Refer to lines 22, 25, 411, and 525 in the tracked version Reviewer #2: The authors have largely addressed the points raised in the reviews, and the manuscript has been improved. However, I have one remaining comment: Authors now report correlations between age and performance in each performance metric individually (i.e., separately for the V and the AV paradigms), and find, maybe not surprisingly, that performance generally increases with age. However, my original comment pertained to the correlation between age and the size of the multisensory gain (i.e., the performance difference between V and AV paradigms, e.g., MNAT minus VNAT). At least for NAT and ALER, the results presented in Tables 3 and 4 seem to suggest that the multisensory gain becomes smaller in >12 as compared to <12, and I wonder if there would be a significant correlation here? If yes, this would indicate that the multisensory gain actually decreases and becomes more adult-like over development, which would be different to the current interpretation that there is no difference until age 18. Similarly, in Table 5 the authors did not directly test whether there is a significant difference between <12 and >12 in the multisensory gain, they only compare V and AV metrics separately between age groups. I think it would be crucial to directly compare the size of the multisensory gain between age groups (note that multisensory gain could be significant in both age groups as shown in Table 4 but still be significantly smaller in >12 than in <12). ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

22 Jun 2022 First of all, we would like to express our gratitude to both Referees for their help to improve the quality of our study. We discussed all criticisms and suggestions of the Referees and made changes accordingly. The suggestions are answered below point-by-point; textual changes are marked in the final manuscript. Reviewer #1: The authors have adequately addressed all review comments. This work makes an interesting contribution to the field of equivalence learning and the role of sensory modalities (visual and multisensory) in various age groups. A few minor English/grammar corrections to note: Refer to lines 22, 25, 411, and 525 in the tracked version Answer: The English/grammar corrections were performed with the help of a native English speaker. Reviewer #2: The authors have largely addressed the points raised in the reviews, and the manuscript has been improved. However, I have one remaining comment: Authors now report correlations between age and performance in each performance metric individually (i.e., separately for the V and the AV paradigms), and find, maybe not surprisingly, that performance generally increases with age. However, my original comment pertained to the correlation between age and the size of the multisensory gain (i.e., the performance difference between V and AV paradigms, e.g., MNAT minus VNAT). At least for NAT and ALER, the results presented in Tables 3 and 4 seem to suggest that the multisensory gain becomes smaller in >12 as compared to <12, and I wonder if there would be a significant correlation here? If yes, this would indicate that the multisensory gain actually decreases and becomes more adult-like over development, which would be different to the current interpretation that there is no difference until age 18. Similarly, in Table 5 the authors did not directly test whether there is a significant difference between <12 and >12 in the multisensory gain, they only compare V and AV metrics separately between age groups. I think it would be crucial to directly compare the size of the multisensory gain between age groups (note that multisensory gain could be significant in both age groups as shown in Table 4 but still be significantly smaller in >12 than in <12). Answer: We have defined the multisensory gain in Data Analysis section. We have calculated the gain values of each psychophysical learning parameters (NAT, ALER, RER, GER), and made the correlations of these parameters with age. Significant correlation of multisensory gain and age was found in acquisition phase. This indicates that the multisensory gain actually decreases with the aging of the children and becomes more adult-like over development. We have added a new paragraph “Correlation of multisensory gain with age and the size of multisensory gain in the age groups” to the results section. Two new tables were prepared (Table 6 and Table 7), which contain the descriptive statistic and the correlation results in detailed. We have also compared the multisensory gains between the two age groups (under and above 12 years). The comparison of the multisensory gains of the two age cohorts denoted that the older cohort has smaller gain than the younger one, and the differences are significant in the acquisition phase. Table 8 contain the detailed results of the Mann-Whitney U tests. We have additionally modified the Abstract and Discussion sections according to these new results. Submitted filename: Response to Reviewers_2.docx Click here for additional data file. 4 Jul 2022 Multisensory Stimuli Enhance the Effectiveness of Equivalence Learning in Healthy Children and Adolescents PONE-D-21-24459R2 Dear Dr. Eördegh, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Patrick Bruns Guest Editor PLOS ONE Additional Editor Comments (optional): I would like to disclose that I had previously served as a reviewer (Review #2) for this manuscript myself. Based on my assessment of the revised manuscript, I am happy to confirm that the remaining minor comments I had raised in the previous round of reviews have now all been addressed. Congratulations on a nice contribution. Reviewers' comments: None 21 Jul 2022 PONE-D-21-24459R2 Multisensory Stimuli Enhance the Effectiveness of Equivalence Learning in Healthy Children and Adolescents Dear Dr. Eördegh: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Patrick Bruns Guest Editor PLOS ONE

50 in total

1. Age-related differences in audiovisual interactions of semantically different stimuli.

Authors: Maria Pia Viggiano; Fabio Giovannelli; Fiorenza Giganti; Arianna Rossi; Tiziana Metitieri; Mohamed Rebai; Renzo Guerrini; Massimo Cincotta
Journal: Dev Psychol Date: 2016-11-28

2. Multisensory convergence and integration in the neostriatum and globus pallidus of the rat.

Authors: E H Chudler; K Sugiyama; W K Dong
Journal: Brain Res Date: 1995-03-13 Impact factor: 3.252

3. Integration of visual and infrared information in bimodal neurons in the rattlesnake optic tectum.

Authors: E A Newman; P H Hartline
Journal: Science Date: 1981-08-14 Impact factor: 47.728

4. Fusion of visual cues is not mandatory in children.

Authors: Marko Nardini; Rachael Bedford; Denis Mareschal
Journal: Proc Natl Acad Sci U S A Date: 2010-09-13 Impact factor: 11.205

5. Dissociation between medial temporal lobe and basal ganglia memory systems in schizophrenia.

Authors: Szabolcs Kéri; Orsolya Nagy; Oguz Kelemen; Catherine E Myers; Mark A Gluck
Journal: Schizophr Res Date: 2005-09-15 Impact factor: 4.939

6. Acquired equivalence and related memory processes in migraine without aura.

Authors: Attila Öze; Attila Nagy; György Benedek; Balázs Bodosi; Szabolcs Kéri; Éva Pálinkás; Katalin Bihari; Gábor Braunitzer
Journal: Cephalalgia Date: 2016-05-19 Impact factor: 6.292

7. Multisensory control of hippocampal spatiotemporal selectivity.

Authors: Pascal Ravassard; Ashley Kees; Bernard Willers; David Ho; Daniel A Aharoni; Jesse Cushman; Zahra M Aghajan; Mayank R Mehta
Journal: Science Date: 2013-05-02 Impact factor: 47.728

8. Maintained Visual-, Auditory-, and Multisensory-Guided Associative Learning Functions in Children With Obsessive-Compulsive Disorder.

Authors: Ákos Pertich; Gabriella Eördegh; Laura Németh; Orsolya Hegedüs; Dorottya Öri; András Puszta; Péter Nagy; Szabolcs Kéri; Attila Nagy
Journal: Front Psychiatry Date: 2020-11-26 Impact factor: 4.157

9. Visually guided associative learning in pediatric and adult migraine without aura.

Authors: Zsófia Giricz; Ákos Pertich; Attila Őze; András Puszta; Ágnes Fehér; Gabriella Eördegh; Jenő Kóbor; Katalin Bihari; Éva Pálinkás; Gábor Braunitzer; Attila Nagy
Journal: Cephalalgia Date: 2020-09-20 Impact factor: 6.292

10. Impairment of visually guided associative learning in children with Tourette syndrome.

Authors: Gabriella Eördegh; Ákos Pertich; Zsanett Tárnok; Péter Nagy; Balázs Bodosi; Zsófia Giricz; Orsolya Hegedűs; Dóra Merkl; Diána Nyujtó; Szabina Oláh; Attila Őze; Réka Vidomusz; Attila Nagy
Journal: PLoS One Date: 2020-06-16 Impact factor: 3.240