Literature DB >> 32552477

Rapid Perceptual Learning: A Potential Source of Individual Differences in Speech Perception Under Adverse Conditions?

Tali Rotman¹, Limor Lavie¹, Karen Banai¹.

Abstract

Challenging listening situations (e.g., when speech is rapid or noisy) result in substantial individual differences in speech perception. We propose that rapid auditory perceptual learning is one of the factors contributing to those individual differences. To explore this proposal, we assessed rapid perceptual learning of time-compressed speech in young adults with normal hearing and in older adults with age-related hearing loss. We also assessed the contribution of this learning as well as that of hearing and cognition (vocabulary, working memory, and selective attention) to the recognition of natural-fast speech (NFS; both groups) and speech in noise (younger adults). In young adults, rapid learning and vocabulary were significant predictors of NFS and speech in noise recognition. In older adults, hearing thresholds, vocabulary, and rapid learning were significant predictors of NFS recognition. In both groups, models that included learning fitted the speech data better than models that did not include learning. Therefore, under adverse conditions, rapid learning may be one of the skills listeners could employ to support speech recognition.

Entities: CellLine Chemical Disease Gene Species

Keywords: age-related hearing loss; auditory learning; fast speech; speech recognition

Mesh：

Year: 2020 PMID： 32552477 PMCID： PMC7303778 DOI： 10.1177/2331216520930541

Source DB: PubMed Journal: Trends Hear ISSN： 2331-2165 Impact factor: 3.293

Dynamic matching of incoming speech input with preexisting phonological and lexical representations facilitates speech perception, especially under suboptimal or adverse conditions (Davis & Johnsrude, 2007; Guediche et al., 2014; Mattys et al., 2009; Samuel, 2011). However, such suboptimal conditions also yield substantial individual differences in speech recognition which are only partially explained by various sensory (e.g., hearing acuity), cognitive (e.g., working memory and vocabulary), and demographic factors (Benichov et al., 2012; Bent et al., 2016; Carbonell, 2017; DeCaro et al., 2016; Gordon-Salant & Fitzgibbons, 1997; Humes & Dubno, 2010; Mattys et al., 2012; McLaughlin et al., 2018; Nagaraj, 2017; Wingfield & Tun, 2001) . Another potential source of individual differences in speech recognition which we have recently revealed is perceptual learning (Banai & Lavie, 2020; Karawani et al., 2017; Manheim et al., 2018), defined as experience-induced changes in the perception of stimulus arrays (Green et al., 2018; Samuel & Kraljic, 2009). These studies showed high correlations between rapid perceptual learning of time-compressed speech (TCS) and the recognition of natural-fast speech (NFS) in both younger and older adults. However, these correlations could reflect the contribution of the sensory and cognitive factors already known to associate with speech recognition, and not a unique contribution of learning. Therefore, the goal of the current study was to test the hypothesis that in normal-hearing young adults as well as in older adults with hearing loss, rapid auditory perceptual learning is a unique predictor of speech recognition under adverse conditions while accounting for other sensory and cognitive variables. To this end, we assessed rapid learning of TCS and the recognition of NFS in younger and older adults, speech perception in noise (young adults only), and indices of cognitive performance (selective attention, working memory span, vocabulary, and nonverbal intelligence).

Perceptual Learning and the Recognition of Degraded Speech

Samuel and Kraljic (2009) suggested that a primary purpose of perceptual learning for speech in adulthood is to allow listeners to understand degraded speech (speech that deviates from the norm or that is presented under adverse conditions). Consistent with this suggestion, multiple labs reported improved recognition of degraded speech (e.g., noisy, vocoded, time-compressed, accented) following either brief experiences or longer training (e.g., Adank & Janse, 2009; Baese-Berk et al., 2013; Banai & Lavner, 2014; Bradlow & Bent, 2008; Clopper & Pisoni, 2004; Davis et al., 2005; Dupoux & Green, 1997; Golomb et al., 2007; Greenspan et al., 1988; Stacey & Summerfield, 2008). The current study focuses on rapid learning that follows brief encounters with distorted speech. For example, upon first encounter with TCS, substantial improvements occur within minutes of exposure (Altmann & Young, 1993; Dupoux & Green, 1997; Peelle & Wingfield, 2005). The recognition of other forms of degraded speech also improves after brief exposure (Clarke & Garrett, 2004; Davis et al., 2005). However, in studies of perceptual learning, both learning and speech recognition were usually assessed using a single type of speech stimulus (i.e., stimuli were all time-compressed or vocoded). Therefore, the design of those studies makes it hard to determine whether perceptual learning can be viewed as a more general capacity. In our previous studies (Karawani et al., 2017; Manheim et al., 2018), we showed that individual differences in perceptual learning are associated with individual differences in speech recognition assessed with speech materials that are different from those used to assess learning. The rapid occurrence of perceptual learning at least for some forms of degraded speech, (e.g., time-compressed and accented speech, Adank & Janse, 2009; Clarke & Garrett, 2004; Dupoux & Green, 1997; Golomb et al., 2007; Gordon-Salant et al., 2010; Peelle & Wingfield, 2005) together with findings that speech perception is inherently dynamic and flexible (Ahissar et al., 2009; Davis & Johnsrude, 2007; Guediche et al., 2014) lead us to propose that perceptual learning is one of the capacities that serve online speech recognition. If this is the case, the specificity of learning to the type of speech to which listeners are exposed is not an obstacle for learning (in the absence of generalization), but rather, this specificity serves the purpose of coping with the exact characteristics of the encountered degraded speech. If, as we argue, perceptual learning is a capacity that serves ongoing speech recognition, rapid learning should explain unique variance in the recognition of different types of distorted speech in addition to the known contributions of other cognitive capacities such as attention, working memory, and vocabulary. When listening to speech, listeners automatically attempt to extract the general meaning of the input (e.g., the identity of a word) and are not attuned to the lower level acoustic and phonetic features of the input (Ahissar et al., 2009; Nahum et al., 2008). However, under adverse conditions, this automatic process often fails, and the failure triggers adaptive processes (Guediche et al., 2014; Sohoglu & Davis, 2016) that allows access to the relevant low-level information. Learning rates and amounts differ across individuals (Manheim et al., 2018; Theodore et al., 2019). We thus argue that because in “good learners” learning is generally quicker and more efficient, they enjoy a general advantage, leading to better perception of degraded speech. We use the perceptual learning of TCS as an index of rapid learning because most listeners have no experience with this type of speech, making it useful in studying the correlations between learning-as-a-capacity and the recognition of other, more familiar forms of degraded speech (e.g., NFS and speech in noise [SIN]). Although highly compressed speech is initially hard to recognize, recognition accuracy improves quite rapidly with exposure to as few as 10 to 20 time-compressed sentences (e.g., Dupoux & Green, 1997; Golomb et al., 2007). This rapid learning has been shown in young and in older adults as well as in older adults with hearing impairments (Golomb et al., 2007; Manheim et al., 2018; Peelle & Wingfield, 2005). Finally, across age and hearing levels, we previously reported that variance in rapid learning of TCS accounted for unique variance in the recognition of NFS, even after controlling for the known association between the recognition of NFS and TCS (Manheim et al., 2018).

Age-Related Hearing Loss, Perceptual Learning, and the Recognition of Distorted Speech

The recognition of different forms of degraded speech deteriorates with age and age-related hearing loss (Dubno et al., 1984; Gordon-Salant & Fitzgibbons, 1995, 2001; Humes & Christopherson, 1991; Humes et al., 2012; Janse, 2009; Sommers et al., 2020). As with general individual differences in speech recognition, declines in sensory and cognitive factors are not sufficient to explain why some older adults find degraded speech so challenging (Anderson et al., 2013; Peelle & Wingfield, 2016; Wingfield et al., 2005). Rather, it seems that the relative contribution of nonsensory factors may increase with age/hearing loss, perhaps because listeners recruit domain general cognitive resources to compensate for sensory declines (Alain et al., 2004; Bidelman et al., 2019; Peelle & Wingfield, 2016). Therefore, we studied the contribution of perceptual learning to speech recognition separately in each age-group. As for learning, perceptual learning for speech is certainly present in older adults with either normal or impaired hearing (Colby et al., 2018; Golomb et al., 2007; Karawani et al., 2016; Manheim et al., 2018; Neger et al., 2014; Peelle & Wingfield, 2005; Schlueter et al., 2016). However, whereas some reported little or no effects of age on learning (Golomb et al., 2007; Peelle & Wingfield, 2005), others found that learning might slow (Manheim et al., 2018) or diminish (Neger et al., 2014) with age. One possibility is that learning is maintained to a greater extent in situations in which lexical information can be used (Colby et al., 2018; Scharenborg & Janse, 2013). Regardless, older adults who maintain better perceptual learning also tend to maintain better speech perception (Karawani et al., 2017; Manheim et al., 2018). Therefore, we now hypothesize that to the extent that perceptual learning constrains speech perception independent of other factors, this should be true regardless of age/hearing.

Cognition and the Recognition of Distorted Speech

Cognitive and linguistic resources are used to support speech recognition under adverse conditions (Gordon-Salant & Fitzgibbons, 1997; McLaughlin et al., 2018; Pichora-Fuller et al., 1995; Wingfield & Tun, 2001). Therefore, in the current study, we assessed vocabulary, working memory, and inhibitory attention, in addition to speech recognition and rapid auditory learning. While an in-depth review of how cognitive and linguistic factors might contribute to speech perception is beyond the scope of the current article, the general notion is that in favorable listening conditions speech perception is an implicit process, through which the acoustic signal is matched with stored representations in long-term memory. On the other hand, signal degradation results in a mismatch that prevents this automatic process, requiring explicit top-down processing (e.g., the Ease of Language Understanding model, Ronnberg et al., 2013, 2019). Consistent with this notion, there are significant associations between aspects of speech processing and working memory (McLaughlin et al., 2018; Pichora-Fuller et al., 1995; Souza & Arehart, 2015). Attention, and particularly the ability to ignore irrelevant information, is also associated with speech perception (Adank & Janse, 2010; Oberfeld & Klockner-Nowotny, 2016; Tierney et al., 2019). Finally, vocabulary has been repeatedly associated with the recognition of different forms of degraded speech (Banks et al., 2015; Bent et al., 2016; McLaughlin et al., 2018), possibly because it reflects the wealth of previous experiences a listener can draw from to support ongoing speech recognition.

The Current Study

The major goal of this study was to test facts of the hypothesis that perceptual learning (as measured with a TCS task) explains individual differences in independent estimates of speech recognition when hearing level and cognitive factors are considered. In young adults, we estimated, in addition to rapid learning, the recognition of NFS and SIN. In older adults with age-related hearing loss, we repeated the same assessment with two major differences: First, only NFS was tested. Second, speech materials were amplified, and the rate of TCS was slower to obtain an estimate of rapid learning that is as free as possible from sensory effects.

Methods

Older adults with age-related hearing loss and young adults with normal hearing participated in this study. They were tested on indices of hearing, rapid perceptual learning, speech recognition, and cognition. However, given the known effects of age/hearing loss on some of these processes, test protocols differed in the two groups: First, we adapted the TCS task used to assess rapid learning such that slower speech rates were presented to older adults. Older adults were also given more trials on this task. This was done so that the assessments of the relationships between learning on one hand and speech recognition on the other are as clean as possible from the deleterious effects of age/hearing loss on rapid learning of TCS with this learning paradigm (e.g., Manheim et al., 2018). Second, young adults were tested on three conditions of challenging speech: two conditions of NFS and one condition of SIN. However, due to time constraints (both in testing individual participants and in the time devoted to pilot testing to establish appropriate signal-to-noise ratios [SNRs] for our stimuli in noise), in older adults, speech recognition was assessed with a single condition of NFS.

Participants

A total of 101 participants, 55 young adults ages 18 to 34 years (mean age: 24 ± 4) and 46 older adults ages 65 to 89 years (mean age: 75 ± 7), were recruited. Young adults were recruited through advertisements at academic institutions and social media; older adults were recruited in audiology clinics, at hospitals and in the community. Inclusion criteria for both younger and older adults were no prior experience with TCS and psychoacoustic testing, high proficiency in Hebrew, normal neurological status (based on self-report), and normal cognitive status (see later). Young adults were payed participants; older adults received either a pack of hearing aid batteries or a personal TV headset as “thank you” gift. All young adults (20 males, 35 females) had normal hearing (mean 4 frequency [0.5, 1, 2, and 4 kHz] pure-tone average [PTA] in both ears 5 ± 3 dB HL, range 1–12 dB), and all had high school education or higher (see later). The older adults were all with age-related hearing loss (mean 4 frequency PTA in both ears 50 ± 8 dB HL, range 30–70 dB; see Figure 1 for audiograms). Inclusion criterion for suprathreshold recognition score was ≥ 60%, actual range was 72% to 100%, and mean score was 90% ± 7%. Cognitive status was measured with the Hebrew version of the Mini-Mental State Examination (MMSE; Folstein et al., 1975), inclusion criterion: ≥24, mean score 28 ± 2. All but 4 participants had high school education or higher, range 8 to 22 years, mean 14 ± 3. One participant was excluded from the study for not following the inclusion criterion for MMSE (score ≥ 24).

Figure 1.

Mean Audiograms. Mean thresholds and standard deviations are shown; Older adults (OA) audiogram in full lines, young adults (YA) in dashed lines.

Mean Audiograms. Mean thresholds and standard deviations are shown; Older adults (OA) audiogram in full lines, young adults (YA) in dashed lines. Twenty-three of the participants were experienced with hearing aids for at least 1 year, and 22 had similar hearing but no experience with amplification devices. Both groups were counterbalanced in mean age, years of education, and MMSE score. Preliminary analysis suggested that (unaided) hearing, speech perception, learning, and cognition did not differ between the two groups. Therefore, for the purpose of the current report, data were collapsed across the two groups. Note that these data have been presented at the 2019 International Symposium on Auditory and Audiological Research, and a brief report has been published in the symposium’s proceedings (Rotman et al., 2020). Participants were compensated for their time. All aspects of the study were approved by the ethics committee of the Faculty of Social Welfare and Health Sciences at the University of Haifa (approval number 362/18).

Stimuli

Sixty-two simple Hebrew sentences, 5- to 6-word long (adapted from Prior & Bentin, 2006) with a common subject–verb–object grammatical structure were taken from a set of prerecorded Hebrew sentences. Half of the sentences were semantically plausible (e.g., “The young woman braids her long hair”), and the other half were semantically implausible (e.g., “The broken window plays an electric guitar,” which is implausible because windows do not play guitars). Note that plausibility was based on the semantic congruency of the content and not on predictability or grammatical agreement. The presentation order of plausible and implausible sentences was random throughout the different tasks. The sentences used for testing speech perception resembled those used for learning evaluation in their length, grammatical structure, semantic plausibility, and recording method and conditions. All sentences were recorded in a sound attenuating booth and sampled at 44 kHz using a built-in MacBook Air microphone and converted to WAV format by three talkers (all female native speakers of Hebrew). The root-mean-square levels of the sentences were normalized after recording using Audacity audio software.

Speech Perception and Rapid Learning Tasks

Speech perception was evaluated with NFS (both groups) and with speech in 4-talker bubble noise (young adults only). NFS was recorded by two different female native speakers of Hebrew. Talker 1 recorded 20 different sentences at an average rate of 221 words/min (SD = 23); Talker 2 recorded 10 sentences at a rate of 189 words/min (SD = 30). In addition, 10 sentences were recorded by Talker 3 at 113 ± 12 words/min and mixed with 4-talker bubble noise (SNR = −5 dB). This SNR was determined based on a previous study in which the same recordings were used on a different cohort of young adults (Banai & Lavie, 2020). Young adults were tested with NFS of the two talkers (1 and 2), 10 sentences each, and with 10 SIN sentences. Older adults were tested with 20 natural-fast sentences of Talker 1 (221 words/min). Talker 2 was added for young adults because pilot testing suggested that Talker 1 (despite being faster) was hardly challenging for young adults. We note that Talker 1 is considered more pleasant and clear than talker two, perhaps because she is a clinical audiologist and is used to speaking clearly. Rapid learning was evaluated with a TCS task. Young adults listened to 10 sentences; older adults to 20 sentences. Sentences (recorded by Talker 3) were compressed to 30% of their duration for the young adults and to 50% of their duration for the older group. Time compression was applied in MATLAB, using a Wave form Similarity Overlap and Add algorithm that modifies the rate while preserving other qualities of the speech stimuli, such as pitch and timbre (Verhelst & Roelands, 1993). Because recognition of TCS declines with age and hearing loss (e.g., Gordon-Salant & Fitzgibbons, 2001; Letowski & Poch, 1996; Wingfield et al., 1985), a fair assessment of learning requires giving older adults speech at a rate that is not as hard as to prohibit learning. Therefore, based on our previous studies with similar sentences (e.g., Banai & Lavner, 2014; Manheim et al., 2018), speech was compressed to 30% of its natural duration for young adults and to 50% of its natural duration for older adults. In Manheim et al., we used a compression rate of 40% for old hearing-impaired adults. However, because in the current study the hearing losses of the participants were more severe, and the sample was somewhat older (on average), a small-scale pilot study (8 participants) was conducted to set the compression rate for the older hearing-impaired adults. The 40% rate yielded a floor effect, and the 50% criterion was deemed adequate. In addition, the number of words per minute in the 50% compression rate matched the mean rate of the NFS presented to the older group (221 words/min). Note however that this adjustment resulted in substantial differences in baseline performance between younger and older adults (see Table 1 ), which is one of the reasons data was not directly compared across groups.

Table 1.

Speech Recognition (Proportions Correct) in Older and Younger Adults.

	Older adults	Younger adults
NFS—Talker 1
M (SD) [95% CI]	0.36 (0.18) [0.31, 0.41]	0.93 (0.05) [0.92, 0.94]
Mdn (IQR)	0.34 (0.21–0.48)	0.93 (0.91–0.97)
NFS—Talker 2
M (SD) [95% CI]	–	0.84 (0.07) [0.82, 0.86]
Mdn (IQR)	–	0.85 (0.80–0.90)
SIN
M (SD) [95% CI]	–	0.62 (0.14) [0.59, 0.66]
Mdn (IQR)	–	0.67 (0.54–0.72)
TCS baseline
M (SD) [95% CI]	0.37 (0.22) [0.30, 0.44]	0.08 (0.09) [0.06, 0.11]
Mdn (IQR)	0.36 (0.18–0.52)	0.08 (0–0.1)

Note. NFS = natural-fast speech; CI = confidence interval; IQR = interquartile range; SIN = speech in noise; TCS = time-compressed speech.

Speech Recognition (Proportions Correct) in Older and Younger Adults. Note. NFS = natural-fast speech; CI = confidence interval; IQR = interquartile range; SIN = speech in noise; TCS = time-compressed speech. Sentences were presented (to both ears) through Sennheiser HD-215 headphones (hearing-aids were not used during stimuli presentation), at the most comfortable level (MCL) of each listener. The MCLs were determined in reference to the levels in dB SPL for a 1-kHz pure tone. The headphones’ output was matched to the individual MCL with a sound level meter. After hearing each sentence once, the younger participants were asked to write it down as accurately as possible. Participants in the older group were asked to repeat what they heard after each single trial, and the experimenter transcribed their replies. Each sentence could be played only once, and no feedback was provided. Performance was scored off-line. All words, including function words, were counted for scoring. Homophonic spelling errors were accepted as correct, consistent with our previous studies (Banai & Lavner, 2014, 2016; Manheim et al., 2018). Other than this exception, words had to be perfectly reported to count as correct.

Cognitive Measures

A computerized version of the flanker task (Eriksen & Eriksen, 1974) was used as a measure of selective attention. A computerized version was created in SuperLab, according to the parameters presented by Scharenborg et al. (2015). Participants were seated in front of a white computer screen, with headphones on, and were instructed to indicate (by clicking either the “z” or the “/” key on the keyboard) which direction a middle symbol (an arrow pointing left or right) in a row of five symbols points. There were three types of stimuli: congruent (⋙≫ or ⋘≪), incongruent (≫ < ≫ or ≪ > ≪), and neutral (== < == or == > ==). Each type (six different sequences) was presented 12 times (a total of 72 trials) in a random order. Each trial started with a beep (a 400 Hz pure-tone signal) and a fixation cross that remained on the screen for 250 ms. Following the fixation cross, the stimulus was presented for 1,500 ms. Participants were instructed to response in each trial as fast and as accurate as possible. The intertrial time was 1,000 ms. Before beginning the test, six practice trials were presented. If needed, another six practice trials were given. The “flanker cost” for each participant was used for statistical analysis. The cost was calculated as the mean logRT (RT = reaction time in ms) of the correct responses in the incongruent trials divided by the mean logRT of the correct responses in the neutral trails. A higher flanker cost (>1) means poorer selective attention. Response accuracies were high (young adults: congruent trials: M = 98%, SD = 8; neutral trials: M = 96, SD = 9; incongruent trials: M = 92, SD = 11; older adults: congruent: M = 98, SD = 8; neutral: M = 97, SD = 9; incongruent: M = 87, SD = 24). Mean response times in young adults were 1,026 msec (SD = 96) in congruent trials, 1,033 msec (SD = 91) in neutral trials, and 1,118 msec (SD = 98) in incongruent trials. In older adults, mean response times were 1,199 msec (SD = 112) in congruent trials, 1,199 msec (SD = 105) in neutral trials and 1,296 msec (SD = 243) in incongruent trials. Three subtests from the Wechsler Adult Intelligence Scale-III in Hebrew were administered: Working memory was tested with the Digit span forward and backward subtest; Vocabulary test was used to evaluate participants’ semantic knowledge and verbal concept formation in Hebrew; and Block design was used to evaluate nonverbal reasoning. Administration and scoring followed the test manual; scaled scores are reported later.

Study Administration and Schedule

Young adults were tested at the Auditory Cognition Lab at the University of Haifa or at their home, according to their preference. Older adults were tested in a quiet room at an audiology clinic or in the participants’ home, according to preference. After a short explanation regarding the procedure of the experiment, participants were requested to sign a written informed consent and fill in a short background questionnaire. Older adults participated in one session (with frequent breaks, as needed) in which MMSE and hearing were tested followed by the speech and cognitive tasks. Younger adults participated in two sessions: In one session, the speech and learning assessments were completed followed by the cognitive tasks. Hearing was tested on another session in which participants also transcribed two time-compressed sentences presented by Talker 3 that was used as baseline for statistical analysis. Although ideally baseline recognition of TCS should have been estimated prior to the estimation of learning, here this assessment in young adults was conducted as part of the hearing assessment that was conducted either before or after the learning assessment. We nevertheless believe that this choice had minimal influence on the findings because even with substantially longer learning experience improvement in the recognition of new sentences produced by a different talker is minimal (e.g., Manheim et al., 2018).

Preliminary Data Analysis

Missing Data and Outliers

One older adult failed to meet the inclusion criteria (scored lower than 24 on the MMSE) and was excluded from data analysis. Therefore, analyses of this group are based on the 45 remaining participants. Five younger adults had missing scores on one of the cognitive tests; therefore, sample size for the prediction models of young adults was 50. Otherwise, all data (including outlying data points) were included in data analysis.

Speech Perception

The number of correct words per sentence per condition was counted for each participant, and the proportion of correctly recognized words out of the total was used for data presentation and statistical modeling. Baseline recognition of TCS was defined as performance with two sentences recorded by Talker 3. These sentences were measured as part of the hearing screening in young adults. For older adults, the first two time-compressed sentences were used. The semantic plausibility of the sentences could have influenced performance, especially in older adults (e.g., Sheldon et al., 2008). Therefore, the recognition of plausible and implausible sentences was first evaluated separately in older adults. Recognition of the two types of sentences did not differ significantly for either NFS (plausible sentences: M = 0.36, SD = 0.18, Mdn = 0.35, interquartile range [IQR]: 0.20–0.48; implausible sentences: M = 0.36, SD = 0.18, Mdn = 0.31, IQR: 0.24–0.45, Wilcoxon signed-rank test: W = 495, p = .60, Bayesian comparison: BF10 = 0.13) or TCS (plausible sentences: M = 0.45, SD = 0.25, Mdn = 0.47, IQR: 0.25–0.64; implausible sentences: M = 0.48, SD = 0.24, Mdn = 0.44, IQR: 0.28–0.67, Wilcoxon signed-rank test: W = 349, p = .97, Bayesian comparison: BF10 = 0.055). Consequently, analyses in the results section were conducted with no further consideration of semantic plausibility. Younger adults are supposed to be less influenced by semantic plausibility (Sheldon et al., 2008). In addition, in the current study, young adults were tested with fewer sentences than older adults, so we did not separate plausible and implausible sentences.

Rapid Learning

Learning was defined as the rate of improvement in the recognition of TCS over time. It was quantified as the linear slopes of the learning curves over all TCS sentences. Calculation of the slope was based on the linear fit between performance—the proportion of words correctly recognized—in miniblocks of two sentences (approximately 10 words) and the number of the miniblock. Thus, a slope of 0.1 suggests that recognition improved by 10% per miniblock (approximately 1 word per miniblock or 0.5 words/sentence). The decision to use linear slope was based on our previous studies of TCS learning (e.g., Banai & Lavner, 2014). Because we aimed to dissociate between recognition and rapid learning and had no separate baseline estimate for older adults, we used their first two sentences as baseline for data modeling and therefore, for this part of the analysis, slopes were based on 18 sentences only.

Data Availability

Raw data and audio demos can be requested by emailing the corresponding author.

Results

Speech Recognition and Cognition in Younger and Older Participants

Initial exploration of speech perception in the two experiments was based on averaged performance across all sentences at a given condition per participant. As shown in Table 1, older adults found the recognition of naturally fast speech presented by Talker 1 quite challenging. Although Talker 1 was hardly challenging for younger adults, their performance was challenged to a greater extent with the other talker as well as in the SIN task as shown in the right column of Table 1. Table 2 shows descriptive statistics for the cognitive indices used in the study as well as PTA hearing thresholds that were used as an index of hearing in subsequent analyses. For the tasks taken from the Wechsler Adult Intelligence, scaled scores in both groups fell well within the normal range as described in the test manual.

Table 2.

Estimates of Hearing and Cognition.

	Older adults	Younger adults
Hearing (PTA4, dB)
M (SD)	50 (8)	5 (3)
Mdn (IQR)	49 (44–56)	5 (3–7)
Vocabulary (scaled score)
M (SD)	9 (2)	12 (2)
Mdn (IQR)	9 (7–11)	12 (11–13)
Digit span (scaled score)
M (SD)	9 (2)	10 (2)
Mdn (IQR)	9 (7–11)	10 (8–12)
Block design (scaled score)
M (SD)	9 (2)	12 (2)
Mdn (IQR)	9 (7–12)	12 (10–14)
Flanker cost
M (SD)	0.99 (0.15)	1.00 (0.03)
Mdn (IQR)	1.01 (1.008–1.016)	1.01 (1.002–1.014)

Note. PTA = pure-tone average; IQR = interquartile range.

Estimates of Hearing and Cognition. Note. PTA = pure-tone average; IQR = interquartile range. As expected from the literature, speech perception was correlated with some cognitive indices (see Table 3 for Pearson correlations). On the other hand, perceptual learning slopes were not correlated with any of the cognitive measures (highest correlation was r = .2 which was insignificant). Nor were they associated with hearing thresholds (r = −.2 in young adults and r = .002 in older adults, both insignificant). The correlations between baseline recognition of TCS and the dependent variables led us to include this baseline in the statistical models described later.

Table 3.

Correlations Between Speech Recognition in Younger and Older Adults and Cognition and Learning.

	NFS 1		NFS 2	SIN
	OA	YA	YA	YA
Hearing	−0.50	−0.03	−0.04	0.1
TCS baseline	0.82	0.09	0.26	0.32
Vocabulary	0.47	0.30	0.40	0.35
Working memory	0.44	0.11	0.15	0.27
Block design	0.11	0.14	−0.09	0.27
Attention	−0.02	0.13	0.02	0.17
Slope	0.27	0.47	0.54	0.49

Note. Pearson correlations are shown. NFS 1 = natural-fast speech, Talker 1; NFS 2 = natural-fast speech, Talker 2; SIN = speech in noise; OA = older adults; YA = young adults; Hearing = average PTA; TCS baseline = average of first two sentences of time-compressed speech. Vocabulary, working memory, and block design—scaled scores from the corresponding tests; Attention = cost in the flanker task; Slope = rapid perceptual learning slope. Note that all correlations between rapid learning slopes and the cognitive variables were low (r < .19).

Correlations Between Speech Recognition in Younger and Older Adults and Cognition and Learning. Note. Pearson correlations are shown. NFS 1 = natural-fast speech, Talker 1; NFS 2 = natural-fast speech, Talker 2; SIN = speech in noise; OA = older adults; YA = young adults; Hearing = average PTA; TCS baseline = average of first two sentences of time-compressed speech. Vocabulary, working memory, and block design—scaled scores from the corresponding tests; Attention = cost in the flanker task; Slope = rapid perceptual learning slope. Note that all correlations between rapid learning slopes and the cognitive variables were low (r < .19).

Rapid Perceptual Learning of TCS

To determine whether rapid learning has occurred over the course of a brief encounter with TCS, we tracked recognition accuracy over the course of 10 sentences for young adults and 20 sentences for older adults. Subsequently, the slopes of the learning curves were calculated as explained in the methods. Figure 2 (left column) shows average accuracies of the first five sentences and final five sentences encountered by each group. In young adults, mean recognition accuracies improved from 0.09 in the first 5 sentences to 0.27 in sentences 6 to 10. The group learning slope (M = .065, SD = 0.06, 95% confidence interval [CI] [0.049, 0.080], Figure 2 midcolumn) was significantly positive with a large effect size (t(54) = 8.3, p < .001, Cohen’s d = 1.12 with a 95% CI of 0.78 to 1.45). In older adults, accuracies improved from 0.35 in the first 5 sentences to 0.45 in the next group of 5 and to 0.56 in sentences 16 to 20. Given that compressed speech rates were slower in older adults, it is not surprising that recognition accuracies were quite good. Yet, initial recognition was still low enough to afford improvements. With 20 sentences, learning was significant with a large effect size (M = .026, SD = 0.021, 95% CI [0.019, 0.032], t(44) = 8.11, p < .001, Cohen’s d = 1.21 with a 95% CI of 0.82 to 1.59). These data are consistent with our previous findings in individuals with milder hearing loss (Manheim et al., 2018) and suggest that although learning is present in older adults, it is nevertheless slower than in younger adults with normal hearing. Therefore, in daily situations where older and younger adults are facing similar conditions, perceptual adjustment is expected to be reduced in older adults.

Figure 2.

Rapid Learning of Time-Compressed Speech in Younger (Top Row) and Older (Bottom Row) Adults. The leftmost panel in each row shows performance (averaged over blocks of five sentences, not including baseline sentences; thin gray lines mark individual participants). Slopes of the learning curves (see text for details; individual data are shown with gray symbols) are shown in the middle panel of each row. Boxes mark the interquartile range (25th–75th percentile). The thick line within each box marks the median (blue for young adults and green for older adults). + signs mark outlying data points. The rightmost panel on each line shows performance on the final set of five sentences versus performance on the first set. Dashed diagonal lines mark y = x; thus, all symbols above this diagonal indicate learning. The dashed thicker lines (blue and green) show the linear fit between final and initial performance. Note that statistical analysis was based on the models described later, and fits are shown for demonstration only.

Speech Recognition Versus Rapid Perceptual Learning

As shown in Figure 3 , speech recognition was associated with perceptual learning (assessed as slope over 10 sentences in young adults and 20 sentences in older adults) in both older and younger adults. Speech recognition in older adults was evaluated with NFS of one talker and in young adults with NFS of two talkers and with SIN.

Figure 3.

Speech Recognition Versus Rapid Learning. Proportions correct are plotted against the rapid learning slopes for younger (top row) and older (bottom row) adults. Left to right: natural-fast speech (NFS) produced by Talker 1, NFS produced by Talker 2, and speech-in-noise (SIN). Dashed lines show linear fits. Note that for the purpose of visual demonstration only, values of the learning slopes were adjusted to partial out the contribution of baseline recognition of time-compressed speech to the observed correlation (for details, see Manheim et al., 2018). Therefore, values on the x axis do not match the learning slopes shown in Figure 2.

Modeling Speech Recognition as a Function of Hearing, Cognition, and Rapid Learning

To further account for the potential contribution of rapid learning to speech recognition, we modeled speech recognition as a function of rapid learning as well as other measures previously suggested in the literature to correlate with speech recognition (hearing thresholds, vocabulary, working memory, and attention). A series of generalized linear mixed models was ran using the lme4 package (Bates et al., 2015) in R (R Core Team, 2019). Random effects included intercepts for participant and sentence; predictors were scaled prior to exporting the raw data to R. Following the recommendations of different sources about the analysis of proportions (e.g., Chen et al., 2017; Dunn & Smyth, 2018), we used binomial regressions with logistic link functions. Three models were constructed for each index of speech recognition: A “null” model with hearing, vocabulary, working memory, and attention; a “baseline speech” model that included the same predictors of the “null” model as well as baseline recognition of TCS; and a “full” model that in addition to the predictors of the previous models also included the rapid learning slope. Because rapid learning in the current study was assessed using a speech recognition task, and different indices of speech recognition can be correlated (see Table 3), the “baseline speech” models were intended to account for the associations among different indices of speech recognition. The “null,” “baseline speech,” and “full” models were then compared to isolate the unique contribution of learning. Given the differences in data collection and analysis described so far, we did not compare the models of older and younger adults directly.

Recognition of NFS

Table 4 shows the full model for NFS produced by Talker 1 for the two groups. In this model, vocabulary and rapid learning were significant predictors of natural speech recognition in young adults. Rapid learning contributed about half a word for every 1 SD increase in learning. The “baseline speech” model in which hearing, cognition, and initial recognition of TCS were included was not significantly different from the “null” model (χ2 = 0.18). On the other hand, the full model explained the data significantly better than the “baseline speech” model (χ2 = 17.31, p < .001), suggesting that rapid learning had a unique contribution to speech recognition, beyond that of the other variables.

Table 4.

Estimates of Natural-Fast Speech Recognition (Talker 1) Based on the “Full” Model.

	Younger adults		Older adults
Predictor	β (SE)	Z	β (SE)	Z
Hearing	0.0148 (0.0995)	0.15	−0.1568 (0.0520)	−3.01**
Vocabulary	0.1993 (0.0931)	2.14*	0.1910 (0.0585)	3.50**
Working memory	0.0386 (0.0937)	0.41	0.0725 (0.0497)	1.46
Attention	0.0584 (0.1076)	0.54	0.0747 (0.0469)	1.59
TCS baseline	−0.15 (0.1100)	−1.41	0.6498 (0.0578)	11.23***
Rapid learning	0.5170 (0.1183)	4.37***	0.2118 (0.0469)	4.52***

Note. TCS = time-compressed speech.

*p < .05. **p < .01. ***p < .001.

Estimates of Natural-Fast Speech Recognition (Talker 1) Based on the “Full” Model. Note. TCS = time-compressed speech. *p < .05. **p < .01. ***p < .001. In older adults, in the “full” model, hearing, vocabulary, initial TCS recognition, and rapid learning were all significant predictors of NFS recognition. In this group, model comparison suggests that the “baseline speech” model was significantly better than the null model (χ2 = 52.89, p < .001), but the full model was significantly better than the baseline speech model (χ2 = 16.83, p < .001). This suggests that while in older adults with hearing loss the strongest predictor (largest beta) of NFS recognition was recognition of a somewhat different type of distorted speech (TCS), learning was also a significant predictor, and the magnitude of its contribution was similar to those of hearing and vocabulary. In young adults, modeling NFS of the second—slower but harder to recognize talker—resulted in similar outcomes, with vocabulary and rapid learning as significant predictors (see Table 5 ). Rapid learning improved recognition by about 6% for every 1 SD increase and vocabulary by about 4%. Model comparison suggested that the “baseline speech” model predicted the data better than the “null” model (χ2 = 4.33, p < .05); the “full” model that included rapid learning fitted the data better than the “baseline speech” model (χ2 = 15.90, p < .001).

Table 5.

Estimates of Natural-Fast Speech (Talker 2) and Speech in Noise in Younger Adults Based on the “Full” Model.

	NFS (Talker 2)		SIN
Predictor	β (SE)	Z	β (SE)	Z
Hearing	0.0159 (0.0778)	0.21	0.1475 (0.0921)	1.60
Vocabulary	0.2273 (0.0735)	3.09**	0.2589 (0.0877)	2.95**
Working memory	0.0667 (0.0735)	0.91	0.2388 (0.0880)	2.71**
Attention	−0.0242 (0.0805)	−0.30	0.1104 (0.0940)	1.21
TCS baseline	0.0534 (0.0858)	0.62	0.1248 (0.0982)	1.27
Rapid learning	0.3696 (0.0880)	4.20***	0.4080 (0.0984)	4.15***

Note. NFS = natural-fast speech; SIN = speech in noise; TCS = time-compressed speech.**p<0.01, ***p<0.001.

Estimates of Natural-Fast Speech (Talker 2) and Speech in Noise in Younger Adults Based on the “Full” Model. Note. NFS = natural-fast speech; SIN = speech in noise; TCS = time-compressed speech.**p<0.01, ***p<0.001.

Recognition of SIN in Young Adults

As shown in Table 5, rapid learning, vocabulary, and working memory were all significant predictors of SIN recognition in the full model. Rapid learning (despite being assessed with a different speech task) improved the recognition of SIN by approximately 0.4 words/1 SD increase in learning. Vocabulary and working memory improved recognition by more than 0.2 words/1 SD increase. The model that included baseline recognition of TCS fitted the data better than the model with only hearing and cognitive variables (χ2 = 7.07, p < .01), but the model that included learning was a better fit than the “baseline speech” model (χ2 = 14.86, p < .001). Under the conditions of the current study, SIN was the hardest speech task for young adults. Therefore, this model demonstrates that dynamic factors such as working memory and learning may come to play a greater role in speech recognition when listeners find it particularly challenging.

Discussion

We investigated the contribution of rapid perceptual learning to speech recognition under adverse conditions. Even after accounting for hearing levels, baseline recognition of TCS, vocabulary, memory, and attention, rapid learning slopes remained significant predictors of speech recognition in both older and younger adults. Other factors also predicted the recognition of degraded speech: Vocabulary significantly predicted the recognition of NFS in the two groups. In older adults with age-related hearing loss, hearing and baseline recognition of TCS also related to the recognition of NFS. Working memory emerged as a significant predictor only for SIN (which was not tested in older adults). Although these findings are correlational and thus do not speak of causality, they at the very least fail to disprove the hypothesis that rapid perceptual learning is one of the factors that support speech recognition under adverse conditions. Two aspects of the findings are noteworthy: First, rapid perceptual learning of TCS was associated with the recognition of other types of speech (NFS and SIN), which in the case of young adults were all produced by different talkers. In both age groups, learning remained a significant predictor even after the correlations between recognition of NFS and TCS were accounted for. Therefore, while we cannot disentangle the causal direction from the current design, we can nevertheless conclude that prediction models that include learning better fit speech recognition data than models that do not include learning. Second, although we did not statistically compare prediction models between groups or speech conditions, the structures of the best fitting models seem qualitatively different across groups and speech tasks (in young adults), consistent with other findings that different types of adverse conditions may result in the recruitment of different processes to support speech recognition (Bent et al., 2016; McLaughlin et al., 2018). The major outcome of this study is that independent assessment of perceptual learning remains a significant predictor of perception in both younger and older adults on top of the variance explained by other cognitive processes and hearing thresholds. These findings replicate our previous observations that perceptual learning of TCS is associated with the recognition of NFS in younger and older adults with normal hearing as well as in older adults with age-related hearing loss (Manheim et al., 2018) and extend them to older adults with more severe age-related hearing loss. They also show that the contribution of learning is not entirely mediated by other processes that were already known to be predictive of speech recognition under adverse conditions (e.g., vocabulary). In young adults, the current data further show that the association between rapid perceptual learning and the recognition of degraded speech might extend across different type of acoustic degradation. As discussed later, this association is consistent with the notion that rapid perceptual learning is an individual capacity that supports listening under challenging acoustic conditions. However, because we used the same talker to assess both rapid perceptual learning and SIN recognition, an alternative interpretation is that the association reflects talker familiarity. Previous studies indeed show that under challenging conditions, speech presented by familiar talkers is easier to recognize than speech presented by unfamiliar ones (e.g., Johnsrude et al., 2013; Nygaard & Pisoni, 1998). In the current study, listeners who were better able to learn the characteristics of the talker during the rapid learning phase were also better able to use those characteristics to support recognition of speech presented by the same talker in noise. Although this was not the case in a previous study in which the association between learning and the recognition of NFS did not depend on talker familiarity (Manheim et al., 2018), it could be that effects of talker familiarity require longer to emerge. Indeed, effects of talker familiarity were previously documented for highly familiar talkers such as spouses or college professors (Johnsrude et al., 2013; Newman & Evers, 2007) or after intensive training (Kreitewolf et al., 2017; Nygaard & Pisoni, 1998), whereas in the current study, previous experience with the familiar talker was limited to ten sentences. That learning might be a capacity involved in speech perception under adverse conditions is consistent with the notion that dynamic processes (variously termed perceptual learning, adaptation, adjustment, and recalibration) are recruited to support speech perception when the received signal is insufficient to automatically match existing lexical representations due to either listener or environmental challenges. This is by no means a new notion (Ahissar et al., 2009; Bidelman et al., 2019; Davis & Johnsrude, 2007; Guediche et al., 2014; Ronnberg et al., 2013), but our findings imply that this dynamic process could be an individual characteristic that is not specific to any single condition. Furthermore, given the across-listeners differences in learning, it seems that different listeners greatly differ in their ability to employ it in service of their speech recognition. Finally, findings such as ours may help bridge the gap between the hallmark specificity of perceptual learning (Eisner & McQueen, 2005; Green et al., 2018) and its potential role in speech perception under ecological conditions (Samuel & Kraljic, 2009). If learning is recruited online whenever adverse conditions are encountered, specificity should not matter for individuals with good learning, whereas individuals with poor learning will be at a constant disadvantage. Furthermore, if this is the case, this could be one of the reasons why speech recognition in some adverse conditions (such as the presence of competition from other talkers or babble noise) is so challenging, because an inspection of the learning data reported in previous studies suggest that perceptual learning in noise may be slower to emerge than the learning of highly TCS (Burk et al., 2006; Karawani et al., 2016; Schlueter et al., 2016). Consistent with previous findings (Manheim et al., 2018; Schlueter et al., 2016), in the current study, learning was present but seemed reduced in individuals with age-related hearing loss, even though their initial performance was quite good. Specifically, we found that both learning slopes and effect sizes are numerically smaller in older adults despite them getting more favorable starting conditions and more trials to learn. Nevertheless, as in young adults, learning and vocabulary were still significant predictors of NFS recognition. Furthermore, despite having a restricted range of hearing deficits (see Figure 1), both hearing thresholds and baseline recognition of TCS were also significant predictors. The contribution of learning can thus be separated from the contribution of sensory and perceptual challenges to speech perception that are associated with age-related hearing loss (e.g., Benichov et al., 2012; Gordon-Salant & Fitzgibbons, 1997; Humes & Christopherson, 1991; Humes & Roberts, 1990; Pichora-Fuller et al., 1995), captured here by baseline recognition of TCS. Nevertheless, under environmental conditions, with no special allowances and no extra time, older adults with hearing loss are doubly challenged by both sensory and cognitive declines and declines in rapid learning. In the current data, the positive contributions of perceptual learning and vocabulary were similar in size to the negative contribution of hearing loss, perhaps providing another reason for why individual older adults with similar audiograms are so different in their perceptual profiles. Listeners who maintain good learning into old age might be better able to compensate for adversity by employing rapid learning, whereas those with poor learning are less able to do so. Although tests in the current study were certainly audible, all listeners were tested in unaided conditions, in which sensory factors associated with the hearing loss could have been more dominant than if assessment had been conducted with hearing aids. Therefore, further studies are required to determine whether prediction models change when listeners are using their hearing aids and whether rapid learning is associated with adaptation to new hearing aids and/or cochlear implants. Further studies are also needed to determine why learning declines with age and whether the course of decline can be affected by various interventions. The current outcomes about the associations between individual differences in speech recognition and cognitive and linguistic factors are generally in line with those of previous studies (Banks et al., 2015; Bent et al., 2016; McLaughlin et al., 2018). They show that different cognitive processes might have differential contributions to speech recognition under adverse conditions. Specifically, whereas young adults’ vocabulary associated with the recognition of both NFS and speech-in-noise, working memory associated with the recognition of speech-in-noise only. Likewise, in older adults, vocabulary contributed to the account of individual differences in the recognition of NFS, but the contribution of the other cognitive variables was insignificant. Although the interpretation of insignificant outcomes is complicated, we are not the first to report lack of association between speech recognition on one hand and working memory or attention on the other. For example, in a study on individual differences in the recognition of different types of distorted speech, vocabulary consistently predicted performance across the different challenging conditions, whereas working memory predicted only some of the conditions (McLaughlin et al., 2018). Likewise, consistent with the current findings, flanker costs did not relate to the intelligibility of the distorted speech conditions used by Bent et al. (2016). The authors of these studies suggested that different abilities may help overcome particular types of deviation from canonical speech and that certain challenges may afford the recruitment of specific cognitive resources. Therefore, for the present findings, one possibility is that NFS was not sufficiently challenging to engage working memory in young adults (see Table 1), but this cannot explain why working memory and NFS recognition were not associated in older adults. Alternatively, the fast speech rates used in the current study could have resulted in a mismatch between the representation of the incoming stimuli and available linguistic representations. According the Ease of Language Understanding Model, when such a mismatch occurs, the input fails to automatically activate existing long-term representations, and additional working memory resources are recruited in an attempt to resolve the mismatch (e.g., Ronnberg et al., 2019). However, rapid speech rates may have resulted in a mismatch that was too large to overcome, in which case recruiting working memory would have been unhelpful. In contrast, in noise the signal is masked, resulting in uncertainty about the correct match, in which case working memory could support speech recognition, leading to the correlation we observed. Finally, this study had a number of limitations. First and foremost, our design makes it impossible to disentangle cause and effect, and it could be that poorer speech processing could interfere with perceptual learning. We nevertheless note that had this been the case, in the current study, larger learning slopes should have been observed in older and not in younger adults, because due to the levels of time compression, we selected young adults started out with substantially poorer recognition of TCS. Yet, despite poorer starting performance, young adults had larger learning slopes. Second, as explained earlier, there were a number of differences between the test conditions of older and younger adults which make a direct statistical comparison between the groups impossible. Therefore, additional studies in which a single prediction model can be attempted for all participants regardless of age and hearing status and with a larger range of challenging speech conditions are required. Third, sensory factors that could contribute to speech recognition (other than hearing thresholds) were not included in the current study. Instead, we relied on the baseline estimate of TCS recognition as a statistical control. As such factors could contribute to rapid auditory learning, they should be explored in future studies. Nevertheless, we believe these limitations are not sufficient to undermine the conclusion that rapid speech learning could be one of the factors contributing to individual variability in speech recognition under adverse conditions.

67 in total

1. "Mini-mental state". A practical method for grading the cognitive state of patients for the clinician.

Authors: M F Folstein; S E Folstein; P R McHugh
Journal: J Psychiatr Res Date: 1975-11 Impact factor: 4.791

2. Perceptual learning of time-compressed and natural fast speech.

Authors: Patti Adank; Esther Janse
Journal: J Acoust Soc Am Date: 2009-11 Impact factor: 1.840

3. Selected cognitive factors and speech recognition performance among young and elderly listeners.

Authors: S Gordon-Salant; P J Fitzgibbons
Journal: J Speech Lang Hear Res Date: 1997-04 Impact factor: 2.297

4. Afferent-efferent connectivity between auditory brainstem and cortex accounts for poorer speech-in-noise comprehension in older adults.

Authors: Gavin M Bidelman; Caitlin N Price; Dawei Shen; Stephen R Arnott; Claude Alain
Journal: Hear Res Date: 2019-08-27 Impact factor: 3.208

5. Individual Differences in Distributional Learning for Speech: What's Ideal for Ideal Observers?

Authors: Rachel M Theodore; Nicholas R Monto; Stephen Graham
Journal: J Speech Lang Hear Res Date: 2019-12-16 Impact factor: 2.297

6. Speech-in-speech perception, nonverbal selective attention, and musical training.

Authors: Adam Tierney; Stuart Rosen; Fred Dick
Journal: J Exp Psychol Learn Mem Cogn Date: 2019-10-03 Impact factor: 3.051

7. Auditory Perceptual Learning in Adults with and without Age-Related Hearing Loss.

Authors: Hanin Karawani; Tali Bitan; Joseph Attias; Karen Banai
Journal: Front Psychol Date: 2016-02-03

8. Implicit Talker Training Improves Comprehension of Auditory Speech in Noise.

Authors: Jens Kreitewolf; Samuel R Mathias; Katharina von Kriegstein
Journal: Front Psychol Date: 2017-09-14

9. Individual differences in selective attention predict speech identification at a cocktail party.

Authors: Daniel Oberfeld; Felicitas Klöckner-Nowotny
Journal: Elife Date: 2016-08-31 Impact factor: 8.140

10. Speech perception under adverse conditions: insights from behavioral, computational, and neuroscience research.

Authors: Sara Guediche; Sheila E Blumstein; Julie A Fiez; Lori L Holt
Journal: Front Syst Neurosci Date: 2014-01-03

4 in total

1. Rapid but specific perceptual learning partially explains individual differences in the recognition of challenging speech.

Authors: Karen Banai; Hanin Karawani; Limor Lavie; Yizhar Lavner
Journal: Sci Rep Date: 2022-06-15 Impact factor: 4.996

2. Younger and older adults show non-linear, stimulus-dependent performance during early stages of auditory training for non-native English.

Authors: Rebecca E Bieber; Anna R Tinnemore; Grace Yeni-Komshian; Sandra Gordon-Salant
Journal: J Acoust Soc Am Date: 2021-06 Impact factor: 2.482

3. One Size Does Not Fit All: Examining the Effects of Working Memory Capacity on Spoken Word Recognition in Older Adults Using Eye Tracking.

Authors: Gal Nitsan; Karen Banai; Boaz M Ben-David
Journal: Front Psychol Date: 2022-04-11

4. Speech Perception in Older Adults: An Interplay of Hearing, Cognition, and Learning?

Authors: Liat Shechter Shvartzman; Limor Lavie; Karen Banai
Journal: Front Psychol Date: 2022-02-17

4 in total