Literature DB >> 31808373

Age-Related Temporal Processing Deficits in Word Segments in Adult Cochlear-Implant Users.

Zilong Xie¹, Casey R Gaskins¹, Maureen J Shader¹, Sandra Gordon-Salant¹, Samira Anderson¹, Matthew J Goupell¹.

Abstract

Aging may limit speech understanding outcomes in cochlear-implant (CI) users. Here, we examined age-related declines in auditory temporal processing as a potential mechanism that underlies speech understanding deficits associated with aging in CI users. Auditory temporal processing was assessed with a categorization task for the words dish and ditch (i.e., identify each token as the word dish or ditch) on a continuum of speech tokens with varying silence duration (0 to 60 ms) prior to the final fricative. In Experiments 1 and 2, younger CI (YCI), middle-aged CI (MCI), and older CI (OCI) users participated in the categorization task across a range of presentation levels (25 to 85 dB). Relative to YCI, OCI required longer silence durations to identify ditch and exhibited reduced ability to distinguish the words dish and ditch (shallower slopes in the categorization function). Critically, we observed age-related performance differences only at higher presentation levels. This contrasted with findings from normal-hearing listeners in Experiment 3 that demonstrated age-related performance differences independent of presentation level. In summary, aging in CI users appears to degrade the ability to utilize brief temporal cues in word identification, particularly at high levels. Age-specific CI programming may potentially improve clinical outcomes for speech understanding performance by older CI listeners.

Entities: CellLine Disease Gene Species

Keywords: aging; cochlear implant; presentation level; temporal processing

Year: 2019 PMID： 31808373 PMCID： PMC6900735 DOI： 10.1177/2331216519886688

Source DB: PubMed Journal: Trends Hear ISSN： 2331-2165 Impact factor: 3.293

Introduction

Time-varying information contains important cues for speech perception (Rosen, 1992). For example, the duration of a silent interval can influence the identification of final fricative-affricate contrasts, such that listeners’ perception of a word can change from a fricative (e.g., dish) to an affricate (e.g., ditch) as the silence duration increases (Dorman, Raphael, & Isenberg, 1980). Even when spectral information is severely degraded, such as in the case of listening through a cochlear implant (CI), temporal information can provide relatively robust information to support listeners’ intelligibility of consonants, vowels, words, and sentences (Friesen, Shannon, Baskent, & Wang, 2001; Shannon, Zeng, Kamath, Wygonski, & Ekelid, 1995). As people age, their speech understanding abilities tend to decrease, particularly in adverse conditions such as in the presence of background noise (e.g., Dubno, Dirks, & Morgan, 1984; Frisina & Frisina, 1997) or when the target speech signals are distorted by time compression, reverberation, and interruption (e.g., Gordon-Salant & Fitzgibbons, 1993). One potential mechanism underlying these speech understanding difficulties is a decline in the ability to process temporal properties of sounds with advancing age (Füllgrabe, Moore, & Stone, 2015; Gordon-Salant, Fitzgibbons, & Yeni-Komshian, 2011; Schneider & Pichora-Fuller, 2001). Age-related declines in temporal processing abilities have been documented across a variety of tasks in individuals with normal hearing (NH) and hearing impairment (HI) (Fitzgibbons & Gordon-Salant, 1995; Gordon-Salant & Fitzgibbons, 1999; Gordon-Salant, Yeni-Komshian, & Fitzgibbons, 2008; Gordon-Salant, Yeni-Komshian, Fitzgibbons, & Barrett, 2006; Humes, Kewley-Port, Fogerty, & Kinney, 2010; Pichora-Fuller & Souza, 2003; Roque, Gaskins, Goupell, Anderson, & Gordon-Salant, 2019; Snell, 1997; Strouse, Ashmead, Ohde, & Grantham, 1998). For example, Gordon-Salant et al. (2008) showed that older adults exhibited reduced ability to identify and discriminate word contrasts based on a variety of temporal duration cues such as dish–ditch (silence duration) and buy-pie (voice onset time). Mechanistically, the temporal processing deficits associated with aging are likely to result from encoding differences in the auditory system. For example, animal studies have shown that aging is associated with the loss of spiral ganglion cells in the auditory periphery (Makary, Shin, Kujawa, Liberman, & Merchant, 2011; Otte, Schuknecht, & Kerr, 1978; Sergeyenko, Lall, Liberman, & Kujawa, 2013), which may limit neural synchronization to temporal features in the acoustic inputs (Lopez-Poveda, 2014; Lopez-Poveda & Barrios, 2013). Both animal and human work have demonstrated that aging is associated with a deterioration in the encoding of temporal cues at the midbrain level (e.g., Roque et al., 2019; Walton, Frisina, & O’Neill, 1998) as well as at the cortical level (e.g., Hughes, Turner, Parrish, & Caspary, 2010; Roque et al., 2019; Tremblay, Piskosz, & Souza, 2003; Willott, 1991). The use of a CI as an intervention for hearing difficulties is increasing among the aging population. Both retrospective and prospective evidence indicate that aging may be a factor limiting speech understanding performance in CI users (Blamey et al., 2013; Holden et al., 2013; Kim et al., 2018; Shader et al., 2019; Sladen & Zappler, 2015). In this study, we examined age-related auditory temporal processing deficits as a potential mechanism that underlies the ostensible speech understanding deficits associated with aging in CI users. The investigation of auditory temporal processing ability seems an appropriate focus considering that: (a) The CI transmits mainly temporal envelope cues (Loizou, 2006), (b) temporal modulation processing ability has been shown to be correlated with speech understanding outcomes in CI users (Fu, 2002), and (c) CI listeners appear to rely mostly on temporal cues to categorize speech sounds (Winn, Chatterjee, & Idsardi, 2012). To date, there is limited direct evidence regarding age-related temporal processing differences in CI users. Recently, Goupell et al. (2017) utilized vocoded word stimuli to simulate listening through CI processors and examined age-related differences in the ability to use temporal cues to categorize word segments (i.e., a dish–ditch continuum with varying duration of a silent interval before the fricative noise). They found that older NH listeners needed longer silence durations to discriminate the word contrasts compared with younger NH listeners, suggesting age-related temporal processing deficits with spectrally degraded speech signals. Similar evidence has demonstrated that aging may decrease older NH listeners’ ability to use temporal envelope cues of vocoded stimuli for tasks such as phoneme recognition (Schvartz, Chatterjee, & Gordon-Salant, 2008), fundamental frequency discrimination (Schvartz-Leyzac & Chatterjee, 2015), and gender identification (Schvartz & Chatterjee, 2012). The purpose of this study was to examine age-related changes in temporal processing across adult younger CI (YCI) listeners (<45 years old), middle-aged CI (MCI; between 45 and 64 years old), and older CI (OCI; >64 years old) ages. Temporal processing ability was quantified as the performance on a speech categorization task based on silence duration cues (dish–ditch continuum; Gordon-Salant et al., 2006, 2008; Goupell et al., 2017; Roque et al., 2019). The rationale for investigating speech categorization is that this type of task may target a specific problem in speech understanding with a CI (Winn, Won, & Moon, 2016). Particularly, performance on the speech categorization task may be more closely related to speech understanding performance than traditional psychophysical tasks that target temporal processing (e.g., temporal modulation detection; Winn et al., 2016). We hypothesized that the ability to categorize dish versus ditch would diminish with increasing age.

Experiment 1: Processing of Silence Duration Cues in Word Segments at a Fixed Presentation Level in CI Listeners

Methods

Participants

Three groups of adult CI users participated in this experiment: seven YCI (7 ears; 21.0 to 42.5 years, mean age and standard deviation [SD] = 30.5 ± 9.3), 19 MCI (19 ears; 45.5 to 63.6 years, mean age and SD = 56.2 ± 5.3), and 12 OCI (12 ears; 65.4 to 81.0 years, mean age and SD = 72.4 ± 5.0). All participants were native speakers of American English. We screened participants with the Montreal Cognitive Assessment (MoCA; Nasreddine et al., 2005) to ensure normal or near-normal cognitive function (≥22 out of 30 possible points) (Cecato, Martinelli, Izbicki, Yassuda, & Aprahamian, 2016; Dupuis et al., 2015). The MoCA data were missing for 1 YCI (CCE), 4 MCI (CBI, CBM, CCC, and CCH), and 2 OCI (CAN and CBY). Details of their demographic information are provided in Table 1. The effect of age group was not significantly different for duration of deafness, Kruskal–Wallis, χ2(2) = 1.198, p = .549, or duration of CI use, Kruskal–Wallis χ2(2) = 5.160, p = .076. Written informed consent was obtained from all participants. All materials and procedures were approved by the Institutional Review Board at the University of Maryland. All participants received monetary compensation for their participation.

Table 1.

Demographic Information for the CI Participants.

Experiment no.	Stimulus presentation method	Age-group	Subject code	Sex	Age (years)	CI ear	DoD (years)	CI use duration (years)	CI processor	Etiology
1, 2	DAI	YCI	CAR	M	24.0	Right	1	6.0	Cochlear-Freedom	Genetic
2	DAI	YCI	CAT	M	28.7	Left	12	6.9	Cochlear-Freedom	Unknown
1, 2	DAI	YCI	CAT	M	27.8	Right	8	9.6	Cochlear-Freedom	Genetic
1, 2	DAI	YCI	CBP	F	36.1	Left	3	16.1	Cochlear-Nucleus 5	Unknown
1, 2	Headphone	YCI	CBQ	M	21.0	Right	3	18.0	Advanced Bionics-Harmony	Genetic
1, 2	Headphone	YCI	CBU	M	41.2	Left	39	1.1	Advanced Bionics-Naida	Unknown
1, 2	DAI	YCI	CBW	M	42.5	Right	1	16.3	Cochlear-Nucleus 6	Cogan's Syndrome
1, 2	DAI	YCI	CCE	F	21.0	Right	2	19.0	Cochlear-Nucleus 6	Unknown
2	DAI	YCI	CCM	F	35.2	Left	<1	2.2	MED-EL-Opus2	Ototoxicity
2	DAI	YCI	CCM	F	35.2	Right	<1	1.2	MED-EL-Sonnet	Ototoxicity
2	DAI	YCI	CCV	M	31.5	Right	<1	29.5	Cochlear-Nucleus 6	Bacterial meningitus
1, 2	DAI	MCI	CAJ	F	63.6	Right	47	16.0	Cochlear-Nucleus 5	Genetic
2	DAI	MCI	CAQ	F	57.7	Right	17	0.7	Cochlear-Nucleus 6	Ménière disease
1, 2	DAI	MCI	CAQ	F	57.7	Left	17	0.7	Cochlear-Nucleus 6	Ménière disease
2	DAI	MCI	CAW	M	53.4	Right	17	4.0	Cochlear-Nucleus 5	Unknown
1, 2	DAI	MCI	CAW	M	53.4	Left	2	7.1	Cochlear-Nucleus 5	Unknown
1, 2	DAI	MCI	CAX	M	53.3	Left	<1	4.3	Cochlear-Nucleus 5	Unknown
2	DAI	MCI	CAY	F	59.0	Left	5	6.0	Cochlear-Nucleus 5	Unknown
1, 2	DAI	MCI	CAY	F	57.6	Right	<1	9.6	Cochlear-Nucleus 5	Unknown
1, 2	DAI	MCI	CBA	F	54.1	Left	31	2.3	Cochlear-Nucleus 5	Stickler's Syndrome
1, 2	DAI	MCI	CBF	M	57.1	Left	5	5.3	Cochlear-Nucleus 6	Hereditary
1, 2	DAI	MCI	CBG	F	61.2	Right	2	3.9	Cochlear-Nucleus 5	Rh factor
1	DAI	MCI	CBH	F	61.4	Right	13	3.4	Cochlear-Nucleus 5	Nerve Damage
1, 2	DAI	MCI	CBI	M	56.1	Left	<1	1.5	MED-EL-Opus2	Unknown
1, 2	DAI	MCI	CBJ	F	52.7	Right	33	2.7	MED-EL-Opus2	Genetic
1, 2	DAI	MCI	CBK	F	56.0	Left	5	6.0	Cochlear-Freedom	Unknown
1, 2	Headphone	MCI	CBM	F	50.5	Left	13	4.5	Advanced Bionics-Harmony	Unknown
1, 2	DAI	MCI	CBN	M	50.4	Right	2	8.4	Cochlear-Nucleus 5	Rubella
1, 2	DAI	MCI	CBR	F	62.1	Left	54	2.1	Cochlear-Nucleus 6	Premature birth
1, 2	DAI	MCI	CBV	F	62.8	Left	11	3.8	Cochlear-Nucleus 5	Enlarged Vestibular Aqueduct Syndrome
1, 2	Headphone	MCI	CCC	F	63.0	Right	5	8.0	Advanced Bionics-Naida Q70	Unknown
1	Headphone	MCI	CCD	M	45.5	Right	24	21.6	Cochlear-Nucleus 6	Unknown
1	Headphone	MCI	CCH	M	48.7	Right	0	<1	Advanced Bionics-Naida Q90	Unknown
2	DAI	MCI	CCL	F	51.7	Left	<1	7.7	Cochlear-Nucleus 6	Ménière disease
2	DAI	MCI	CCL	F	51.7	Right	2	1.7	Cochlear-Nucleus 6	Ménière disease
2	DAI	OCI	CAB	F	71.3	Left	36	19.9	Cochlear-Nucleus 5	Unknown
2	DAI	OCI	CAB	F	71.3	Right	44	13.1	Cochlear-Nucleus 5	Unknown
2	DAI	OCI	CAD	M	76.7	Left	3	13.6	Cochlear-Nucleus 6	Unknown
2	DAI	OCI	CAD	M	76.7	Right	7	8.5	Cochlear-Nucleus 5	Unknown
2	Headphone	OCI	CAK	M	69.9	Left	22	12.9	Cochlear-Nucleus 6	Sinus surgery
1, 2	DAI	OCI	CAK	M	68.6	Right	<1	10.5	Cochlear-Nucleus 6	Unknown
1, 2	DAI	OCI	CAM	F	70.5	Right	6	4.5	Cochlear-Nucleus 6	Unknown
1, 2	DAI	OCI	CAN	F	75.4	Right	?	9.8	Cochlear-Freedom	Unknown
1, 2	DAI	OCI	CAO	F	70.1	Right	5	5.1	Cochlear-Nucleus 5	Measles
1, 2	DAI	OCI	CBB	M	81.0	Right	2	2.0	Cochlear-Nucleus 5	Sudden SNHL
2	DAI	OCI	CBC	F	77.8	Right	17	1.8	Cochlear-Nucleus 6	Hereditary, measles
1, 2	DAI	OCI	CBC	F	76.8	Left	<1	2.8	Cochlear-Nucleus 6	Hereditary, measles
1, 2	DAI	OCI	CBD	M	79.3	Right	<1	5.3	Cochlear-Freedom	Unknown
1	Headphone	OCI	CBS	F	65.8	Right	15	0.6	Advanced Bionics-Naida	Unknown
1, 2	DAI	OCI	CBT	F	73.9	Right	11	2.9	Cochlear-Nucleus 5	Unknown
1, 2	Headphone	OCI	CBY	M	69.5	Right	12	15.5	Advanced Bionics-Naida	Unknown
2	DAI	OCI	CCA	M	75.4	Right	1	4.3	Cochlear-Nucleus 6	Antibiotics, aging
2	DAI	OCI	CCA	M	75.4	Left	61	2.0	Cochlear-Nucleus 6	Measles
2	DAI	OCI	CCI	F	65.4	Right	8	3.4	Cochlear-Nucleus 6	Otosclerosis, measles, mumps, chicken pox
1, 2	DAI	OCI	CCI	F	65.4	Left	2	14.4	Cochlear-Nucleus 6	Otosclerosis, measles, mumps, chicken pox
2	Headphone	OCI	CCJ	M	72.5	Right	4	19.0	Advanced Bionics-Harmony	Unknown
1, 2	Headphone	OCI	CCJ	M	72.5	Left	15	7.8	Advanced Bionics-Harmony	Unknown
2	DAI	OCI	CDG	M	73.0	Left	<1	5.0	Cochlear-Nucleus 6	Virus

Note. DoD = duration of deafness; DAI = direct audio input; YCI = younger cochlear-implant users; MCI = middle-aged cochlear-implant users; OCI = older cochlear-implant users; CI = cochlear-implant; ? = Unknown.

Demographic Information for the CI Participants. Note. DoD = duration of deafness; DAI = direct audio input; YCI = younger cochlear-implant users; MCI = middle-aged cochlear-implant users; OCI = older cochlear-implant users; CI = cochlear-implant; ? = Unknown.

Stimuli

Stimuli for the current experiment have been described in previous studies (Gordon-Salant et al., 2006; Goupell et al., 2017). The stimuli consisted of a seven-step continuum of speech tokens that varied the silence duration so that the endpoints were perceived as the words “dish” and “ditch.” The continuum was created as follows: An adult American male speaker produced the words dish and ditch in isolation. A hybrid ditch was then created by replacing the burst and frication in the ditch token with the frication portion of the dish token. The silence duration in the hybrid ditch was varied parametrically over seven equal 10-ms steps from 60 to 0 ms, resulting in a stimulus set that spanned a perceptual continuum from ditch (60-ms silence duration) to dish (0-ms silence duration). Figure 1 displays the waveforms and spectrograms of the endpoint stimuli (i.e., 0- and 60-ms silence duration) from the continuum.

Figure 1.

Spectrograms (a; top row) and waveforms (b; bottom row) for the endpoint stimuli from the stimulus continuum. Left: The 0-ms silence duration stimulus, perceived as dish. Right: The 60-ms silence duration stimulus, perceived as ditch. Black triangles on the left panels indicate the onset of the frication portion. Black triangles on the right panels indicate the silence period before the frication portion.

Design

This experiment consisted of 280 trials (7 stimuli × 40 repetitions) that were usually divided into four blocks of 70 trials each (7 stimuli × 10 repetitions). For two participants, the trials were divided into more than four blocks (5–6 blocks) to reduce fatigue. The order of the trials was randomized in each block, which was different for each listener. Participants could take short breaks between blocks. The stimulus intensities were adjusted for individual participants so that the sound was set at a level that participants reported to be comfortably soft while maintaining good discrimination between the endpoint dish–ditch stimuli (i.e., 0- and 60-ms silence duration). In other words, participants chose a level at which they thought they could best discriminate the two words.

Procedure

Participants were tested individually in a sound-attenuating booth (Industrial Acoustics, Inc., Bronx, NY). The participant’s task was to respond whether they heard “dish” or “ditch” after each stimulus presentation. Participants self-initiated each trial by clicking a box reading “Begin Trial” on the screen. The stimulus, randomly selected from the seven-step continuum, was then presented. Participants responded by clicking a box on the left or right side of the screen, corresponding to “dish” and “ditch,” respectively. Participants were encouraged to guess if they were unsure and had unlimited time to respond. After their response, the task moved on to the next trial. No feedback was provided. Before testing, participants received a training task, wherein they were required to identify the endpoint stimuli, 0- and 60-ms silence duration, as dish or ditch, respectively. The trial procedure was identical to the main experiment except that participants were provided with correct answer feedback after each response. Ten repetitions of each stimulus were presented. An accuracy of at least 90% was required before proceeding to the main experiment. All participants established stable performance and achieved the 90% goal within 15 mins, and most achieved near-perfect word identification performance on the training task. Stimulus presentation and response collection were controlled with custom scripts in MATLAB (MathWorks, Natick, MA). The stimuli were presented through participants’ clinical processors with their everyday settings as set by their audiologists. We chose direct audio input (DAI) to directly deliver stimuli to the processor monaurally (see Table 1). Note that newer generations of CI sound processors are moving away from the inclusion of DAI. When we encountered participants without the ability to use DAI, headphones (Sennheiser HD650s) were used to deliver stimuli to the processor. The testing (including training and breaks) was usually completed within 1 to 1.5 hr.

Psychometric function analysis

The percentage of dish responses along the dish–ditch continuum was calculated. A logistic function implemented in the psignifit toolbox (Wichmann & Hill, 2001) in MATLAB (MathWorks, Natick, MA) was used to fit the psychometric function for the percentage data of dish responses from each condition for each listener. Two metrics were calculated from the psychometric function: 50% crossover point and slope. The 50% crossover point was quantified as the silence duration (in ms) that corresponds to 50% of dish responses. In some cases, no crossover point was found between 0- and 60-ms silence duration. In such cases, the curve fit was extrapolated out of the range of 0- to 60-ms and then the crossover point was estimated. Crossover points were adjusted to be in the range of 0 to 70 ms, such that values larger than 70 ms or smaller than 0 ms were set to be 70 ms or 0 ms, respectively. The slope was quantified as the maximum slope that occurred over the entire function or the percentage change in dish responses per unit change in silence duration (%/ms). The upper limit of slope considered appropriate was set at 7.5%/ms because that would be a single step change from 100% to 0% dish responses (Goupell et al., 2017). If the slope of the function was backward, such that longer silence duration was associated with more dish responses, we considered the output metric values (crossover point and slope) to be inappropriate. In addition, if the data from a given condition (e.g., uniform 100% or 0% dish responses along the dish–ditch continuum) failed to be fitted by the logistic function, we also considered those data inappropriate. For both scenarios, we applied the following procedures to set the metric values: If the percentage of dish responses for any step along the dish–ditch continuum was not higher than 50%, this means that a 50% crossover point may not occur even at the shortest possible silence duration (i.e., 0 ms). In such cases, we set the crossover point to be 0 ms. Otherwise, we set it to be 70 ms. The slope was set to be 0%/ms for both of these patterns of data. In the current experiment, data from all participants were successfully fitted by the logistic function and the fitted functions were in the appropriate direction such that longer silence durations were associated with less dish responses.

Statistical analysis

We performed separate Kruskal–Wallis tests on the crossover point and slope data to examine the aging effect, with age group (YCI, MCI, or OCI) as the between-subjects factor. The crossover point and slope data from one MCI participant (CAW) consisted of 30 out of 40 repetitions for each stimulus due to computer errors, but were included in the analysis. We also performed similar analyses to examine the aging effect on the demographic data (duration of deafness and duration of CI use). Missing (unknown) values in the demographic data were excluded and values specified as “<1” were treated as 1. We applied the same procedures to analyze the demographic data for Experiment 2.

Results

Figure 2(a) displays the percentage of dish responses as a function of silence duration. The three age groups (YCI, MCI, and OCI) were able to accurately discriminate the words dish and ditch by comparing their responses at the endpoints. Their performance was comparable across the various silence durations. Figure 2(b) and (c) shows mean crossover points and slopes of the performance functions across the three groups, respectively. The effect of age group was not statistically significant for the crossover point, Kruskal–Wallis χ2(2) = 2.608, p = .272, or the slope, Kruskal–Wallis χ2(2) = 1.853, p = .396. Therefore, while Experiment 1 showed that CI listeners could discriminate the words dish and ditch, it failed to reveal age-related temporal processing deficits.

Figure 2.

Results for YCI (blue/squares), MCI (green/circles), and OCI (red/triangles) groups at a fixed presentation level in Experiment 1. (a) Mean percentage of trials that participants reported dish responses for the dish–ditch continuum. The continuum consisted of seven stimuli with the silence duration parametrically varied from 0 to 60 ms. Error bars denote ±1 standard deviation. (b) and (c) Mean crossover point and slope of the psychometric functions. Error bars denote 95% confidence intervals. YCI = younger cochlear-implant users; MCI = middle-aged cochlear-implant users; OCI = older cochlear-implant users.

Discussion

This experiment examined age-related changes in the processing of temporal cues (silence duration) in word segments in CI users. The results failed to reveal age-related differences in auditory temporal processing. The null findings from the current experiment warrant further investigation considering: (a) the vast literature on age-related temporal processing deficits in NH and HI listeners (Fitzgibbons & Gordon-Salant, 1995; Gordon-Salant & Fitzgibbons, 1999; Gordon-Salant et al., 2006, 2008; Humes et al., 2010; Pichora-Fuller & Souza, 2003; Roque et al., 2019; Snell, 1997; Strouse et al., 1998); and (b) the current experiment may lack statistical power to reveal age group differences due to a relatively small sample size for YCI (n = 7) listeners. The finding of this experiment is in contrast to Goupell et al. (2017) who demonstrated age-related temporal processing deficits in word segments in NH listeners. They recruited younger and older NH listeners to complete a categorization task on the dish–ditch continuum, which was similar to the current experiment. To simulate hearing through CI processors, the stimuli were vocoded using tonal carriers. They found later crossover points (i.e., longer silence duration) and shallower slopes with the vocoded stimuli for older than younger NH listeners. Regarding the findings of age-related temporal processing differences, the discrepancy between the current experiment and prior work may be a result of stimulus level differences. Here, stimulus intensities were set at a level that participants reported to be comfortably soft while maintaining good perceived discrimination between the endpoint dish–ditch stimuli. The levels used in the current experiment may be lower than those from prior studies (≥ 65 dB SPL) that revealed age-related temporal processing differences with a similar categorization task on the dish–ditch continuum (Gordon-Salant et al., 2006, 2008; Goupell et al., 2017; Roque et al., 2019). Therefore, in Experiment 2, we manipulated stimulus presentation levels and further explored age-related changes in the processing of temporal cues in word segments in CI listeners.

Experiment 2: Processing of Silence Duration Cues in Word Segments as a Function of Presentation Level in CI Listeners

Nine YCI (11 ears; 21.0 to 42.5 years, mean age and SD = 31.3 ± 7.5), 17 MCI (21 ears; 50.4 to 63.6 years, mean age and SD = 56.4 ± 4.3), and 15 OCI (22 ears; 65.4 to 81.0 years, mean age and SD = 73.1 ± 4.2) listeners took part in the current experiment. Among these participants, 13 participants were tested separately in both ears (Table 1). While the behavioral performance of the two ears in one individual cannot be completely independent, it can nonetheless be distinguishable given differing characteristics of the electrode–neuron interface, and therefore we assumed that they could be treated independently. Seven YCI (7 ears), 16 MCI (16 ears), and 11 OCI (11 ears) also participated in Experiment 1. All participants were native speakers of American English and were screened with the MoCA, similar to Experiment 1. The MoCA data were missing for 1 YCI (CCE), 3 MCI (CBI, CBM, and CCC), and 2 OCI (CAN and CBY). Their demographic information is provided in Table 1. The effect of age group was not significantly different for duration of deafness, Kruskal–Wallis χ2(2) = 2.825, p = .244, or duration of CI use, Kruskal–Wallis χ2(2) = 5.296, p = .071. Stimuli were identical to those for Experiment 1. The stimulus presentation levels were parametrically changed between nominal values of 25 and 85 dB in equal steps of 10 dB. We approximated that these level values reflect the stimulus levels that participants actually received in dB SPL. However, it is not possible to confirm such an approximation. This was because for the stimulation methods adopted here (DAI or headphone), we were not able to calibrate the stimulus levels received by the sound processor. We used participants’ clinical sound processors with their everyday settings and the levels that they normally used. Hence, their processors determined the actual levels of the electrical stimulation. We therefore reported level in this experiment without a true reference because a reference is not possible in our experimental setup. Because the lower intensity (e.g., 25 dB) may be below participants’ thresholds, we selected the lowest presentation level individually by querying the level (among the seven possible levels) that they could just hear the sample word dish before the experiment. Table 2 lists the proportion of participants/ears per age group at each presentation level. We conducted 3 (age group: YCI, MCI, and OCI) × 2 (participants/ears that could hear the sample word dish: Yes or No) χ2 tests of independence. We found that the number of participants/ears that could versus could not hear the sample word dish did not significantly differ across age groups at 25 dB, χ2(2) = 3.254, p = .197, but was significantly different across group at 35 dB, χ2(2) = 6.716, p = .035. Post hoc analysis showed that compared to YCI and OCI, there was a significantly lower number of MCI who could hear the sample word dish at 35 dB (p = .01, uncorrected).

Table 2.

Proportion of Participants/Ears Per Age Group at Each Presentation Level in Experiment 2.

Age-group	Presentation level (dB)
Age-group	25	35	45	55	65	75	85
YCI	9.09	90.91 (10;10)	100	100	100	100	100
MCI	14.29	57.14 (16.67;0)	95.24	100 (4.76;0)	100 (9.52;0)	100 (4.76;0)	100 (4.76;19.05)
OCI	0	86.36 (10.53;0)	100	100	100	100 (4.55;4.55)	100 (22.73;0)

Note. Number(s) in parentheses represent the proportion of participants/ears (relative to the total number of participants/ears who completed the task at that level for that age group) that were fitted by a logistic function in a backward direction (first number) and that were failed to be fitted by the logistic function (second number). Zeros were included in the parentheses for illustration purposes. YCI = younger cochlear-implant users; MCI = middle-aged cochlear-implant users; OCI = older cochlear-implant users.The following data were fitted by logistic functions in the backward direction: (a) 35 dB: one YCI (CBQ-right ear), two MCI (CAY-right ear, CBK-left ear), and two OCI (CAO-right ear, CBB-right ear); (b) 55 dB: one MCI (CAW-left ear); (c) 65 dB: two MCI (CAW-left ear, CAW-right ear); (d) 75 dB: one MCI (CBN-right ear) and one OCI (CBT-right ear); and (e) 85 dB: one MCI (CBN-right ear). The following data failed to be fitted by the logistic functions: (a) 35 dB: one YCI (CCM-right ear); (b) 75 dB: one OCI (CBC-left ear); and (c) 85 dB: four MCI (CAQ-right ear, CAY-left ear, CAY-right ear, CBA-left ear) and five OCI (CAD-right ear, CBB, CBC-left ear, CBC-right ear, CBT-right ear).

Proportion of Participants/Ears Per Age Group at Each Presentation Level in Experiment 2. Note. Number(s) in parentheses represent the proportion of participants/ears (relative to the total number of participants/ears who completed the task at that level for that age group) that were fitted by a logistic function in a backward direction (first number) and that were failed to be fitted by the logistic function (second number). Zeros were included in the parentheses for illustration purposes. YCI = younger cochlear-implant users; MCI = middle-aged cochlear-implant users; OCI = older cochlear-implant users.The following data were fitted by logistic functions in the backward direction: (a) 35 dB: one YCI (CBQ-right ear), two MCI (CAY-right ear, CBK-left ear), and two OCI (CAO-right ear, CBB-right ear); (b) 55 dB: one MCI (CAW-left ear); (c) 65 dB: two MCI (CAW-left ear, CAW-right ear); (d) 75 dB: one MCI (CBN-right ear) and one OCI (CBT-right ear); and (e) 85 dB: one MCI (CBN-right ear). The following data failed to be fitted by the logistic functions: (a) 35 dB: one YCI (CCM-right ear); (b) 75 dB: one OCI (CBC-left ear); and (c) 85 dB: four MCI (CAQ-right ear, CAY-left ear, CAY-right ear, CBA-left ear) and five OCI (CAD-right ear, CBB, CBC-left ear, CBC-right ear, CBT-right ear). Each presentation level consisted of 140 trials (7 stimuli × 20 repetitions). Trials from all levels were mixed together and divided into four blocks. Each block was composed of 35 trials (7 stimuli × 5 repetitions) at each presentation level. The trial order was randomized for each block in each listener. For five participants/ears, the trials were divided into more than four blocks (5, 7, or 10 blocks) to reduce fatigue. Participants could take short breaks between blocks. The testing (including training and breaks) was usually completed within 2 hr. The experimental procedures were identical to those detailed in Experiment 1. All participants achieved near-perfect word identification performance on the training task. The training task was administered at 65 dB. The same psychometric function analysis as detailed in Experiment 1 was applied to the percentage of dish responses along the dish–ditch continuum calculated from each condition for each listener. Table 2 shows the proportion of participants/ears that were fitted by a logistic function in a backward direction or that failed to be fitted by the logistic function. To ensure balanced numbers of participants at each condition, we focused the statistical analysis on presentation levels from 45 to 85 dB. Data from four ears (CAD-left ear, CBC-right ear, CBU-right ear, and CCJ-left ear) consisted of 15 to 19 repetitions of each stimulus due to computer errors but were included in the analysis. We also reanalyzed the data after converting presentation levels to sensation levels. Please refer to the Online Appendix A for details. As shown in Table 1, the two ears of 13 participants were tested separately. The results from each ear (see Online Appendix B) were treated as independent observations consistent with previous studies (e.g., Bierer & Litvak, 2016; Donaldson, Rogers, Johnson, & Oh, 2015). Linear mixed-effects modeling implemented via the lme4 package (Bates, Maechler, Bolker, & Walker, 2014) in R version 3.5.1 (R Core Team, 2013) was used to fit the data of crossover point and slope, respectively. In the model, age group (YCI, MCI, or OCI) and presentation level (45, 55, 65, 75, or 85 dB) were included as the fixed effects. By-participant/ear intercept was included as a random effect to account for baseline performance differences for all the ears across participants. In that sense, two ears from the same participant were allowed to have different performance baselines. Both fixed-effects factors were treated as categorical variables. We systematically removed fixed effects that did not contribute significantly to the model (p > .05) to reduce the risk of overfitting the data by comparing each simpler model to the more complex model using the likelihood ratio test (Baayen, Davidson, & Bates, 2008). We present results from the simplest, best-fitting model in the results section. Significance values for the fixed effects in the optimal model were computed using the analysis of variance function in the lmerTest package (Kuznetsova, Brockhoff, & Christensen, 2017). Post hoc analysis for significant fixed effects, if necessary, was carried out using the lsmeans function of the lsmeans package (Lenth, 2016). Multiple comparisons were corrected by controlling false discovery rate (Benjamini & Hochberg, 1995). Descriptive statistics, if reported, represent mean ± SD. Figure 3(a) displays the percentage of dish responses as a function of silence duration. While the three age groups (YCI, MCI, and OCI) were able to discriminate the words dish and ditch, the MCI and OCI groups had longer crossover durations and shallower slopes (i.e., shallower curves for the percentage of dish responses as a function of silence duration) than the YCI listeners at higher presentation levels but not at lower levels.

Figure 3.

Results for YCI (blue/squares), MCI (green/circles), and OCI (red/triangles) groups as a function of presentation level (45, 55, 65, 75, and 85 dB) in Experiment 2. (a) Mean percentage of trials that participants reported dish responses for the dish–ditch continuum. The continuum consisted of seven stimuli with the silence duration parametrically varied from 0 to 60 ms. Error bars denote ±1 standard deviation. (b) and (c) Mean crossover point and slope of the psychometric functions. Error bars denote 95% confidence intervals. YCI = younger cochlear-implant users; MCI = middle-aged cochlear-implant users; OCI = older cochlear-implant users. Figure 3(b) shows the mean crossover points of the performance functions for the three groups across presentation levels 45 to 85 dB. The main effect of age group was not significant, F(2, 51.1) = 2.436, p = .098. The main effect of presentation level was significant, F(4, 203.4) = 11.17, p < .001. The interaction between age group and presentation level also was significant, F(8, 203.4) = 2.458, p = .015. Post hoc analysis of the interaction revealed that the crossover point for the OCI group was significantly later than that for the YCI group at 85 dB (OCI: 60.6 ms ± 13.5 vs. YCI: 33.4 ms ± 13.4; p < .001). No other comparisons between the age groups (YCI vs. MCI or MCI vs. OCI) were statistically significant (p > .13 in all cases). This suggests that OCI listeners needed longer silence durations than YCI listeners to change their percept from dish to ditch when the stimuli were presented at 85 dB. Figure 3(c) shows the mean slopes of the performance functions for the three groups across presentation levels 45 to 85 dB. The main effect of age group was not statistically significant, F(2, 50.6) = 2.44, p = .097. The main effect of presentation level was significant, F(4, 202.7) = 9.878, p < .001. The interaction between age group and presentation level was significant, F(8, 202.8) = 2.426, p = .016. Post hoc analysis of the interaction revealed that the slope was significantly shallower for the OCI group than that for the YCI group at 75 dB (OCI: 1.71%/ms ± 1.05 vs. YCI: 3.68%/ms ± 2.01; p = .015) and 85 dB (OCI: 0.66%/ms ± 1.17 vs. YCI: 2.98%/ms ± 1.5; p = .004). No other comparisons between the age groups (YCI vs. MCI or MCI vs. OCI) were statistically significant (p > 0.11 in all cases). This suggests that OCI listeners found it more difficult to discriminate dish and ditch than YCI listeners at 75 and 85 dB. Furthermore, we recoded presentation levels into sensation levels and reanalyzed the data, the details of which are reported in Online Appendix A. We found similar patterns of results (comparing Figure 3 for presentation levels and Figure S1 for sensation levels). There were minor differences regarding the significance of the age group effects for the metric of crossover point, which was significant when using the presentation levels but not significant when using sensation levels. Experiment 2 demonstrated that OCI listeners, compared with YCI listeners, required longer silence durations to identify ditch and exhibited reduced ability to distinguish the words dish and ditch (i.e., shallower slopes). Interestingly, such age-related performance differences were dependent on the presentation level, such that the differences emerged only at higher levels (≥ 75 dB; Figure 3). The age-related performance differences in CI users found here concur with extensive prior work using unprocessed stimuli in NH and HI listeners (Fitzgibbons & Gordon-Salant, 1995; Gordon-Salant & Fitzgibbons, 1999; Gordon-Salant et al., 2006, 2008; Humes et al., 2010; Pichora-Fuller & Souza, 2003; Roque et al., 2019; Snell, 1997; Strouse et al., 1998), as well as with our recent work using vocoded stimuli in NH listeners (Goupell et al., 2017). The level dependence of age-related performance differences explain the lack of age-related differences in word identification suggested in Experiment 1, considering that comfortably soft levels were used in that experiment. In the General Discussion, we speculate on the mechanisms related to the effect of presentation level in CI users. The calibration of stimulus levels received by the sound processor via DAI or headphone is nontrivial. Multiple factors may affect stimulus levels, including differences in the CI device programming approach across audiologists, volume settings on the CI device chosen by individual participants, and variability in the determination of loudness perception. Unfortunately, many of these factors were difficult to control. This posed limitations on the current experiment; specifically, we could not confirm that the actual levels were consistent across participants. However, these design limitations were inevitable given the methods of stimulus presentation required for CI users. We developed Experiment 2 as a follow-up to Experiment 1. As a result, we tried to keep the experimental parameters in Experiment 2 as close as possible to Experiment 1, including the stimulation choices. Nevertheless, this study is important as an initial step to characterize the role of sound level on age-related changes in temporal processing in CI users. Even though we could not precisely manipulate the stimulus presentation levels, our findings suggest that stimulus level may exert a large effect on auditory temporal processing in older CI users. This argument was corroborated by the alternative analysis focusing on sensation levels; in that analysis, the age-related performance differences in word identification persisted in older CI users (see Figure S1 in Online Appendix). Together, these data and the current approach are informative for designing similar types of experiments that aim to control presentation levels and mapping of the processors. In summary, the current experiment showed that the ability to process silence duration cues in word segments appears to decrease at higher stimulus presentation levels particularly for older CI users. The literature with NH listeners also suggests some evidence of a small amount of performance decline at higher-than-normal sound levels on speech perception tasks (e.g., Liu, 2008; Molis & Summers, 2003; Studebaker, Sherbecoe, McDaniel, & Gwaltney, 1999). Therefore, Experiment 3 aimed to examine the extent to which NH listeners exhibit performance decline with increasing stimulus presentation level for the dish–ditch contrast.

Experiment 3: Processing of Silence Duration Cues in Sine-Vocoded Word Segments as a Function of Presentation Level in NH Listeners

Sixteen younger NH (YNH, 18.6 to 26.2 years, mean age and SD = 21.5 ± 2.1) and 11 older NH (ONH, 65.1 to 78.1 years, mean age and SD = 68.9 ± 3.6) listeners were tested. NH was defined as pure tone thresholds ≤ 25 dB HL at octave frequencies from 250 to 4000 Hz. Figure 4 displays the average thresholds for both groups. The threshold data for 8000 Hz were missing for one YNH participant. All listeners were native speakers of American English. We screened ONH listeners with the MoCA (Nasreddine et al., 2005) to ensure normal or near-normal cognitive function (Cecato et al., 2016; Dupuis et al., 2015). Their MOCA scores ranged from 25 to 30, with 9 out of 10 participants scoring above 26. The MoCA data were missing for one ONH participant.

Figure 4.

Mean pure tone thresholds in dB HL (re: ANSI 2018) for YNH (blue/square) and ONH (red/triangle) groups in Experiment 3. The horizontal dashed line indicates 25 dB HL. Error bars denote ±1 standard deviation. YNH = younger normal-hearing listeners; ONH = older normal-hearing listeners. Stimuli consisted of the unprocessed, nonvocoded seven-step dish–ditch continuum used in Experiments 1 and 2, as well as vocoded dish–ditch continua with two, four, or eight contiguous channels. Details of the vocoding processes can be found in Goupell et al. (2017). Each unprocessed stimulus was forward-backward bandpass filtered into the corresponding number of channels (two, four, or eight) with cut-off frequencies logarithmically spaced from 200 to 8000 Hz. Note that some of the stimulus energy is above the guaranteed region of NH at 4000 Hz. This upper frequency value was chosen to match the previous study (Goupell et al., 2017) and to match the frequency range presented to the CI listeners. It is important to remember that age-related changes in performance for the NH listeners in this experiment could be partially a result of hearing loss above 4 kHz. Temporal envelopes were extracted from each channel and low-pass filtered at 400 Hz. The extracted envelopes modulated tonal carriers; these modulated sine tones were then summed to create the final vocoded stimulus. Thus, there were four stimulus types that were characterized by varying numbers of channels (two, four, and eight channels, and unprocessed). Similar to Experiment 2, the possible presentation levels were parametrically changed between 25 and 85 dB SPL in equal steps of 10 dB. We selected the lowest presentation level individually by querying the level (among the seven possible levels) that the listeners could just hear the unprocessed sample word dish before the experiment. At 25 dB SPL, 13 (81.3%) YNH and 3 (27.3%) ONH participants completed the tasks (Table 3). At 35 dB SPL and above, all participants completed the tasks. We conducted 2 (age group: YNH and ONH) × 2 (participants/ears that could hear the sample word dish: Yes or No) χ2 tests of independence. We found that the number of participants/ears that could versus could not hear the sample word dish was significantly different across group at 25 dB SPL, χ2(1) = 7.867, p = .015, such that compare to YNH, there was a lower number of ONH who could hear the sample word dish at 25 dB SPL.

Table 3.

Proportion of Participants Per Age Group at Each Presentation Level in Experiment 3.

Age group	Number of channels	Presentation level (dB SPL)
Age group	Number of channels	25	35	45	55	65	75	85
ONH	Unprocessed	27.27 (100;0)	100 (45.45;0)	100 (18.18;0)	100	100	100	100
	Eight channels	27.27 (33.33;0)	100 (54.55;0)	100 (9.09;0)	100	100	100	100
	Four channels	27.27 (100;0)	100 (63.64;9.09)	100 (36.36;18.18)	100	100 (9.09;0)	100	100
	Two channels	27.27 (66.67;0)	100 (36.36;18.18)	100 (27.27;18.18)	100 (45.45;18.18)	100 (45.45;18.18)	100 (18.18;18.18)	100 (27.27;18.18)
YNH	Unprocessed	81.25 (7.69;0)	100 (6.25;0)	100	100	100	100	100
	Eight channels	81.25 (15.38;0)	100 (6.25;0)	100	100	100	100	100
	Four channels	81.25 (23.08;0)	100 (12.5;0)	100 (6.25;0)	100	100 (6.25;0)	100	100
	Two channels	81.25 (23.08;0)	100 (12.5;6.25)	100 (18.75;0)	100	100 (12.5;0)	100 (6.25;0)	100 (0;6.25)

Note. Number(s) in parentheses represent the proportion of participants (relative to the total number of participants who completed the task at that level for that age group) that were fitted by a logistic function in a backward direction (first number) and that were failed to be fitted by the logistic function (second number). Zeros were included in the parentheses for illustration purpose. YNH = younger normal-hearing listeners; ONH = older normal-hearing listeners.

Proportion of Participants Per Age Group at Each Presentation Level in Experiment 3. Note. Number(s) in parentheses represent the proportion of participants (relative to the total number of participants who completed the task at that level for that age group) that were fitted by a logistic function in a backward direction (first number) and that were failed to be fitted by the logistic function (second number). Zeros were included in the parentheses for illustration purpose. YNH = younger normal-hearing listeners; ONH = older normal-hearing listeners. Each presentation level consisted of 280 trials (7 stimuli × 4 channels × 10 repetitions). Trials from all presentation levels were mixed together and divided into five blocks. Each block was composed of 56 trials (7 stimuli × 4 channels × 2 repetitions) at each presentation level. The trials from one YNH participant were divided into four blocks. The trial order was randomized for each block in individual listeners. Participants could take short breaks between blocks. The testing (including training and breaks) was usually completed within 2.5 hr. The testing procedures were identical to those detailed in Experiment 1 except for two modifications. First, the stimuli were presented monaurally through one ER2 insert earphone (Etymotic, Elk Grove Village, IL) to the right ear in YNH listeners and to the better ear in ONH listeners. Better ear was defined as the ear with better averaged audiometric thresholds across 500, 1000, 2000, and 4000 Hz. Second, in the training task, participants were presented with the endpoint stimuli (0- and 60-ms silence duration) in both unprocessed and vocoded (16 channels) speech modes at 65 dB SPL. All participants achieved an accuracy of at least 90% on the training task; hence, none were excluded from the experiment. The same psychometric function analysis as detailed in Experiment 1 was applied to the percentage of dish responses along the dish–ditch continuum calculated from each condition for each listener. Table 3 shows the proportion of participants who were fitted by a logistic function in a backward direction or who failed to be fitted by the logistic function. To be consistent with Experiment 2, we focused the statistical analysis on presentation levels from 45 to 85 dB SPL. Data from one ONH participant consisted of nine (out of 10) repetitions of each stimulus due to computer errors but were included in the analysis. We also reanalyzed the data after converting presentation levels to sensation levels. Please refer to Online Appendix A for details. The linear mixed-effects modeling implemented via the lme4 package (Bates et al., 2014) in R version 3.5.1 (R Core Team, 2013) was used to fit the data of 50% crossover points and slopes. In the model, age group (YNH or ONH), number of channels (two, four, and eight channels, or unprocessed) and presentation level (45, 55, 65, 75, or 85 dB SPL) were included as the fixed effects, and by-participant intercept was included as a random effect to account for baseline performance differences. All the fixed-effects factors were treated as categorical variables. We adopted similar approaches to those detailed in Experiment 2 to determine the significance of fixed effects and to conduct post hoc analyses. Descriptive statistics, if reported, represent mean ± SD. Figure 5(a) displays the percentage of dish responses as a function of silence duration. While both YNH and ONH groups were able to discriminate the words dish and ditch, the ONH group had longer crossover durations and shallower slopes (i.e., shallower curve for the percentage of dish responses as a function of silence duration) than the YNH participants across presentation levels and channels.

Figure 5.

Results for YNH (blue/squares) and ONH (red/triangles) groups as a function of presentation level (45, 55, 65, 75, and 85 dB SPL) in Experiment 3. (a) Mean percentage of trials that participants reported dish responses for the dish–ditch continuum. The rows (from top to bottom) show data for unprocessed and vocoded stimuli (8, 4, and 2 channels). The continuum consisted of seven stimuli with the silence duration parametrically varied from 0 to 60 ms. Error bars denote ±1 standard deviation. (b) and (c) Mean crossover points (b) and slopes (c) of the psychometric functions. As a comparison, we displayed results for YCI (transparent blue/squares) and OCI (transparent red/triangles) groups from Experiment 2. The columns (from left to right) show data for unprocessed and vocoded stimuli (8, 4, and 2 channels). Error bars denote 95% confidence intervals. YNH = younger normal-hearing listeners; ONH = older normal-hearing listeners; YCI = younger cochlear-implant users; OCI = older cochlear-implant users. Figure 5(b) shows the mean crossover points of the performance functions for the two groups as a function of presentation level for unprocessed and vocoded stimuli. The main effect of age group was significant, F(1, 25) = 7.911, p = .009. The main effect of channel was significant, F(3, 503) = 5.435, p = .001. The interaction between channel and age group also was significant, F(3, 503) = 3.434, p = .017. Post hoc analysis of the interaction revealed that the crossover points were significantly later for the ONH compared with the YNH group for unprocessed (ONH: 43.29 ms ± 15.51 vs. YNH: 29.0 ms ± 12.1; p = .012), eight-channel (ONH: 36.59 ms ± 22.41 vs. YNH: 26.08 ms ± 12.21; p = .049), and four-channel (ONH: 45.05 ms ± 21.05 vs. YNH: 31.33 ms ± 17.12; p = .014) stimuli. No significant age-group difference was observed for two-channel stimuli (ONH: 39.33 ms ± 33.52; YNH: 35.76 ms ± 23.69; p = .453). These results indicate that ONH listeners needed longer silence durations to change their percept from dish to ditch relative to YNH listeners, but such age-related differences disappear for stimuli with fewer vocoded channels. The main effect of presentation level was significant, F(4, 503) = 49.762, p < .001. Post hoc analysis showed that the crossover points became significantly earlier as the presentation level increased up to 75 dB SPL (p < .01 in all cases). The comparison between 75 and 85 dB SPL was not statistically significant (p = .355). These results suggest that participants needed shorter silence durations to change their percept from dish to ditch with increasing levels. Figure 5(c) shows the mean slope of the performance functions for the two groups as a function of presentation level for unprocessed and vocoded stimuli. All main effects were significant: age group, F(1, 25) = 6.773, p = .015, channel, F(3, 491) = 79.595, p < .001, and presentation level, F(4, 491) = 24.379, p < .001. The interaction between channel and age group was significant, F(3, 491) = 3.786, p = .010. Post hoc analysis revealed that the slopes were significantly shallower for the ONH group than those for the YNH group when listening to eight-channel (ONH: 2.46%/ms ± 1.59 vs. YNH: 4.43%/ms ± 2.31; p = .006) and two-channel (ONH: 0.12%/ms ± 1.39 vs. YNH: 1.99%/ms ± 1.94; p = .007) stimuli but not when listening to unprocessed (ONH: 3.43%/ms ± 1.91 vs. YNH: 4.39%/ms ± 2.18; p = .149) or four-channel (ONH: 2.57%/ms ± 2.24 vs. YNH: 3.55%/ms ± 2.01; p = .144) stimuli. The interaction between channel and presentation level was significant, F(12, 491) = 2.108, p = .015. Post hoc analysis revealed that for the unprocessed stimuli, the slopes became significantly steeper at presentation levels of 65 to 85 dB SPL compared with 45 dB SPL (p < .01 in all cases) and at the presentation level of 85 dB SPL compared with 55 dB SPL (p = .001). For the eight-channel stimuli, the slopes were significantly steeper at presentation levels of 65 to 85 dB SPL compared with presentation levels of 45 to 55 dB SPL (p < .01 in all cases). For the four-channel stimuli, the slopes were steepest for presentation levels at 65 to 75 dB SPL, followed by that at 55 dB SPL, and least steep for that at 45 dB SPL (p < .05 in all cases). The slope at 85 dB SPL was steeper than that at 45 dB SPL (p < .001). For two-channel stimuli, the slopes were not significantly different between presentation levels (p > .2 in all cases). These results suggest that while it generally became easier to discriminate dish and ditch with increasing presentation level, the level benefit is dependent on the spectral resolution of the stimuli. Furthermore, we recoded presentation levels into sensation levels and reanalyzed the data, the details of which are reported in the Online Appendix. We found similar patterns of results (comparing Figure 5 for presentation levels and Figure S2 for sensation levels). There were minor differences regarding the significance of the age group × sound level interaction effects for the metric of slope, which were not significant when using the presentation levels but were significant when using sensation levels. Experiment 3 demonstrated age-related temporal processing deficits in NH listeners with unprocessed and vocoded stimuli, a finding which concurs with Goupell et al. (2017) using a similar paradigm. Importantly, Experiment 3 extended this prior study by varying stimulus presentation levels. The current results suggest that while the ability to utilize temporal cues for word identification in NH listeners generally improves with increasing presentation levels for both age groups, there are consistent age-related temporal processing deficits that may not significantly change across levels. Previous studies have used a similar paradigm to assess age-related changes in temporal processing at different fixed levels. There are noted performance differences across these studies, but they consistently revealed age-related declines in temporal processing (85 dB SPL, Gordon-Salant et al., 2006, 2008; 65 dB SPL, Goupell et al., 2017; 75 dB SPL, Roque et al., 2019). Our findings of enhanced temporal processing (i.e., shorter crossover durations and steepest slopes) with higher presentation levels help explain performance differences across these prior studies. Importantly, the consistent age-related differences in temporal processing across presentation levels as demonstrated in this study further reinforce the hypothesis of age-related temporal processing deficits that can be inferred from these past investigations. Findings from this experiment contrast with those from CI users in Experiment 2, wherein age-related performance differences occurred at higher presentation levels (≥75 dB) but not at lower levels (<75 dB) (Figure 3). The discrepancy in the level effects between CI (Experiment 2) and NH (Experiment 3) listeners was further evidenced by the fact that the ability to distinguish dish and ditch generally improves with elevating sound levels in NH listeners (Figure 5) but may plateau or become worse at intermediate sound levels in CI listeners (Figure 3). Note that due to the approximate calibration of stimuli for the CI processors, the range of sound levels actually received by the participants was not necessarily comparable between CI and NH listeners. Nevertheless, these results undoubtedly highlight the differential sensitivity to level changes between NH and CI listeners. This argument is consistent with the findings that CI listeners demonstrate a much smaller dynamic range (Skinner, Holden, Holden, Demorest, & Fourakis, 1997; Zeng et al., 2002) and abnormal loudness growth (Zhang & Zeng, 1997) compared with NH listeners. What are the mechanisms underlying the disparities in level effects between NH listeners and CI users, particularly regarding their influence on the age-related changes in word identification based on temporal cues? Here, we offer some plausible explanations. The first explanation may lie in the differences between acoustic and electric hearing despite the fact that we simulated CI hearing via vocoding. In electric hearing, higher stimulus presentation levels could induce larger current spread that reduces spectral resolution (Eisen & Franck, 2005), which in turn, may disrupt the processing of temporal modulation cues (Oxenham & Kreft, 2014). However, such change in spectral resolution with increasing levels in electric hearing was not systematically accounted for in the results observed for listeners with acoustic hearing for unprocessed or vocoded stimuli in this study. Alternatively, speech temporal cues might have already been lost or been distorted by CI signal processing or preprocessing (e.g., automatic gain control, AGC) at high presentation levels, which was not an issue for the NH listeners. Another explanation is that speech understanding in NH listeners can exhibit either no change or a small drop in performance with increasing presentation levels (Miranda & Pichora-Fuller, 2002). We did not observe a decrease in the perception of temporal cues with increasing levels in our NH data. Unlike in NH listeners, a CI processor directly excites spiral ganglia and bypasses stimulus encoding at the cochlear level. The auditory system beyond the cochlea (auditory nerve and central nervous system) may have undergone significant pathological changes due to deafness in the CI listeners (e.g., Middlebrooks, 2018; Shepherd & Hardie, 2001). For example, previous studies suggest that hearing impairment results in the loss of auditory fibers (e.g., Middlebrooks, 2018; Webster & Webster, 1981), which may lead to decreased neural synchrony to auditory stimuli. Decreased neural synchrony may be one of the mechanisms underlying the reduced ability to process temporal cues with increasing levels in CI listeners. For instance, Miranda and Pichora-Fuller (2002) demonstrated that the introduction of temporal jitter to the speech stimuli, a simulation of neural desynchrony, resulted in a decline of word recognition performance at high presentation levels in younger NH listeners. With aging, the retrocochlear neural substrates may be subjected to further pathological changes (Hughes et al., 2010; Makary et al., 2011; Otte et al., 1978; Roque et al., 2019; Sergeyenko et al., 2013; Tremblay et al., 2003; Walton et al., 1998; Willott, 1991), which might exacerbate performance decline with increasing levels in older CI users. In later sections, we further discuss the possible mechanisms related to the effect of presentation level in CI users.

General Discussion

Overview of Results

This study demonstrated that CI listeners can identify words based on a temporal contrast (silence duration) and that aging in CI users appears to be associated with decreased ability to utilize brief temporal cues in word segments (Figure 3). These findings concur with the vast literature on age-related temporal processing deficits with acoustic hearing in NH and HI listeners (Fitzgibbons & Gordon-Salant, 1995; Gordon-Salant & Fitzgibbons, 1999; Gordon-Salant et al., 2006, 2008; Goupell et al., 2017; Humes et al., 2010; Pichora-Fuller & Souza, 2003; Roque et al., 2019; Snell, 1997; Strouse et al., 1998). This study extends these findings to CI users and suggests that age-related declines in word identification based on temporal cues in CI users are dependent on stimulus presentation levels, such that older CI users demonstrate reduced performance in the utilization of brief temporal cues for word identification at higher levels (Figure 3).

Level Dependency of Age-related Temporal Processing Deficits in CI Users: Potential Explanations

This study (Experiment 2) revealed that age interacts with sound level to affect auditory temporal processing for temporally based word contrasts in CI listeners. Our finding on the processing of silence duration cues in word segments for older CI users (Figure 3) is reminiscent of a previous study that showed that in some CI listeners, performance on syllable identification tasks decreased at a higher stimulus level (Franck, Xu, & Pfingst, 2003). What are the mechanisms underlying the level dependency of age-related declines in word identification based on temporal cues in CI users? Following the discussion of level effects in Franck et al. (2003), we speculate that for electric hearing with CI, on the one hand, increasing levels may produce positive effects on the encoding of temporal cues. For example, the number of auditory nerve fibers responding to the stimulus may increase with higher intensities, which leads to more faithful encoding of stimulus temporal features (Lopez-Poveda, 2014; Lopez-Poveda & Barrios, 2013). On the other hand, increasing levels may cause negative effects on the encoding of temporal cues. For example, as discussed earlier, higher levels could reduce spectral resolution due to larger current spread, which, in turn, may interfere with the processing of temporal modulation cues (Oxenham & Kreft, 2014). Besides, the processing of the speech sounds through the CI sound processer may lead to partial loss or distortion of speech cues, especially at higher levels. For instance, many CI programming parameters (e.g., amplitude mapping, AGC, microphone sensitivity, input dynamic range) may affect the transmission of temporal cues and other cues, such as intensity. These competing effects associated with stimulus level may interact with younger and older CI listeners differently considering the age-related changes at the level of the spiral ganglia or above. For example, the potential positive effects of increasing levels (e.g., more responding auditory nerve fibers) may be diminished in older CI users due to the loss of auditory nerve fibers with aging. Specifically, the low spontaneous-rate fibers, which are important for the encoding of temporal cues at higher intensities, may be more affected by aging (Bharadwaj, Verhulst, Shaheen, Liberman, & Shinn-Cunningham, 2014; Schmiedt, Mills, & Boettcher, 1996). It is also possible that the level dependency of age-related declines in word identification based on temporal cues occurs because the three age groups (YCI, MCI, and OCI) are mapped differently. While possible, this explanation seems not to adequately address our findings for the following reasons. First, if the three groups were mapped differently, we should observe systematic differences between the groups across levels. Instead, we found age-group performance differences only at higher levels (≥75 dB; Figure 3). Furthermore, as shown in Table 2, the proportion of participants who can hear the word dish at 25 or 35 dB did not significantly differ between the YCI and OCI groups. Second, at the time of this study, we were unaware of any empirical studies advocating for or implementing age-customized CI fitting procedures. Third, if such an approach to mapping does occur, it would have to be implemented across numerous clinicians and clinical sites, as our CI listeners were recruited from across the DC-Baltimore metropolitan area and the US. Rather, it would be more likely that clinicians followed roughly similar procedures to map CI patients of different ages (Wolfe & Schafer, 2014), despite anecdotal evidence that lower stimulation rates could be used in OCI listeners than are used in YCI listeners. Furthermore, both analyses with presentation levels and sensation levels pointed to similar patterns of age-related performance differences at higher levels (Figure 3 and Figure S1). To summarize, we posit that the level dependency of age-related declines in word identification based on temporal cues in CI listeners may be attributed to limits of the CI device to process speech cues, declines in temporal processing with aging, or the interaction between these two factors. Assuming that CI listeners of different age groups are mapped similarly, it may be reasonable to propose that age-related changes play an important role in the observed age effects on word identification. Future studies are needed to elucidate the exact mechanisms underlying changes in temporal processing of older CI listeners.

Data From Both Ears in Bilateral CI Users: How to Handle?

In our Experiment 2, we collected data from both ears in 13 bilateral CI users. Currently, there is no consensus in the approach to handling the data from the two ears in a single CI listener. Some studies averaged data from the two ears and treated the data as from a single listener (e.g., Feng & Oxenham, 2018). This approach seems reasonable if we assume that temporal processing is predominantly affected by central factors that are common to both ears. Other studies treated data from each ear as independent observations (e.g., Bierer & Litvak, 2016; Donaldson et al., 2015). This approach seems more appropriate if there may also be significant ear-specific contributions to temporal processing. We adopted the latter approach based on the assumption that ear-specific factors (e.g., auditory nerve survival) may significantly contribute to the age-related differences in temporal processing. A close inspection of data from these bilateral CI users (see Online Appendix B) shows that there are indeed between-ear differences in discriminating the words dish and ditch. Future research may systematically evaluate between-ear differences in temporal processing from bilateral CI users to better understand ear-specific peripheral and central contributions to age-related temporal processing deficits.

Limitations of This Study and Future Recommendations

As discussed earlier, our manipulation of sound levels in CI users may not be as precise as we wanted, likely due to the variabilities introduced by sound processors from individual CI participants. Here, future studies may consider the following approaches to more rigorously manipulate and examine the sound level effects. First, the same (research) processor with similar programming parameters may be used to deliver sounds to the CI across participants. This could potentially minimize the variabilities from individual clinical processors. Second, at least two methods may be used to match sound levels across CI participants. Loudness judgement may be used to choose sound levels that are rated at the same loudness (as is a standard in the field of CI research; e.g., Bierer & Litvak, 2016; Donaldson et al., 2015; Feng & Oxenham, 2018; Friesen et al., 2001; Fu, 2002). Electrophysiological measures may also be used to facilitate the matching of sound levels across CI participants. For example, we can match individual levels by adjusting them to elicit brainstem responses (wave V) with equal amplitudes (Gordon, Abbasalipour, & Papsin, 2016). Finally, it may be necessary to estimate the electrode–neuron interface with available measures such as the electrically evoked compound action potential and computerized tomography (DeVries, Scheperle, & Bierer, 2016; Verbist, Frijns, Geleijns, & Van Buchem, 2005). This is because the electrode–neuron interface is considered a significant contributor to performance variability in CI listeners (Bierer, 2010; DeVries et al., 2016; Long et al., 2014) that may mediate the sound level effects. Indeed, the current approach to measure temporal processing via clinical sound processors with everyday settings may not be optimal. As mentioned earlier, many CI device-related factors (e.g., amplitude mapping, AGC, microphone sensitivity, input dynamic range) may impair the transmission of speech cues, especially at varying signal levels. Those device-related factors were not systematically manipulated in this study. Hence, the limitations of CI processors to preserve temporal cues in speech may obscure the genuine temporal processing limits of older CI listeners. Many previous studies have adopted single- or multi-channel direct stimulation approaches to assess temporal processing abilities, with potential advantages to bypass the front-end processing that may interact with temporal cue perception. Further research may utilize both approaches (clinical sound processors and direct stimulation) to provide converging evidence regarding limitations in temporal processing associated with aging in CI listeners. This study demonstrated age-related temporal processing deficits with only one temporally based speech contrast (i.e., dish–ditch continuum). Clearly, this may limit the generalizability of our findings to other speech contrasts based on temporal cues. Nevertheless, our study and other studies have utilized the dish–ditch continuum and revealed age-related declines in temporal processing across diverse populations (NH, HI, and CI listeners; Gordon-Salant et al., 2006, 2008; Goupell et al., 2017; Roque et al., 2019). This suggests that this temporal contrast may represent a highly sensitive paradigm to reveal age-related temporal processing differences. Here, the dish–ditch continuum was presented as isolated words. Prior work suggests that the ability to process temporal cues may change when temporal contrasts are presented in sentential contexts (Gordon-Salant et al., 2008). Given that temporally based contrasts are typically embedded in sentences during natural speech processing, future studies could use the dish–ditch contrast embedded in sentences to reexamine the effects of aging and presentation levels on temporal processing in CI users.

Implications for Programming in Adult CI Users

For CI programming, an important parameter is stimulus level. Our results suggest a potential interaction between stimulus level and age to affect word identification based on temporal cues such as silence duration, such that word identification performance may be diminished for older CI users at higher levels. This interaction may be attributed to limitations of the CI device to preserve temporal cues, age-related changes in temporal processing, or the combination of these factors. Therefore, clinical CI fitting procedures may need to include age as a potential variable. Strategies to manipulate CI processor parameters related to stimulus level (e.g., amplitude mapping function, AGC time constants and threshold, microphone sensitivity, input dynamic range, electrode thresholds and comfortable levels) may need to take into account the preservation of temporal cues.

Conclusions

Older NH and CI listeners, relative to their younger counterparts, appear to exhibit reduced ability to utilize brief temporal cues in word identification. Importantly, such age-related performance differences appear to be independent of presentation level for NH listeners but may emerge only at high presentation levels for CI listeners. These results suggest that clinicians may want to consider age-specific CI device settings to improve speech understanding in older CI listeners. Click here for additional data file. Supplemental material, TIA886688 Supplemental Material for Age-Related Temporal Processing Deficits in Word Segments in Adult Cochlear-Implant Users by Zilong Xie, Casey R. Gaskins, Maureen J. Shader, Sandra Gordon-Salant, Samira Anderson and Matthew J. Goupell: On behalf of the ComorBidity in Relation to AIDS (COBRA) Collaboration and the Korean NeuroAIDS Project in Trends in Hearing

65 in total

1. Assessment of Spectral and Temporal Resolution in Cochlear Implant Users Using Psychoacoustic Discrimination and Speech Cue Categorization.

Authors: Matthew B Winn; Jong Ho Won; Il Joon Moon
Journal: Ear Hear Date: 2016 Nov/Dec Impact factor: 3.570

2. Age-related differences in identification and discrimination of temporal cues in speech segments.

Authors: Sandra Gordon-Salant; Grace H Yeni-Komshian; Peter J Fitzgibbons; Jessica Barrett
Journal: J Acoust Soc Am Date: 2006-04 Impact factor: 1.840

3. Rollover effect of signal level on vowel formant discrimination.

Authors: Chang Liu
Journal: J Acoust Soc Am Date: 2008-04 Impact factor: 1.840

4. Age-related loss of activity of auditory-nerve fibers.

Authors: R A Schmiedt; J H Mills; F A Boettcher
Journal: J Neurophysiol Date: 1996-10 Impact factor: 2.714

5. Assessing the Electrode-Neuron Interface with the Electrically Evoked Compound Action Potential, Electrode Position, and Behavioral Thresholds.

Authors: Lindsay DeVries; Rachel Scheperle; Julie Arenberg Bierer
Journal: J Assoc Res Otolaryngol Date: 2016-02-29

6. Spiral ganglion neuron loss following organ of Corti loss: a quantitative study.

Authors: M Webster; D B Webster
Journal: Brain Res Date: 1981-05-11 Impact factor: 3.252

7. Processing of broadband stimuli across A1 layers in young and aged rats.

Authors: Larry F Hughes; Jeremy G Turner; Jennifer L Parrish; Donald M Caspary
Journal: Hear Res Date: 2009-09-20 Impact factor: 3.208

8. Factors affecting open-set word recognition in adults with cochlear implants.

Authors: Laura K Holden; Charles C Finley; Jill B Firszt; Timothy A Holden; Christine Brenner; Lisa G Potts; Brenda D Gotter; Sallie S Vanderhoof; Karen Mispagel; Gitry Heydebrand; Margaret W Skinner
Journal: Ear Hear Date: 2013 May-Jun Impact factor: 3.570

9. Cochlear neuropathy and the coding of supra-threshold sound.

Authors: Hari M Bharadwaj; Sarah Verhulst; Luke Shaheen; M Charles Liberman; Barbara G Shinn-Cunningham
Journal: Front Syst Neurosci Date: 2014-02-21

Review 10. Why do I hear but not understand? Stochastic undersampling as a model of degraded neural encoding of speech.

Authors: Enrique A Lopez-Poveda
Journal: Front Neurosci Date: 2014-10-30 Impact factor: 4.677

8 in total

1. Stimulus context affects the phonemic categorization of temporally based word contrasts in adult cochlear-implant users.

Authors: Zilong Xie; Samira Anderson; Matthew J Goupell
Journal: J Acoust Soc Am Date: 2022-03 Impact factor: 1.840

2. Open-Set Phoneme Recognition Performance With Varied Temporal Cues in Younger and Older Cochlear Implant Users.

Authors: Maureen J Shader; Bomjun J Kwon; Sandra Gordon-Salant; Matthew J Goupell
Journal: J Speech Lang Hear Res Date: 2022-02-08 Impact factor: 2.674

3. Neural Adaptation of the Electrically Stimulated Auditory Nerve Is Not Affected by Advanced Age in Postlingually Deafened, Middle-aged, and Elderly Adult Cochlear Implant Users.

Authors: Shuman He; Jeffrey Skidmore; Sara Conroy; William J Riggs; Brittney L Carter; Ruili Xie
Journal: Ear Hear Date: 2022-01-03 Impact factor: 3.562

4. Access to semantic cues does not lead to perceptual restoration of interrupted speech in cochlear-implant users.

Authors: Brittany N Jaekel; Sarah Weinstein; Rochelle S Newman; Matthew J Goupell
Journal: J Acoust Soc Am Date: 2021-03 Impact factor: 1.840

5. Aging Effects on Cortical Responses to Tones and Speech in Adult Cochlear-Implant Users.

Authors: Zilong Xie; Olga Stakhovskaya; Matthew J Goupell; Samira Anderson
Journal: J Assoc Res Otolaryngol Date: 2021-07-06

6. Head Shadow, Summation, and Squelch in Bilateral Cochlear-Implant Users With Linked Automatic Gain Controls.

Authors: Taylor A Bakal; Kristina DeRoy Milvae; Chen Chen; Matthew J Goupell
Journal: Trends Hear Date: 2021 Jan-Dec Impact factor: 3.293

7. Impact of Aging and the Electrode-to-Neural Interface on Temporal Processing Ability in Cochlear-Implant Users: Gap Detection Thresholds.

Authors: Maureen J Shader; Sandra Gordon-Salant; Matthew J Goupell
Journal: Trends Hear Date: 2020 Jan-Dec Impact factor: 3.293

8. The Sensitivity of the Electrically Stimulated Auditory Nerve to Amplitude Modulation Cues Declines With Advanced Age.

Authors: William J Riggs; Chloe Vaughan; Jeffrey Skidmore; Sara Conroy; Angela Pellittieri; Brittney L Carter; Curtis J Stegman; Shuman He
Journal: Ear Hear Date: 2021 Sep/Oct Impact factor: 3.562

8 in total