Literature DB >> 34999594

Parameter-Specific Morphing Reveals Contributions of Timbre to the Perception of Vocal Emotions in Cochlear Implant Users.

Celina I von Eiff^1,2, Verena G Skuk^1,2,3,4,5,6, Romi Zäske^1,2,3,4,5,6, Christine Nussbaum^1,2, Sascha Frühholz^4,5, Ute Feuer⁶, Orlando Guntinas-Lichius³, Stefan R Schweinberger^1,2.

Abstract

OBJECTIVES: Research on cochlear implants (CIs) has focused on speech comprehension, with little research on perception of vocal emotions. We compared emotion perception in CI users and normal-hearing (NH) individuals, using parameter-specific voice morphing.
DESIGN: Twenty-five CI users and 25 NH individuals (matched for age and gender) performed fearful-angry discriminations on bisyllabic pseudoword stimuli from morph continua across all acoustic parameters (Full), or across selected parameters (F0, Timbre, or Time information), with other parameters set to a noninformative intermediate level.
RESULTS: Unsurprisingly, CI users as a group showed lower performance in vocal emotion perception overall. Importantly, while NH individuals used timbre and fundamental frequency (F0) information to equivalent degrees, CI users were far more efficient in using timbre (compared to F0) information for this task. Thus, under the conditions of this task, CIs were inefficient in conveying emotion based on F0 alone. There was enormous variability between CI users, with low performers responding close to guessing level. Echoing previous research, we found that better vocal emotion perception was associated with better quality of life ratings.
CONCLUSIONS: Some CI users can utilize timbre cues remarkably well when perceiving vocal emotions.

Entities: Chemical

Mesh：

Year: 2022 PMID： 34999594 PMCID： PMC9197138 DOI： 10.1097/AUD.0000000000001181

Source DB: PubMed Journal: Ear Hear ISSN： 0196-0202 Impact factor: 3.562

INTRODUCTION

Hearing loss can be a disabling condition. Severe hearing impairment increases the risk of depression (Kim et al. 2017), is linked with cognitive decline (Ray et al. 2018), and is associated with an annual global cost of 980 billion dollars (World Health Organization 2021). Cochlear implants (CIs)—hearing prostheses designed to functionally replace damaged parts of the inner ear—are a highly successful way to treat severe hearing loss. This technology has advanced to a great extent in the last decades. Refinements in the processing strategies of the devices enabled striking improvements in transferring speech (Wilson et al. 1991) so that today, CIs can promote recovery of remarkable speech understanding abilities (Peterson et al. 2010; Jiam et al. 2017). However, CIs still show crucial limitations in transmitting paralinguistic sounds, such as music or social aspects in voices (for music perception, see Limb & Roy 2014; Thomas & Tripathi 2014). This is thought to be partially due to the limited number of stimulation electrodes. Even though the devices make use of tonotopic representation in the cochlea, only between 6–12 and 22 electrodes are used, depending on the specific CI device. In fact, most devices only cover the frequency range between 200 Hz and 7 kHz within the normal human hearing range between 20 Hz and 20 kHz (Kirtane et al. 2010; Peterson et al. 2010). Thus, CIs have a reduced spectral resolution (Wilson & Dorman 2008), impoverishing sound representation (Moore & Shannon 2009). Other variations in technology, such as the specific processing strategies of the devices (specifically, ACE versus MP3000 strategies; cf. Agrawal et al. 2012, 2013), also seem to affect CI users’ ability to perceive nonverbal signals, and emotional prosodies in particular. Accordingly, CIs degrade prosodic cues because of their constraints in extraction, processing, and transmission of pitch and timbre cues (Kong et al. 2004; Galvin et al. 2007; Xu et al. 2009; Kang et al. 2010). Underlining the limitations of CIs in transmitting paralinguistic social-communicative vocal signals, previous research suggested general deficits in CI users when perceiving emotions (e.g., Luo et al. 2007; Schorr et al. 2009; Agrawal et al. 2013; See et al. 2013; Volkova et al. 2013; Wiefferink et al. 2013; Jiam et al. 2017; Kim & Yoon 2018; Paquette et al. 2018; Tinnemore et al. 2018; Waaramaa et al. 2018), age (Skuk et al. 2020), or gender (e.g., Fu et al. 2004, 2005; Kovacić & Balaban 2009, 2010; Meister et al. 2009, 2016; Li & Fu 2011; Massida et al. 2013; Fuller et al. 2014a; Hazrati et al. 2015; Gaudrain & Baskent 2018; Skuk et al. 2020) in other people’s voices. Actually, not only do CI users perform less well than normal-hearing (NH) individuals in perceiving nonverbal social cues, they also seem to employ different perceptual strategies: for example, while NH individuals rely on both timbre and F0 when discriminating a speaker’s sex (Skuk & Schweinberger 2014), CI users rely more on F0 alone (Massida et al. 2013; Fuller et al. 2014a; Skuk et al. 2020). In addition, a recent study indicates that CI users can efficiently use F0 cues for speaker segregation to support speech perception in multitalker situations with noise-vocoded speech maskers (Meister et al. 2020; but see also Stickney et al. 2004). Perceiving vocal emotions is essential to the accurate understanding of other human beings’ messages (Frick 1985; Scherer 1986; Banse & Scherer 1996). A large body of research suggests that the neurocognitive mechanisms for perceiving and producing vocal emotions are tightly interwoven (Frühholz & Schweinberger 2021), and combined research on perception and production is considered increasingly important in CI research (Jiam et al. 2017). Importantly, both abilities are highly relevant for daily communication, and impairments in the perception and production of vocal emotion often cause extensive ramifications on both social interactions and development (Trainor et al. 2000). Considering the relevance of vocal emotion perception, its tight connection to quality of life does not seem surprising: in fact, whereas there is only a weak relationship between life quality and speech understanding abilities in CI users (Huber 2005), perceived quality of life and vocal emotion perception skills are distinctively and positively correlated (Schorr et al. 2009; Luo et al. 2018). Yet, despite its importance, vocal emotion perception in CI users remains relatively understudied, especially when compared to speech comprehension. Some studies suggest large interindividual differences between CI users in their ability to perceive vocal emotions—with some CI users’ performance approximating the level of NH individuals (Chatterjee et al. 2015; Jiam et al. 2017). On average, NH individuals perform better in vocal emotion recognition than CI users even if CI-simulated voice stimuli are presented (Chatterjee et al. 2015; Gilbers et al. 2015). Various factors may influence interindividual differences in CI users (Jiam et al. 2017). For example, early auditory access to the variability of speech seems to be crucial to prevent deprivation in children and to promote speech intelligibility (Artières et al. 2009; Schorr et al. 2009). It is interesting that the performance of children who were congenitally deaf and early implanted was similar to that in late-implanted CI users who had experienced normal hearing early in life (Chatterjee et al. 2015). Identifying specific acoustic parameters relevant for CI users’ vocal emotion recognition, Gilbers et al. (2015) reported a bias toward pitch range cues in CI users, whereas NH individuals seem to rely more on mean pitch than pitch range. Other researchers suggested that CI users may rely on tempo-information and intensity (Luo et al. 2007; Kalathottukaren et al. 2015). In the present study, we planned to gain information on the relative impact of specific acoustic parameters on the perception of vocal emotions in CI users. We also aimed at a detailed quantification of individual differences in CI users’ performance. In particular, we planned to compare CI users and NH listeners not only regarding their overall performance but also regarding their reliance on specific acoustic cues to recognize vocal emotion. To quantify the specific acoustic parameters utilized for task performance, we applied a parameter-specific voice morphing approach based on the TANDEM-STRAIGHT algorithm (Kawahara et al. 2013), extending our previous research on the perception of speaker gender and speaker age in CI users (Skuk et al. 2020). Considering the increased scientific attention to the relevance of social-communicative abilities for daily functioning, we additionally assessed the relationship between the CI users’ perceived quality of life (Guyatt et al. 1993; Hinderink et al. 2000) and their ability to perceive emotional expression in voices. Based on previous published findings regarding related topics (i.e., use of F0 and timbre cues by CI users in the context of vocal age and gender perception), we hypothesized that CI users rely more on F0 information in emotion perception, while NH individuals can efficiently use both F0 and timbre information. We furthermore hypothesized that perceived quality of life would be possibly related to vocal emotion perception in CI users.

MATERIALS AND METHODS

Participants

Twenty-five (14 female) CI users between 20 and 83 years old (M = 61.0, SD = 17.0) and 25 (14 female) individuals with NH abilities aged between 19 and 81 years old (M = 63.6, SD = 16.4), matched to CI users by age and gender, participated in this study. All CI users and 10 NH individuals were recruited locally and tested in the Cochlear Implant Rehabilitation Centre Thuringia in Erfurt, Germany. Fifteen NH individuals were tested at the Friedrich Schiller University Jena, Germany, and these participants received a small financial reimbursement to compensate them for local travel expenses. All participants were native German speakers; none reported a neurological or psychiatric diagnosis. CI users (for details see Table 1) reported no other otologic disorders and had either bilateral implants or unilateral implants together with severe to profound hearing loss in the nonimplanted ear. The NH individuals were recruited based on self-report of normal hearing, did not report any hearing disorders, and none was using a hearing aid.

TABLE 1.

Demographic characteristics of all CI users

CI user	Performance rank	Sex	Age	Civil status	Pre-/post-deaf	Age at deafness	Age at first CI (yr)	Mode of hearing	Left CI			Right CI
CI user	Performance rank	Sex	Age	Civil status	Pre-/post-deaf	Age at deafness	Age at first CI (yr)	Mode of hearing	Wear time (hr)	Manufacturer*	Processor	Wear time (hr)	Manufacturer*	Processor
11	4 (HP)	Female	61	Widowed	Post	40	60	CI-left	12–16	Cochlear	CP910	/	/	/
12	21 (LP)	Female	47	Single	Pre	0	45	CI-bi	> 16	Advanced Bionics	Naida Q90	> 16	Advanced Bionics	Naida Q90
13	20 (LP)	Female	76	Widowed	Post	19	70	CI-bi	12–16	MED-EL	OPUS2	12–16	MED-EL	Sonnet
14	12 (HP)	Male	68	Married	Post	40	67	CI-right	/	/	/	> 16	Cochlear	CP910
15	19 (LP)	Female	56	Married	Pre	0	54	CI-bi	12–16	Cochlear	CP910	12–16	Cochlear	CP910
16	7 (HP)	Female	50	Divorced	Post	3	46	CI-bi	> 16	MED-EL	OPUS2	> 16	MED-EL	Sonnet
17	17 (LP)	Male	68	Married	Post	66	68	CI-bi	12–16	Cochlear	Kanso	12–16	Cochlear	Kanso
18	11 (HP)	Male	67	Married	Post	40	66	CI-left	12–16	Cochlear	CP910	/	/	/
19	6 (HP)	Female	57	Divorced	Post	12	55	CI-bi	12–16	Advanced Bionics	Naida Q90	12–16	Advanced Bionics	Naida Q90
20	10 (HP)	Female	81	Widowed	Post	65	81	CI-right	/	/	/	> 16	MED-EL	Sonnet
21	1 (HP)	Male	64	Married	Post	59	63	CI-right	/	/	/	12–16	MED-EL	Sonnet
22	24 (LP)	Male	41	Single	Post	3	41	CI-right	/	/	/	8–12	Advanced Bionics	Naida Q90
23	18 (LP)	Female	69	Married	Post	40	69	CI-right	/	/	/	8–12	Cochlear	Kanso
24	15 (LP)	Male	67	Single	Post	50	53	CI-bi	12–16	MED-EL	OPUS2	0–4	MED-EL	Sonnet
25	9 (HP)	Female	76	Widowed	Post	62	74	CI-bi	12–16	MED-EL	Sonnet	12–16	MED-EL	Sonnet
26	23 (LP)	Female	51	Divorced	Post	25	50	CI-bi	12–16	Cochlear	CP910	12–16	Cochlear	CP910
27	16 (LP)	Female	80	Married	Post	40	79	CI-right	/	/	/	8–12	Cochlear	CP910
28	14 (LP)	Male	27	Single	Pre	0	25	CI-bi	12–16	MED-EL	Sonnet	12–16	MED-EL	Sonnet
29	25 (LP)	Female	36	Single	Pre	0	29	CI-bi	8–12	Cochlear	Kanso	8–12	Cochlear	CP910
30	13 (HP)	Male	76	Married	Post	44	75	CI-right	/	/	/	12–16	Cochlear	CP910
31	2 (HP)	Female	72	Married	Post	50	72	CI-left	12–16	Cochlear	Kanso	/	/	/
32	22 (LP)	Male	83	Married	Post	20	83	CI-left	8–12	Cochlear	CP910	/	/	/
33	8 (HP)	Male	20	Widowed	Pre	0	2	CI-bi	12–16	Advanced Bionics	Harmony	12–16	Advanced Bionics	Harmony
34	3 (HP)	Male	77	Single	Post	74	76	CI-left	4–8	MED-EL	Sonnet	/	/	/
35	5 (HP)	Female	54	Married	Post	3	53	CI-left	12–16	Advanced Bionics	Naida Q90	/	/	/
M			61.0			30.20	58.24
SD			17.0			25.06	19.37
Minimum			20			0	2
Maximum			83			74	83
N	25	25	25	25	25	25	25	25	25	25	25	25	25	25

CI Manufacturers: Advanced Bionics GmbH, Max-Eyth-Str. 20, 70736 Fellbach-Oeffingen, Germany; Cochlear Headquarters, 1 University Avenue, Macquarie University, NSW, 2109, Australia; MED-EL Elektromedizinische Geräte Gesellschaft m.b.H., Fürstenweg 77a, 6020 Innsbruck, Austria.

CI, cochlear implant; CI-bi, bilateral implanted CI user; CI-left, unilateral implanted CI user who was fitted with the CI on the left ear; CI-right, unilateral implanted CI user who was fitted with the CI on the right ear; HP, high-performing subgroup of CI users (n = 13); LP, low-performing subgroup of CI users (n = 12); post = postlingually deafened; pre = prelingually deafened.

Demographic characteristics of all CI users CI Manufacturers: Advanced Bionics GmbH, Max-Eyth-Str. 20, 70736 Fellbach-Oeffingen, Germany; Cochlear Headquarters, 1 University Avenue, Macquarie University, NSW, 2109, Australia; MED-EL Elektromedizinische Geräte Gesellschaft m.b.H., Fürstenweg 77a, 6020 Innsbruck, Austria. CI, cochlear implant; CI-bi, bilateral implanted CI user; CI-left, unilateral implanted CI user who was fitted with the CI on the left ear; CI-right, unilateral implanted CI user who was fitted with the CI on the right ear; HP, high-performing subgroup of CI users (n = 13); LP, low-performing subgroup of CI users (n = 12); post = postlingually deafened; pre = prelingually deafened.

Voice Stimuli

We selected all original audio recordings (sampling rate = 44.1 kHz) from a database that was similar to the one described in Frühholz et al. (2015). The database consists of recordings of eight different bisyllabic, five-letter, and phonetically balanced pseudowords, spoken by eight vocal actor portrayals (four female) in 10 different emotional expressions (neutral, anger, fear, happiness, disgust, sadness, achievement, pain, pleasure, and surprise). For the present study, we used four different pseudowords (/belam/, /namil/, /molen/, /loman/), spoken by four speakers (two female) with two different emotional expressions (fearful and angry). This subset of emotions and stimuli was chosen based on both high classification rates and low confusability between selected emotions in a pilot study, in which 10 NH raters performed a 10-alternative-forced choice task (mean correct classification 86.6%). The criterion for selection of an individual stimulus was a minimum performance of 60% correct for a given stimulus. Please note that anger and fear are both high-arousal negative emotions, while they are still characterized by systematic differences in acoustical parameters (cf. Table 2).

TABLE 2.

Acoustic characteristics of stimuli used as continuum endpoints

	Female speakers		Male speakers		Paired t test; anger vs. fear*
	Anger	Fear	Anger	Fear	t(15)	p
F0 mean (in Hz)	366	288	264	205	3.69	0.002
F0 SD (in Hz)	67.1	19.0	45.6	26.8	6.50	0.001
F0 intonation (in Hz)	259	79	171	95	6.99	0.001
F0 glide (in Hz)	–60	3	–41	12	–3.08	0.007
Formant dispersion (in Hz)	910	1043	819	1074	–5.61	0.001
Alpha ratio	1.0	2.3	1.1	2.28	–11.16	0.001
HNR	12.8	21.3	7.3	19	–11.44	0.001
Duration (in ms)	885	760	731	854	0.02	0.988

All acoustical parameters were adapted from McAleer et al. (2014) and extracted using Praat software (Boersma 2018). For F0 extraction, pitch ranges were set to 170–600 Hz for female and 100–370 Hz for male stimuli. F0 intonation = F0max–F0min; F0 glide = F0End–F0Start; Formant dispersion = ratio between consecutive formant means (from F1 to F4, maximum formant frequency set to 5 kHz, window length 0.025 s); alpha ratio (a measure of the spectral slope) = ratio of mean energy within low (0–1 kHz) and high frequencies (1–5 kHz), computed from the long-term average spectrum; HNR was extracted with the cross-correlation method (mean value; time step = 0.01 s; min pitch = 75 Hz; silence threshold = 0.1, periods per window = 1.0).

Including both male and female speakers.

HNR, harmonics-to-noise ratio.

Acoustic characteristics of stimuli used as continuum endpoints All acoustical parameters were adapted from McAleer et al. (2014) and extracted using Praat software (Boersma 2018). For F0 extraction, pitch ranges were set to 170–600 Hz for female and 100–370 Hz for male stimuli. F0 intonation = F0max–F0min; F0 glide = F0End–F0Start; Formant dispersion = ratio between consecutive formant means (from F1 to F4, maximum formant frequency set to 5 kHz, window length 0.025 s); alpha ratio (a measure of the spectral slope) = ratio of mean energy within low (0–1 kHz) and high frequencies (1–5 kHz), computed from the long-term average spectrum; HNR was extracted with the cross-correlation method (mean value; time step = 0.01 s; min pitch = 75 Hz; silence threshold = 0.1, periods per window = 1.0). Including both male and female speakers. HNR, harmonics-to-noise ratio. To create the experimental stimuli for this study, we applied a parameter-specific voice morphing approach to the selected original recordings, using the speech analysis, modification, and resynthesis framework TANDEM-STRAIGHT (Kawahara et al. 2013). TANDEM-STRAIGHT dissects a speech signal in source and filter information; STRAIGHT-based morphing generates highly naturally sounding synthesized voices (for further information, cf. Skuk & Schweinberger 2014; Kawahara & Skuk 2019). We systematically manipulated individual acoustic parameters along a fearful-angry morph continuum, while keeping the respective other acoustic parameters constant at an intermediate 50% morph level (ML). Thus, the relative effects of specific acoustic cues on the perception of vocal emotion expression could be quantified. In the F0 morph type condition, solely the parameter F0 was varied, while the other TANDEM-STRAIGHT acoustic parameters aperiodicity (AP), formant frequencies (FF), spectrum level (SL), and Time (T) were all kept at a 50% ML. Conversely, AP, FF, and SL were considered to reflect timbre, in line with previous research (Skuk & Schweinberger 2014) and were systematically varied in the morph type condition Timbre, while F0 and T were kept constant. In the morph type condition Time, we set F0, AP, FF, and SL at a 50% ML, while only T was varied. Note that the Time (T) parameter does not only reflect overall duration but rather interpolations of individual time anchor positions in the stimuli (for more details, cf. Skuk & Schweinberger 2014; Kawahara & Skuk 2019). Finally, in the morph type condition Full, all five parameters were varied. Morphed test voices were created at six MLs in steps of 20% from 0/100% (anger/fear; equivalent to a fearful voice) to 100/0% morph (equivalent to an angry voice). Please refer to Table 2 for stimulus characteristics of the continuum endpoints. Altogether, 384 stimuli (four speakers × four pseudowords × four morph types × six MLs) were presented in the experiment. Mean duration was 808 ms (SD = 89 ms, range: 540 to 1.017 ms).

Experimental Setting

All participants performed the experiment using the same technical equipment. This included a Lenovo ThinkPad R500 notebook with a 32-bit operating system, an Intel Core Duo Mobile processor T5870 (2.0 GHz), 800 MHz FSB, 2 MB L2-Cache, and a 39.1 cm (15.4′′) TFT display. Voice stimuli were presented binaurally at a presentation level of approximately 70 dB SPL, as measured with a Brüel and Kjær Precision Sound Level Meter Type 2206, using two Logitech loudspeakers (230 V ~ 50 Hz 40 mA). All participants were tested individually in a sound-attenuated room (~4 m2). They were sitting on a comfortable chair 1 m away from the notebook monitor. Loudspeakers were placed next to the monitor.

Procedure

Experimental sessions lasted about 60 minutes for CI users and 30 minutes for NH individuals. Both groups filled in a number of paper-and-pencil questionnaires, including a written self-report questionnaire on demographic data. CI users further answered questions regarding their personal experience with their CIs and subjective causes of hearing loss. In addition, the CI users filled in a version of the 60-item Nijmegen Cochlear Implant Questionnaire* (NCIQ; Hinderink et al. 2000) to evaluate quality of life related to hearing loss. Subsequently, the 384 voice stimuli were presented in a computer experiment programmed with E-Prime 2.0. Unilateral CI users with only one CI were asked to turn off or remove any hearing aids in the contralateral ear for the duration of this experiment to avoid the contribution of residual hearing. Note that bilaterally-implanted CI users were not tested with each CI independently but in the bilateral condition only. While performing the experiment, each CI user was using the processor he or she usually used in daily routine. Experimental instructions were shown on the monitor at the beginning of the experiment to avoid possible interference from the experimenter’s voice. Participants were asked to listen carefully to each voice and to decide as accurately and as fast as possible whether it sounded rather fearful or angry, using the keyboard (“D” for fearful and “L” for angry, German layout). Twenty initial practice trials were presented to ensure that all instructions were fully understood. After the experimenter had reassured that the participant did not have remaining questions, experimental trials were presented in six blocks of 64 trials each. All voices were presented once in random order. Self-paced breaks were allowed after each block. A trial started with a black screen with reminders of response labels (“fearful D,” “angry L”) in the upper left and right corners, respectively. After 500 ms, a green fixation cross appeared for 500 ms and was replaced by a green question mark, the onset of which coincided with the onset of a voice stimulus. For practice trials, only unambiguous fearful or angry voices (i.e., ML 0% or ML 100%) were presented and participants received automatic feedback about the accuracy of their previous response. For experimental trials, all MLs were included and no feedback was given. In case participants failed to respond within 3000 ms following voice offset, the words “Zu langsam! Bitte reagieren Sie schneller!” (“Too Slow! Please respond faster!”) appeared for 1000 ms. Mean duration of the computer experiment was approximately 28 minutes (M = 27.85 minutes, SD = 8.60 minutes) for CI users and 33 min (M = 32.58 minutes, SD = 15.46 minutes) for NH individuals. The study was approved by the Ethics Committee of Jena University Hospital (Reference Number 5282-10/17).

RESULTS

Here, we only report results that were of primary interest for the aim of this study. Further documents (including Supplemental Figures and Tables, Analysis Scripts, and Raw Data) can be found in the associated OSF Repository (https://osf.io/8pc4y/) (http://links.lww.com/EANDH/A984).

Statistical Analysis

Statistical analyses were performed using R (R Core Team 2020). Both errors of omission (no key press; 0.68% of experimental trials) and trials with individual reaction times < 200 ms (measured from voice onset; 0.04%) were excluded from analyses. We used Epsilon corrections for heterogeneity of covariances throughout where appropriate (Huynh & Feldt 1976) but did not otherwise test for distribution assumptions for performing analyses of variance (ANOVAs) due to the remarkable robustness of ANOVAs to violations from normality (cf. Schmider et al. 2010).

Vocal Emotion Recognition Performance Is Impaired in Cochlear Implant Users Compared to Normal-Hearing Individuals

We performed an initial 4 × 6 × 2 × 2 × 2 mixed ANOVA on the proportion of “angry”-responses, with within-subject factors morph type (MType: F0, Full, Timbre, Time), ML (0%, 20%, 40%, 60%, 80%, 100%), speaker sex (SpSex: female, male), and between-subject factors listener sex (LSex: female, male) and listener group (LGroup: CI, NH). As this ANOVA did not reveal any main effects or interactions involving LSex (all ps ≥ 0.063), we collapsed data across LSex for all subsequent analyses. It is important to note that the ANOVA showed several two- and three-way interactions involving LGroup (cf. Table 3), revealing significant differences between the CI and the NH group. Relevant three-way interactions involving LGroup included LGroup × MType × ML, F(15,720) = 8.432, p < 0.001, ε = 0.703, η2 = 0.149 (Fig. 1), and LGroup × ML × SpSex, F(5,240) = 6.208, p < 0.001, ε = 0.898, η2 = 0.115. Please note that, as we were not particularly interested in effects of SpSex, reports of effects and interactions involving this factor only appear in Table 3 and Supplemental Material 4.1.2, 4.2, and 5 (http://links.lww.com/EANDH/A984).

TABLE 3.

Results of the 4 × 6 × 2 × 2 ANOVA on the proportion of “angry”-responses with the factors MType, ML, SpSex, and LGroup

Main effects and interactions	F	df	p	η_p²	ε_HF
ML	185.743	5, 240	< 0.001	0.795	0.478
SpSex	11.686	1, 48	0.001	0.196
LGroup × ML	24.909	5, 240	< 0.001	0.342	0.478
MType × ML	54.647	15, 720	< 0.001	0.532	0.703
LGroup × MType × ML	8.432	15, 720	< 0.001	0.149	0.703
LGroup × ML × SpSex	6.208	5, 240	< 0.001	0.115	0.898
MType × ML × SpSex	2.802	15, 720	< 0.001	0.055

ANOVA, analysis of variance; df, degrees of freedom; LGroup, listener group; ML, morph level; MType, morph type; SpSex, speaker sex.

Fig. 1.

The proportion of “angry”-responses for different morph levels and morph types used in the experiment, separately for normal-hearing individuals and CI users. Note that steeper slopes represent better performance. Error bars represent SEM. Best-fitting cumulative Gaussian functions are also shown. CI, cochlear implant; SEM, standard error of the mean.

Results of the 4 × 6 × 2 × 2 ANOVA on the proportion of “angry”-responses with the factors MType, ML, SpSex, and LGroup ANOVA, analysis of variance; df, degrees of freedom; LGroup, listener group; ML, morph level; MType, morph type; SpSex, speaker sex. The proportion of “angry”-responses for different morph levels and morph types used in the experiment, separately for normal-hearing individuals and CI users. Note that steeper slopes represent better performance. Error bars represent SEM. Best-fitting cumulative Gaussian functions are also shown. CI, cochlear implant; SEM, standard error of the mean. As expected, and as indicated by steeper gradients of classification performance across MLs, visual inspection of Figure 1 suggests that vocal emotion recognition performance generally is much more accurate in NH individuals than in CI users.

Cochlear Implant Users Make Disproportional Use of Timbre Information

Another observation from Figure 1 is that while F0 and timbre cues appear to make virtually identical contributions to performance in normal hearers, timbre cues appear to be more efficiently processed in CI users. To follow-up significant interactions with LGroup (cf. Table 3), we performed subsequent statistical analyses separately for both groups. First, we assessed pairwise differences between morph types within the group of CI users at a global level by performing three separate 2 × 6 repeated-measures ANOVAs with the factors MType and ML. For these, we contrasted adjacent morph types (where “adjacent” was defined by decreasing degrees of performance levels between morph types) with each other, the three pairwise comparisons involved contrasts between Full and Timbre, Timbre and F0, and F0 and Time. For all these contrasts, the ANOVAs revealed significant main effects of ML, Fs(5,120) ≥ 7.566, ps < 0.001, ε ≤ 0.801, η2 ≥ 0.240, but not for MType, Fs(1,24) ≤ 1.596, ps ≥ 0.219, η2 ≤ 0.062. Importantly, interactions of MType × ML were found for the contrast between Full and Timbre, F(5,120) = 2.979, p = 0.014, η2 = 0.110, Timbre and F0, F(5,120) = 4.200, p = 0.005, ε = 0.725, η2 = 0.149, and F0 and Time, F(5,120) = 3.039, p = 0.022, ε = 0.789, η2 = 0.112. Accordingly, performance was best for Full morphs, and the results demonstrated a superior performance for Timbre morphs compared to F0 morphs in CI users. Time cues were less efficient than F0 cues to solve the task. Finally, further analyses (cf. Supplemental Material 4.1.1.1, http://links.lww.com/EANDH/A984) confirmed that while effects of morph type were significant for the more extreme MLs (0%, 20%, 80%, and 100%), they were not significant for the intermediate MLs, as expected.

Timbre and F0 Cues Are Equally Efficient in Normal-Hearing Individuals

Analogous to the analysis performed for the CI users, we first assessed pairwise differences between morph types within NH individuals by computing three separate 6 × 2 repeated-measures ANOVAs with the factors MType and ML. Again, we contrasted morph types Full and Timbre, Timbre and F0, and F0 and Time. For all these contrasts, significant main effects of ML, Fs(5,120) ≥ 72.162, ps < 0.001, η2 ≥ 0.750, were found. For the contrast between Timbre and F0, we moreover found a significant main effect of MType, F(1,24) = 4.712, p = 0.040, η2 = 0.164; there was no significant main effect of MType for the contrasts between Full and Timbre and between F0 and Time, Fs(1,24) ≤ 1.311, ps ≥ 0.263, η2 ≤ 0.052. Most importantly, the ANOVAs revealed interactions of MType x ML for the contrast between Full and Timbre, F(5,120) = 27.120, p < 0.001, η2 = 0.531, and between F0 and Time, F(5,120) = 75.299, p < 0.001, η2 = 0.758, but not between Timbre and F0, F(5,120) = 0.278, p = 0.924, η2 = 0.011. These results confirm the impression from Figure 1: while in CI users, timbre cues were more efficient than F0 cues to solve the task, NH individuals made equally efficient use of timbre and F0 cues. Finally, further analyses (cf. Supplemental Material 4.1.1.2, http://links.lww.com/EANDH/A984) confirmed that NH listeners´ performance for F0 and Timbre morphs did not differ significantly from each other at any ML. In addition and also at variance with the results for CI users, these analyses also indicated some sensitivity of NH listeners to Full morphs at the intermediate 40% and 60% MLs that should contain relatively ambiguous vocal emotional information only.

High-Performing Cochlear Implant Users Rely on Timbre Almost As Efficiently As Normal-Hearing Individuals Do, But Still Perform Lower When Having to Rely on F0

Since a visual inspection of the individual Gaussian fits on the proportion of “angry”-responses indicated considerable individual differences between CI users (see Supplemental Material 4.3.2.1, http://links.lww.com/EANDH/A984), the CI group was separated into two performance groups (PerfGroups) by using the median of the deviation of individual CI performance from the average performance of NH group DEVall as a cutoff: the high-performing CI users (n = 13) and the low-performing CI users (n = 12). The performance measure deviation (DEV) indicates how much a CI user’s performance deviates from the average performance of the NH individuals. The smaller the DEV is for a given CI user, the more similar is her/his performance to the average performance of NH individuals. In that sense, smaller DEV scores indicate better performance (for a similar approach, see Fuller et al. 2014a; Skuk et al. 2020). For each CI user, we calculated DEV as follows: (1) For each stimulus of the experiment, we calculated how “angry” it was perceived on average across all NHs, that is, stimAngAVG. (2) Then, for each CI user and stimulus separately, we subtracted the performance of the CI user from the stimAngAVG and then took the absolute value of the result to get a difference measure for each stimulus independent of the polarity of the difference. (3) The DEV for a given CI user is then the absolute mean difference across all stimuli. We calculated DEV for all stimuli of all morph types together (DEVall) and also separately for the stimuli of individual morph types (that is DEVFull, DEVF0, DEVTimbre, DEVTime). A 4 × 6 × 2 mixed ANOVA on the proportion of “angry”-responses with factors MType and ML and the between-subject factor PerfGroup (high-performing CI, low-performing CI) revealed main effects of PerfGroup, F(1,23) = 9.316, p = 0.006, η2 = 0.288, and ML, F(5,115) = 40.239, p < 0.001, ε = 0.698, η2 = 0.636. They were qualified by several interactions (cf. Table 4) that were not post hoc tested any further, as the two CI PerfGroups were expected to differ significantly from one another.

TABLE 4.

Two-way interactions of the 4 × 6 × 2 ANOVA on the proportion of “angry”-responses with the factors MType, ML, and PerfGroup, including both the high-performing and the low-performing CI users

Main effects and interactions	F	df	p	η_p²	ε_HF
MType × ML	10.823	15, 345	< 0.001	0.320	0.738
PerfGroup × ML	15.066	5, 115	< 0.001	0.396	0.698

ANOVA, analysis of variance; CI, cochlear implant; df, degrees of freedom; ML, morph level; MType, morph type; PerfGroup, performance group.

Two-way interactions of the 4 × 6 × 2 ANOVA on the proportion of “angry”-responses with the factors MType, ML, and PerfGroup, including both the high-performing and the low-performing CI users ANOVA, analysis of variance; CI, cochlear implant; df, degrees of freedom; ML, morph level; MType, morph type; PerfGroup, performance group. However, visual inspection of Figure 2 suggests that the high-performing CI users exhibited a pattern of results similar to the NH individuals, while the low-performing CI users seemed to be close to guessing level.

Fig. 2.

The proportion of “angry”-responses for different morph levels and morph types used in the experiment, separately for the normal-hearing individuals, the high-performing CI users (n = 13), and the low-performing CI users (n = 12). Note that steeper slopes represent better performance. Error bars represent SEM. Best-fitting cumulative Gaussian functions are also shown. CI, cochlear implant; SEM, standard error of the mean. To compare performance of the high-performing CI users with performance of NH individuals, we calculated a 4 × 6 × 2 × 2 mixed ANOVA on the proportion of “angry”-responses with the within-subject factors MType, ML, SpSex, and a between-subject factor PerfGroup (high-performing CI, NH). Only interested in the group differences here (refer to Table 5 for full display of interactions), we focused on the found interaction PerfGroup × MType × ML, F(15,540) = 2.840, p = 0.001, ε = 0.759, η2 = 0.073.

TABLE 5.

Main effects and interactions	F	df	p	η_p²	ε_HF
ML	284.578	5, 180	< 0.001	0.888	0.637
SpSex	6.594	1, 36	0.015	0.155
PerfGroup × ML	7.298	5, 180	< 0.001	0.169	0.637
MType × ML	69.354	15, 540	< 0.001	0.658	0.759
ML × SpSex	2.360	5, 180	0.042	0.062
MType × SpSex	2.705	3, 108	0.049	0.070
PerfGroup × MType x ML	2.840	15, 540	0.001	0.073	0.759
PerfGroup × ML × SpSex	3.651	5, 180	0.004	0.092
MType × ML × SpSex	2.976	15, 540	0.001	0.076	0.796

ANOVA, analysis of variance; CI, cochlear implant; df, degrees of freedom; ML, morph level; MType, morph type; PerfGroup, performance group; SpSex, speaker sex.

Results of the explorative 4 × 6 × 2 × 2 ANOVA on the proportion of “angry”-responses with the factors MType, ML, SpSex, and PerfGroup, including the high-performing CI users and the normal-hearing individuals ANOVA, analysis of variance; CI, cochlear implant; df, degrees of freedom; ML, morph level; MType, morph type; PerfGroup, performance group; SpSex, speaker sex. We post hoc tested this interaction by comparing each MType between high-performing CI users and NH individuals. Therefore, we separately calculated four ANOVAs, one per MType, with the factors ML and PerfGroup (PerfGroup: high-performing CI, NH). Importantly, we found no differences between high-performing CI users and NH individuals for timbre and timing (cf. Figure S8 in the Supplemental Material, http://links.lww.com/EANDH/A984), as all main effects and interactions involving PerfGroup were nonsignificant (ps ≥ 0.217). However, some differences between high-performing CI users and NH individuals were found for Full and F0 (cf. Figure S8 in the Supplemental Material, http://links.lww.com/EANDH/A984; refer to Table 6 for full statisticshttp://links.lww.com/EANDH/A984).

TABLE 6.

Main effects and interactions	F	df	p	η_p²	ε_HF
Full
ML	405.728	5, 180	< 0.001	0.919	0.646
PerfGroup × ML	6.684	5, 180	< 0.001	0.157	0.646
F0
ML	80.518	5, 180	< 0.001	0.691	0.643
PerfGroup × ML	8.037	5, 180	< 0.001	0.183	0.643

ANOVA, analysis of variance; CI, cochlear implant; df, degrees of freedom; ML, morph level; PerfGroup, performance group.

Results of the 6 x 2 ANOVAs for the morph types F0 and Full on the proportion of “angry”-responses with the factors ML and PerfGroup, including the high-performing CI users and the normal-hearing individuals ANOVA, analysis of variance; CI, cochlear implant; df, degrees of freedom; ML, morph level; PerfGroup, performance group. For F0, independent-sample two-tailed t tests revealed significant differences between high-performing CI and NH, reflecting better performance for NH listeners, for 20%, 60%, 80%, and 100% MLs, |ts(74)| ≥ 2.514, ps ≤ 0.014, but no differences were significant for 0% and 40% ML, |ts(74)| ≤ 1.473, ps ≥ 0.145. For Full, better performance for NH listeners was found for 0%, 80%, and 100% MLs, |ts(74)| ≥ 2.696, ps ≤ 0.009, with no significant differences for more ambiguous 20%, 40%, and 60% MLs, |ts(74)| ≤ 1.940, ps ≥ 0.056. In summary, the results indicate that timbre was used similarly efficiently by high-performing CI users as it was used by NH individuals. For Time, both high-performing CI users and NH individuals were close to guessing level. Full and F0, however, were used to a smaller extent by high-performing CI users than by NH individuals. Please also refer to Supplemental Material 4.3.1 (http://links.lww.com/EANDH/A984) for results of analyses of the cumulative Gaussian’s slopes that reflected and, thus, supported these results.

The Cochlear Implant Users’ Ability to Perceive Vocal Emotions Is Positively Correlated With Quality of Life

To explore relations between the CI users’ ability to perceive vocal emotion and perceived quality of life, the performance measures DEVall, DEVFull, DEVF0, DEVTimbre, and DEVTime were correlated with the scores of the NCIQ (i.e., the NCIQ total score and five subscores). Considering the directed hypothesis on the relationship between vocal emotion perception and quality of life (see Introduction), correlations were performed one-tailed. Because smaller DEV scores correspond to better performance, we expected negative correlations. As expected, DEVall was negatively related to the total score of the NCIQ, r = –0.390, p = 0.027, n = 25, indicating that the CI users’ ability to perceive the emotional expression in voices and perceived quality of life are related. Furthermore, DEVFull, r = –0.375, p = 0.032, n = 25, and DEVTimbre, r = –0.404, p = 0.023, n = 25, were also associated with the total score of NCIQ. Surprisingly, no significant relations between both DEVF0, r = –0.320, p = 0.059, n = 25 and DEVTime, r = –0.319, p = 0.060, n = 25 and the total score were revealed. We also observed significant relations between DEV scores and each of the NCIQ´s subdomains except for self-esteem; at a descriptive level, relations tended to be most prominent for the subscales of advanced sound perception, speech production, and activity limitations (please refer to Supplemental Material 4.4.1, http://links.lww.com/EANDH/A984 for full details).

DISCUSSION

At a general level, our findings are in line with earlier reports that CI users perform lower in vocal emotion recognition than NH individuals (e.g., Luo et al. 2007; Schorr et al. 2009; Agrawal et al. 2013; See et al. 2013; Volkova et al. 2013; Wiefferink et al. 2013; Jiam et al. 2017; Kim & Yoon 2018; Paquette et al. 2018; Tinnemore et al. 2018; Waaramaa et al. 2018). However, at a more specific level, when considering contributions of fundamental frequency, timbre, and timing cues to the perception of vocal emotion in CI users, the current findings represent an intriguing contrast to previous research: Our results suggest a greater contribution of timbre than F0, at variance with former reports—originating from gender perception tasks—that CI users cannot dependably use timbre (Fuller et al. 2014a). Fuller et al. (2014a) proposed two possible explanations for the poor usage of timbre. First, CI users could be unable to perceive timbre because its representation is not transferred to the auditory nerve due to large excitation fields of adjacent electrodes. Second, timbre representations—even when partially present in the neuronal code—may be too weak or too distorted to be reliably used. Our findings, however, suggest that some CI users can actually rely on timbre just as well as NH individuals can, at least under the conditions of the present experiment. This shows that CI systems, in principle, can efficiently transfer the acoustic parameters defining timbre (here, FF, spectral level information, and AP; cf. Skuk et al. 2020). In addition, recent research using harmonic complex tones suggests an interdependence between aspects of timbre processing (here, spectral slope) and F0 in both NH listeners and CI users (Luo et al. 2019). Concerning the present study on voices, the transmission of individual acoustic parameters defining voice timbre and their combined contributions to emotion perception will require further research. Moreover, since the neuronal representation of timbre will inevitably be distorted by the CI, our findings could suggest a remarkable degree of neuronal plasticity in the afferent pathway or the auditory cortex of a substantial proportion of CI users, enabling them to efficiently process timbre in emotional voices. In the present experiment, vocal emotion recognition performance based on timing cues alone was virtually at chance levels. In our opinion, the most likely explanation for this finding was that timing cues were largely uninformative for the specific emotional contrast (fearful versus angry) we tested in this study (cf. Table 2). Crucially, timing cues were used neither by CI users nor NH listeners. As such, these results are not necessarily in contradiction with other studies proposing that CI users rely more on tempo-information than on other cues such as pitch (Kalathottukaren et al. 2015). Thus, timing cues may well be informative for CI users´ recognition and discrimination of other emotional categories or contrasts (e.g., happy versus sad). Several previous studies claim—using gender perception tasks—that fundamental frequency is the most robust and salient acoustic parameter for CI users or CI simulations in NH listeners (e.g., Fuller et al. 2014a; Fuller et al. 2014b). Our results, however, suggest difficulties in processing F0 information in some vocal emotions, as even the subgroup of high-performing CI users was significantly handicapped compared to NH individuals when judging emotions based on F0 cues alone. Of interest, in a vocal emotion task, Gilbers et al. (2015) reported a bias toward F0 range cues in CI users, whereas mean F0 constitutes a more salient cue for NH individuals. It is important to note that both mean F0 and F0 range were potentially diagnostic for the present emotion contrast (cf. Table 2). Several other studies investigated pitch perception irrespective of vocal emotions, thus only allowing restricted comparisons with the present study. Sucher and McDermott (2007) studied CI users’ ability to perceive changes in pitch (with a range of 98 to 740 Hz) in complex musical stimuli and observed poor pitch change perception in CI users, broadly in line with the present findings. It is not entirely clear how these results can be reconciled with other findings of relatively preserved pitch perception with a CI (e.g., Fuller et al. 2014b). One possibility is that F0 differences in the present emotion contrast were less salient compared to F0 differences for other emotions (Banse & Scherer 1996) or for other social signals. For instance, mean F0 in female voices (200 to 220 Hz) is almost twice as high than mean F0 in male voices (100 to 120 Hz; for a review, see Simpson 2009), and F0 alone is efficiently used by CI users in the perception of speaker gender (Skuk et al. 2020). Overall, findings that inform about the degree to which F0 or timbre information can be perceived by CI users are somewhat inconsistent across studies. One way to account for such discrepancies is by considering the influence of other factors such as the nature of the auditory stimuli (e.g., vowels, words, sentences) or the type of social signal (e.g., speech comprehension, emotion perception, gender perception). For instance, Meister et al. (2016; see also Meister et al. 2020) compellingly argued that the ability of CI users to utilize timbre, while limited for brief stimuli, is relatively preserved for sentences with their larger phonetic variability and suprasegmental information. In our view, more research is also needed to determine the role of the social signal for voice perception in CI users (Schweinberger et al. 2020). The present data, for example, strongly suggest that the ability to use timbre information can be relatively preserved in a vocal emotion task, even for brief stimuli (bisyllabic pseudowords). The pattern of results from the present experiment also potentially forms a double dissociation relative to the pattern discovered by Skuk et al. (2020). Specifically, Skuk et al. (2020) found CI users´ ability to perceive speaker gender in brief bisyllabic stimuli to be exclusively based on F0, with minimal or no use of timbre, which is directly opposite to the present pattern for vocal emotion perception. Thus, it seems important to consider the type of social signal in tasks that assess nonverbal voice perception abilities in CI users. There is a need for more systematic evidence regarding interactive contributions of stimulus type and social signal to the use of F0 and timbre cues by CI users. Overall, our data show that some CI users can efficiently process timbre in emotional voices beyond what would be expected based on earlier findings, and despite the fact that CIs degrade prosodic information (e.g., Nakata et al. 2012), probably partially due to their small number of electrodes. We are intrigued by recent evidence that rehabilitation programs may improve the perception of prosody in CI users (Vandali et al. 2015). In their review, Jiam et al. (2017) discuss the possibility that auditory trainings might potentially transfer to enhance vocal emotion recognition in CI users (e.g., Krull et al. 2012; for a review, see Nussbaum & Schweinberger 2021) but emphasize that much further research into the potential of such trainings is warranted. This seems particularly relevant because there is increasing evidence that vocal emotion recognition skills in CI users are positively linked to quality of life, both in children (Schorr et al. 2009) and adults, as shown in both the present study and a recent report (Luo et al. 2018). As a perspective, further research on multimodal emotion perception in CI users also seems promising, especially when considering how current models of face and voice processing emphasize the multisensory nature of emotions (e.g., Young et al. 2020). Differences in CI technology alone may be insufficient to explain the present striking degree of individual differences. It seems more likely that the degree of reorganization triggered by the individual history of sensory deprivation (Ponton et al. 2000; Gordon et al. 2011) promotes speech-related facial processing through cross-modal plasticity, allowing more efficient audiovisual integration after cochlear implantation (e.g., Rouger et al. 2012). Last but not least, further research should aim at delineating the perceptual abilities and strategies that CI users employ when perceiving different types of (social) signals. Ultimately, a better understanding of possibilities and limitations of CI users to perceive different auditory cues and social signals might promote not only an improvement of CI design but also the development of tailor-made perceptual training programs. Together, such a focus on nonverbal aspects of the voice might further enhance social communication and, ultimately, quality of life for CI users. In conclusion, when comparing vocal emotion perception in CI users and NH individuals using parameter-specific voice morphing (Skuk & Schweinberger 2014), CI users were far more efficient in using timbre than F0 information in the present experiment. We also observed an enormous degree of interindividual variability; a subgroup of high-performing CI users relied on timbre cues virtually as efficiently as NH individuals did while showing evidence for reduced usage of F0 information. Thus, in the context of the present vocal emotion task, CIs seem inefficient in conveying emotion based on F0 alone. Our results challenge many earlier findings by demonstrating that CI users actually can efficiently use timbre cues in some situations. Moreover, they form a potential double dissociation with a consistent previous pattern of results for voice gender perception, in which CI users exhibit efficient use of F0 but inefficient use of timbre. Accordingly, the current results could indicate that the type of social signal needs to be considered when assessing F0 and timbre perception skills in CI users. The ability to perceive vocal emotions was associated with quality of life. As a perspective, the present findings could inform both perceptual training interventions and improvements in CI technology and ultimately could contribute to enhancing CI users’ social-emotional communication skills.

ACKNOWLEDGMENTS

The authors would like to thank all participants for their time and cooperation in this investigation. Thanks go also to Bettina Kamchen and Kathrin Rauscher for various forms of support during the study.

OPEN PRACTICES

This manuscript qualifies for an Open Data Badge. The data have been made publically available at https://osf.io/8pc4y/. More information about the Open Practices Badges can be found at https://journals.lww.com/ear-hearing/pages/default.aspx.https://journals.lww.com/ear-hearing/pages/default.aspx.

63 in total

1. Talker-identification training using simulations of binaurally combined electric and acoustic hearing: generalization to speech and emotion recognition.

Authors: Vidya Krull; Xin Luo; Karen Iler Kirk
Journal: J Acoust Soc Am Date: 2012-04 Impact factor: 1.840

2. Vocal emotion recognition by normal-hearing listeners and cochlear implant users.

Authors: Qian-Jie Fu; John J Galvin
Journal: Trends Amplif Date: 2007-12

3. Interaction Between Pitch and Timbre Perception in Normal-Hearing Listeners and Cochlear Implant Users.

Authors: Xin Luo; Samara Soslowsky; Kathryn R Pulling
Journal: J Assoc Res Otolaryngol Date: 2018-10-30

4. Association of Cognition and Age-Related Hearing Impairment in the English Longitudinal Study of Ageing.

Authors: Jaydip Ray; Gurleen Popli; Greg Fell
Journal: JAMA Otolaryngol Head Neck Surg Date: 2018-10-01 Impact factor: 6.223

5. Children with bilateral cochlear implants identify emotion in speech and music.

Authors: Anna Volkova; Sandra E Trehub; E Glenn Schellenberg; Blake C Papsin; Karen A Gordon
Journal: Cochlear Implants Int Date: 2013-03

6. Speech intonation and melodic contour recognition in children with cochlear implants and with normal hearing.

Authors: Rachel L See; Virginia D Driscoll; Kate Gfeller; Stephanie Kliethermes; Jacob Oleson
Journal: Otol Neurotol Date: 2013-04 Impact factor: 2.311

Review 7. Technological, biological, and acoustical constraints to music perception in cochlear implant users.

Authors: Charles J Limb; Alexis T Roy
Journal: Hear Res Date: 2013-05-07 Impact factor: 3.208

Review 8. Nonverbal auditory communication - Evidence for integrated neural systems for voice signal production and perception.

Authors: Sascha Frühholz; Stefan R Schweinberger
Journal: Prog Neurobiol Date: 2020-11-12 Impact factor: 11.685

9. Normal-Hearing Listeners' and Cochlear Implant Users' Perception of Pitch Cues in Emotional Speech.

Authors: Steven Gilbers; Christina Fuller; Dicky Gilbers; Mirjam Broersma; Martijn Goudbeek; Rolien Free; Deniz Başkent
Journal: Iperception Date: 2015-10-18

10. Discrimination of Voice Pitch and Vocal-Tract Length in Cochlear Implant Users.

Authors: Etienne Gaudrain; Deniz Başkent
Journal: Ear Hear Date: 2018 Mar/Apr Impact factor: 3.570

1 in total

1. Enhancing socio-emotional communication and quality of life in young cochlear implant recipients: Perspectives from parameter-specific morphing and caricaturing.

Authors: Stefan R Schweinberger; Celina I von Eiff
Journal: Front Neurosci Date: 2022-08-25 Impact factor: 5.152

1 in total