Literature DB >> 27094028

Cochlear Implant Rate Pitch and Melody Perception as a Function of Place and Number of Electrodes.

Vijay Marimuthu¹, Brett A Swanson², Robert Mannell¹.

Abstract

Six Nucleus cochlear implant recipients participated in a study investigating the effect of place of stimulation on melody perception using rate-pitch cues. Each stimulus was a pulse train delivered on either a single electrode or multiple electrodes sequentially. Four spatial stimulation patterns were used: a single apical electrode, a single mid electrode, a pair of electrodes (apical and mid), and 11 electrodes (from apical to mid). Within one block of trials, all stimuli had the same spatial stimulation pattern, with pulse rate varying from 131 to 262 pps. An additional pulse rate range of 262 to 523 pps was tested with the single-electrode stimuli. Two experimental procedures were used: note ranking; and a modified melodies test with backwards and warp modification. In each trial of the modified melodies test, a familiar melody and a version with modified pitch were presented (in random order), and the subject's task was to select the unmodified melody. There were no significant differences in performance for stimulation on 1, 2, or 11 electrodes, implying that recipients were unable to combine temporal information from different places in the cochlea to give a stronger pitch cue. No advantage of apical electrodes was found: at the lower pulse rates, there were no significant differences between electrodes; and at the higher pulse rates, scores on the apical electrode dropped more than those on the mid electrode.

Entities: Chemical Disease Gene Species

Keywords: cochlear implants; melody perception; pitch

Mesh：

Year: 2016 PMID： 27094028 PMCID： PMC4871214 DOI： 10.1177/2331216516643085

Source DB: PubMed Journal: Trends Hear ISSN： 2331-2165 Impact factor: 3.293

Cochlear implants can provide three different types of perceptual cues that are all commonly referred to as “pitch.” Place pitch is the percept associated with the place of stimulation, with apical electrodes having a low place pitch, and basal electrodes a high place pitch. Rate pitch is the percept associated with pulse rate. Modulation pitch is the percept associated with the frequency of amplitude modulation applied to a high-rate carrier pulse train. Rate pitch and modulation pitch have similar perceptual characteristics (Kong, Deeks, Axon, & Carlyon, 2009; Landsberger, 2008; McDermott, 2004; McDermott & McKay, 1997; McKay, McDermott, & Clark, 1994), and both can be categorized as forms of temporal pitch. In this article, a ranking task is defined as a task that requires the subject to order the stimuli along a perceptual scale (e.g., a two-interval, two-alternative forced choice task, where the subject is asked to select the stimulus that has the higher pitch). In contrast, a discrimination task is defined as a task where the subject merely has to detect a difference between stimuli, without having to apply any ordering to them (e.g., a four-interval, four-alternative forced choice task, where the subject is asked to select the stimulus that differs from the rest; McDermott, 2004). Neither of these tasks demonstrates that the percept can convey a melody, which is a more restrictive definition of pitch (Plack & Oxenham, 2005). Cochlear implant rate pitch has been shown to satisfy a strict melody-based definition of pitch (McDermott & McKay, 1997; Pijl & Schwarz, 1995a, 1995b). The first research question addressed by the present study was whether rate-pitch perception would improve as more electrodes were stimulated. This could be due to the pitch information being carried on more nerve fibers and then being combined by a central processor. Alternatively, it could simply be due to some electrodes being more effective than others, and stimulating more electrodes improved the chances of including better electrodes. An electrode’s effectiveness may depend on its physical distance from the modiolus and the number of surviving nerve fibers in its vicinity (Long et al., 2014). Venter and Hanekom (2014) investigated rate pitch on 1, 2, 5, 9, or 18 electrodes. Using a discrimination task, they found a systematic improvement in performance at rates above 300 Hz as the number of electrodes increased. However, using a pitch-ranking task, the number of electrodes had no effect on performance. Similar results were previously obtained by Carlyon, Deeks, and McKay (2010), who found no significant difference between single-channel stimuli and seven-channel stimuli in a pitch-ranking task, despite three of the four listeners scoring better with the seven-channel stimuli at one or more high rates (≥450 Hz) in a discrimination task. The differing outcomes from discrimination and ranking tasks show that the choice of experimental procedure is important. One possible explanation for these results is that at high rates, a change in pulse rate may produce no change in pitch, but a change in some other perceptual quality (which may have no counterpart in normal hearing). Recently, Penninger, Kludt, Büchner, and Nogueira (2015) compared rate pitch ranking performance with 1, 3, 6, and 11 electrodes. Increasing the number of electrodes had no effect at 100 or 300 Hz but provided a significant improvement in scores at 500 Hz. However, a possible confounding factor was that in the multiple-electrode stimuli, the electrodes were stimulated in a random order on each cycle, causing variations in the interpulse intervals, as illustrated in Figure 1. All interpulse intervals in the 500 Hz single-channel stimulus were 2000 µs. In contrast, the interpulse intervals in the 500 Hz 11-channel stimulus ranged from 1,330 to 2,670 µs. In the model for temporal pitch perception proposed by Carlyon, van Wieringen, Long, Deeks, and Wouters (2002), pitch is determined by a weighted sum of the auditory nerve interspike intervals, with the longest intervals receiving more weight. The interspike intervals depend on the interpulse intervals of the stimuli. Because pitch-ranking performance degrades as pulse rate increases (i.e., as interpulse intervals decrease), the performance advantage of the 11-channel stimuli may merely have been due to them containing longer interpulse intervals than the single-channel stimuli. This advantage increased with fundamental frequency, because the time taken to emit a burst of 11 pulses was constant (670 µs) and so became a larger proportion of the fundamental period (6.7% at 100 Hz, 20% at 300 Hz, and 33% at 500 Hz). This may explain why the only significant difference in scores found by Penninger et al. (2015) was between the 11-channel and single-channel stimuli at 500 Hz.

Figure 1.

Electrodograms of some stimuli used by Penninger et al. (2015). Each pulse is represented by a vertical line, with the horizontal position indicating the start time (onset) of the pulse and the vertical position indicating the channel number. The upper panel shows one period of the 500 Hz single-channel stimulus, with interpulse intervals of 2000 µs, as indicated. The lower panel shows the 500 Hz 11-channel stimulus, in which the order of the electrodes was different in each period. The time between pulse onsets was 67 µs. In this figure, the electrode order was chosen to demonstrate the maximum (2670 µs) and minimum (1330 µs) interpulse intervals; with randomized order, the interpulse intervals would range between these extremes. The second research question was whether rate-pitch perception would be better on apical electrodes than on mid electrodes. In normal hearing, only sounds that contain resolved harmonics evoke a strong pitch sensation, and it appears that a particular relationship is required between the place and temporal cues (Loeb, White, & Merzenich, 1983; Oxenham, Bernstein, & Penagos, 2004). Schatzer et al. (2014) asked eight Med-El cochlear implant recipients to perform an adaptive pitch-ranking procedure, where the pulse rate on a single electrode was adjusted to match the pitch of a reference acoustic tone presented to their other ear, which had near-normal hearing. For acoustic tones with frequencies below 300 Hz, subjects gave the most consistent matches for deeply inserted electrodes. Schatzer et al. speculated that “at shallower insertion depths, the increasing mismatch between temporal rate and tonotopic place of stimulation may be too large to elicit a reliable low pitch percept.” (p 34). However, other studies have not supported this view. Baumann and Nobbe (2004) found no difference between apical and basal electrodes in a pulse rate discrimination task. Kong et al. (2009) found no consistent pattern in rate-pitch ranking scores between apical, mid, and basal electrodes. Pijl and Schwarz (1995b) found no difference between apical, mid, and basal electrodes in interval recognition, or interval labeling, or closed-set melody recognition at pulse rates of 200 pps and lower. Many previous studies have focussed on the upper limit of rate pitch perception (Carlyon et al., 2010; Kong et al., 2009; Macherey, Deeks, & Carlyon, 2011; Penninger et al., 2015; Venter & Hanekom, 2014). Instead, the present study measured performance at lower fundamental frequencies, where rate pitch is most salient and more relevant to the perception of voice pitch and music. A notable improvement over some previous studies is that the present study explicitly tested melody perception to evaluate pitch in its strictest musical sense. The two melodies used in the present study are represented in Table 1 (also see Figure 4). The pitch of each note is specified as the number of semitones above a reference note. Choosing a different reference note, which adds (or subtracts) a fixed number of semitones to every note in the melody, is known as transposing the melody. Transposition does not affect the identity of the melody; thus, a melody is really defined by its intervals, the differences in pitch between consecutive notes. The contour of a melody is defined as the directions of the pitch changes between consecutive notes (up, down, or no change).

Table 1.

The Two Familiar Melodies Used in the Modified Melodies Test.

Melody	Old Macdonald												Twinkle Twinkle Little Star
Duration (beats)	1	1	1	1	1	1	2	1	1	1	1	2	1	1	1	1	1	1	2	1	1	1	1	1	1	2
Pitch (semitones)	5	5	5	0	2	2	0	9	9	7	7	5	0	0	7	7	9	9	7	5	5	4	4	2	2	0
Intervals (semitones)		0	0	−5	+2	0	−2	+9	0	−2	0	−2		0	+7	0	+2	0	−2	−2	0	−1	0	−2	0	−2
Contour		0	0	−1	+1	0	−1	+1	0	−1	0	−1		0	+1	0	+1	0	−1	−1	0	−1	0	−1	0	−1

Note. For each note, the duration is specified as the number of beats and the pitch is specified as the number of semitones above a reference note. The intervals are calculated by taking the pitch of a note (in semitones) and subtracting the pitch of the preceding note. The contour is the direction of the pitch changes, that is, the sign of each interval; +1 indicates a pitch increase, −1 indicates a pitch decrease, and 0 indicates no change in pitch.

Figure 4.

Melodies used in the backwards melodies procedure. The top row shows the original melodies, “Old MacDonald” (left) and “Twinkle Twinkle Little Star” (right). The bottom row shows the corresponding backwards melodies. Time (in beats) is indicated on the horizontal axis. The vertical axis shows fundamental frequency on a logarithmic scale (i.e., linear in semitones), with note names indicated.

The Two Familiar Melodies Used in the Modified Melodies Test. Note. For each note, the duration is specified as the number of beats and the pitch is specified as the number of semitones above a reference note. The intervals are calculated by taking the pitch of a note (in semitones) and subtracting the pitch of the preceding note. The contour is the direction of the pitch changes, that is, the sign of each interval; +1 indicates a pitch increase, −1 indicates a pitch decrease, and 0 indicates no change in pitch. A melody discrimination task was considered unsuitable for the present study, because a reference melody and its variant may sound different to each other but neither may be recognizable. A familiar melody recognition task has the issue that rhythmic cues alone may be sufficient to identify the melodies. Even without rhythmic cues, Dowling and Fujitani (1971) found that normal-hearing listeners scored far better than chance in a closed-set familiar melody recognition task when the pitch of each melody was changed while preserving its contour. Thus, a high score on a melody recognition task does not necessarily imply accurate perception of musical intervals. Instead, the modified melodies test (Swanson, 2008; Swanson, Dawson, & McDermott, 2009) was chosen for the present study, to measure perception of melodic contour and interval size (as will be explained in Methods section). A pitch-ranking task was also included to provide a point of comparison with previous studies.

Methods

Subjects

Six postlingually deafened adult cochlear implant recipients participated in the study (Table 2). Their mean age was 64.3 (standard deviation [SD] = 14.2), and they had an average of 6.8 years of implant use (SD 2.9). The study was approved by the human research ethics committees of Macquarie University and the Sydney South West Area Health Service.

Table 2.

Subject Details.

Subject ID	S1	S2	S3	S4	S5	S6
Gender	M	F	M	F	F	M
Age	71	70	65	65	37	78
Years of implant use	4.5	4	7	12	7	6
Etiology	Progressive	Progressive	Progressive	Sudden (Ear Infection)	Ototoxicity	Progressive
Implant type	CI24 RE (ST)	CI24 RE (ST)	CI24 RE (CA)	CI24M	CI24M	CI24 RE (CA)
Pulse rate (Hz)	900	500	900	900	900	900
Processor	CP810	CP810	Freedom	Freedom	Freedom	CP810
Processing strategy	ACE	ACE	ACE	ACE	ACE	ACE

Note. ACE = advanced combination encoder.

Subject Details. Note. ACE = advanced combination encoder.

Stimuli

All stimuli consisted of biphasic, monopolar (MP1 + 2) pulses, with a phase width of 25 µs and an interphase gap of 8 µs. Each stimulus can be thought of as a melody (a sequence of musical notes), where each note was a pulse train on 1, 2, or 11 electrodes, with the pulse rate on each electrode equal to the fundamental frequency of the note. A single note (one beat) was 300 ms in duration. The same set of electrodes was stimulated in all of the notes within one melody, and in all of the melodies within one block of trials, so that the only cue to pitch was pulse rate. Two ranges of pulse rate were used, as shown in Table 3.

Table 3.

Rate Conditions.

Rate Condition	Note range	Frequency range (Hz)
Octave 3	C3–C4	131–262
Octave 4	C4–C5	262–523

Rate Conditions. The four spatial patterns are described in Table 4, and representative stimuli are shown in Figure 2. Electrodes in the Nucleus system are labeled from E22 at the apex to E1 at the base. Some recipients did not use all 22 electrodes, so the stimuli were specified in terms of channel numbers, with the electrodes in use numbered consecutively, starting from 1 for the most apical electrode in use. Channel 1 was E22 for all subjects. Channel 11 was E12 for all subjects except S4, who used E11 (because electrode E13 was deactivated).

Table 4.

Spatial Conditions.

Spatial condition	Number of electrodes	Channel numbers stimulated
Apex	1	1
Mid	1	11
Pair	2	1, 11
Scan	11	1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11

Note. Channels are numbered from 1 at the apex.

Figure 2.

Electrodograms of stimuli used in the present study, showing the four spatial patterns: Apex, Mid, Pair, and Scan. Each panel shows two notes, C3 (131 pps) and A3 (220 pps).

Electrodograms of stimuli used in the present study, showing the four spatial patterns: Apex, Mid, Pair, and Scan. Each panel shows two notes, C3 (131 pps) and A3 (220 pps). Spatial Conditions. Note. Channels are numbered from 1 at the apex. In the multiple-electrode stimuli (Pair and Scan), electrodes were stimulated in apical to basal order, and consecutive pulses were separated by the minimal time delay of 12 µs; thus, the time between consecutive pulse onsets was 70 µs. The wide spacing of the two electrodes in the Pair stimuli reduced the chance of any neuron responding to both electrodes, and so the resulting interspike intervals should have been primarily related to the fundamental period (Marozeau, McDermott, Swanson, & McKay, 2015; McKay & McDermott, 1996). For the Scan stimuli, a particular neuron may be stimulated by several neighboring electrodes, thus, the resulting interspike intervals will be closer to the desired fundamental period if the delays between the electrodes are minimized. Venter and Hanekom (2014) referred to this timing pattern as burst mode. The set of spatial patterns comprising Apex, Pair, and Scan was intended to investigate the hypothesis that rate pitch performance would improve as the number of electrodes increased. The inclusion of the Mid spatial pattern allowed performance with Pair to be compared to performance with each of its two-component electrodes alone. For example, if performance with Pair was better than Mid, but no better than Apex, it would imply that performance with multiple electrodes was determined by performance on the best electrode, rather than the number of electrodes stimulated. It also allowed a comparison between Apex and Mid to investigate whether the place of stimulation affected single-channel performance. Each of the four spatial patterns was tested at Octave 3 pulse rates. In addition, the single-electrode stimuli (Apex and Mid) were also tested at Octave 4 pulse rates to investigate possible interactions between pulse rate and place. Thus, there were a total of six conditions. Stimuli were delivered under the control of a computer, using a Freedom processor connected via a Programming Pod. The software utilized the Nucleus Implant Communicator (NIC) library, Nucleus MATLAB Toolbox (NMT), and Python.

Experimental Procedures

Two experimental procedures were used: note ranking and two variants of the modified melodies test (backwards melodies and warped melodies, as described later). The subjects were familiar with the procedures from previous studies. No trial-by-trial feedback was provided, but at the end of a block of trials, subjects were told the overall score for that block if they inquired. Each subject attended two or three sessions. Each session lasted for up to 2 hr, with approximately 10 min break after an hour.

Loudness adjustment

The clinical units for current in the Nucleus cochlear implant system are referred to as current levels; each current level step is approximately a 2% increase in current (0.16 dB). The current level on each electrode was initially set equal to the maximum comfortable level (MCL) of that electrode in the subject’s usual advanced combination encoder (ACE) map (Figure 3). Subjects were first presented with a 131-pps pulse train on their Apex electrode (i.e., stimulus “C3-Apex”). The current level was adjusted until the stimulus was comfortably loud. The remaining C-note stimuli (C3-Mid, C3-Pair, C3-Scan, C4-Apex, and C4-Mid) were then adjusted to be equal in loudness to C3-Apex. When adjusting the loudness of a multiple-electrode stimulus, a single current level offset was applied to all activated electrodes to maintain the profile of current levels from the subject’s ACE map.

Figure 3.

MCLs for the subjects’ usual ACE map, for the 11 channels used in the stimuli. MCLs = maximum comfortable levels; ACE = advanced combination encoder.

MCLs for the subjects’ usual ACE map, for the 11 channels used in the stimuli. MCLs = maximum comfortable levels; ACE = advanced combination encoder. Melodies used in the backwards melodies procedure. The top row shows the original melodies, “Old MacDonald” (left) and “Twinkle Twinkle Little Star” (right). The bottom row shows the corresponding backwards melodies. Time (in beats) is indicated on the horizontal axis. The vertical axis shows fundamental frequency on a logarithmic scale (i.e., linear in semitones), with note names indicated. At these pulse rates, loudness changes relatively slowly with pulse rate. In the present study, it was found that little or no current level difference was required to loudness balance the notes C and A (9 semitones interval) within one spatial condition (e.g., C3-Apex and A3-Apex). For comparison, Busby and Clark (1997) found little change in loudness, with constant current level, as pulse rate increased from 71 to 250 pps (19 semitones increase); and Macherey et al. (2011) reported a decrease in MCL of 1.4 dB per decade increase in pulse rate (equivalent to two Nucleus current levels for a 9 semitone increase). Furthermore, it was not practical to loudness balance all of the notes that could be produced in the warped melodies procedure. Therefore, all of the notes in one octave for each condition of the modified melodies procedures were presented at the same current level. The three experimental procedures used the same set of current levels.

Modified Melodies Test

In each trial of the modified melodies test (Swanson, 2008; Swanson et al., 2009), the name of a familiar melody was displayed to the subject, and its opening phrase was presented twice. In one of the presentations, randomly selected, the pitch was deliberately modified. Two types of pitch modification were used, backwards and warp, as explained later. The rhythm was left intact. The subjects’ task was to select the unmodified melody (a two-alternative forced choice task). In other words, the subject had to decide which of the two presentations best matched their memory of the familiar melody. On each trial, a value was randomly selected from the set 0, 1, 2, or 3 semitones, and both the correct and modified versions of the melody were transposed by this amount. This was intended to reduce the likelihood of subjects learning to identify the correct melody by some idiosyncratic characteristic of its notes. Trials alternated between two melodies, “Old MacDonald” and “Twinkle Twinkle Little Star,” as defined in Table 1. All subjects confirmed that they were familiar with these melodies, often from their childhood, prior to the onset of severe to profound deafness. Both melodies spanned a range of 9 semitones, and with the addition of up to 3 semitones of transposition, the total range was 12 semitones (an octave; Table 3). In the modified melodies test with backwards modification (referred to as the backwards melodies procedure), the melody notes were reversed in time. This alters the contour of the melodies. To preserve the rhythm of the original melody, repeated notes were first replaced by a single note to give a merged note sequence, as shown in Table 5. The merged note sequence was then reversed in time, and finally, the appropriate notes were split into repeated notes. The resulting melodies are plotted in Figure 4. In both melodies, the final note was the same as the first note, so the pitch of the first note was not a cue. A block of trials comprised 16 trials (2 melodies × 8 repetitions).

Table 5.

Derivation of the Backwards Melodies, Where the Numbers Represent the Number of Semitones Above the Lowest Note in the Sequence.

Melody	Old MacDonald	Twinkle Twinkle
Merged note sequence	[5, 0, 2, 0, 9, 7, 5]	[0, 7, 9, 7, 5, 4, 2, 0]
Reversed merged note sequence	[5, 7, 9, 0, 2, 0, 5]	[0, 2, 4, 5, 7, 9, 7, 0]

Derivation of the Backwards Melodies, Where the Numbers Represent the Number of Semitones Above the Lowest Note in the Sequence. In the modified melodies test with warp modification (referred to as the warped melodies procedure), each note was shifted in pitch by applying a piecewise linear input–output function, as shown in Figure 5. The amount of pitch shift was controlled by a warp factor, which specified the slope of the initial segment of the input–output function. The resulting melodies are plotted in Figure 6. The highest and lowest notes in the melody were unchanged, and the pitch range of the modified melody was the same as the original melody. The warped melodies contained mistuned notes lying between the notes of the musical scale, which cannot be represented in standard musical notation. The main feature of the warp modification is that it changes the sizes of the musical intervals, while keeping the melodic contour intact. For warp factors less than 1, the intermediate notes were shifted downwards in pitch and thus the mean pitch was lower than the original. Conversely, for warp factors greater than 1, the intermediate notes were shifted upwards in pitch, and the mean pitch was higher. Each block of trials utilized two warp factors, one the reciprocal of the other, with their order randomized, so that a subject who responded purely on the basis of the mean pitch would score 50%. A block of trials comprised 16 trials (2 melodies × 2 warp factors × 4 repetitions). Subjects were tested with progressively more difficult pairs of warp factors, starting with warp factors 10 and 0.1, then 4 and 0.25, then 2 and 0.5, giving a total of 48 trials. In several cases, testing was stopped early when the subject scored at chance levels.

Figure 5.

Note input–output relationship for the warp modification with warp factors 2.0 (red) and 0.5 (blue). The dashed line indicates no warp (warp factor = 1) and is used as a visual reference.

Figure 6.

Melodies used in the warped melodies procedure. The middle row shows the original melodies, “Old MacDonald” (left) and “Twinkle Twinkle Little Star” (right). The upper rows show the warped melodies with warp factors greater than one and the lower rows show the warped melodies with warp factors less than one. Axes are the same as Figure 4.

Note input–output relationship for the warp modification with warp factors 2.0 (red) and 0.5 (blue). The dashed line indicates no warp (warp factor = 1) and is used as a visual reference. Melodies used in the warped melodies procedure. The middle row shows the original melodies, “Old MacDonald” (left) and “Twinkle Twinkle Little Star” (right). The upper rows show the warped melodies with warp factors greater than one and the lower rows show the warped melodies with warp factors less than one. Axes are the same as Figure 4.

Note Ranking

The goal was to design a pitch-ranking task that would be in some sense comparable to, or at least relevant to, the melody tasks. In the modified melodies test, both melodies spanned a 9-semitone range (C to A), and the set of interval sizes was [1, 2, 5, 7, 9] semitones, with the most common interval being 2 semitones. In the note-ranking task, each note from the set [C, D, G, A] was paired with the remaining notes to form six pairs, as shown in Table 6, giving interval sizes in the set [2, 5, 7, 9] semitones. In each trial, a pair of notes (X, Y) was presented in a three-note melody, either [X, X, Y] or [Y, Y, X]. The subject’s task was to categorize the melody as either rising pitch or falling pitch (a two alternative forced choice procedure). Phrasing the question this way (rather than asking the subject to select the stimulus with the higher pitch) was intended to focus attention on the melodic context, for consistency with the modified melodies test. This phrasing was also used by Houtsma and Smurzynski (1990).

Table 6.

Note Sequences Used in Note-ranking Task.

Note pair	C, D	G, A	D, G	C, G	D, A	C, A
Interval (semitones)	2	2	5	7	7	9
Rising melody (notes)	[0, 0, 2]	[7, 7, 9]	[2, 2, 7]	[0, 0, 7]	[2, 2, 9]	[0, 0, 9]
Falling melody (notes)	[2, 2, 0]	[9, 9, 7]	[7, 7, 2]	[7, 7, 0]	[9, 9, 2]	[9, 9, 0]

Note Sequences Used in Note-ranking Task. The amplitude of the final note was randomly increased or decreased by two current levels to minimize the possible use of loudness cues. The notes were the same duration and played at the same tempo as in the modified melodies test. A block of trials comprised 48 trials (6 note pairs × 2 directions × 4 repetitions).

Results

The results were divided into two subsets for analysis. The first subset of results addressed the hypothesis that stimulating more electrodes would improve pitch performance. It comprised the four spatial patterns (Apex, Mid, Pair, and Scan) at pulse rates in Octave 3, over six subjects. The second subset of results comprised the four single-channel stimuli (Apex and Mid, at pulse rates in Octaves 3 and 4), addressing the issue of rate–place interaction. The Octave 3 Apex and Octave 3 Mid scores appeared in both subsets. The second subset contained only five subjects, as subject S2 did not undertake the Octave 4 pulse rate conditions. The note-ranking scores were aggregated based on the interval size; for example, the “Rank 2” label represents the sum of the scores for note pairs (C, D) and (G, A). For the warped melodies procedure, scores of reciprocal warp factors were summed; for example, the “Warp 10” label represents the sum of the scores for warp factor 10 and warp factor 0.1. As the experiments used two-alternative forced choice procedures, the results (by definition) follow a binomial distribution. Figure 7 (top) is a normal probability plot (sometimes called a Q–Q plot) of the entire set of results, demonstrating a large deviation from the straight line that indicates normality, primarily due to the large number of 100% scores. The arcsine transformation is sometimes applied to proportion-correct data to improve its normality, but it had little benefit on this data set, as shown in Figure 7 (bottom). Although analysis of variance is tolerant of moderate deviations from normality, it is not appropriate when the variance of subgroups differs substantially (“heteroscedasticity”; McDonald, 2014); note that the binomial distribution has zero variance for p = 1.0. Instead, two types of statistical analysis that require fewer assumptions were conducted.

Figure 7.

Normal probability plot of the entire set of percent-correct scores (top) and of the arcsine-transformed scores (bottom).

Normal probability plot of the entire set of percent-correct scores (top) and of the arcsine-transformed scores (bottom). The first analysis addressed the question of whether an individual subject’s performance differed across stimuli. For this purpose, a contrast set was defined as the set of scores for a single subject, under a single procedure condition (e.g., S1 Rank 9), for the four stimuli in a result subset (each contrast set is plotted as a group of bars in Figures 8 and 12). The sample variance was used to quantify the spread of scores in each contrast set. Monte Carlo simulation was used to calculate the probability that this observed spread in scores was merely due to random fluctuation (Simon, 1997; Swanson, 2008). For each simulation run, four random numbers were generated from the binomial distribution with the same number of trials as that contrast set and with the probability of success for each trial given by the mean proportion correct for that contrast set averaged over the four stimuli (which corresponds to the null hypothesis that the probability of success in a trial was the same for all four stimuli). The sample variance for each simulation run was calculated, and finally a p value for the significance of the spread of scores for that contrast set was estimated as the proportion of 25,000 simulation runs that had a sample variance equal to or larger than the observed sample variance.

Figure 8.

Figure 12.

Scores for individual subjects with single-electrode stimuli. “C3” indicates Octave 3, “C4” indicates Octave 4. Each contrast set (group of bars) that had a significant spread of scores is indicated by “*” (p < .05) or “**” (p < .01).

Scores for individual subjects with Octave 3 pulse rates (“C3” indicates Octave 3). Each contrast set (group of bars) that had a significant spread of scores is indicated by “*” (p < .05) or “**” (p < .01). Group mean and median scores with Octave 3 pulse rates. The error bars indicate the standard error of the mean. The second analysis applied the nonparametric Friedman test, using the MATLAB Statistics Toolbox (The MathWorks, Inc). The results for each type of procedure were analyzed separately (i.e., note ranking and modified melodies). The percent-correct scores were arranged in a matrix with four columns (one for each stimulus type), where each row was a contrast set. Within each row (contrast set), the four scores were assigned ranks from 1 (lowest score) to 4 (highest score). Tied scores were assigned the average of the ranks that would have been assigned had the scores differed slightly. Next, the mean rank of each column was found. The null hypothesis was that there was no difference in performance between the stimulus types, in which case the expected mean rank of each column would be 2.5 (the mean of [1, 2, 3, 4]). The Friedman test provided the probability (p value) that the observed difference in mean ranks (or greater) would occur by chance if the null hypothesis was true. Pairwise differences between stimuli were subsequently examined with Tukey’s honestly significant difference criterion (using multcompare in MATLAB). The mean ranks and comparison intervals were plotted such that two mean ranks differed significantly (p < .05) if their comparison intervals did not overlap (Figures 10 and 14). Because the scores were not normally distributed, and there were many instances of 100% scores, the mean may not be the best measure of group performance. Therefore, the group median results are also shown (Figures 9 and 13).

Figure 10.

Mean of score ranks (Friedman test) for note ranking (left) and modified melodies (backwards and warped) procedures (right), with Octave 3 pulse rates.

Figure 14.

Mean of score ranks (Friedman test) for note ranking and modified melodies procedures, with single-electrode stimuli.

Figure 9.

Group mean and median scores with Octave 3 pulse rates. The error bars indicate the standard error of the mean.

Figure 13.

Group mean and median scores with single-electrode stimuli. The error bars indicate the standard error of the mean.

Mean of score ranks (Friedman test) for note ranking (left) and modified melodies (backwards and warped) procedures (right), with Octave 3 pulse rates. Group mean note-ranking scores with Octave 3 pulse rates, as a function of interval size, and a psychometric function fitted to the group mean scores averaged across stimuli. Scores for individual subjects with single-electrode stimuli. “C3” indicates Octave 3, “C4” indicates Octave 4. Each contrast set (group of bars) that had a significant spread of scores is indicated by “*” (p < .05) or “**” (p < .01). Group mean and median scores with single-electrode stimuli. The error bars indicate the standard error of the mean. Mean of score ranks (Friedman test) for note ranking and modified melodies procedures, with single-electrode stimuli.

Octave 3 Results

Figure 8 shows the scores for each subject, for each procedure, for the four spatial patterns at pulse rates in Octave 3. Two contrast sets showed a significant spread of scores (S1 Rank 7 and S6 Warp 10); however, if there was actually no difference between the four spatial patterns, then finding 2 significant (p < .05) contrast sets out of 46 contrast sets (i.e., 4.4%) is expected by chance. Figure 9 shows the corresponding group mean and median scores.

Note-ranking results

Ceiling effects were evident for the note-ranking procedure for intervals of 5 or more semitones, with subjects S3, S4, S5, and S6 scoring 100%, yielding group median scores of 100%. The Friedman test found no significant differences between the spatial patterns, as shown in Figure 10, where the mean of the score ranks was close to 2.5 for each stimulus. The group mean and median scores were almost equal for each spatial pattern. Given the lack of difference between spatial patterns, it was appropriate to sum the scores across spatial patterns for further analysis. According to a binomial test, the pooled scores for the 2-semitone interval were significantly greater than chance for subjects S2 (p = .02), S3 (p < .0001) and S4 (p < .0001). As expected, scores improved with increasing interval size. Figure 11 shows the group mean note-ranking scores for the four spatial patterns, replotted as a function of interval size, together with a Weibull-shaped psychometric function that was fitted to the group mean scores averaged across the four spatial patterns. With hindsight, intervals of 3 and 4 semitones would have been more sensitive test conditions; however, all spatial patterns were well fitted by a single psychometric curve, and there was little scope for the performance of the four spatial patterns to differ at the untested intervals.

Figure 11.

Group mean note-ranking scores with Octave 3 pulse rates, as a function of interval size, and a psychometric function fitted to the group mean scores averaged across stimuli.

Modified melodies results

For the modified melodies procedures, some ceiling effects were again apparent, with perfect scores obtained by subject S5 for backwards melodies and subject S3 for Warp 10. In contrast, subject S2 scored at chance levels for Warp 10 and so was not tested at the more difficult warp factors. The four spatial patterns had almost equal group mean and median scores, except at Warp 2, where Pair and Scan were highest. However, the Friedman test found no significant differences between the spatial patterns, as shown in Figure 10, where the mean of the score ranks was again close to 2.5 for each stimulus.

Single-Channel Results

Figure 12 shows the scores for the single-channel stimuli. Of the 38 contrast sets, 7 (18%) showed a significant (p < .05) spread of scores, more than would be expected by chance. In five of these contrast sets (S4 Back, S5 Back, S5 Warp 10, S5 Warp 4, and S6 Warp 4), the Octave 4 Apex score was lower than the others. Subject S6 had three contrast sets with a highly significant spread (p < .01), with the Octave 4 scores generally lower than the Octave 3 scores. Figure 13 shows the corresponding group mean and median scores. For the note-ranking procedure, ceiling effects were again apparent, with subjects S3, S4, and S5 scoring 100% for intervals of 5 or more semitones, yielding group median scores of 100%. The Friedman test found no significant differences between the four stimuli, as shown in Figure 14(a). Given the lack of difference between stimuli, it was again appropriate to sum the scores across stimuli for further analysis. According to a binomial test, the pooled scores for the 2-semitone interval were significantly greater than chance for all subjects except S6 (S1: p = .046; S3: p < .001; S4: p < .001; S5: p = .002). Figure 15 shows the group mean note-ranking scores for the four stimuli, replotted as a function of interval size, together with a Weibull-shaped psychometric function that was fitted to the group mean scores averaged across the four stimuli. The Octave 4 Apex and Octave 4 Mid mean scores lie below the fitted curve for intervals of 5 or more semitones, but this is almost entirely attributable to a single subject, S6.

Figure 15.

Group mean note-ranking scores with single-electrode stimuli, as a function of interval size, and a psychometric function fitted to the group mean scores averaged across stimuli.

Group mean note-ranking scores with single-electrode stimuli, as a function of interval size, and a psychometric function fitted to the group mean scores averaged across stimuli. Subjects S1 and S6 were not tested at Warp 2 at Octave 4 pulse rates, as their corresponding scores at Warp 4 were at chance levels. The Octave 4 Apex stimuli had the lowest group mean and median scores in all modified melodies conditions. The Friedman test found a significant difference between stimuli (p < .001). Subsequent pairwise comparisons indicated that the Octave 4 Apex stimuli had significantly lower mean rank than both Octave 3 stimuli (p < .05), as shown in Figure 14(b). To determine the size of the performance differences (which are not provided by the Friedman test), the mean score for each subject for the four stimuli was averaged across the modified melodies conditions, as shown in Figure 16(a). Comparing Octave 4 scores, the group mean score was 7.6 percentage points lower on the Apex electrode than the Mid electrode, and the group median score was 22 percentage points lower. Examining the change in scores on each electrode as the pulse rates increased from Octave 3 to Octave 4, as shown in Figure 16(b), four of the five subjects (S1, S4, S5, and S6) had a greater drop in scores on the Apex electrode than the Mid electrode. In contrast, S3 showed negligible change in scores on both electrodes. Averaged across subjects, scores on the Apex electrode dropped by 13 percentage points more than scores on the Mid electrode.

Figure 16.

Scores for individual subjects with single-electrode stimuli, averaged across modified melodies procedures, and group mean and median scores.

Discussion

Both note-ranking and backwards melodies procedures rely on contour judgments. Note ranking asked the subject whether the melodic contour was rising or falling. The backwards-melodies procedure required subjects to detect an incorrect melodic contour, that is, to detect changes in the ranking of successive notes. The mean scores for backwards melodies were generally between those for 2 and 5-semitone note ranking, which is consistent with the sizes of the intervals being judged. For example, referring to Figure 4, the first interval in Old MacDonald is −5 semitones, compared to +2 semitones in the backwards version; detecting this difference should be easier than 2-semitone note ranking (where the task is to decide whether the step was +2 or −2 semitones), but more difficult than 5-semitone note ranking. The warped melodies procedure required subjects to judge interval sizes. As can be seen in Figure 6, for the larger warp factors, all the notes approached the extremes of the range (C and A), so that the intervals approached either 0 or 9 semitones; the warped melody could be identified by the absence of intermediate intervals or perhaps simply by the absence of the intermediate notes. Another difference between the procedures is that each note-ranking trial contained a single pitch change, while each modified melodies trial provided multiple opportunities to detect an unexpected interval. But perhaps the most important difference is that the modified melodies task required more than the ability to discriminate between the two presentations; subjects needed to compare each presentation to an internal memory template of the melody. Although the correct version of the melody was presented in every trial, its perception via rate pitch was unlikely to perfectly match the recipient’s memory of the acoustically presented melody from their childhood. Nevertheless, all subjects obtained high scores on at least one modified melodies condition (all except S2 scored 100% at least once). The cochlear implant recipients in the present study showed relatively good rate-pitch perception ability. For example, they achieved a mean score of 98% for 5-semitone note ranking with the Octave 3 stimuli (Notes D and G; 147 and 196 pps), while those tested by Kong et al. (2009) obtained a mean score of about 85% (Nucleus recipients) and 70% (Med-El recipients) for ranking intervals of 5.2 semitones (35% frequency difference) at a base rate of 100 pps. However, their rate-pitch resolution was substantially worse than the 1-semitone steps used in Western melodies.

Effect of Place

The present study was not designed to test the hypothesis that a place–rate match was a prerequisite for good pitch perception. The electrode locations were not available, but they likely corresponded to characteristic frequencies higher than the pulse rates used in this experiment (131–523 Hz). For comparison, McDermott, Sucher, and Simpson (2009) reported that, in a group of 13 Nucleus recipients, the location of the most apical electrode corresponded to characteristic frequencies in the range of 600 to 800 Hz. Instead, the present study investigated whether recipients’ performance on rate-pitch tasks in a melodic context varied between apical and mid electrodes. No difference in performance was found between apical and mid electrodes at Octave 3 pulse rates (131–262 Hz). It is difficult to argue that this was due to the apical electrode in the Nucleus electrode array not being inserted deeply enough, because the same result was seen with deeply inserted Med-El electrodes by Baumann and Nobbe (2004) and Kong et al. (2009). The Octave 3 results are consistent with cochlear implant rate pitch and place pitch being orthogonal perceptual dimensions, as suggested by a multidimensional scaling study by Tong, Blamey, Dowell, and Clark (1983). In a discrimination task with small concurrent place and rate changes, McKay, McDermott, and Carlyon (2000) found results consistent with optimal processing of independent observations. Pijl (1997) presented a reference pulse train and asked recipients to adjust the pulse rate of a comparison pulse train until its pitch matched that of the reference. Although recipients were most accurate when the two pulse trains were on the same electrode, the mean error was still less than a semitone when the two pulse trains differed substantially in electrode place. If place pitch and rate pitch were completely independent, then performance on rate pitch tasks might be equally good on any electrode. At Octave 4 pulse rates, the larger drop in performance on the apical electrode is consistent with the hypothesis of Kong et al. (2009) that the upper limit of temporal pitch is higher for more basal electrodes; however, the present study did not attempt to measure this upper limit. Although Macherey et al. (2011) found that a lower MCL was associated with a higher upper limit of rate pitch, no relationship was found between the MCLs of the present subjects and the amount of performance drop; on the contrary, the present subjects exhibited a trend for lower MCLs at the apex (Figure 3).

Effect of Number of Electrodes

The present study provided no evidence that rate pitch perception improves with an increasing number of electrodes, at least for pulse rates in Octave 3 (131–262 Hz). This null result is consistent with the results of previous studies described in the first part of the article. Both Carlyon et al. (2010) and Venter and Hanekom (2014) found that rate pitch ranking performance was independent of the number of electrodes. They found advantages for multiple electrodes only in discrimination tasks at pulse rates higher than 262 Hz, where it seems possible that the cue being used for discrimination was not pitch in its strict melodic sense. Similarly, Penninger et al. (2015) found an effect of number of electrodes only at 500 Hz (not at 100 or 300 Hz) and even then it may have been an artifact of the variation in interpulse intervals. The two hypothetical mechanisms that could have lead to better temporal pitch perception on multiple electrodes appear unfounded. First, it appears that cochlear implant recipients are unable to combine rate-pitch cues from different places in the cochlea to give a stronger cue. Second, there was little evidence that some electrodes were more effective than others at conveying temporal pitch cues. At Octave 3 pulse rates, not only were apical and mid electrodes equally effective, but indeed all 11 electrodes used appeared equally effective. This occurred despite substantial variation in MCL across electrodes in four subjects (Figure 3); conversely, subjects S3 and S4 had fairly flat profiles. A lower MCL is thought to reflect a higher density of and closer proximity to excitable nerve fibers; however, it appears that this was not an important factor in rate-pitch performance at Octave 3 pulse rates.

Implications for Cochlear Implant Sound Coding

In the established continuous interleaved sampling (CIS) and ACE sound coding strategies, electrodes are stimulated at a constant pulse rate (typically at least 500 pps), with the current level on each electrode derived from the envelope of a corresponding band-pass filter. These strategies can provide a modulation pitch cue, but it is often ineffective because the modulation is often shallow and not aligned in time across electrodes (Geurts & Wouters, 2001). Modest improvements in pitch perception have been demonstrated with coding strategies that estimate the fundamental frequency (F0) of the incoming sound and then explicitly modulate the envelope deeply on multiple channels (Green, Faulkner, & Rosen, 2004; Milczynski, Chang, Wouters, & van Wieringen, 2012; Vandali & van Hoesel, 2011, 2012). This approach is here referred to as F0M. A key feature is that the temporal pitch cue on each electrode is aligned in time, making it robust against current spread. Vandali and van Hoesel (2012) also found that pitch-ranking scores for harmonic tones processed by F0M were not significantly different to pitch-ranking scores for pulse rates on a single electrode, which is consistent with the present results. Thus, it appears that rate pitch perception on a single electrode is a good indicator of the pitch perception that can be obtained with sounds processed by F0M. In the fine structure processing (FSP) coding strategy (Arnoldner et al., 2007), the pulse timing on the most apical electrodes is derived from the zero crossings of the corresponding band-pass filters, with the intent of providing rate-pitch cues. The remaining electrodes are stimulated as in CIS. A recent review found no evidence that FSP enables better performance on pitch perception tasks than CIS (Wouters, McDermott, & Francart, 2015). Indeed, FSP provided poorer scores than CIS on a melody discrimination task (Arnoldner et al., 2007). This may be due to the pulse rate and pulse timing differing across the electrodes, so that current spread causes the neurons to experience an inconsistent mixture of the temporal cues. Schatzer et al suggested that cochlear implant sound coding strategies should only apply rate-pitch cues to the most deeply inserted electrodes. On the contrary, the present study confirms that recipients can use rate-pitch cues equally well across a wide range of electrodes to make judgments on melodic contour and musical interval size. The present results certainly do not suggest any advantage of apical electrodes.

31 in total

1. Coding of the fundamental frequency in continuous interleaved sampling processors for cochlear implants.

Authors: L Geurts; J Wouters
Journal: J Acoust Soc Am Date: 2001-02 Impact factor: 1.840

2. Correct tonotopic representation is necessary for complex pitch perception.

Authors: Andrew J Oxenham; Joshua G W Bernstein; Hector Penagos
Journal: Proc Natl Acad Sci U S A Date: 2004-01-12 Impact factor: 11.205

3. Pulse rate discrimination with deeply inserted electrode arrays.

Authors: Uwe Baumann; Andrea Nobbe
Journal: Hear Res Date: 2004-10 Impact factor: 3.208

4. Melody recognition and musical interval perception by deaf subjects stimulated with electrical pulse trains through single cochlear implant electrodes.

Authors: S Pijl; D W Schwarz
Journal: J Acoust Soc Am Date: 1995-08 Impact factor: 1.840

5. Stimulating on multiple electrodes can improve temporal pitch perception.

Authors: Richard T Penninger; Eugen Kludt; Andreas Büchner; Waldo Nogueira
Journal: Int J Audiol Date: 2015-01-29 Impact factor: 2.117

6. Contour, interval, and pitch recognition in memory for melodies.

Authors: W J Dowling; D S Fujitani
Journal: J Acoust Soc Am Date: 1971-02 Impact factor: 1.840

7. Spatial cross-correlation. A proposed mechanism for acoustic pitch perception.

Authors: G E Loeb; M W White; M M Merzenich
Journal: Biol Cybern Date: 1983 Impact factor: 2.086

8. Pitch percepts associated with amplitude-modulated current pulse trains in cochlear implantees.

Authors: C M McKay; H J McDermott; G M Clark
Journal: J Acoust Soc Am Date: 1994-11 Impact factor: 1.840

9. Psychophysical studies evaluating the feasibility of a speech processing strategy for a multiple-channel cochlear implant.

Authors: Y C Tong; P J Blamey; R C Dowell; G M Clark
Journal: J Acoust Soc Am Date: 1983-07 Impact factor: 1.840