| Literature DB >> 24616669 |
Carolyn McGettigan1, Stuart Rosen2, Sophie K Scott3.
Abstract
Noise-vocoding is a transformation which, when applied to speech, severely reduces spectral resolution and eliminates periodicity, yielding a stimulus that sounds "like a harsh whisper" (Scott et al., 2000, p. 2401). This process simulates a cochlear implant, where the activity of many thousand hair cells in the inner ear is replaced by direct stimulation of the auditory nerve by a small number of tonotopically-arranged electrodes. Although a cochlear implant offers a powerful means of restoring some degree of hearing to profoundly deaf individuals, the outcomes for spoken communication are highly variable (Moore and Shannon, 2009). Some variability may arise from differences in peripheral representation (e.g., the degree of residual nerve survival) but some may reflect differences in higher-order linguistic processing. In order to explore this possibility, we used noise-vocoding to explore speech recognition and perceptual learning in normal-hearing listeners tested across several levels of the linguistic hierarchy: segments (consonants and vowels), single words, and sentences. Listeners improved significantly on all tasks across two test sessions. In the first session, individual differences analyses revealed two independently varying sources of variability: one lexico-semantic in nature and implicating the recognition of words and sentences, and the other an acoustic-phonetic factor associated with words and segments. However, consequent to learning, by the second session there was a more uniform covariance pattern concerning all stimulus types. A further analysis of phonetic feature recognition allowed greater insight into learning-related changes in perception and showed that, surprisingly, participants did not make full use of cues that were preserved in the stimuli (e.g., vowel duration). We discuss these findings in relation cochlear implantation, and suggest auditory training strategies to maximize speech recognition performance in the absence of typical cues.Entities:
Keywords: cochlear implants; individual differences; speech perception
Year: 2014 PMID: 24616669 PMCID: PMC3933978 DOI: 10.3389/fnsys.2014.00018
Source DB: PubMed Journal: Front Syst Neurosci ISSN: 1662-5137
Figure 1Equation used to estimate psychometric functions describing the relationship between number of bands and speech intelligibility. α, alpha; β, beta; γ, gamma; λ, lambda. “x” in this study was the log of the number of channels in the noise vocoder.
Feature matrix for IT analysis of the Consonants task.
| ʃ | |||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Voicing | + | + | − | + | + | − | + | + | + | − | − | − | − | + | + | + | + |
| Manner | plos | plos | fric | plos | aff | plos | app | nas | nas | plos | fric | fric | plos | fric | app | app | fric |
| Place | bil | alv | lad | vel | paa | vel | alv | bil | alv | bil | alv | paa | alv | Lad | lav | pal | alv |
For Voicing, the ‘+’ and ‘−’ signs correspond to present and absent voicing, respectively. For Manner, plos, plosive; fric, fricative; aff, affricate; app, approximant; nas, nasal. For Place, bil, bilabial; alv, alveolar; lad, labiodental; paa, postalveolar; vel, velar; lav, labialized velar; pal, palatal.
Feature matrix for IT analysis of the Vowels task.
| əʊ | ɔ | ɔɪ | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Height | no | cm-fc | o | om | c | nc-m | cm | nc | o-nc | om | o | m-nc | c | om | o-nc | om-nc | om |
| Backness | f | f-nf | b | f | f | nf-c | f | nf | f-nf | c | b | c-nb | b | b | f-nb | b-nf | b |
| Roundedness | n | n | n | n | n | n | n | n | n | n | y | ny | y | y | ny | yn | n |
| Length | s | l | l | l | l | l | s | s | l | l | s | l | l | l | l | l | s |
| Diphthong? | n | y | n | n | n | y | n | n | y | n | n | y | n | n | y | y | n |
For Height, o, open; no, near-open; om, open-mid; m, mid; cm, close-mid; nc, near-close; c, close. For Backness, b, back; nb, near-back; c, central; nf, near-front; f, front. For Roundedness, y, rounded and n, unrounded. For Length, s, short and l, long. For Diphthong, y, diphthong and n, monophthong. Dashes indicate the separation of the diphthong descriptions into monophthongal elements, in temporal order.
Figure 2Logistic curves describing group performance on the speech recognition tasks for (A) open-set tasks (sentences and words) and Error bars show 95% confidence limits around α.
Figure 3Mean TNBs (Threshold Number of Bands) for speech recognition across the five tasks, and across the two test sessions. Error bars show ±1 standard error of the mean.
Pearson's correlation coefficients between the five tasks in the experiment, across the two testing session.
| BKB | – | 0.356 | 0.259 | 0.003 | −0.100 |
| IEEE | - | 0.323 | 0.069 | −0.056 | |
| Words | – | 0.417 | 0.331 | ||
| Cons | – | 0.302 | |||
| Vowels | – | ||||
| BKB | – | 0.277 | 0.333 | 0.299 | 0.236 |
| IEEE | – | −0.025 | 0.393 | 0.296 | |
| Words | – | 0.015 | 0.057 | ||
| Cons | – | 0.317 | |||
| Vowels | – | ||||
Cons, Consonants;
p < 0.10,
p < 0.05.
Results of factor analyses on individual TNBs (Threshold Number of Bands).
| BKB | 0.605 | |
| IEEE | 0.593 | |
| Words | 0.705 | 0.469 |
| Consonants | 0.558 | |
| Vowels | 0.562 | |
| BKB | 0.520 | 0.344 |
| IEEE | 0.545 | |
| Words | 0.946 | |
| Consonants | 0.642 | |
| Vowels | 0.491 | |
Only factor loadings over 0.3 are shown.
Figure 4Results of the group IT analysis on consonant perception for (A) Session 1 and (B) Session 2.
Figure 5Results of the IT analysis on consonant perception, using individual participant data. For each feature, the darker bars show the results for Session 1, and the paler bars show the results for Session 2. Error bars show ±1 standard error of the mean.
Figure 6Results of the group IT analysis on vowel perception for (A) Session 1 and (B) Session 2.
Figure 7Results of the IT analysis on vowel perception, using individual participant data. For each feature, the darker bars show the results for Session 1, and the paler bars show the results for Session 2. Error bars show ±1 standard error of the mean. BACK, backness; ROUND, roundedness.