| Literature DB >> 24904451 |
Clémence Bayard1, Cécile Colin1, Jacqueline Leybaert1.
Abstract
Speech perception for both hearing and deaf people involves an integrative process between auditory and lip-reading information. In order to disambiguate information from lips, manual cues from Cued Speech may be added. Cued Speech (CS) is a system of manual aids developed to help deaf people to clearly and completely understand speech visually (Cornett, 1967). Within this system, both labial and manual information, as lone input sources, remain ambiguous. Perceivers, therefore, have to combine both types of information in order to get one coherent percept. In this study, we examined how audio-visual (AV) integration is affected by the presence of manual cues and on which form of information (auditory, labial or manual) the CS receptors primarily rely. To address this issue, we designed a unique experiment that implemented the use of AV McGurk stimuli (audio /pa/ and lip-reading /ka/) which were produced with or without manual cues. The manual cue was congruent with either auditory information, lip information or the expected fusion. Participants were asked to repeat the perceived syllable aloud. Their responses were then classified into four categories: audio (when the response was /pa/), lip-reading (when the response was /ka/), fusion (when the response was /ta/) and other (when the response was something other than /pa/, /ka/ or /ta/). Data were collected from hearing impaired individuals who were experts in CS (all of which had either cochlear implants or binaural hearing aids; N = 8), hearing-individuals who were experts in CS (N = 14) and hearing-individuals who were completely naïve of CS (N = 15). Results confirmed that, like hearing-people, deaf people can merge auditory and lip-reading information into a single unified percept. Without manual cues, McGurk stimuli induced the same percentage of fusion responses in both groups. Results also suggest that manual cues can modify the AV integration and that their impact differs between hearing and deaf people.Entities:
Keywords: Cued Speech; audio-visual speech integration; cochlear implant; deafness; multimodal speech perception
Year: 2014 PMID: 24904451 PMCID: PMC4032946 DOI: 10.3389/fpsyg.2014.00416
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
Figure 1Cues in French Cued Speech: hand-shapes for consonants and hand placements for vowels. Adapted from http://sourdsressources.wordpress.com.
CS-deaf group characteristics.
| 1 | 17 | At birth | Unknown | 2 |
| 2 | 21 | 3 years | 3 | 3 |
| 3 | 21 | At birth | 2 | 3 |
| 4 | 14 | At birth | 3 | 2 |
| 5 | 24 | At birth | 3 | 2 |
| 6 | 21 | At birth | 5 | 2 |
| 7 | 16 | At birth | 8 | 2 |
| 8 | 17 | 2 years | 16 | 14 |
Indicates participants with cochlear implants.
TERMO scores by group and participant for Audio-Only, Visual-Only, AV, and Visual with CS (V + CS) cue conditions.
| DEAF CS | 60.38 (22.30) | 35.88 (11.61) | 77.00 (18.40) | 94.75 (5.01) | 41.13 (22.34) | 58.89 (12.29) |
| 1 | 82 | 24 | 82 | 94 | 58 | 70 |
| 2 | 12 | 29 | 41 | 100 | 12 | 71 |
| 3 | 71 | 35 | 94 | 88 | 59 | 53 |
| 4 | 59 | 35 | 88 | 100 | 53 | 65 |
| 5 | 65 | 47 | 94 | 88 | 47 | 41 |
| 6 | 71 | 29 | 76 | 94 | 47 | 65 |
| 7 | 47 | 59 | 59 | 100 | 0 | 41 |
| 8 | 76 | 29 | 82 | 94 | 53 | 65 |
| HEARING CS | 100 (0.00) | 38.14 (9.92) | 100 (0.00) | 88.43 (3.69) | 61.86 (9.2) | 50.29 (9.92) |
| 1 | 100 | 47 | 100 | 88 | 53 | 41 |
| 2 | 100 | 41 | 100 | 88 | 59 | 47 |
| 3 | 100 | 41 | 100 | 88 | 59 | 47 |
| 4 | 100 | 24 | 100 | 82 | 76 | 58 |
| 5 | 100 | 29 | 100 | 88 | 71 | 59 |
| 6 | 100 | 41 | 100 | 82 | 59 | 41 |
| 7 | 100 | 35 | 100 | 88 | 65 | 53 |
| 8 | 100 | 59 | 100 | 94 | 41 | 35 |
| 9 | 100 | 18 | 100 | 94 | 82 | 76 |
| 10 | 100 | 35 | 100 | 88 | 65 | 53 |
| 11 | 100 | 41 | 100 | 94 | 59 | 53 |
| 12 | 100 | 41 | 100 | 88 | 59 | 47 |
| 13 | 100 | 41 | 100 | 88 | 59 | 47 |
| 14 | 100 | 41 | 100 | 88 | 59 | 47 |
| HEARING CONTROL | 99.60 (1.55) | 36.27 (8.39) | 100 (0.00) | 42.33 (10.91) | 63.73 (8.39) | 6.07 (12.70) |
| 1 | 100 | 41 | 100 | 47 | 59 | 6 |
| 2 | 100 | 47 | 100 | 47 | 53 | 0 |
| 3 | 100 | 41 | 100 | 47 | 59 | 6 |
| 4 | 100 | 29 | 100 | 35 | 71 | 6 |
| 5 | 100 | 47 | 100 | 59 | 53 | 12 |
| 6 | 100 | 35 | 100 | 35 | 65 | 0 |
| 7 | 100 | 18 | 100 | 47 | 82 | 29 |
| 8 | 100 | 29 | 100 | 53 | 71 | 24 |
| 9 | 100 | 35 | 100 | 41 | 65 | 6 |
| 10 | 100 | 41 | 100 | 53 | 59 | 12 |
| 11 | 100 | 29 | 100 | 53 | 71 | 24 |
| 12 | 100 | 29 | 100 | 24 | 71 | −5 |
| 13 | 94 | 41 | 100 | 24 | 59 | −17 |
| 14 | 100 | 35 | 100 | 29 | 65 | −6 |
| 15 | 100 | 47 | 100 | 41 | 53 | −6 |
Standard deviations are indicated in parentheses.
Indicates participants with cochlear implants.
Figure 2Stimulus sample. Video frame of lip-reading with congruent cue condition (A), of audio only condition (B), of audio with congruent cue condition (C).
Stimulus composition of congruent control conditions.
| Audio only | A /pa/ | A /ta/ | A /ka/ |
| Lip-reading only | LR /pa/ | LR /ta/ | LR /ka/ |
| Audio + CS cue | A /pa/ + CS cue coding / | A /ta/ + CS cue coding /m, | A /ka/ + CS cue coding / |
| Lip-reading + CS cue | LR /pa/ + CS cue coding / | LR /ta/ + CS cue coding /m, | LR /ka/ + CS cue coding / |
| Audio visual | A /pa + LR /pa/ | A /ta/ + LR /ta/ | A /ka/ + LR /ka/ |
| AV + CS cue | A /pa/ + LR /pa/ + CS cue coding / | / | / |
Because each CS cue codes several phonemes, the phoneme congruent with auditory information, or lip-reading information is indicated in bold.
The composition of McGurk stimuli in incongruent conditions.
| Baseline condition | pa | ka | / |
| Audio condition | pa | ka | |
| Lip-reading condition | pa | ka | |
| Fusion condition | pa | ka | ma, |
Because each CS cue codes several phonemes, the phoneme congruent with auditory information, or lip-read information, or the expected fusion is indicated in bold.
Mean percentages of correct responses for all groups in Audio-Only and Audio + CS cue conditions.
| /pa/ | 85 (18.2) | 93 (12.5) | 100 (0) | 98 (2.4) | 98 (2.1) | 95 (7.1) |
| /ta/ | 62 (21.9) | 70 (23.9) | 100 (0) | 98 (0) | 100 (0) | 100 (0) |
| /ka/ | 59 (29.2) | 93 (9.4) | 100 (0) | 100 (0) | 100 (0) | 100 (0) |
Standard deviations are indicated in parentheses.
Mean percentages of correct responses for all groups in Lip-reading-Only and Lip-reading + CS cue conditions.
| /pa/ | 68 (18.8) | 100 (0) | 71 (18.7) | 91 (9.9) | 91 (10.7) | 77 (17.8) |
| /ta/ | 52 (27.1) | 85 (18.2) | 38 (27.8) | 69 (36.9) | 46 (24) | 38 (24.4) |
| /ka/ | 22 (14.6) | 89 (15.6) | 8 (11.0) | 69 (22.9) | 14 (13.5) | 52 (24.9) |
Standard deviations are indicated in parentheses.
Mean percentages of correct responses for all groups in Audio + Lip-reading (LR) and Audio + LR + CS cue conditions.
| Audio /pa/ + LR /pa/ | 100 (0) | 100 (0) | 100 (0) |
| Audio /ta/ + LR /ta/ | 64 (27.1) | 100 (0) | 100 (0) |
| Audio /ka/ + LR /ka/ | 62 (26.0) | 100 (0) | 100 (0) |
| Audio /pa/ + LR /pa/ + CS /pa/ | 100 (0) | 100 (0) | 100 (0) |
Standard deviations are indicated in parentheses.
Mean percentages of each kind of response (audio, lip-reading, fusion and other) for all groups in incongruent conditions.
| Resp. audio /pa/ | 8 (14.6) | 17 (20.5) | 27 (28.9) |
| Resp. lip-reading /ka/ | 2 (3.6) | 1 (2.4) | 1 (2.1) |
| Resp. fusion /ta/ | 81 (24) | 78 (20.7) | 70 (29.3) |
| Other response | 9 (10.4) | 2 (4.3) | 2 (2.1) |
| Resp. lip-reading /ka/ | 2 (3.6) | 0 (0) | 1 (2.1) |
| Resp. fusion /ta/ | 20 (27.1) | 21 (22.5) | 57 (32.9) |
| Other response | 60 (31.2) | 18 (21.5) | 5 (5.8) |
| Resp. audio /pa/ | 2 (3.6) | 20 (21.1) | 35 (33.4) |
| Resp. fusion /ta/ | 25 (22.9) | 33 (24.1) | 61 (30.4) |
| Other response | 13 (18.7) | 6 (7.9) | 2 (2.1) |
| Resp. audio /pa/ | 0 (0) | 16 (23.7) | 35 (33.8) |
| Resp. lip-reading /ka/ | 0 (0) | 0 (0) | 1 (2.1) |
| Other response | 9 (10.4) | 9 (13.8) | 3 (3.9) |
Standard deviations are indicated in parentheses. Audio, lip-reading or fusion response congruent with CS cue information are indicated in bold.
Figure 3CS perception models. (A) Sequential model with late integration of manual cue; (B) Sequential model with early integration of manual cue; (C) Simultaneous model with early integration of manual cue.