| Literature DB >> 35907167 |
Sarah E Gutz1, Hannah P Rowe2, Victoria E Tilton-Bolowsky2, Jordan R Green3,4.
Abstract
Mask-wearing during the COVID-19 pandemic has prompted a growing interest in the functional impact of masks on speech and communication. Prior work has shown that masks dampen sound, impede visual communication cues, and reduce intelligibility. However, more work is needed to understand how speakers change their speech while wearing a mask and to identify strategies to overcome the impact of wearing a mask. Data were collected from 19 healthy adults during a single in-person session. We investigated the effects of wearing a KN95 mask on speech intelligibility, as judged by two speech-language pathologists, examined speech kinematics and acoustics associated with mask-wearing, and explored KN95 acoustic filtering. We then considered the efficacy of three speaking strategies to improve speech intelligibility: Loud, Clear, and Slow speech. To inform speaker strategy recommendations, we related findings to self-reported speaker effort. Results indicated that healthy speakers could compensate for the presence of a mask and achieve normal speech intelligibility. Additionally, we showed that speaking loudly or clearly-and, to a lesser extent, slowly-improved speech intelligibility. However, using these strategies may require increased physical and cognitive effort and should be used only when necessary. These results can inform recommendations for speakers wearing masks, particularly those with communication disorders (e.g., dysarthria) who may struggle to adapt to a mask but can respond to explicit instructions. Such recommendations may further help non-native speakers and those communicating in a noisy environment or with listeners with hearing loss.Entities:
Keywords: Compensation; Face masks; Speaker adaptation; Speaker strategies
Mesh:
Year: 2022 PMID: 35907167 PMCID: PMC9339031 DOI: 10.1186/s41235-022-00423-4
Source DB: PubMed Journal: Cogn Res Princ Implic ISSN: 2365-7464
Stimuli and measures for each protocol and condition
| Protocol | Conditions | Stimulus | Outcome measure | |
|---|---|---|---|---|
| KN95 mask acoustic profile | Mannequin: “No Mask” “Mask” | Computer-generated white noise | Signal attenuation (Mask minus No Mask) | |
| 1/3 octave band analysis | ||||
| Human | Human: “No Mask” “ | Sustained /a/ | Phonatory measures (LHR, duration, F0, shimmer, jitter, HNR) | |
Human: “No Mask” “ “Clear + Mask” “Loud + Mask” “Slow + Mask” | VAS survey | Speaker effort | ||
| SIT | Transcription intelligibility | |||
| Story read task | Formant measures (F1 and F2 range) | |||
| Spoken paragraph | Kinematic measures (Jaw ROM and speed; Head ROM and Speed) | |||
| Phonatory compensation to KN95 mask | Human + mannequin: “Masked Human” (maskless mannequin) “Masked Mannequin” (maskless human) | Human-produced sustained /a/ from “No Mask” and “Mask Only” conditions | Phonatory measures (LHR, duration, F0, shimmer, jitter, HNR) | |
Fig. 1KN95 mask attenuation of white noise—1/3 octave band analysis. Note: Intensity attenuation of KN95 mask on white noise, presented by 1/3 octave bands. Negative values indicate lower intensity when played through the mask compared to noise not played through the mask. Red dashed line = low/high-frequency cutoff A; Green dashed line = low/high-frequency cutoff B
KN95 acoustic profile
| Spectrum section | Lower frequency | Upper frequency | Intensity difference | |
|---|---|---|---|---|
| Mean (dB) | ||||
| Full spectrum | 80 Hz | 16 kHz | − 7.47 | 4.60 |
| Low frequencies A | 80 Hz | 4 kHz | − 4.16 | 5.91 |
| High frequencies A | 4 kHz | 16 kHz | − 8.56 | 3.45 |
| Low frequencies B | 80 Hz | 2.5 kHz | − 0.36 | 3.35 |
| High frequencies B | 2.5 kHz | 16 kHz | − 8.76 | 3.48 |
| Mask resonance | 178 Hz | 269 Hz | + 9.35 | 3.73 |
Average intensity difference between the mask and no mask conditions for white noise played through the mannequin speaker, calculated as mask minus no mask. Lower and upper frequency indicates the boundaries of each spectrum section
Functional impact—intelligibility and effort for human speakers
| Intelligibility (%) | Speaker effort | |||
|---|---|---|---|---|
| Mean | Mean | |||
| Mask only | 89.38 | 12.39 | 0.36 | 0.27 |
| No mask | 84.86 | 14.84 | 0.09 | 0.16 |
| Clear + Mask | 94.25 | 12.87 | 0.74 | 0.21 |
| Loud + Mask | 96.20 | 5.26 | 0.67 | 0.24 |
| Slow + Mask | 92.50 | 11.91 | 0.74 | 0.27 |
Descriptive statistics mean and standard deviation (SD) for functional measures of transcription intelligibility and speaker effort
Fig. 2Effect sizes of functional measures relative to mask only. Note: Standardized beta coefficients for transcription intelligibility (blue) and speaker effort (red) for each condition relative to the Mask Only condition. Negative effect size indicates lower values relative to the Mask Only condition; positive effect size indicates higher values relative to the Mask Only condition. Error bars indicate 95% confidence interval. Large effect: |Beta|≥ 0.95; medium effect: |Beta|≥ 0.55; small effect: |Beta|≥ 0.25
Mechanism of change—phonatory measures for human speakers
| Mask only | No mask | |||
|---|---|---|---|---|
| Mean | Mean | |||
| Low frequencies (dB) | 77.20 | 5.07 | 76.82 | 4.04 |
| High frequencies (dB) | 36.24 | 5.53 | 38.78 | 4.82 |
| Low/high ratio (dB) | 40.96 | 4.39 | 38.04 | 4.82 |
| Duration (s) | 19.63 | 9.32 | 20.33 | 9.78 |
| F0 (Hz) | 189.60 | 43.93 | 184.32 | 45.24 |
| Shimmer (%) | 0.04 | 0.01 | 0.04 | 0.01 |
| Jitter (%) | 0.004 | 0.001 | 0.004 | 0.002 |
| HNR (dB) | 22.26 | 3.14 | 21.01 | 2.70 |
Descriptive statistics mean and standard deviation (SD) for phonatory measures produced by humans. Low frequencies: 80 Hz–4 kHz; High frequencies: 4–10 kHz
Fig. 3Effect size of select phonatory measures. Note: Standardized beta coefficients for low/high ratio (blue) and harmonic-to-noise ratio (HNR, red) of Masked Mannequin relative to Masked Human (top) and human-produced No Mask relative to Mask Only (bottom). Negative effect size indicates lower values relative to the reference condition; positive effect size indicates higher values relative to the reference condition. Error bars indicate 95% confidence interval. Large effect: |Beta|≥ 0.95; medium effect: |Beta|≥ 0.55; small effect: |Beta|≥ 0.25
Mechanism of change—phonatory measures played through a mannequin
| Masked Human | Masked Mannequin | |||
|---|---|---|---|---|
| Mean | Mean | |||
| Low frequencies (dB) | 76.23 | 4.13 | 74.26 | 4.90 |
| High frequencies (dB) | 42.23 | 6.76 | 30.74 | 4.55 |
| Low/high ratio (dB) | 34.00 | 4.62 | 43.52 | 4.35 |
| Duration (s) | 19.63 | 9.32 | 20.33 | 9.78 |
| F0 (Hz) | 189.26 | 43.73 | 185.10 | 45.13 |
| Shimmer (%) | 0.04 | 0.01 | 0.04 | 0.02 |
| Jitter (%) | 0.006 | 0.001 | 0.005 | 0.004 |
| HNR (dB) | 21.69 | 4.29 | 20.92 | 3.13 |
Descriptive statistics mean and standard deviation (SD) for phonatory measures produced by humans and played through the mannequin. Low frequencies: 80 Hz–4 kHz; high frequencies: 4–10 kHz
Mechanism of change—formant measures for human speakers
| F1 range (Hz) | F2 range (Hz) | |||
|---|---|---|---|---|
| Mean | Mean | |||
| Mask only | 177.24 | 83.78 | 490.10 | 262.61 |
| No mask | 179.73 | 88.62 | 350.73 | 162.61 |
| Clear + Mask | 199.22 | 119.40 | 735.30 | 346.12 |
| Loud + Mask | 137.19 | 79.96 | 371.03 | 224.35 |
| Slow + Mask | 161.61 | 109.90 | 598.28 | 274.15 |
| Female | 204.23 | 90.58 | 566.30 | 305.32 |
| Male | 77.95 | 48.42 | 348.90 | 194.01 |
Descriptive statistics mean and standard deviation (SD) for F1 range and F2 range
Fig. 4Effect sizes of formant measures relative to mask only. Note: Standardized beta coefficients for F1 range (blue) and F2 range (red) for each condition relative to the Mask Only condition. Negative effect size indicates lower values relative to the Mask Only condition; positive effect size indicates higher values relative to the Mask Only condition. Error bars indicate 95% confidence interval. Large effect: |Beta|≥ 0.95; medium effect: |Beta|≥ 0.55; small effect: |Beta|≥ 0.25
Mechanism of change—kinematic measures for human speakers
| Jaw ROM (mm3) | Jaw speed (mm/s) | |||
|---|---|---|---|---|
| Mean | Mean | |||
| Mask only | 93.80 | 104.05 | 32.36 | 12.76 |
| No mask | 220.80 | 157.35 | 50.96 | 28.99 |
| Clear + Mask | 225.31 | 210.69 | 36.23 | 15.81 |
| Loud + Mask | 193.24 | 245.32 | 36.68 | 11.96 |
| Slow + Mask | 212.36 | 345.86 | 26.78 | 12.17 |
Descriptive statistics mean and standard deviation (SD) for kinematic measures for the jaw (top) and head (bottom)
Fig. 5Effect sizes of kinematic measures relative to mask only. Note. Standardized beta coefficients for Jaw ROM (purple), jaw speed (green), head ROM (blue), and head speed (red) for each condition relative to the Mask Only condition. Negative effect size indicates lower values relative to the Mask Only condition; positive effect size indicates higher values relative to the Mask Only condition. Error bars indicate 95% confidence interval. Large effect: |Beta|≥ 0.95; medium effect: |Beta|≥ 0.55; small effect: |Beta|≥ 0.25