| Literature DB >> 33272109 |
Jasper Ooster1,2, Melanie Krueger2,3, Jörg-Hendrik Bach2,3,4, Kirsten C Wagener2,3,4, Birger Kollmeier2,3,4,5, Bernd T Meyer1,2,3.
Abstract
Speech audiometry in noise based on sentence tests is an important diagnostic tool to assess listeners' speech recognition threshold (SRT), i.e., the signal-to-noise ratio corresponding to 50% intelligibility. The clinical standard measurement procedure requires a professional experimenter to record and evaluate the response (expert-conducted speech audiometry). The use of automatic speech recognition enables self-conducted measurements with an easy-to-use speech-based interface. This article compares self-conducted SRT measurements using smart speakers with expert-conducted laboratory measurements. With smart speakers, there is no control over the absolute presentation level, potential errors from the automated response logging, and room acoustics. We investigate the differences between highly controlled measurements in the laboratory and smart speaker-based tests for young normal-hearing (NH) listeners as well as for elderly NH, mildly and moderately hearing-impaired listeners in low, medium, and highly reverberant room acoustics. For the smart speaker setup, we observe an overall bias in the SRT result that depends on the hearing loss. The bias ranges from +0.7 dB for elderly moderately hearing-impaired listeners to +2.2 dB for young NH listeners. The intrasubject standard deviation is close to the clinical standard deviation (0.57/0.69 dB for the young/elderly NH compared with 0.5 dB observed for clinical tests and 0.93/1.09 dB for the mild/moderate hearing-impaired listeners compared with 0.9 dB). For detecting a clinically elevated SRT, the speech-based test achieves an area under the curve value of 0.95 and therefore seems promising for complementing clinical measurements.Entities:
Keywords: hearing screening; matrix sentence test; smart home; speech audiometry
Year: 2020 PMID: 33272109 PMCID: PMC7720343 DOI: 10.1177/2331216520970011
Source DB: PubMed Journal: Trends Hear ISSN: 2331-2165 Impact factor: 3.293
Figure 1.Overview of the Smart Speaker Measurement Application.
Statistics of the Four Subject Groups Who Participated in the Evaluation.
| Young normal-hearing | Normal-hearing | Mild hearing loss | Moderate hearing loss | |
|---|---|---|---|---|
| Max. one frequency at 20 dB HL | PTA | PTA | PTA | |
| N (f/m) | 16 (12/4) | 9 (5/4) | 11 (5/6) | 10 (3/7) |
| Age | 23 ± 4 years | 61 ± 6 years | 63 ± 6 years | 62 ± 10 years |
| PTA | 0 ± 5 dB HL | 10 ± 9 dB HL | 31 ± 5 dB HL | 46 ± 6 dB HL |
Note. PTA = Pure Tone Average.
Figure 2Individual Audiograms of the Better Hearing Ear of Our Subjects (Gray Lines) Together With the Average Audiogram for the Respective Subject Group (Black Lines). Note the different y-axis for the young normal-hearing listeners. The two dashed lines in the moderately hearing-impaired panel describe the audiogram from two subjects which had to be discarded from data analysis as explained in the Results section.
Measurement Sequence During One of the Two Sessions for Each Subject.
| Room settings A | Training list 1 |
| Training list 2 | |
| Test list 1 | |
|
| Test list 2 |
| Room settings B | Test list 3 |
|
| Test list 4 |
| Room settings C | Test list 5 |
|
| Test list 6 |
| Isolated booth | Reference |
Note. While the reference measurement with the clinical setup was always performed at the end of each session, the order of the room characteristics of the CAS during the smart speaker measurement was randomly chosen for each subject, i.e., each setting (Living Room/Poor Classroom/Concert Hall) correspond to A, B, or C. CAS = Communication Acoustic Simulator.
Figure 3.Bland–Altman Plot for Visualizing the Agreement Between Automated and Regular Test Conduction. The figure compares SRT differences between smart speaker and clinical measurement (measured in the same session) to the average of these two values. Data are shown for different subject groups (labeled with different shapes) and room characteristics (differentiated by color). In addition, the average difference between the two measuring methods (gray solid line) and the 95% percentiles (gray dashed lines) are shown. The young-NH subjects only conducted experiments in the living room condition. SRT = speech recognition threshold; NH = normal-hearing; HL = hearing loss.
Figure 4.Bias and Intrasubject Standard Deviations for All Elderly, Age-Matched Subjects for Different Room Configurations. The bias relates to the difference between clinical and automated measurements. A positive bias refers to a lower (better) SRT in the clinical measurement. SNR =signal-to-noise ratio.
Figure 5.Bias and Intrasubject Standard Deviations in the Living Room-Settings for the Different Subject Groups. Since the young NH subject group only conducted one measurement session with one reference measurement, the intrasubject standard deviation for this subject group in the reference setup cannot be estimated. NH = normal-hearing; HL = hearing loss; SNR = signal-to-noise ratio.
Figure 6.Violin Plot of the ASR System’s Performance From the Smart Speaker. The individual data points denote error rates of single measurement lists, each with 20 presented sentences; the width of the violin denotes the normalized histogram of the error rates. The median value and the interquartile range are denoted by white dots and the gray line, respectively. NH = normal-hearing; HL = hearing loss.
Figure 7.Sensitivity and Specificity for Analyzing How Well a Potential Decision Threshold Is Suited for Providing a Binary Screening Decision. The curves shown here are derived from criteria that are used to quantifying hearing loss. (a): dB SNR (the 95% percentile boundary from the young NH data measured with the reference setup).(b): PTA > 25 dB HL.(c): A hearing loss of 30 dB or higher in at least one audiogram frequency between 500 Hz and 4 kHz (which is an indication for a hearing aid in Germany). The dashed black line shows the 95% percentile boundary from the young NH data measured with the smart speaker. AUC = area under the curve; SRT = speech recognition threshold; SNR = signal-to-noise ratio.