| Literature DB >> 35368250 |
Lionel Fontan1, Libio Gonçalves Braz2, Julien Pinquier2, Michael A Stone3, Christian Füllgrabe4.
Abstract
Automatic speech recognition (ASR), when combined with hearing-aid (HA) and hearing-loss (HL) simulations, can predict aided speech-identification performances of persons with age-related hearing loss. ASR can thus be used to evaluate different HA configurations, such as combinations of insertion-gain functions and compression thresholds, in order to optimize HA fitting for a given person. The present study investigated whether, after fixing compression thresholds and insertion gains, a random-search algorithm could be used to optimize time constants (i.e., attack and release times) for 12 audiometric profiles. The insertion gains were either those recommended by the CAM2 prescription rule or those optimized using ASR, while compression thresholds were always optimized using ASR. For each audiometric profile, the random-search algorithm was used to vary time constants with the aim to maximize ASR performance. A HA simulator and a HL simulator simulator were used, respectively, to amplify and to degrade speech stimuli according to the input audiogram. The resulting speech signals were fed to an ASR system for recognition. For each audiogram, 1,000 iterations of the random-search algorithm were used to find the time-constant configuration yielding the highest ASR score. To assess the reproducibility of the results, the random search algorithm was run twice. Optimizing the time constants significantly improved the ASR scores when CAM2 insertion gains were used, but not when using ASR-based gains. Repeating the random search yielded similar ASR scores, but different time-constant configurations.Entities:
Keywords: age-related hearing loss; age-related hearing loss (ARHL); attack time; automatic speech recognition; automatic speech recognition (ASR); compression speed; hearing aids; hearing aids (HAs); random search; random search (RS); release time
Year: 2022 PMID: 35368250 PMCID: PMC8969748 DOI: 10.3389/fnins.2022.779062
Source DB: PubMed Journal: Front Neurosci ISSN: 1662-453X Impact factor: 4.677
FIGURE 1Components of the OPRA-RS optimization chain, with associated input data (in italics) and output data (right panel). The parameters randomized by the RS algorithm are highlighted in red.
FIGURE 2Audiograms used as an input for the simulation of hearing loss. Corresponding pure-tone averages (PTAs) for frequencies between 0.5 and 4 kHz are shown in the right panel. Figure reproduced from Gonçalves Braz et al. (2022).
FIGURE 3ASR scores with or without optimization of time constants, using the insertion gains recommended by CAM2 (left panel) or OPRA-RS (right panel) for the 12 audiograms. Horizontal, thick dark lines inside the boxes represent median values. The 0 and 100th percentiles are represented by the bottom and top whiskers, while the bottom and top limits of the boxes represent the 25th and 75th percentiles.
FIGURE 4Distribution of the attack times (ATs) and release times (RTs) yielding the highest ASR performances when using the insertion gains recommended by CAM2 (left panel) or OPRA-RS (right panel) for the 12 audiograms. Otherwise as Figure 3.