Sam Denys1, Jan De Laat2, Wouter Dreschler3, Michael Hofmann1, Astrid van Wieringen1, Jan Wouters1. 1. KU Leuven, Department of Neurosciences, ExpORL, Leuven, Belgium. 2. Leiden University Medical Centre, Department of Audiology, Leiden, the Netherlands. 3. Academic Medical Centre, Department of ENT, Clinical and Experimental Audiology, Amsterdam, the Netherlands.
Abstract
A language-independent automated self-test on tablet based on masked recognition of ecological sounds, the Sound Ear Check (SEC), was developed. In this test, 24 trials of eight different sounds are randomly presented in a noise that was spectrally shaped according to the average frequency spectra of the stimulus sounds, using a 1-up 2-down adaptive procedure. The test was evaluated in adults with normal hearing and hearing loss, and its feasibility was investigated in young children, who are the target population of this test. Following equalization of perceptual difficulty across sounds by applying level adjustments to the individual tokens, a reference curve with a steep slope of 18%/dB was obtained, resulting in a test with a high test-retest reliability of 1 dB. The SEC sound reception threshold was significantly associated with the averaged pure tone threshold (r = .70), as well as with the speech reception threshold for the Digit Triplet Test (r = .79), indicating that the SEC is susceptible to both audibility and signal-to-noise ratio loss. Sensitivity and specificity values on the order of magnitude of ∼70% and ∼80% to detect individuals with mild and moderate hearing loss, respectively, and ∼80% to detect individuals with slight speech-in-noise recognition difficulties were obtained. Homogeneity among sounds was verified in children. Psychometric functions fitted to the data indicated a steep slope of 16%/dB, and test-retest reliability of sound reception threshold estimates was 1.3 dB. A reference value of -9 dB signal-to-noise ratio was obtained. Test duration was around 6 minutes, including training and acclimatization.
A language-independent automated self-test on tablet based on masked recognition of ecological sounds, the Sound Ear Check (SEC), was developed. In this test, 24 trials of eight different sounds are randomly presented in a noise that was spectrally shaped according to the average frequency spectra of the stimulus sounds, using a 1-up 2-down adaptive procedure. The test was evaluated in adults with normal hearing and hearing loss, and its feasibility was investigated in young children, who are the target population of this test. Following equalization of perceptual difficulty across sounds by applying level adjustments to the individual tokens, a reference curve with a steep slope of 18%/dB was obtained, resulting in a test with a high test-retest reliability of 1 dB. The SEC sound reception threshold was significantly associated with the averaged pure tone threshold (r = .70), as well as with the speech reception threshold for the Digit Triplet Test (r = .79), indicating that the SEC is susceptible to both audibility and signal-to-noise ratio loss. Sensitivity and specificity values on the order of magnitude of ∼70% and ∼80% to detect individuals with mild and moderate hearing loss, respectively, and ∼80% to detect individuals with slight speech-in-noise recognition difficulties were obtained. Homogeneity among sounds was verified in children. Psychometric functions fitted to the data indicated a steep slope of 16%/dB, and test-retest reliability of sound reception threshold estimates was 1.3 dB. A reference value of -9 dB signal-to-noise ratio was obtained. Test duration was around 6 minutes, including training and acclimatization.
In their expert consensus article, the European Federation of Audiology Societies
recommends the implementation of preschool- and school-age hearing screening,
targeting mainly children between 4 and 7 years of age, in addition to the
systematic hearing screening of newborns (Piotrowska & Skarzynski, 2012). The
importance of having a safety net to capture children with hearing loss that has a
delayed onset or that is acquired later in life (for a review, see Smith, Bale, & White,
2005) has been widely recognized. This is especially important because
hearing loss negatively impacts on various development domains (Bess, Dodd-Murphy
& Parker, 1998; Porter,
Sladen, & Ampah, 2013), and its prevalence is expected to rise with
age (Fortnum, Summerfield,
Marshall, Davis, & Bamford, 2001).Selecting the most appropriate screening test for a given population is crucial.
Preschool- and school-age hearing screening are commonly based on pure tone
audiometry or otoacoustic emission testing, but applied procedures are not
universal. More important, performing such tests in screening environments can be
quite problematic, given the absence of a sound-proof booth or quiet room and the
susceptibility of both tests to excessive external noise (Prieve, Schooling, Venediktov, & Franceschini,
2015). As such, these methods have variable reliability as well as
variable sensitivity and specificity to detect mild hearing loss. Furthermore,
examiner experience level and training might influence accuracy.A growing trend has been the use of automated suprathreshold self-tests based on
speech perception in noise for adult hearing screening (De Sousa, Swanepoel, Moore, & Smits,
2018; Folmer,
Vachhani, Mcmillan, Kidd, & Feeney, 2017; Jansen et al., 2010; Jansen, Luts, Dejonckere, van Wieringen, &
Wouters, 2014; Koole
et al., 2016; Potgieter, Swanepoel, Myburgh, & Smits, 2018a; Smits & Houtgast, 2005; Smits, Merkus, & Houtgast,
2006; Vercammen,
Goossens, Wouters, & van Wieringen, 2017; Vlaming, MacKinnon, Jansen, & Moore,
2014; Watson, Kidd,
Miller, Smits, & Humes, 2012), for example, for online hearing
screening and screening via mobile devices (Stenfelt, Janssen, Schirkonyer, & Grandori,
2011). One commonly used test is the Digit Triplet Test (DTT), in which
combinations of three digits are presented in speech-shaped noise at different
signal-to-noise ratios (SNRs) using an adaptive test procedure. The test has been
translated to several languages (Potgieter, Swanepoel, Myburgh, Hopper, & Smits, 2015; Willberg et al., 2016;
Zokoll et al., 2012,
2013), its
development principles have been consolidated (Akeroyd et al., 2015), and the test can now
be considered an efficient reference for speech-in-noise screening. It has high
measurement reliability and accurately distinguishes between adults with normal
hearing (NH) and (mild) hearing loss, based on pure tone audiometry, with reported
sensitivity and specificity values of >80%. Because of its suprathreshold nature,
the impact of ambient noise and calibration is significantly reduced. The use of
this test for school-age hearing screening in Flanders has also been explored and
has led to the implementation of the DTT for the systematic hearing screening of
fifth grade (since school year 2018–2019 moved to sixth grade) elementary and third
grade secondary school children by school health services since the school year
2016–2017 (Denys et al.,
2018; Guérin, Van Hoeck, Denys, Wouters, & Hoppenbrouwers, 2017).However, whereas reliable SRTs can be measured in adults and children from higher
grades, its feasibility in younger children at school entry is not trivial (Witton, Talcott, & Henning,
2017). For instance, the test requires the ability to direct and maintain
attention as well as to remember digit combinations and knowledge of their written
representations. Altogether, the DTT might be too challenging for young children at
school-entry age.An adaptive self-test that uses nonspeech sounds with easily recognizable pictures
might offer a solution and potentially pave ways for international usage, given its
complete language independence and uniformity. The idea of using ecological sound
stimuli is not new and has already been suggested many years ago as a promising
alternative to pure tones when the aim is to screen young children (Myers, Letowski, Abouchacra, Kalb,
& Haas, 1996). More recently, such a test—the Frequency-specific
Animal Sound Test—was described and validated in young children by Nolte et al. (2016). It is
an adaptive self-test that randomly presents four filtered animal sounds (two low
frequency and two high frequency sounds) to the right and left ear consecutively and
allows determination of hearing thresholds for the low and high frequency range.
However, as it is a threshold test in quiet, it remains challenging to deal with
ambient noise levels and calibration aspects. To our knowledge, no previous studies
have investigated the use of a suprathreshold test using ecological sounds.Such a test, the Sound Ear Check (SEC) was developed. Should this test have
acceptable reliability, discriminate accurately between individuals with and without
hearing loss, and proves itself feasible in young children, it might hold potential
to be a useful common reference across many countries in the future given its
language independence. Consequently, it might allow for efficient estimation of
childhood hearing loss incidence worldwide and, as such, an evidence base for the
need of (pre)school-age hearing screening programs.In what follows, the development, evaluation, and validation of the SEC in adults and
its feasibility in children are described. All experiments conducted were approved
by the Ethics Committee of the University Hospitals of Leuven and Amsterdam.
Part I. Development of the SEC
The first part of this article concerns the development of the SEC in adults.From a database of 31 sounds, eight were selected based on international
recognizability and imageability, that is, the sounds had to be able to be depicted
with a simple drawing and thought to be recognizable for children.[1] Also, the spectral content of the selected sounds needed to cover a wide
frequency range. To characterize the spectrotemporal content of the sounds, an
analysis of their spectrotemporal properties, including an exploratory factor
analysis, was conducted. The results of these analyses are given in Appendix. A
spectrally matched masker was generated using MATLAB, by filtering a white noise
with the average sound spectrum. The sounds were rescaled to the average root mean
square (RMS) of the noise, using commercially available sound editing software.In a first group of participants, psychometric curves were measured for all sounds by
presenting them at a number of fixed SNRs, and level adjustments were carried out to
the individual sound tokens to obtain a homogeneous material in terms of equal
perceptual difficulty among items. Reducing the spread of sound-specific sound
reception thresholds (SRTs) is necessary to obtain a reference curve with a steep
slope, which is important to achieve the highest possible test reliability.In a second group of NH participants, homogeneity was verified, and a reference curve
was obtained. Also, two adaptive test versions were evaluated with respect to their
convergence and test–retest reliability.
Methods
Participants
Group 1 consisted of 11 young female students, aged between 18 and 22 years.
Group 2 consisted of ten participants (4 males, 6 females), aged between 22
and 53 years. With air conduction hearing thresholds for octave frequencies
between 250 and 8000 Hz ≤25 dB HL, all participants had NH.
Materials and procedures
All measurements were conducted in a sound-proof room. First, pure tone air
conduction thresholds were measured using standard clinical procedures,
using a Madsen Orbiter 922 or Madsen Midimate 622 audiometer connected to a
TDH-39 headphone, calibrated yearly according to ISO-389 standards.Second, the sounds were presented in noise at fixed SNRs. Stimuli were played
back via a PC, using APEX 3.1 software (Francart et al., 2017; Francart, van Wieringen,
& Wouters, 2008) and were routed to an external soundcard
(Fireface UCX RME) connected to a HDA-200 headphone, calibrated to the sound
masker at 80 dB sound pressure level (SPL) with a Brüel & Kjaer sound
level meter 2260 and a Brüel & Kjaer artificial ear 4153, using the flat
plate.Participants were seated in front of a computer screen and were instructed to
identify the sounds from a closed set of response alternatives, that is, a
matrix with line images of the sounds in black and white (see Figure 1). Guessing
was encouraged at challenging SNRs. Testing was conducted monaurally (the
sounds-in-noise were presented to the right ear) and was preceded by a
training phase. Obligatory breaks were included to avoid fatigue. In the
training phase, every sound was presented three times at 0 dB SNR with
random presentation order, and participants received feedback on their
performance.
Figure 1.
Screenshot of the Sound Ear Check. The final set of pictures is
shown, as some of the original pictures were revised after
evaluation in young children.
Screenshot of the Sound Ear Check. The final set of pictures is
shown, as some of the original pictures were revised after
evaluation in young children.During the test phase, no feedback was given. The sounds were presented at
SNRs of −9, −11, −13, −15, and −17 dB. Per SNR, every sound was presented
six times with random presentation order, and each SNR was tested twice,
resulting in 12 presentations per participant at each SNR. The order of SNRs
tested was counterbalanced across participants. The noise started 500 ms
before the target stimulus and ended 500 ms after each presentation and was
a randomly selected segment from the noise file (1 min). Its level was fixed
at 65 dB SPL.The same measurement protocol was followed by Group 2 participants. After the
assessment at fixed SNRs, Group 2 participants also performed two tests in
which the sounds were presented adaptively: one using a 1-up 1-down
procedure (targeting the SNR at which there is a probability of ≈50% for
correct identification, i.e., SRT50) and one using a 1-up 2-down
procedure (targeting the SNR at which there is a probability of ≈ 71% for
correct identification, i.e., SRT71). The level of the sounds was
varied in steps of 2 dB against a fixed background noise level of 65 dB SPL,
with SNR = 0 dB as starting value. Every sound was presented four times
randomly, leading to a test of 32 trials. Each test was conducted twice
(test–retest). Half of the participants started with the 1-up 1-down
procedure, and the other half started with the 1-up 2-down procedure.
Results and Discussion
Level adjustments (Group 1)
Psychometric functions were fitted to responses for each sound to estimate
its SRT and slope by means of a logistic regression, using the following
formula: with SI = speech intelligibility, = guess level (i.e., 1/8), s = slope, SRT = sound
reception threshold, and SNR = signal-to-noise ratio. Fitted SRTs and slopes
per sound (with standard error [SE]) are presented in Table 1.
Table 1.
SRT50s and Slopes Per Sound Before and After the
Application of Level Adjustments.
Before optimization (Group 1,
N = 11)
After optimization (Group 2,
N = 10)
Stimulus
SRT ± SE (dB SNR)
Slope ± SE (%/dB SNR)
SRT ± SE (dB SNR)
Slope ± SE (%/dB SNR)
Baby
−12.2 ± 0.1
17 ± 1
−12.8 ± 0.2
17 ± 2
Horn
−12.1 ± 0.1
17 ± 1
−12.6 ± 0.2
20 ± 2
Dog
−12.2 ± 0.1
23 ± 2
−12.2 ± 0.2
20 ± 2
Cat
−12.1 ± 0.1
14 ± 1
−12.7 ± 0.2
17 ± 2
Piano
−13.5 ± 0.1
14 ± 1
−13.3 ± 0.2
13 ± 1
Phone
−12.0 ± 0.1
16 ± 1
−11.7 ± 0.2
12 ± 1
Bell
−12.0 ± 0.1
13 ± 1
−12.9 ± 0.2
17 ± 2
Bird
−12.4 ± 0.1
27 ± 2
−12.3 ± 0.2
18 ± 2
Average (SD)
−12.3 (0.5)
18 (5)
−12.6 (0.5)
17 (3)
Note. SRTs and slopes result from nonlinear
regression fits per sound. SRT = sound reception threshold;
SE = standard error; SNR = signal-to-noise ratio.
SRT50s and Slopes Per Sound Before and After the
Application of Level Adjustments.Note. SRTs and slopes result from nonlinear
regression fits per sound. SRT = sound reception threshold;
SE = standard error; SNR = signal-to-noise ratio.Based on the data of Group 1 participants, the level of each sound was
adjusted to the mean SRT across all sounds. Level adjustments were applied
to the majority of sounds but were rather small (−1.2 to 0.3 dB).Originally, the SEC was homogenized with nine stimuli and included also the
sound of a trumpet, which was removed because of confusions with “piano”
that were above chance level (acoustically, both sounded quite alike), and a
relatively shallow slope of its psychometric function in comparison with the
other stimuli. Level adjustments applied in these previous homogenization
stages can explain why adjustments applied were rather small.
Reference curve, convergence to the SRT, and test–retest reliability[2] (Group 2)
For Group 2 participants, a psychometric function per participant was also
fitted to obtain a reference curve. Precise psychometric fits could be
obtained based on the measurements with fixed SNRs, with SEs below 0.5 dB
for SRT estimates, and below 7%/dB for slope estimates (data not shown).
Across Group 2 participants, a fitted SRT of −12.5 dB SNR
(SE = 0.004 dB) and a steep slope of 18%/dB
(SE = 0.001) were obtained. The reference psychometric
curve is shown in Figure
2.
Figure 2.
Reference psychometric curve for normal-hearing adults
(N = 10) for the Sound Ear Check. Dots
indicate predicted participant scores at fixed SNRs. The black
line results from a nonlinear regression fit across all
participants. The dotted line represents the chance level.
SNRs = signal-to-noise ratios.
Reference psychometric curve for normal-hearing adults
(N = 10) for the Sound Ear Check. Dots
indicate predicted participant scores at fixed SNRs. The black
line results from a nonlinear regression fit across all
participants. The dotted line represents the chance level.SNRs = signal-to-noise ratios.Per participant a psychometric curve was also fitted based on the pooled data
of the adaptive measurements to estimate the SNRs for a 50% and 71%
probability of correct identification. For each adaptive procedure, we
investigated SNR convergence to these fitted SRTs, to determine the number
of initial trials to exclude when computing the SRT. These adaptively
determined SRTs (averaged across test and retest per participant) were then
compared with the fitted SRTs, using Wilcoxon signed-rank tests. SRT
convergence was reached at Trial 8 for the 1-up 1-down procedure
(Mdn = −0.4 dB, W = 7,
p = .89) and at Trial 9 for the 1-up 2-down procedure
(Mdn = −0.8 dB, W = 7,
p = .89). As such, the SRT50 and
SRT71 can be determined as the SNR average from Trials 8 and
9 to Trial (?) 33, respectively, when the SRT needs to be
estimated from an adaptive procedure. Trial 33 is an imaginary trial: It is
not presented as an actual trial, but its SNR can be determined based on the
response for Trial 32 (the final trial). SRTs determined as such were
−14.5 dB SNR (SD = 1.0 dB) and −12.6 dB SNR
(SD = 1.1 dB), on average, for the 1-up 1-down and 1-up
2-down procedure, respectively. A significant difference
(Mdn of difference = −0.7 dB, W = −41,
p < .05) with respect to the fitted SRT50
(M = −13.7 dB SNR, SD = 1.1 dB), but
not (Mdn of difference = 0.3 dB, W = 21,
p = .24) with respect to the fitted SRT71
(M = −12.8 dB SNR, SD = 1.0 dB), was
found compared with SRTs calculated from the adaptive procedures.Per procedure, the test–retest reliability was calculated as the standard
deviation (SD) of the differences between test- and retest-SRTs,[3] divided by (Smits
& Houtgast, 2005; Smits, Kapteyn, & Houtgast,
2004). Test–retest reliability was also calculated taking only a
subset of trials into account, to simulate a test with 24 instead of 32
trials. Test–retest reliability was 0.9 dB for the 1-up 1-down procedure and
0.5 dB for the 1-up 2-down procedure. Simulating a shorter test, averaging
up to imaginary Trial “25” did not seem to affect SRT estimates
(SRT50: M = −14.5 dB SNR,
SD = 1.1 dB; SRT71:
M = −12.5 dB SNR, SD = 1.0 dB) nor
test–retest reliability (SRT50: 1.0 dB; SRT71:
0.5 dB).
Part II. Evaluation and Validation of the SEC
The second part of this article concerns the evaluation and validation of the SEC in
adults with NH and hearing loss and involves a comparison with the DTT. For both
tests, test–retest reliability was investigated, and the relation between the SRT
and pure tone thresholds was studied. The sensitivity and specificity of both tests
for detecting mild and moderate degrees of hearing loss were examined. SEC-SRTs were
also related to DTT-SRTs, and the sensitivity and specificity of the SEC to detect
individuals with impaired speech perception, based on different pass/fail-criteria
for the DTT, were explored.Although homogenization was aimed at maximizing the slope of the sound material at
50%, a 1-up 2-down adaptive procedure was chosen for further evaluation because it
seemed to result in more precise adaptive SRT estimates and more accurate
convergence to the SRT (see Part I). Also, targeting higher recognition
probabilities (a 1-up 2-down procedure converges to a recognition probability
of ≈ 71%) agrees more with the DTT’s convergence point of 79%, which makes the
comparison between DTT and SEC outcomes, as presented in this article, more
theoretically sound. Furthermore, targeting higher recognition probabilities is more
appropriate for testing younger children (Witton et al., 2017). As shortening the
test did not significantly reduce its test–retest reliability, a test with 24 trials
was used.In total, 135 participants were recruited from existing lab databases and via
word of mouth. Participants with NH, defined as hearing thresholds ≤25 dB HL
for all octave frequencies between 250 and 8000 Hz and for interoctave
frequencies of 3000 and 6000 Hz, were considered as the reference NH group.
Other participants (one or more hearing threshold(s) >25 dB HL) were
classified as having hearing loss and formed the hearing-impaired (HI)
group. Hearing loss etiologies varied and included genetic as well as
age-related and noise-related causes, but all participants included had a
sensorineural hearing loss, as we used an air bone gap averaged over the
frequencies 500–4000 Hz ≤15 dB HL as an inclusion criterion. Two (out of
135) participants were excluded because of a larger air bone gap in the test
ear. One participant with a known diagnosis of auditory neuropathy was also
excluded.In total, 132 participants were tested, of which 44 had NH, and 88 had
hearing loss. The distribution of their hearing thresholds is shown in Figure 3.
Reference-group participant’s ages ranged between 18 and 69 years
(M = 40 years, SD = 14 years). With
only 10 males, this group consisted primarily of female participants. Ages
of HI participants ranged between 18 and 76 years (M = 60
years, SD = 14 years). In this group, 45 participants were
female and 43 were male.
Figure 3.
Distribution of the pure tone thresholds (in dB HL) per frequency
(in Hz) for participants with hearing loss. Each black line
represents a different percentile. Note. Only
data of the test-ear are shown.
Distribution of the pure tone thresholds (in dB HL) per frequency
(in Hz) for participants with hearing loss. Each black line
represents a different percentile. Note. Only
data of the test-ear are shown.Measurements were conducted in a sound-proof booth for the majority of
participants. NH participants were occasionally tested in their home
environment in a quiet room. First, bilateral pure tone audiometry was
conducted. Thresholds were measured using the Hughson Westlake method (Carhart & Jerger,
1959). If needed, masking was applied using the method of Hood (1960). Pure
tone audiometry was performed, using a Madsen Orbiter 922 or Madsen Midimate
622 audiometer connected to a TDH-39 headphone for the measurement of air
conduction thresholds, and a B71 bone vibrator for bone conduction,
calibrated yearly according to ISO-389 standards.Second, the Flemish DTT and SEC screening tests were administered. The test
platform used consisted of a 7-inch Samsung Galaxy Tab 4 tablet, connected
to DD45 transducers embedded in Peltor caps. The test setup was calibrated
to the noises of the test materials at 80 dB SPL with a Bruël & Kjaer
Sound level meter 2260 and a Brüel & Kjaer artificial ear 4153 using the
flat plate.Half of the participants started with the SEC, and the other half started
with the DTT to prevent order effects. The tests were administered
monaurally, that is, the right ear for NH participants (all
thresholds ≤ 25 dB HL) and the better ear (based on the average hearing
threshold for the frequencies 500, 1000, and 2000 Hz) for HI participants.
Each test was conducted twice (test–retest) and was preceded by a short
training test with the exact same test procedure, but consisting of only 12
trials. No feedback was given during training. For the SEC, an additional
acclimatization phase was implemented which allowed the participant to learn
the set (i.e., link the sounds to the images), presenting the sounds in
noise at an easy SNR of + 10 dB SNR. During this phase, whenever a sound was
correctly linked to its image, the image disappeared. Whenever a sound was
incorrectly linked to an image, the sound was repeated until correctly
linked with its image. Both training and acclimatization (for the SEC only)
stimuli were presented bilaterally. Prior to the tests, participants were
orally instructed and informed that they had to guess at challenging
SNRs.The Flemish DTT (Akeroyd
et al., 2015; Jansen, Luts, Dejonckere, van Wieringen, & Wouters, 2013)
consisted of 27 triplets (combinations of three monosyllabic digits: one
[e:n], two [twe:], three [dri:], four [vi:r], five [vɛif], six [zɛs], and
eight [ɑxt], uttered by a Female native Flemish speaker), which were
randomly chosen from one of ten equivalent test lists and were presented at
variable SNRs in a spectrally matched speech-shaped noise fixed at 65 dB
SPL, which was interrupted in between triplet presentations. Starting at an
SNR of −2 dB, the first 5 trials were scored using a triplet scoring
procedure (i.e., all three digits had to be identified correctly), with a
fixed SNR step size of 2 dB, ensuring a rapid descent to the SRT. From the
sixth triplet onward, the SNR step size varied depending on the number of
digits identified correctly, using the equation: , where i is the number of digits
recognized correctly (Brand & Kollmeier, 2002; Denys, Hofmann, van Wieringen, &
Wouters, 2019). Feedback on the position of the presented digit
was given in real time by means of a highlighting check box. Responses were
given via a numeric touch pad on the tablet.The SEC consisted of eight sounds that were randomly presented (each sound
was presented three times) in a spectrally matched sound-shaped noise fixed
at 65 dB SPL, which was interrupted in between trials. The first sound was
presented at an SNR of 0 dB. The SNR on subsequent trials was varied in
fixed steps of 2 dB using a 1-up 1-down procedure for the first 7 trials and
a 1-up 2-down procedure from Trial 8 onward (Levitt, 1971). A matrix with
stimulus illustrations (Figure 1) was shown on the tablet, and a response was given by
tapping the image corresponding to the identified sound.A pure tone average (PTA) based on the thresholds for frequencies ranging between
250–8000 Hz, including 3000 and 6000 Hz (in dB HL), was calculated from the
audiogram of the test ear and was used for most of the analyses. The main
outcome measure of the suprathreshold tests is the SRT (in dB SNR). This was
calculated as the averaged SNR of the last 22 (DTT) and 17 (SEC) trials,
including a nonpresented (imaginary) trial for which the SNR could be calculated
based on the response of the last presented item. Test results were
automatically transferred to a database server over a secure connection using
Wi-Fi. The age of the participant was also recorded and included in the
analyses.Prior to the analyses, four participants were excluded from the reference group
because of deviant performances on the SEC (N = 4) or the DTT
(N = 1) compared with their peers, based on SRTs
deviating ≥ 2 SDs from the mean (Ramus et al., 2003).
Exploratory results and learning effect
In an exploratory data analysis, we investigated a potential effect of
session (test or retest) on the SEC-SRT and DTT-SRT by conducting a repeated
measures analysis of variance (ANOVA).Per test, a repeated measures ANOVA was conducted on the SRT data with
session (test or retest) as a within-subject variable and group (NH or HI)
as a between-subject factor. Despite training (and acclimatization for the
SEC), a significant session effect was found both for the SEC,
F(1, 126) = 73.73, p < .01, and the
DTT, F(1, 126) = 5.54, p < .05, with
retest-SRTs being significantly different (better) with values of 1.2 (SEC)
and 0.3 (DTT) dB, presumably due to learning. This effect was independent of
group, as shown by nonsignificant interaction effects between session and
group, both for the SEC, F(1, 126) = 0.00,
p = .95, and for the DTT, F(1,
126) = 0.04, p = .83.Although an SRT improvement of 0.3 dB for the DTT can be considered
negligible, given that it is much smaller than the measurement error, an
improvement of 1.2 dB, as seen for the SEC, could be clinically relevant.
Such an effect was not observed in the development phase, where test- and
retest-SRTs for the adaptive 1-up 2-down procedure differed only by 0.2 dB.
It is likely that participants from the development phase were trained by
the assessment at fixed SNRs, which always preceded the adaptive
measurements.The more pronounced learning effect found for the SEC compared with the DTT
suggests that participants needed more training to become familiar with the
nonspeech sounds presented in the test as compared with the digits. Indeed,
people are highly familiar with digits. Presumably, it requires more
training to familiarize with the spectrotemporal properties of the sounds,
which can be considered content learning. This has implications for test
duration. However, with 2 min 33 s (SD = 26 s), the average
test duration was significantly lower with 1 min 17 s,
t(39) = −17.99, p < .01, as compared
with test duration for the DTT (3 min 50 s, with SD = 30 s)
in NH participants.To gain more insight into the learning effect, four monaural (right ear) SEC
tests were consecutively performed by ten naïve NH participants (aged
between 28 and 57 years old) who did not participate in this study. A
Friedman ANOVA indicated a significant main effect of test iteration
(χ2 = 18.09, p < .01). Nonparametric post
hoc pairwise comparisons indicated a significant difference between the
first and second test iteration (p < .05), but not
between further test assessments. SRTs were on average −9.2
(SD = 1.3), −11.1 (SD = 0.8), −11.5
(SD = 1.2), and −11.7 (SD = 1.1) dB
SNR for the first up to the fourth test iteration. The mean improvement of
1.9 dB after the first test is larger than the learning effect of 1.3 dB
that was found in our reference group. Presumably, the acclimatization and
training tests did partly counter the learning effect, but this testing was
not long enough (only 12 trials) to fully eliminate it.All subsequent analyses, in which SRTs were involved, were done on retest
data, but the same pattern of results was obtained when those analyses were
repeated for test data.
Reference values and test–retest reliability
Reference values were determined in NH participants
(N = 40). A reference SRT (±2 SD) of −11.3 dB SNR (±2 dB)
for the SEC was obtained. This reference approached the expected value,
based on the reference curve obtained in the development phase (−11.5 dB
SNR) for a recognition probability of ≈71%.For the DTT, the reference value of−11.1 (±1.6) dB SNR obtained in this study
agrees with previously reported values of−11.5 and−11.1 dB SNR for young
(20–30 years) and middle-aged (40–60 years) adults (Vercammen et al., 2017),
respectively, and the original reference value of −11.7 dB SNR found by
Jansen et al.
(2013) in young adults. Although the DTT version used in this
study applied a digit scoring procedure, comparisons to formerly used
versions with triplet scoring can be made as both scoring methods target the
same probability of recognizing three digits (79%). As such, SRTs resulting
from these adaptive procedures with different scoring methods are
comparable.The test–retest reliability of SRT measurements was calculated using a method
that balances out learning effects (Smits et al., 2004). As such,
reliability was calculated as the SD of the differences between test- and
retest-SRTs, divided by . Bootstrapping, based on N = 1,000
samples, was performed to obtain 95% CIs, allowing comparisons of
test–retest reliability across tests and groups. SRTs could be estimated
with low test–retest error. With 0.8 dB (95% CI [0.6, 1.0 dB]) for the SEC
and 0.5 dB (95% CI [0.4, 0.6 dB]) for the DTT, test–retest reliability for
the SEC was significantly worse (p < .05) in NH
participants, as indicated by the nonoverlapping error bars. The value of
0.8 dB found for the SEC was slightly worse when compared with values
obtained in the development phase, but still acceptable. Differences in the
measurement setup (pc with external sound card in sound-proof booth in
development phase versus tablet with internal sound card in quiet room in
evaluation phase) as well as differences in age between participants from
the development and evaluation phases might partly explain this finding.The more favorable reliability values obtained with the DTT in NH
participants are likely due to the larger number of trials used for the
calculation of the DTT-SRT (22 vs. 17 for the SEC). With DTT-SRTs
recalculated as the average SNR of trials 7 to 23 (i.e., 17 trials),
test–retest reliability slightly decreased from 0.5 to 0.7 dB (95% CI [0.5,
0.9]) in reference-group participants.Taking also HI participants into account, test–retest reliability for both
tests decreased, but did not differ significantly, and was 1.0 (95% CI [0.9,
1.1 dB]) for the SEC, and 0.9 (95% CI [0.7, 1.0 dB]) for the DTT. A similar
test–retest reliability of 0.5 dB for the DTT with digit scoring in NH
participants was reported by Denys et al. (2019), which decreased
to 0.8 dB when taking HI adults into account, who had a similar age
distribution, and also with the value of 0.8 dB found by Jansen in their
adult noise-exposed study sample (Jansen et al., 2013), with slightly
younger participants (22–59 years). The shallower slopes of the psychometric
functions of HI participants can explain this drop in test reliability.
Also, as they were on average somewhat older, concentration fluctuations
from test to retest might further explain the lower test reliability when
including HI participants.
Predicting PTA from SEC- and DTT-SRTs
Pearson correlations between all variables in both NH and HI participants
were explored to choose relevant predictors for subsequent regression
models.These were conducted to investigate the SRT-PTA relation for SEC and DTT
(this section) and the mutual relations between SEC-results and DTT-results
(next section) in the HI group. Data were evaluated with respect to the
assumptions of regression analysis (no multicollinearity, homoscedasticity,
linearity, independence, and normal distribution (Kolmogorov–Smirnov test of
residuals).The SRT-PTA relation is shown in Figure 4 for SEC and DTT. Strong and
significant correlations between SRT and PTA were found, with
r = .70 (p < .01) for the SEC, and
r = .86 (p < .01) for the DTT.
Figure 4.
Scatter plot of the SRT versus the PTA0.25–8kHz for
the SEC (black dots) and the DTT (gray dots). Open circles
represent data from normal-hearing adults
(N = 40), and filled circles represent data
from adults with hearing loss (N = 88). The
lines result from simple linear regression fits to the data
obtained from participants with hearing loss. The dotted lines
represent the 95% CI of the fits. Only retest-SRTs are
shown.
In HI participants (N = 88), multiple linear regression
analyses were conducted to predict the PTA0.25–8kHz from the
SEC-SRT and DTT-SRT. Age was added to the models, given its known relation
with hearing thresholds and the results from our univariate exploratory
correlational analysis (Table 2). A significant model was obtained using SEC-SRT and age
as predictor variables, F(2, 85) = 36.70,
p < .01, with an R2 of
.46. SEC-SRT (β = .68, p < .01), but not age (β = −.04,
p = .63), was a significant contributor to the model. A
significant regression model was also obtained using the DTT-SRT and age,
F(2, 85) = 104.99, p < .01, with an
R2 of .71. The DTT-SRT contributed
significantly to the model (β = .86, p < .01) but age
did not (β = .11, p = .08).
Table 2.
Exploratory Correlational Analysis of Outcome Variables.
Age
PTA0.25–8kHz
SEC-SRT
DTT-SRT
Age
–
0.44**
0.28
0.47**
PTA0.25–4kHz
−0.23*
–
−0.02
0.34*
SEC-SRT
−0.11
0.70**
–
0.42**
DTT-SRT
−0.25*
0.86**
0.79**
–
Note. Pearson correlations above the
diagonal were obtained in normal-hearing participants
(N = 40). Values below the diagonal
were obtained in hearing-impaired participants
(N = 88). PTA = pure tone average;
SEC-SRT = sound reception threshold of Sound Ear Check;
DTT-SRT = speech reception threshold for Digit Triplet
Test.
Significant correlations are indicated by an asterisk (*),
with *p < .05 and
**p < .01.
Exploratory Correlational Analysis of Outcome Variables.Note. Pearson correlations above the
diagonal were obtained in normal-hearing participants
(N = 40). Values below the diagonal
were obtained in hearing-impairedparticipants
(N = 88). PTA = pure tone average;
SEC-SRT = sound reception threshold of Sound Ear Check;
DTT-SRT = speech reception threshold for Digit Triplet
Test.Significant correlations are indicated by an asterisk (*),
with *p < .05 and
**p < .01.The strong correlation of r = .86 between the DTT-SRT and
the averaged hearing threshold agrees with findings previously reported for
the Flemish DTT with triplet or digit scoring, for which correlations
between the DTT-SRT and the PTA2,3,4,6kHz (Jansen et al., 2013) or
PTA1,2,4kHz (Denys et al., 2019) of
r = .86 have been found.Although the correlation of r = .70 between the SEC-SRT and
the PTA found in this study is in line with correlations between
r = .60 and r = .75 reported in
literature for sentence tests or CVC tests (Jansen et al., 2014), it is lower
compared with the correlation value of r = .86 found for
the DTT. One factor limiting the correlation obtained for the SEC might stem
from the narrower spectral content of the SEC sounds, which only included
frequencies up to 5000 Hz. Another explanation could be that the SEC and DTT
measure spectrotemporal processing abilities with a different weighting.
Indeed, correlational and post hoc regression analyses showed that the DTT
explained variance in SEC-SRTs in addition to variance explained by the PTA
alone. Although both the DTT and the PTA have an overlap in the spectral
domain, temporal information (i.e., amplitude modulation) is absent in pure
tones but is an important factor for speech discrimination. It is possible
that the SEC is more sensitive to temporal aspects of hearing, whereas the
DTT loads more heavily on spectral resolution. Some studies have shown
that—when accounting for audibility—better amplitude modulation thresholds
are obtained by HI participants when compared with NH participants, a
phenomenon that has been linked to the loss of fast-acting cochlear
compression and loudness recruitment, resulting in a perceptual
magnification of amplitude fluctuations (Schlittenlacher & Moore, 2016;
Wallaert, Moore,
Ewert, & Lorenzi, 2017). It is possible that the sounds of
the SEC are recognized mainly based on their temporal content and that HI
persons are quite proficient in discriminating between the different sounds
because of their increased sensitivity to amplitude modulation. This might
explain why the variation in SEC-SRTs is rather small over the wide range of
hearing loss degrees, resulting in a lower correlation between the SEC-SRT
and the PTA.
Predicting DTT-SRT from PTA, age, and SEC-SRT
The relation between the SEC-SRT and DTT-SRT is shown in Figure 5. A strong and significant
positive correlation of r = .79
(p < .01) was found between the SEC-SRT and the DTT-SRT.
Figure 5.
Scatterplot of the SEC-SRT versus the DTT-SRT. Open circles
represent NH adults (N = 40), and filled
circles represent HI participants (N = 88). The
black line results from a simple linear regression fit to the
data of HI participants. Dotted lines represent the 95% CI of
the fit. Only retest-SRTs are shown.
SEC-SRT = sound reception threshold for the Sound Ear Check;
DTT-SRT = speech reception threshold for the Digit Triplet Test;
SNR = signal-to-noise ratio; NH = normal-hearing;
HI = hearing-impaired.
Scatter plot of the SRT versus the PTA0.25–8kHz for
the SEC (black dots) and the DTT (gray dots). Open circles
represent data from normal-hearing adults
(N = 40), and filled circles represent data
from adults with hearing loss (N = 88). The
lines result from simple linear regression fits to the data
obtained from participants with hearing loss. The dotted lines
represent the 95% CI of the fits. Only retest-SRTs are
shown.SRT = sound reception threshold (SEC)/speech reception threshold
(DTT); PTA = pure tone average; SEC = Sound Ear Check;
DTT = Digit Triplet Test.A significant multiple linear regression model was obtained, predicting the
DTT-SRT from the SEC-SRT, PTA0.25–8kHz and age,
F(3, 84) = 124.08, p < .01, with an
R2 of .82 in HI participants.
PTA0.25–8kHz and age were added to the model, based on the
results from our univariate exploratory correlational analysis. All
predictors contributed significantly to the model independently (SEC-SRT:
β = .41, p < .01; PTA0.25–8 kHz: β = .54,
p < .01; age: β = −.15,
p < .01), with PTA0.25–8kHz being the most
important predictor.
Sensitivity and specificity
Receiver operating characteristic analyses were conducted to investigate the
sensitivity and specificity of the SEC and the DTT to detect mild and
moderate hearing loss based on the results of all participants
(N = 128). For these analyses, PTA0.5–2kHz
and PTA1–4kHz were calculated, as they are often used clinically
to determine the degree of hearing impairment. Mild and moderate hearing
loss was defined as a PTA ≥ 20 and ≥40 dB HL. A sensitivity and specificity
analysis was also conducted for the SEC to detect impaired speech
perception, defined as a failed DTT. Pass/fail-criterion of −10.3 and
−9.5 dB SNR was chosen, that is, 1 and 2 SDs higher than the mean SRT of the
reference group, respectively. The areas under the curve (AUCs), reflecting
the degree of accurate diagnostic classification, were compared for both
tests, and optimal referral criteria were determined with respect to the
trade-off between sensitivity and specificity. AUCs, cutoff values, and
sensitivity and specificity estimates are summarized in Table 3.
Table 3.
Diagnostic Accuracy of the Sound Ear Check (SEC) and Digit
Triplet Test (DTT).
SEC
DTT
Hearing loss criterion
AUC
Pass/fail-criterion (dB SNR)
Sensitivity (%)
Specificity (%)
AUC
Pass/fail-criterion (dB SNR)
Sensitivity (%)
Specificity (%)
PTA0.5–2kHz ≥ 20 dB HL
0.78** [0.70, 0.86]
−10.6
69 (57–80)
71 (58–83)
0.91** [0.86, 0.96]
−9.8
85 (74–92)
84 (72–92)
PTA1–4kHz ≥ 20 dB HL
0.81** [0.74, 0.88]
−10.6
74 (62–84)
73 (60–83)
0.92** [0.88, 0.97]
−9.4
88 (78–95)
87 (76–94)
PTA0.5–2kHz ≥ 40 dB HL
0.92** [0.86, 0.97]
−9.3
82 (57–96)
81 (73–88)
0.98** [0.96, 1.00]
−3.6
94 (71–100)
96 (91–99)
PTA1–4kHz ≥ 40 dB HL
0.86** [0.77, 0.95]
−9.6
77 (59–90)
79 (70–87)
0.97** [0.93, 1.00]
−7.2
94 (79–99)
96 (90–99)
DTT-SRT ≥ −10.3 dB SNR
0.88** [0.92, 0.94]
−10.7
78 (68–86)
83 (69–92)
–
–
–
–
DTT-SRT ≥ −9.5 dB SNR
0.86** [0.80, 0.92]
−10.6
78 (66–87)
78 (66–88)
–
–
–
–
Note. Values between square brackets
indicate 95% confidence intervals. AUC = area under curve;
PTA = pure tone average; DTT-SRT = speech reception
threshold for the DTT; SNR = signal-to-noise ratio.
Asterisks (**) indicate significance, with
p < .01.
Diagnostic Accuracy of the Sound Ear Check (SEC) and Digit
Triplet Test (DTT).Note. Values between square brackets
indicate 95% confidence intervals. AUC = area under curve;
PTA = pure tone average; DTT-SRT = speech reception
threshold for the DTT; SNR = signal-to-noise ratio.Asterisks (**) indicate significance, with
p < .01.Large and significant (p < .01) AUCs were obtained for
both tests, with CIs far from chance level (0.50). A significantly better
diagnostic accuracy was found for the DTT, as indicated by the
nonoverlapping 95% CIs, to detect a mild hearing loss. For moderate hearing
loss, both tests have equal diagnostic accuracy, given overlapping error
bars.With appropriate cutoff values, sensitivity and specificity values ≈ 70% for
the SEC were obtained to detect a mild hearing loss. The test demonstrated
higher sensitivity and specificity values ≈ 80% to detect moderate hearing
loss or (slightly) impaired speech understanding in noise. A pass/fail
criterion of −10.6 dB SNR appears to discriminate best between listeners
with NH and mild hearing problems, both with respect to audibility and SNR
loss.For the DTT, high sensitivity and specificity values of ≈85% to 90% and ≈ 95%
were obtained to detect individuals with a mild (PTA > 20 dB HL) and
moderate (PTA > 40 dB HL) hearing loss, respectively. To detect a mild
hearing loss, optimal pass/fail-criteria were −9.8 (PTA0.5–2 kHz)
and −9.4 (PTA1–4 kHz) dB SNR. To detect a moderate hearing loss,
optimal criteria were −3.6 (PTA0.5–2 kHz) and −7.2
(PTA1–4 kHz) dB SNR. Similar pass/fail-criteria of −10 and
−7.1 dB SNR resulted in comparable sensitivity and specificity values ≈90%
to detect individuals with a mild and moderate high frequency hearing loss,
defined as a PTA2,3,4,6 > 10 dB HL and
PTA2,3,4,6 > 40 dB HL, respectively, in Jansen et al. (2013). Similarly,
Vercammen et al.
(2017) demonstrated a high diagnostic accuracy of the test to
detect middle-aged individuals with hearing loss, defined as a
PTA1,2,4,8 > 25 dB HL with a pass/fail-criterion of
−9.2 dB SNR. In sum, the diagnostic accuracy of the DTT with digit scoring
is comparable with literature published previously, with only small
differences in pass/fail-criteria resulting in similar sensitivity and
specificity values to detect hearing loss, quasi irrespective of hearing
loss definition and in various study samples (Folmer et al., 2017; Koole et al., 2016;
Potgieter,
Swanepoel, & Smits, 2018b; Potgieter et al., 2018a; Qi, Zhang, Fu, & Li,
2018).
Part III. Feasibility of the SEC in Children
The third part of this article concerns an evaluation of the SEC in children with
school-entry ages. In a first group of children, the images and sounds used were
evaluated with respect to their recognizability. In a second group of children, the
test was evaluated in terms of its test–retest reliability and procedural
performance (stability, convergence, and test duration). Also, perceptual
homogeneity among sounds was verified.For the evaluation of sounds and images used, 21 third grade kindergarten
school children were recruited (children aged 5–6 years, Group 1). Because
we did not want development problems, such as problems in the domains of
language development and learning, confounding the results, seven children
were excluded based on a fail on one or more skill domains assessed by the
Screening Instrument For Targeting Educational Risk questionnaire (Anderson, 1989),[4] filled in by the teacher, or based on concerns expressed by the
parents. In addition, one child with a documented hearing loss, reported by
the parents, was excluded. The remaining 13 children (eight boys, five
girls) all passed a pure tone audiometry screening. Their ages ranged from
64 to 80 months (M = 74 months, SD = 4
months). Four children were recruited from an international school and had a
non-Belgian origin (Indian, Danish, Italian, and American).The feasibility of the SEC test as an automated self-test was investigated in
second group of third grade kindergarten and first grade elementary school
children (children aged 5–7 years, Group 2). For this purpose, 32 children
were recruited of whom two were excluded because they failed a pure tone
audiometry screening. For two children, the data could not be retrieved. The
remaining 28 children (11 boys, 17 girls) passed a pure tone hearing
screening and had ages ranging between 65 and 87 months
(M = 77 months, SD = 6 months). All were
of Belgian origin.All children were tested at their schools in a quiet room. A pure tone
screening was conducted using the protocol currently used by Flemish school
health services for the systematic hearing screening of first grade
elementary school children. Pure tone stimuli were either presented via a
Madsen Midimate 622 portable audiometer connected to calibrated TDH-39
headphones or via APEX 3.1 software (Francart et al., 2008, 2017) installed on a
laptop through an external soundcard connected to calibrated HDA-200
headphones. For both ears, a pure tone of 30 dB HL at 1000 and 4000 Hz must
be detected to pass the screening.The evaluation of SEC sounds and images (Group 1) was done as follows: First,
the sounds were played back at a comfortable level via the loudspeaker of a
laptop using VLC media player in random order, and the child was asked to
associate the sound with any referent (free association task). Second, we
randomly showed images of the sounds and asked to name them (picture naming
task). Third, we again played back the sounds at random and asked the child
to link it to one of the pictures (linking task). Finally, the child was
asked to make a drawing of the different sounds (drawing task). No feedback
was given on their responses during any of the earlier tasks. Based on the
data obtained in these children, the original images for the SEC test were
fine-tuned and resulted in the final selection as shown in Figure 1.Following oral instructions, a test–retest of the SEC test was conducted by
the second group of children (Group 2). Materials and procedures used were
identical to those described in Part II, but now both the left and right ear
were tested, and the bilateral training test, following acclimatization,
consisted of 24 trials (instead of 12) to counter the learning effect (cf.
Part II). The start-ear was randomized across participants, and the ear
order was reversed for the retest. Also, at retest, training and
acclimatization phases were skipped. Test and retest assessments were
conducted at separate days, with about one week between sessions.
Evaluation of sounds and images
The number of children correctly associating, naming, and linking the
pictures and sounds are given in Table 4. Overall, “baby,” “piano,”
“bird,” and “horn” were the most problematic. Children seemed to be less
familiar with the sounds of the “baby” and the “piano,” with correct
association for only 2/13 and 6/13 children, respectively. For “baby,” the
picture was unclear for 6/13 children, and the majority of children (7/13)
linked the sound to the wrong image. Confusions with “dog,” “cat,” and
“bird” were observed. These confusions were in line with frequently
appearing animal associations for this sound (e.g.,
rhino, wolf,
monkey, seagull) and its original image (e.g.,
fish, dolphin). Based on drawings of the children, the
image was changed from a crying baby head to a full-bodied crying baby. The
picture for “piano,” on the other hand, seemed clear for all children, and
12/13 were able to link this sound correctly to the image. All children
seemed to be familiar with the sound of the “bird,” but its image was
designated as duck or pigeon by 7/13
children. Despite adequate sound-image linking, the image was changed to
agree more with its highly pitched corresponding bird sound. “Horn”
performed moderately across tasks with correct linking, naming, and
association by 10–11/13 children and was frequently linked to the wrong
image (“bell” or “piano”). As children often associated the sound with a
car, the image was changed accordingly (from horn to
car).
Table 4.
Number of Children (out of 13) Correctly Associating, Naming, and
Linking Sound Ear Check Sounds and Pictures.
Stimulus
Association
Naming
Linking
Baby
2
7
6
Horn
11
11
10
Dog
13
13
13
Cat
10
13
12
Piano
6
13
12
Phone
11
13
13
Bell
12
13
12
Bird
13
6
13
Number of Children (out of 13) Correctly Associating, Naming, and
Linking Sound Ear Check Sounds and Pictures.
Evaluation of the SEC
Prior to analysis, the results from five children with one or more unreliable
staircases were excluded, that is, staircases in which the ceiling-SNR
of + 10 dB was reached (two measurements) or staircases that lead to a
positive SRT (three measurements).To investigate the feasibility of the test in young children, first, SRTs,
test–retest reliability, test stability, and test duration were
investigated. A repeated measures ANOVA was conducted to investigate effects
of session and ear on the SRT. Because counterbalancing start-ears could
have obscured a possible improvement of the SRT due to learning or a
decrease of the SRT due to attention loss, a repeated measures ANOVA was
also conducted with test order as a within-subject factor. Also, we examined
the within-subject SDs across the SNRs used for the calculation of the SRT,
which indicates the stability of individual measurements, and investigated
effects of ear, session, and test order by means of repeated measures
ANOVAs. When the assumption of sphericity was violated, degrees of freedom
were Greenhouse-Geisser corrected. With respect to test
duration, a paired-samples t test was conducted to
investigate differences from test to retest, thereby discarding training and
acclimatization phases.SRTs and SDs are shown in Figure 6. A repeated measures ANOVA indicated a significant main
effect of ear, F(1, 22) = 7.18,
p < .05, but not session, F(1,
22) = 2.41, p = .14, nor a significant session x
ear interaction, F(1, 22) = 0.12,
p = .74. The difference between ears was 0.7 dB, with
better SRTs for the left ear (−9.4 vs. −8.7 dB SNR, on average). Albeit a
small left–right difference, this finding is surprising given the study
design with counterbalanced start-ears, which would balance out learning or
attention effects. A repeated measures ANOVA indicated no significant effect
of test order, F(3, 66) = 0.81, p = .49,
indicating that the full-length training list eliminated the learning effect
observed in adults (Part II). Hence, the SRTs of repeated tests were
averaged per participant and yielded a reference SRT (±2 SD) of −9 (±2.8) dB
SNR. This value differs from the reference of ≈−11 dB SNR obtained in NH
adults by ≈2 dB, which is in line with child–adult differences reported in
the literature for speech-in-noise tests (Leibold, 2017).
Figure 6.
Boxplots of the children’s SRT (upper panel) and test stability
(SD = standard deviation, lower panel) for left (crosses) and
right (open circles) ears.
Scatterplot of the SEC-SRT versus the DTT-SRT. Open circles
represent NH adults (N = 40), and filled
circles represent HI participants (N = 88). The
black line results from a simple linear regression fit to the
data of HI participants. Dotted lines represent the 95% CI of
the fit. Only retest-SRTs are shown.SEC-SRT = sound reception threshold for the Sound Ear Check;
DTT-SRT = speech reception threshold for the Digit Triplet Test;
SNR = signal-to-noise ratio; NH = normal-hearing;
HI = hearing-impaired.Within-subject SDs were on average 1.9 dB, with a SD of 0.3 dB, which is on
the order of the step size used in the adaptive procedure. No significant
effects of session, F(1, 22) = 2.54,
p = .12;, ear, F(1, 22) = 0.59,
p = .45; session x ear,
F(1, 22) = 0.12, p = .74; or test
order, F(2.1, 46.17) = 1.53, p = .23, were
found. Test–retest reliability was quantified by taking the RMS of the SDs
of the SRT across test repetitions per participant. A measurement error of
1.3 dB was found. This value is higher than the 0.8 dB found in adults, but
still acceptable.Test durations recorded were on average 3 min 4 s and 3 min 15 s for test and
retest, respectively, with SDs of 22 and 24 s. The small difference of 11 s
was statistically significant according to a paired-samples
t test, t(22) = −2.55,
p < .05. Acclimatization and training took on
average 2 min 33 s (SD = 43 s). Hence, total test duration
would be 5 min 42 s (SD = 53 s), that is, about 6 min.Based on pooled data across sessions and ears, a stimulus-response matrix was
constructed, as well as an approximation of the overall and item-specific
psychometric curves to investigate confusions and verify the convergence of
the adaptive procedure and homogeneity of tokens used. Psychometric curves
were fitted using Formula (1), but with SNRs normalized to the participants’
SRT.A confusion matrix with prominent diagonal scores ranging between 60% (for
“bell”) and 83% (for “bird”) was obtained. Except for “baby,” which was
often confused with “cat” (17%), all confusions were well below chance level
(12.5%), with the majority (85%) below 5%.Psychometric curve fitting yielded an SRT71 of 0 dB SNR,
indicating that the adaptive procedure adequately converged to the threshold
(SNRs were normalized to the SRT). A slope of 15.9%/dB was obtained, which
is close to the slope of 18%/dB obtained in adults and which is in
accordance to the measurement reliability of 1.3 dB. This points to a highly
balanced material in terms of perceptual equity. Indeed, the SD across SRT
fits for the individual tokens was only 0.4 dB and fitted SRTs for the
individual tokens ranged over 1.4 dB. Except for “bell,” SRT fits for the
individual tokens deviated by ≤ 0.5 dB from the average SRT across sound
tokens. These findings validate the level adjustments applied in the
homogenization experiment (Part I).
Conclusions and Future Research Perspectives
A language-independent hearing screening test based on masked recognition of
ecological sounds, the SEC, was developed. Following perceptual equation of
difficulty across stimuli, a psychometric curve with a steep slope was obtained,
resulting in a highly precise test. SEC outcomes obtained from adults were
significantly associated with the averaged hearing threshold and moderately high
sensitivity and specificity values to detect mild hearing loss were found. However,
the DTT outperformed the SEC, with respect to its association with the audiogram and
diagnostic accuracy to detect mild hearing loss. Higher sensitivity and specificity
values were found to detect individuals with (slightly) impaired speech
understanding in noise.The main advantage of the SEC is its language independence. The test can potentially
be applied to participants who are not native speakers or not fluent in the local
language. This advantage suggests that the SEC could be used as an international
standard hearing screening test and is possibly applicable in countries that have no
validated speech-in-noise test available. Another extremely relevant application is
the framework of school-age hearing screening for which universal standards are
nonexistent and the current methods lack precision. This study showed that the SEC
might be feasible for this purpose, given its high reliability and stability.
Although one training test is required to eliminate the learning effect, test
durations are appropriate for school-age hearing screening.Nevertheless, the high proportion of children (5/28) aged 5 to 7 years performing the
SEC unreliably is still a barrier for wide application. It might be that
instructions are not sufficiently clear, that the current test takes too long or is
not sufficiently engaging, or that the use of eight sounds is too taxing for some
children. Furthermore, whereas the use of ecological sounds makes the test language
independent, it does not automatically bypass potential cultural effects on their
recognition. Therefore, the sound material used could be revised. Currently, not all
stimuli are of equal length (zero-padded noise-stimulus-noise blocks are used).
Also, problematic stimuli could be removed or replaced, preferably taking into
account familiarity across cultures/countries. Most important, the relatively poor
sensitivity and specificity values obtained in adults for the detection of mild
hearing loss hinder broad adoption of the current test format, especially because
children with mild hearing losses are also being missed by currently implemented
school-age hearing screening programs.Validation of the test in young children with different degrees and types of hearing
loss is currently ongoing in a multicenter study steered by the European Federation
of Audiology Societies Working Group on School-Age Hearing Screening and is a
collaborative effort between partners from different European countries. The results
of that study will allow a revision of the test and fuel follow-up research
comparing the SEC test with the locally used school-entry hearing screening methods
across multiple countries. Ultimately, a uniform screen, for example, the SEC, at
school entry may provide the opportunity of linking the results of the school-age
screen with newborn hearing screening outcomes. This provides the potential of an
accurate quantification of the incidences of acquired and late-onset childhood
hearing loss and, as such, form an evidence base for the need of school-age hearing
screening.
Table 5.
Factor Loadings Per Sound in the Spectral and Temporal
Domain.
Authors: Melanie A Zokoll; Kirsten C Wagener; Thomas Brand; Michael Buschermöhle; Birger Kollmeier Journal: Int J Audiol Date: 2012-07-04 Impact factor: 2.117
Authors: Tytti Willberg; Michael Buschermöhle; Ville Sivonen; Antti A Aarnisalo; Heikki Löppönen; Birger Kollmeier; Aarno Dietz Journal: Acta Otolaryngol Date: 2016-04-28 Impact factor: 1.494
Authors: Elien Van den Borre; Sam Denys; Lea Zupan; Jan A P M de Laat; Nina Božanić Urbančič; Astrid van Wieringen; Jan Wouters Journal: Trends Hear Date: 2022 Jan-Dec Impact factor: 3.496