Literature DB >> 36249919

Profiles and predictors of onset based differences in vocal characteristics of adults with auditory neuropathy spectrum disorder (ANSD).

Prateek Lokwani¹, Prashanth Prabhu¹, Kavassery Venkateswaran Nisha¹.

Abstract

Purpose: Onset-based differences are understudied in Auditory Neuropathy Spectrum Disorder (ANSD) in dimensions such as voice, which is addressed in the study. The study aimed to profile and predict the best metrics of onset-related differences in acoustic vocal characteristics of early and late-onset ANSD patients.
Methods: 31 participants (15 early and 16 late-onset) aged 15-30 years diagnosed with ANSD were included in the study. The sustained phonation of vowel /i/ recorded by the participants using android based smartphones of selected configuration was sent over email to the experimenter. Acoustic parameters (fundamental frequency, harmonic frequencies, jitter, shimmer, harmonic-to-noise ratio, cepstral peak prominence -CPP, and pitch sigma) were analysed using Praat software.
Results: Results revealed significantly increased (p < 0.05) fundamental frequency along with decreased F2 and F3 of /i/ in the early-onset ANSD compared to the late-onset group, which can be explained based on differences in the pathophysiology of the disorder. Although not statistically significant, mean perturbations (jitter and shimmer), harmonic-to-noise ratio, cepstral peak prominence, and pitch sigma were more affected in the early-onset group, reflective of lowered auditory feedback and periodicity in their voice samples. Results of discriminant analysis marked the emergence of F2, F3, and CPP as the most sensitive metrics for onset-based group differences in voice characteristics. Conclusions: The findings from the study highlight the role of acoustical voice evaluation (especially CPP, F2 & F3) in verifying the onset of ANSD disorder. The insights from the onset-based differences seen in vocal characteristics can indirectly help audiologists in deciding the management options for ANSD.

Entities: Chemical

Keywords: Acoustic analysis; Audiologists; Auditory neuropathy spectrum disorder; Phonation; Voice

Year: 2022 PMID： 36249919 PMCID： PMC9547112 DOI： 10.1016/j.joto.2022.08.001

Source DB: PubMed Journal: J Otol ISSN： 1672-2930

Introduction

Ever since its first description by Starr et al. (1996), Auditory neuropathy spectrum disorder (ANSD) has captivated the attention of audiologists worldwide due to its heterogeneity. Every aspect of the disorder presents an array of heterogeneous manifestations, including its onset (Berlin et al., 2010; Boudewyns et al., 2016; De Siati et al., 2020; Jijo and Yathiraj, 2012; Kumar and Jayaram, 2006; Shivashankar et al., 2003; Zhang et al., 2016), prevalence (Kumar and Jayaram, 2006; Mittal et al., 2012; Penido and Isaac, 2013; Rance, 2005; Vignesh et al., 2016), aetiology (Berlin et al., 2003, 2010; Draper and Bamiou, 2009; Prabhu et al., 2012; Rance et al., 1999), pathophysiology (Nikolopoulos, 2014; Rance and Starr, 2015), symptomatology (Berlin et al., 2010; Prabhu et al., 2012; Rance, 2005) and rehabilitative options (Nikolopoulos, 2014; Norrix and Velenovsky, 2014; Rance and Starr, 2015). The onset-based distinctions in ANSD are often associated with its aetiology, symptoms, and pathophysiology. Early-onset ANSD is usually secondary to hyperbilirubinemia (Berlin et al., 2010; Rance et al., 1999), ototoxic drugs, low birth weight, low APGAR scores, anoxia and positive family history (Berlin et al., 2003). In contrast, Prabhu et al. (2012) reported that late-onset ANSD adults did not have any pre-, peri, or postnatal causes; instead, there were some predisposing factors associated with them. These factors include exposure to toxic chemicals (pesticides) and toxic solvents (Xylene), low socioeconomic status, and hormonal variations, during puberty (Draper and Bamiou, 2009). Other aetiologies associated with late-onset ANSD are temperature-dependent changes (Cianfrone et al., 2006), hereditary sensory and motor neuropathy (Leonardis et al., 2000), Charcot- Marie tooth disease (Rance et al., 2012), mutations in genes such as AUNA1, PCDH9, OTOF, DFN895, GJB2 and AUNX1 (Manchaiah et al., 2011). The clinical symptoms seen in late-onset patients are vertigo, headache, tinnitus, defective vision, and difficulty in understanding speech (Prabhu et al., 2012), whereas early-onset patients exhibit difficulty in understanding speech, which is disproportionate to the degree of hearing loss, difficulty hearing in noise (Kraus et al., 2000; Rance et al., 2007), tinnitus (Chandan et al., 2013; Prabhu and Chandan, 2014) and vestibular problems (Hu et al., 2020; Prabhu and Jamuar, 2017). Late-onset patients show a rising configuration of hearing loss which could be pathophysiologically linked to more affected apical nerve fibres (Jijo and Yathiraj, 2012; Kumar and Jayaram, 2006). In contrast, early-onset ANSD patients show a flat loss, with pathophysiological bearings related to the degradation of apical fibres followed by the basilar region (Kumar and Jayaram, 2006). Although the onset-based heterogeneity in ANSD patients is usually explored using the above-cited manifestations, all these studies on late-onset primarily were retrospective designs using only the target (late-onset) population, limiting the scope of comparisons with early-onset related manifestations. While onset-related distinctions are often described for explanatory purposes in these studies, a direct inference cannot be made as they lack experimental control provided by prospective designs. Further, late-onset diagnosis in these cases is dependent on the patient complaints documented in case history (Berlin et al., 2010). However, if the patient reports of onset of symptoms in late adulthood, the lack of audiological reports in childhood limits the understanding about normal auditory functions in earlier years in them. Further, questions regarding the efficacy of newborn hearing screening and primary infrastructure for audiological testing in developing countries (Gupta et al., 2015; McPherson, 2012) where late-onset cases are reported, makes the research strides (comparison of late- vs. early-onset characteristics) in this direction even more challenging. To date, no study has systematically explored such onset-based group differences (late vs. early-onset). Thus, the present study aimed to profile ANSD onset-based differences (late vs. early-onset) in a relatively understudied dimension, i.e., voice characteristics. The study also aimed to identify the best metrics which could predict such group differences in voice characteristics between the groups. The motivation for the study is derived from Maruthy et al., (2019) research findings on deviant voice characteristics in long-standing late-onset adult (age range- 17–30 years) ANSD patients. They reported increased roughness, breathiness, and strain, along with increased pitch and reduced loudness in the voice of adults with late-onset ANSD compared to normal age-matched individuals. In contrast, studies reported that childhood-onset of hearing problems show high variability of the fundamental frequency, excessive intonation and pitch variation, increased loudness, and irregularities in resonance (Evans and Deliyski, 2007). A number of literature reports describe the voice characteristics in individuals with hearing loss (Campisi et al., 2006; Coelho et al., 2015; Wirz et al., 1981), but direct generalization of these findings to the ANSD group cannot be made due to different pathophysiology and duration of the disorder. Based on this evidence, we hypothesize that late-onset ANSD patients are likely to show less deviant voice characteristics compared to the early-onset ANSD group. This study would be the first of its kind aimed to describe the ANSD onset-related vocal manifestations in early and late-onset ANSD patients. The specific objectives of the study were to compare the differences (if any) in acoustic voice characteristics (fundamental frequency, formants, harmonic-to-noise ratio, jitter, shimmer, cepstral peak prominence, and pitch sigma) between the early and late-onset groups with ANSD and predict best metrics of onset-related differences.

Materials and methods

Participants

A total of 31 participants aged 15–30 years who reported to the All India Institute of Speech and Hearing, Mysuru, India and diagnosed with bilateral auditory neuropathy spectrum disorder (ANSD) by certified Audiologists were considered for the study. The criteria adopted to diagnose ANSD in the Audiology clinic were those recommended by Starr et al. (2000): absent or abnormal ABRs (delayed in latency or attenuated in amplitude), presence (average or robust amplitude) of OAEs, and absent middle ear reflexes. Based on the clinical records, the diagnosis of ANSD was confirmed by a Neurologist using clinical examination and Computerized Axial Tomography/Magnetic Resonance Imaging. The participants were divided into two groups based on the onset of the ANSD symptoms: early-onset (n = 15, 11 females, four males, mean age = 22.33 y ± 4.18) and late-onset (n = 16, 12 females, four males, mean age = 22.78 y ± 4.20). A cut-off criterion of 12 years for the group segregation was considered in the study, based on the recommendations of the Centres for Disease Control and Prevention (CDC). Participants who were diagnosed as ANSD in childhood (6–10.2 years) with the problem reported at birth (reported and assessed at the institute, the name is hidden to safeguard anonymity for peer review, between 2005 and 2011) were considered for the former group, while the latter group comprised of adults who were diagnosed as ANSD at the age of >12 years, with no complaints of auditory deficits in childhood (reported and assessed between 2013 and 2020). Caution was taken to include only participants with the onset of lesser than five years duration in the late-onset groups, as long-standing ANSD adversely affects vocal characteristics (Maruthy et al., 2019). Also, to rule out any language problems, Clinical Evaluation of Language Fundamentals (CELF-4) (Semel and Wiig, 1980) was administered on late-onset patients, who were included only if the language skills were age-appropriate. The waveforms/data recorded from three participants in the early-onset group were pruned out as they did not fulfill the noise-free criterion (more background noise) for their inclusion. Table 1 shows the demographic and audiological details, comprising the degree (Clark, 1980; Goodman, 1965) and configuration (Pittman and Stelmachowicz, 2003) of all the participants included in the study.

Table 1

Demographic details of all the participants along with their audiological characteristics.

S. No.	Ear	Early Onset.				Late Onset.
S. No.	Ear	Age (in y)	Gender	Degree	Configu-ration	Age (in y)	Gender	Degree	Configu-ration
1	Right	17.2	Female	Severe	Flat	20.6	Female	MS	Rising
1	Left	17.2	Female	Severe	Flat	20.6	Female	Moderate	Rising
2	Right	15.5	Female	MS	Flat	20.5	Male	Moderate	Irregular
2	Left	15.5	Female	MS	Flat	20.5	Male	Moderate	Rising
3	Right	21.4	Male	MS	Rising	16.2	Male	Minimal	Rising
3	Left	21.4	Male	MS	Irregular	16.2	Male	Minimal	Rising
4	Right	28.5	Female	Severe	Flat	18.4	Female	Minimal	Rising
4	Left	28.5	Female	Severe	Flat	18.4	Female	Normal	–
5	Right	21.5	Female	MS	Flat	20.1	Male	Normal	–
5	Left	21.5	Female	Severe	Flat	20.1	Male	Minimal	Rising
6	Right	24.4	Female	Severe	Flat	21.8	Female	Minimal	Flat
6	Left	24.4	Female	Severe	Flat	21.8	Female	Moderate	Rising
7	Right	23	Female	MS	Flat	27.10	Female	Moderate	Flat
7	Left	23	Female	MS	Flat	27.10	Female	Moderate	Rising
8	Right	26.4	Male	MS	Rising	27.2	Female	Moderate	Rising
8	Left	26.4	Male	Moderate	Rising	27.2	Female	Moderate	Rising
9	Right	22.7	Female	MS	Flat	16.4	Female	Mild	Flat
9	Left	22.7	Female	MS	Flat	16.4	Female	Moderate	Rising
10	Right	21.8	Female	Severe	Flat	26.8	Female	Minimal	Flat
10	Left	21.8	Female	Severe	Flat	26.8	Female	Mild	Flat
11	Right	23.6	Female	Profound	Flat	23.5	Female	Minimal	Rising
11	Left	23.6	Female	Severe	Sloping	23.5	Female	Normal	–
12	Right	22.1	Male	MS	Rising	21.5	Female	Moderate	Rising
12	Left	22.1	Male	Ms	Rising	21.5	Female	Moderate	Rising
13	Right	24.8	Male	Severe	Flat	29.9	Female	Mild	Rising
13	Left	24.8	Male	Severe	Flat	29.9	Female	Moderate	Rising
14	Right	20.5	Female	Moderate	Flat	26.8	Female	MS	Flat
14	Left	20.5	Female	Ms	Flat	26.8	Female	MS	Rising
15	Right	21.6	Female	Severe	Flat	26.0	Male	Mild	Flat
15	Left	21.6	Female	Profound	Flat	26.0	Male	Minimal	Rising
16	Right		–	–	–	21.5	Female	Normal	–
16	Left		–	–	–	21.5	Female	Moderate	Rising

Note: MS- moderately severe.

Demographic details of all the participants along with their audiological characteristics. Note: MS- moderately severe.

Informed consent and ethical considerations

Informed consent was signed by all the participants through google forms, where each participant was informed about the objective of the study and its need in brief. The anonymity of the participants was maintained throughout the study. The willingness of any patient to participate in the study did not affect their routine audiological assessment and other evaluations. All procedures performed in this study adhered to the bio-behavioral research standards (Venkatesan and Basavaraj, 2009) framed by the institutional ethical review board, whose permission was obtained for the study.

Procedure

The short-listed participants, after the screening of medical records, were contacted over telephone to assess their language skills (as discussed in inclusionary criteria) and voice characteristics. The participants were asked to record sustained phonation of vowel /i/ for a duration of at least 5 s with three trials per vowel and send the recorded voice samples over email. The rationale for the use of /i/ for sustained phonation is its high discriminatory potential (due to its high-frequency components) in detecting deviations/perturbations in voice. To facilitate the understanding of the task, a recorded video of the instructions, along with a sustained phonation sample enacted by an Indian male speaker, was sent for participant viewing. The participants were asked to keep the microphone of the smartphone six cms away from the mouth (or two-thirds of the index finger, for better understanding of the participants). Smartphones above specific configurations (Android 4, CPU frequency >1.3 GHz) were used for recording (Manfredi et al., 2017). Uloza et al. (2015) showed that smartphones are reliable in recording and assessing acoustic voice parameters. The reason for choosing sustained phonation over connected speech is that connected speech may display more fluctuations when recorded from a smartphone. Also, above cited studies have used sustained vowel phonation rather than connected speech to assess the quality of voice. The rationale for the inclusion of the online-based data collection stemmed from the need for social distancing and alternative assessment procedures during the COVID-19 crisis. The use of alternative methods rather than conventional voice assessment in the COVID-19 pandemic for voice assessment (Jannetts et al., 2019; Lin et al., 2012; Maryn et al., 2017) has become increasingly efficacious as they offer both accessibility and safety. To further validate the utility of the online-based recordings to the conventional voice sample recordings, a pilot study comprising voice samples of 5 normal adults (18–25 years) was carried out using both methods. The adults were asked to phonate /a/. A smartphone of Android 8 and CPU Frequency of 2.05 GHz was used for online recording, while the offline analyses were carried out using Computerized Speech Lab (CSL, Kay Elemetrics, 1996) module (Kay Elemetrics, 1996). The vocal parameters used in the current study were compared between the two recording modes using Mann-WhitneyU test, which showed no statistically significant difference (p > 0.05) between the recordings on all the parameters considered (fundamental frequency:/z/ = 0.31, p = 0.75, F1:/z/ = 0.10, p = 0.92, F2:/z/ = 0.32, p = 0.75, F3:/z/ = 0.94, p = 0.35, jitter:/z/ = 0.11, p = 0.92, shimmer:/z/ = 1.05, p = 0.29, HNR:/z/ = 0.73, p = 0.47, CPP:/z/ = 0.23, p = 0.22, pitch sigma:/z/ = 0.45, p = 0.65). In order to monitor the environmental noise, an android based application Sound Meter, developed by Smart Tools Company (Ibekwe et al., 2016), was used at the participants' end. Live monitoring of the online recording session was supervised by the experimenter through an online video call. The participants were also asked to send the environmental noise data throughout the recording, which was further analysed by the experimenter before the inclusion of the voice sample. Samples with environmental noise less than 45 dB SPL were included for analysis (Lebacq et al., 2017).

Voice analyses

Although the voice samples were obtained from 31 participants, three samples were excluded due to background noise (>60 dB SPL). The vocal characteristics from the remaining 28 clear recorded waveforms were analysed, which constituted a total of 12 individuals with early-onset and 16 individuals with late-onset ANSD. The noise-free voice samples obtained were subjected to both acoustic analyses. The acoustic parameters of voice were assessed using Praat software (Boersma and Weenink, 2010), where fundamental and formant frequencies, jitter, shimmer, harmonic-to-noise ratio, cepstral peak prominence, and pitch sigma were computed and compared between the groups. Burris et al. (2014) concluded that fundamental frequencies and formants generated by Praat Software were reliable, accurate, and comparable to the values obtained in acoustic analysis using other software packages such as WaveSurfer (Sjölander and Beskow, 2005), TF32 (Milenkovic, 2010), and Computerized Speech Lab module (CSL, Kay Elemetrics, 1996) (Kay Elemetrics, 1996). The segment of recording which looked most stable waveform was extrapolated from the recording, and an analysis was done. Fundamental frequency (F0) along with the first three formant frequencies (F1, F2, & F3) and pitch sigma (standard deviation of F0) were computed for each recording. Jitter, shimmer, and harmonic-to-noise ratio (HNR) were also calculated using the Point-process option in Praat Software. Cepstral peak prominence (CPP) was also obtained with the spectrum of the waveform.

Statistical analyses

The data obtained were subjected to statistical analyses using IBM Statistical package social sciences (SPSS) version 25.0 (SPSS Inc., Chicago). Shapiro-Wilk test of normality was done to check for the normal distribution of the data. Multivariate Analysis of Variance (MANOVA) test was carried out for the parametric data, while the Mann-WhitneyU test was done to compare the differences (if any) in vocal characteristics between the groups when the data followed non-normal distribution. Partial Eta Squared (ŋp2) was noted wherever significant differences were observed in parametric tests. Fisher Discriminant Analysis (FDA) was used to identify a selected set of the voice measures that most effectively distinguished the onset-based group differences.

Results

Shapiro-Wilk test showed that all the measures, except jitter and shimmer, adhered to normal distribution (p > 0.05). The descriptive statistics comprising the mean for fundamental and formant frequencies along with standard deviation are shown in Fig. 1. On visual examination, the early-onset group exhibited higher fundamental frequency and lower formant frequencies (F2 and F3) when compared to the late-onset group. This was statistically verified with MANOVA, as seen in Table 2.

Fig. 1

Table 2

Results of inferential statistical test (Mann-Whitney U and Independent t-test) for comparison of group differences in measures of fundamental and formant frequencies, harmonic-to-noise ratio (HNR), perturbations (jitter, shimmer), cepstral peak prominence (CPP) and Pitch sigma.

Acoustic Parameter	Inferential Statistics results
F₀	F (1,26) = 5.96, p = 0.02, ŋ_p² = 0.19
F₁	F (1, 26) = 1.89, p = 0.44, ŋ_p² = 0.07
F₂	F (1,26) = 6.10, p = 0.02, ŋ_p² = 0.20
F₃	F (1,26) = 8.02, p = 0.01, ŋ_p² = 0.24
Jitter	/z/ = 0.33, p = 0.74
Shimmer	/z/ = 0.19, p = 0.85
HNR	F (1,26) = 0.47, p = 0.50, ŋ_p² = 0.02
CPP	F (1,26) = 1.56, p = 0.22, ŋ_p² = 0.06
Pitch Sigma	F (1,26) = 2.31, p = 0.14, ŋ_p² = 0.09

Box plots depicting median (center line), along with interquartile range (error bars) of (A) Fundamental frequency F0, (B) first formant: F1, (C) second formant: F2, and (D) third formant: F3 of /i/ sustained vowel phonations for early-onset and late-onset ANSD groups. The individual data points for the fundamental frequency (F0) and the first three formants are also indicated on the corresponding plots. Graphs marked with asterisks indicate the presence of a statistically significant difference (∗p < 0.05). Results of inferential statistical test (Mann-Whitney U and Independent t-test) for comparison of group differences in measures of fundamental and formant frequencies, harmonic-to-noise ratio (HNR), perturbations (jitter, shimmer), cepstral peak prominence (CPP) and Pitch sigma. The early-onset group had higher perturbations for the sustained phonation of /i/, as seen in Fig. 2. On the other hand, a relatively lower HNR was recorded in the voice samples of the early-onset group. It was also observed that the CPP was more for late-onset, whereas pitch sigma was higher for the early-onset group. However, none of the above-cited differences withstood the statistical verification, as shown in Table 2.

Fig. 2

Box plots depicting median (center line), along with interquartile range (error bars) for (A) Frequency perturbation (Jitter), (B) Amplitude perturbation (Shimmer), (C) Harmonic-to-Noise ratio (HNR), (D) Cepstral Peak Prominence (CPP) and (E) Pitch Sigma for /i/ sustained phonations of early-onset and late-onset ANSD groups. The parameters in which onset-based group differences are significantly seen (F0 , F2 and F3 of /i/) for phonation samples obtained from two female participants (one each belonging to the different onset group) are shown in Fig. 3. The color-coded bands in the spectrogram correspond to bands of acoustic energy. On visual inspection, F0 is distantly located for two groups (Fig. 3A). It is also seen that the energy bands depicting the portions of F2 and F3 are located differently for the two samples (Fig. 3B). Further, the mean F2 and F3 for the early-onset group were 2108 and 2875 Hz, respectively, whereas the same was higher (2529 and 3187 Hz) for the late-onset group. However, no statistical differences were observed in the first formant of vowel /i/.

Fig. 3

Spectrograms of /i/ sustained phonation of a female patient of early-onset (blue, left) and late-onset (black, right) ANSD. (A) Shows the distinction in F0 and (B) shows the distinction in F2 and F3 between the two groups.

Discriminant analyses identifying the optimal measure sensitive to ANSD onset-based group differences

Results of Fisher discriminant analysis (FDA) identified the fundamental frequencies, formants, Harmonic noise ratio, & perturbations as best measures that can distinguish ANSD onset-based group differences in voice. The canonical discriminant function (DF), which statistically clustered behavioral measures that segregated onset-based ANSD groups accounted for 100% of variance (Wilks lambda, λ (14) = 0.366, χ2= 20.60, p = 0.02). However, an examination of the weights for each test indicated that CPP, followed by F3 and F2, were heavily weighed (canonical coefficients) on DF1, as reflected in Table 3. The corresponding structure matrix, which indicates the pooled within-groups correlations between discriminating variables and standardized canonical DF, is also shown in the same table. The canonical DF obtained in the study based on the weights (Table 3) is summarized below:DF1: (0.91 × CPP) + (0.84 × F

Table 3

Contribution (weights) of auditory tests for group membership prediction of onset based ANSD groups.

Discriminating Variable	Weights	Structure matrix
F0	−0.44	−0.34
F1	−0.24	−0.14
F2	0.73	0.37
F3	0.84	0.50
Jitter	−0.66	−0.17
Shimmer	0.64	−0.01
HNR	0.17	−0.10
CPP	0.91	0.19
Pitch Sigma	0.66	−0.23

Contribution (weights) of auditory tests for group membership prediction of onset based ANSD groups. Each participant's score on the discriminant function was calculated by multiplying the standardized canonical DF coefficient by the test score of each individual on the study measures and summing these products. Thus, the calculated frequency (y-axis) for each discriminant score (x-axis) is shown in Fig. 4. It is clear from the figure 4 that the DF separates the early-onset ANSD from the late-onset ANSD group, which emerged as two distinct clusters that are concentrated on either side of the reference line.

Fig. 4

Bar graphs representing the Discriminant Function scores for the segregation of both the groups. The dotted line is the reference for cut-off scores between the groups on the discriminant function.

Bar graphs representing the Discriminant Function scores for the segregation of both the groups. The dotted line is the reference for cut-off scores between the groups on the discriminant function. The error rate in the FDA analysis (indicating the accuracy of classification) was carried out by comparing case-wise statistics of participants' DF scores against their original pre-verified condition, as shown in Table 4. An overall 85.42% accuracy in the classification was seen, indicative of the clear segregation of the groups based on the weightages obtained in the FDA.

Table 4

Accuracy of discriminant function analyses comparing predicted and original group memberships. Total participants (number count) is tabulated with the corresponding percentage in parentheses.

Original Group	Predicted Group Membership
Original Group	Early-onset	Late-onset	Total
Early-onset	91.7% (11)	8.3% (1)	100% (12)
Late-onset	6.3% (1)	93.8% (15)	100% (16)
Total	100% (12)	100% (16)	100% (31)

Accuracy of discriminant function analyses comparing predicted and original group memberships. Total participants (number count) is tabulated with the corresponding percentage in parentheses.

Discussion

The present study aimed to delineate the differences in vocal characteristics of early and late-onset ANSD using objective acoustical measures. Amongst the few available retrospective studies, the existence of late-onset ANSD is documented in case of studies by only a few researchers (Berlin et al., 2010; De Siati et al., 2020; Jijo and Yathiraj, 2012; Kumar and Jayaram, 2006). Thus, the findings from the current study are the first of their kind in research design, which plausibly explains the onset-based group differences in vocal characteristics in a prospective research design. The strength of the study is the precise control of variables at the start of the study. The participants of the study were age and gender-matched between the groups to reduce the effect of any confounding variable. All the subjects passed language screening in the late-onset ANSD group, which in turn helped the experimenter to understand the aptness of the participant inclusion, as the presence of early ANSD (even if of a milder degree) is known to adversely affect language outcomes (Rance et al., 2012). The control was also exercised on the recording of voice samples, with prior succinct segregation of environmental noise using mobile applications. The check between android based voice recording and the conventional voice recording using the Computerized speech lab (CSL) application during the pilot is another strength of the study. The combination of these experiment-based control further consolidates the results obtained in the study, apart from providing flexibility to conduct such studies during a COVID -19 pandemic. The results of MANOVA showed that the fundamental frequency of /i/ was increased in the early-onset group. This finding is on par with previous studies of acoustic features in long-standing hearing loss cases (Evans and Deliyski, 2007; Maruthy et al., 2019). These results are attributed to poor laryngeal control, greater laryngeal muscular tension, or impaired internal auditory feedback. The fundamental frequency is the acoustic correlate of the pitch which, when affected, impacts the social wellbeing of the individual and can be detected perceptually with voice quality rating scales. The results of MANOVA also showed that the second and third harmonics (F2 & F3) of vowel /i/ of the late-onset ANSD group were significantly higher than the early-onset group. This finding is suggestive of higher sensitivity of high-frequency harmonics in detecting ANSD onset-based differences in the production of sounds. This finding could be considered as secondary effect of group differences in the pathophysiology of the disorder. Pathophysiologically, patients with early-onset present a flat type of audiogram (equally impaired perception across all frequencies), whereas those with late-onset ANSD exhibit a rising type of hearing loss with less impaired high-frequency perception (Kumar and Jayaram, 2006). The pathological limitation of impaired high-frequency perception in the early-onset ANSD group, which occurs at a relatively younger age, places them at a disadvantage in the perception of F2 & F3 formants of /i/. The perception-related disadvantage in this group can be postulated to transfer to the production-related aspect as well. The production-related deficits originating from the perceptual disadvantage can be explained based on behaviorism learning theory (Watson, 1913), which advocates the learning of vocal sound productions occurs by environmental conditioning, feedback reaction, and strengthening behavior through repeated actions. According to this theory, the feedback received on the perception of the sound gets strengthened through repeated productions. The altered/distorted feedback in individuals with ANSD (Maruthy et al., 2019) right at childhood (early-onset) can lead to a deficit in the precise relay of vocal production to the auditory cortex. Thus high-frequency sound productions though normal at the early stages, get strengthened by the long-term vicious loop of feedback and altered perception in the early onset ANSD group, resulting in an altered recalibration of high-frequency perception. The deficits in perception of high frequency sounds in early hearing loss onset groups like those with even mild to moderate sensorineural hearing loss is documented in the literature (Evans and Deliyski, 2007). The relative lack of frequency shifts in late-onset ANSD (as opposed to the early-onset group) is indicative of the very nature of delayed onset in this group, which otherwise would have affected their voice characteristics, especially the higher harmonic frequencies. Although not significant, early-onset ANSD had more perturbations in pitch (jitter) and amplitude (shimmer), CPP and pitch sigma, which also could be the result of persistent poor auditory feedback (Maruthy et al., 2019). Complimentary to this, the reduced harmonic-to-noise ratio in the early group is indicative of less periodicity in voice in them. The Fisher discriminant analyses (FDA) revealed that F2 /i/, F3 /i/, and CPP of vowel /i/ were the best predictors of the group differences (Table 3). This added higher diagnostic value to the lowered F2 and F3 and reduced CPP seen in the early-onset group compared to the late-onset group. The presence of such indicators should alert Audiologists to reflect on the possible onset of the disorder, which in turn can facilitate their rehabilitation choice. While applications of cochlear implants in early-onset may be advisable (Fei et al., 2011; Kontorinis et al., 2014), the utility of hearing aids (Barman et al., 2016; Jijo and Yathiraj, 2013) or assistive listening in the late-onset group can be advocated as the first line of rehabilitation. CPP emerged as an important metric in DFA (Table 3), despite it being not sensitive to the group differences on MANOVA (Table 2), while F0 though sensitive for group differences on MANOVA (Table 2), did not mark high distinguishing power on DFA. This finding can be due to participants having a restricted range of F0 variable (range: 128.14–363.37 Hz; Fig. 1) compared to other formant frequencies (F1 range: 241.00–605.00 Hz; F2 range: 975.00–2979.00 Hz; F3 range: 2488.00–3655.00; Fig. 1). On the other hand CPP enjoyed higher range of values (range: 15.12–28.41 dB; Fig. 2) among the perturbation measures (Jitter range: 0.12–10.27%; Shimmer range: 0.65–0.95%; HNR range: 21.65–35.73 dB and Pitch Sigma range: 1.13–11.76 Hz; Fig. 2). The DFA would have under-estimated the importance of F0 due to the limited spread of this data.

Conclusions

The findings from the study highlight the role of acoustical voice evaluation in verifying the onset of ANSD disorder. Based on discriminant analyses, the study points at key vocal indicators (CPP, F2 & F3) that can segregate the two onset-based (early from late) ANSD groups. The insights from the onset-based differences seen in vocal characteristics can help Audiologists in deciding the management options for ANSD.

Conflicts of interest and source of funding

There is no conflict of interest to disclose. This is a non-funded research.

Author contributions

PL was involved in designing the work, data collection, data analysis, and drafting the article. PP was involved in designing of work, data analysis, critical reading of the article, and final approval of the version to be published. KVN was involved in designing the work, data collection, data analysis, drafting the article, critical reading, and final approval of the version of the article.

Data availability statement

The data that support the findings of the study are not publicly available as the containing information could compromise the privacy of research participants but are available from corresponding author (KVN).

42 in total

1. Quantitative and descriptive comparison of four acoustic analysis systems: vowel measurements.

Authors: Carlyn Burris; Houri K Vorperian; Marios Fourakis; Ray D Kent; Daniel M Bolt
Journal: J Speech Lang Hear Res Date: 2014-02 Impact factor: 2.297

2. Acoustic voice analysis of prelingually deaf adults before and after cochlear implantation.

Authors: Maegan K Evans; Dimitar D Deliyski
Journal: J Voice Date: 2006-09-06 Impact factor: 2.009

3. Maximal Ambient Noise Levels and Type of Voice Material Required for Valid Use of Smartphones in Clinical Voice Research.

Authors: Jean Lebacq; Jean Schoentgen; Giovanna Cantarella; Franz Thomas Bruss; Claudia Manfredi; Philippe DeJonckere
Journal: J Voice Date: 2017-03-18 Impact factor: 2.009

4. Cochlear implantation in children with auditory neuropathy spectrum disorders.

Authors: Georgios Kontorinis; Simon K W Lloyd; Lise Henderson; Deanne Jayewardene-Aston; Kerri Milward; Iain A Bruce; Martin O'Driscoll; Kevin Green; Simon R M Freeman
Journal: Cochlear Implants Int Date: 2014-05

5. Auditory neuropathy.

Authors: A Starr; T W Picton; Y Sininger; L J Hood; C I Berlin
Journal: Brain Date: 1996-06 Impact factor: 13.501

6. Exploring the feasibility of smart phone microphone for measurement of acoustic voice parameters and voice pathology screening.

Authors: Virgilijus Uloza; Evaldas Padervinskis; Aurelija Vegiene; Ruta Pribuisiene; Viktoras Saferis; Evaldas Vaiciukynas; Adas Gelzinis; Antanas Verikas
Journal: Eur Arch Otorhinolaryngol Date: 2015-07-11 Impact factor: 2.503