Literature DB >> 35835646

Reliability of Acoustic Measures in Dysphonic Patients With Glottic Insufficiency and Healthy Population: A COVID-19 Perspective.

Seung Jin Lee¹, Min Seok Kang², Young Min Park², Jae-Yol Lim³.

Abstract

OBJECTIVES: The COVID-19 pandemic has affected the voice assessment protocols for dysphonic patients. In this study, we compared the changes in acoustic measures of the healthy population as well as dysphonic patients due to glottic insufficiency between the pandemic period requiring face masks and the prepandemic period when the masks were not essential. The clinical reliability of the acoustic measures with and without face masks was explored.
METHODS: A total of 120 patients (age = 42.3 ± 11.9 yrs) with glottic insufficiencies such as UVFP and sulcus vocalis and 40 healthy population (age = 40.5 ± 11.2 yrs) cohorts were enrolled during the pandemic period. Age- and gender-matched 120 patients and 40 healthy population cohorts who underwent voice assessment without face masks before the pandemic were enrolled as prepandemic controls. Acoustic measures and overall severity estimates of vowel and speech samples were compared, which included cepstral peak prominence (CPP), L/H spectral ratio (SR), their standard deviations, F0, jitter percent (Jitt), shimmer percent (Shim), noise-to-harmonic ratio (NHR), Cepstral Spectral Index of Dysphonia (CSID), and Acoustic Psychometric Severity Index of Dysphonia.
RESULTS: Both patients and healthy cohorts showed higher SRv and SRs but lower CSIDv during the pandemic compared to the prepandemic period. F0 of the healthy male controls during the pandemic was higher than during the prepandemic periods, while the CSIDs was lower for the pandemic period. The pandemic patient cohort showed lower σSRs compared to the prepandemic patient cohort. When the acoustic measures of patients were compared to the healthy population cohort, the patient cohort showed lower CPP and σCPPs, while higher σCPPv, Jitt, Shim, and NHR during both pandemic and prepandemic period. Overall, the area under the curve of the acoustic measures and overall severity estimates was similar between the mask and non-mask groups, although the AUC of the SR measures was poor.
CONCLUSIONS: Wearing face masks during the pandemic did not compromise the overall reliability of the acoustic analysis in patients with glottic insufficiency, suggesting the current protocol of acoustic analysis can be carried out reliably while wearing a mask to ensure safety in the pandemic era.

Entities: Chemical

Keywords: Acoustic analysis; COVID-19; Cepstral analysis; Face mask; Personal protective equipment; Voice disorders

Year: 2022 PMID： 35835646 PMCID： PMC9273473 DOI： 10.1016/j.jvoice.2022.06.013

Source DB: PubMed Journal: J Voice ISSN： 0892-1997 Impact factor: 2.300

INTRODUCTION

As the coronavirus disease 2019 (COVID-19) is highly contagious, medical staff and patients with dysphonia have to wear personal protective equipment, including face masks, to prevent potential viral transmission in most voice clinics.1, 2, 3 Patients who visit the clinic multiple times and medical staff who treat patients repeatedly must wear a mask to avoid a cumulative risk of infection. As the voice assessment procedures, except for visual inspection of the larynx, are carried out wearing face masks, speech signals produced by the patients can be somehow affected. , Therefore, the clinical reliability of voice signals and acoustic measures produced while wearing masks has considerable critical attention. If the acoustic measures differed between the masked and unmasked conditions, it would be inappropriate to use the same protocol for voice assessments and interpretation of prepandemic settings. In this context, recent studies have investigated the effect of wearing face masks on various aspects of speech production. Acoustically, they function as a low-pass filter on acoustic signals, although discrepancies among different types of masks have been reported.6, 7, 8 It was also shown that each type of mask serves as a low-pass filter attenuating the high frequencies from 2 to 7 kHz. , Another study demonstrated that the most substantial attenuation was above 4 kHz, although the cloth masks showed significant variation. The attenuation effect was also reported to be the greatest for the translucent mask with a transparent plastic window, followed by the FFP2 mask and the surgical mask. In this study, cepstral peak prominence (CPP) decreased, while acoustic voice quality index increased for the masked condition, implying that the severity of voice disorders increased by wearing masks. However, most studies were simulations in nature or conducted with healthy populations and few data exist regarding the effects of face masks on the outcomes of voice assessments for dysphonic patients. , , 11, 12, 13, 14, 15, 16 Although most studies showed that face masks did not impact on acoustic correlates of vocal quality, it is still questionable whether the speech samples produced wearing masks are clinically helpful or not in patients with dysphonia. Typically, patients with glottic insufficiencies show breathy vocal quality and subsequent noise in the high frequencies, which results in decreased CPP.17, 18, 19, 20 Studies have shown that the CPP measures have excellent diagnostic utility for screening dysphonia compared to the traditional acoustic parameters such as Jitt, Shim, and noise-to-harmonic ratio (NHR). , Moreover, there are estimates of the overall severity derived from the spectral and cepstral measures. These indices include Cepstral Spectral Index of Dysphonia (CSID) and Acoustic Psychometric Severity Index of Dysphonia (APSID). , As these indices are based on acoustic measures, it should also be tested if their clinical usefulness is affected by wearing face masks or not. There are clinical cases of vocal pathologies in which pre- and post-treatment are not performed in an identical condition in terms of face mask usage. Considering the fluctuation of the pandemic situation and the potential emergence of variants, guidelines for the use of face masks can vary across the voice labs of a given patient undergoing long-term follow-up. Suppose a patient, for instance, has undergone a preoperative voice lab with a face mask and a postoperative voice lab without a mask. In that case, it is difficult to accurately interpret the subtle changes after surgery until it is determined which acoustic measurements significantly change due to wearing the mask. In addition, the clinical usefulness of acoustic measures as a screening tool must be investigated using clinical data from the real world in both healthy and patient cohorts. In this study, we aimed to investigate the changes in acoustic measures of voice quality in healthy populations as well as patients with glottic insufficiencies according to the use of face masks during pandemic period and compare them with prepandemic ones. Furthermore, the clinical reliability of the measurements and usefulness for diagnosing dysphonia were explored for both the masked and unmasked conditions.

MATERIALS AND METHODS

Participants

This study was retrospective in nature and was approved by the Institutional Review Board of Gangnam Severance Hospital (3-2020-0497). A total of 120 patients with glottic insufficiency (45 males and 75 females, age = 42.3 ± 11.9 yrs) and 40 normal adults (15 males and 25 females, age = 40.5 ± 11.2 yrs) who underwent voice assessment wearing KF-80 or 94 masks were included (the number representing the filtration rate). Both types are certified by the Korean Food and Drug Administration and were distributed to the public by the Korean government. During the pandemic period, all participants were asked to wear the masks firmly. Age- (within 3 yrs), gender-, and diagnosis matched 120 patients (45 males and 75 females, age = 41.7 ± 11.7 yrs) and 40 normal adults (15 males and 25 females, age = 39.2 ± 10.4 yrs) who underwent voice assessment without face masks before the pandemic were also enrolled. The diagnosis for each patient was made by a laryngologist. Diagnosis of the patient group in each condition included unilateral vocal fold palsy (38.1%), sulcus vocalis (18.8%), and unilateral vocal fold paresis (18.1%). Patients with a previous surgical and behavioral intervention history for dysphonia or incomplete assessments were excluded from the study. For each group, there was no difference in the grade, roughness, breathiness, asthenia, and strain (GRBAS) scores between the masked and unmasked conditions (P > 0.05).

Voice outcome measurement

Voice recordings were made using the Computerized Speech Lab (Model 4150B; KayPENTAX, Lincoln Park, NJ; CSL). A dynamic microphone (SM48, SHURE, Niles, IL) was stably positioned 10 cm away from the patients’ lips (prepandemic condition) or face mask (pandemic condition) during the recording sessions. Participants were asked to produce 4-second-long vowel samples and Korean passage-reading speech samples. The Korean standard passage ‘Ga-eul’ (Autumn) consists of 194 words and 368 syllables. In Korean clinics and research studies, it is widely used for auditory-perceptual evaluation and cepstral analysis. Second sentence sample was trimmed for each passage-reading sample. Participants were also asked to complete the Korean version of the Voice Activity and Participation Profile (K-VAPP) questionnaire. For acoustic analysis, both the vowel and speech samples (second sentence of each passage-reading sample) were used. Using the Multi-Dimensional Voice Program (Model 5105, KayPENTAX, Lincoln Park, NJ, U.S.A.; MDVP), traditional acoustic measures such as Jitt, Shim, and NHR were measured. With these traditional acoustic measures, CPP, SR, and standard deviations (σ) of CPP and SR for the vowel and sentence samples were also measured using the Analysis of Dysphonia in Speech and Voice (Model 5109, KayPENTAX, Lincoln Park, NJ, U.S.A.; ADSV) Program. Estimates of the overall severity were also calculated using the cepstral measures. Using the ADSV program, the CSID were calculated for vowel and sentence sample. The APSID was also calculated using the CPP measures and self-perceived severity (Severity) score of the K-VAPP.

Statistical analyses

A two-way analysis of variance (groups × conditions) and independent t tests were performed for each acoustic parameter. Receiver operating characteristic (ROC) curve analyses were performed for each condition, and the areas under the curve (AUC) of the independent ROC curves were compared between the two conditions. The significance level was set at 0.05. Statistical analyses were performed using IBM SPSS statistics software for Windows, version 25.0 (IBM Corp., Armonk, NY) and the MedCalc® Statistical Software 20.009 (MedCalc Software Ltd, Ostend, Belgium).

RESULTS

Descriptive data of and comparison of the acoustic measures of patients with glottic insufficiency and healthy population cohorts between prepandemic vs. pandemic cohorts are presented in Table 1 . Both patients and healthy cohorts showed higher SRv (P < 0.001) and SRs (P < 0.001) but lower CSIDv (P = 0.028 and 0.003 for patients and controls, respectively) during the pandemic compared to the prepandemic period. F0 of the healthy male controls during the pandemic was higher than during the prepandemic period (P = 0.031), while the CSIDs were lower for the pandemic period (P = 0.008). The pandemic patient cohort showed lower σSRs compared to the prepandemic patient cohort (P = 0.039).

TABLE 1

Comparison of the Acoustic Measures and Overall Severity Estimates Between the Pandemic and Prepandemic Cohorts

Parameters		Patients (N = 240)		Healthy (N = 80)		P Value
		Pandemic (N = 120)	Prepandemic (N = 120)	Pandemic (N = 40)	Prepandemic (N = 40)	Pandemic vs. Prepandemic			Between groups
						Patients	Healthy	Total
CPP_V		8.969 ± 3.560	8.449 ± 3.507	12.471 ± 2.129	12.362 ± 2.088	0.256	0.818	0.453	<0.001‡
CPP_S		4.942 ± 1.743	4.889 ± 1.593	7.026 ± 0.843	6.600 ± 1.147	0.806	0.062	0.227	<0.001‡
σCPP_V		1.228 ± 0.860	1.207 ± 0.831	0.711 ± 0.350	0.798 ± 0.551	0.853	0.403	0.743	<0.001‡
σCPP_S		3.122 ± 0.760	3.097 ± 0.824	3.684 ± 0.419	3.783 ± 0.451	0.803	0.312	0.693	<0.001‡
SR_V		35.521 ± 5.555	31.404 ± 6.281	38.812 ± 5.042	32.920 ± 3.685	<0.001‡	<0.001‡	<0.001‡	0.003†
SR_S		34.250 ± 3.569	31.002 ± 4.079	35.994 ± 3.035	31.473 ± 2.951	<0.001‡	<0.001‡	<0.001‡	0.019*
σSR_V		1.784 ± 0.825	1.690 ± 0.561	1.646 ± 0.552	1.644 ± 0.613	0.302	0.987	0.582	0.296
σSR_S		9.421 ± 1.112	9.716 ± 1.094	9.197 ± 1.287	9.468 ± 0.767	0.039*	0.256	0.045*	0.095
F0	M	128.275 ± 28.609	132.374 ± 35.423	122.816 ± 25.697	106.420 ± 11.017	0.547	0.031*	0.328	0.013*
F0	F	199.917 ± 38.795	200.350 ± 40.593	190.500 ± 18.302	198.188 ± 16.639	0.947	0.127	0.485	0.320
Jitt		2.515 ± 2.460	3.066 ± 3.002	0.884 ± 1.113	0.786 ± 0.515	0.121	0.616	0.468	<0.001‡
Shim		6.396 ± 5.526	6.529 ± 4.514	3.055 ± 1.322	3.244 ± 0.940	0.838	0.462	0.777	<0.001‡
NHR		0.169 ± 0.922	0.185 ± 0.126	0.132 ± 0.033	0.125 ± 0.020	0.247	0.219	0.721	<0.001‡
Grade		2.113 ± 0.669	2.162 ± 0.605	0.412 ± 0.192	0.388 ± 0.212	0.544	0.582	0.863	<0.001‡
Severity		5.817 ±3.272	6.350 ± 2.907	0.225 ± 0.480	0.300 ± 0.687	0.183	0.573	0.384	<0.001‡
CSID_V		16.328 ± 20.332	22.189 ± 20.828	-4.419 ± 9.992	2.823 ± 11.386	0.028*	0.003†	0.007†	<0.001‡
CSID_S		10.615 ± 21.012	14.556 ± 21.887	-9.570 ± 9.638	-3.073 ± 11.590	0.156	0.008†	0.038*	<0.001‡
APSID		47.327 ± 25.292	49.663 ± 22.590	10.177 ± 9.605	14.308 ± 13.515	0.451	0.119	0.247	<0.001‡

Values are presented in mean ± standard deviation.

P < 0.05.

P < 0.01.

P < 0.001.

Abbreviations: APSID, acoustic psychometric severity index of dysphonia; CPP, cepstral peak prominence; CSID, cepstral spectral index of dysphonia; F0, fundamental frequency; F, female; Grade, Grade of the GRBAS scale; Jitt, jitter percent; M, male; NHR, noise-to-harmonic ratio; S, sentence production; Shim, shimmer percent; Severity, Self-perceived severity score of the Korean version of the voice activity and participation profile; SR, L/H spectral ratio; V, vowel production; σ, standard deviation.

Comparison of the Acoustic Measures and Overall Severity Estimates Between the Pandemic and Prepandemic Cohorts Values are presented in mean ± standard deviation. P < 0.05. P < 0.01. P < 0.001. Abbreviations: APSID, acoustic psychometric severity index of dysphonia; CPP, cepstral peak prominence; CSID, cepstral spectral index of dysphonia; F0, fundamental frequency; F, female; Grade, Grade of the GRBAS scale; Jitt, jitter percent; M, male; NHR, noise-to-harmonic ratio; S, sentence production; Shim, shimmer percent; Severity, Self-perceived severity score of the Korean version of the voice activity and participation profile; SR, L/H spectral ratio; V, vowel production; σ, standard deviation. A two-way ANOVA showed that the SRV (P < 0.001) and SRS (P < 0.001) were higher, while σSRS was lower for the pandemic cohort compared to the prepandemic cohort. Except for the SR and σSR, other acoustic measures did not differ between two patient cohorts. Among the overall severity estimates, CSIDV (P = 0.007) and CSIDS (P = 0.038) were lower for the pandemic cohort compared to the prepandemic cohort, while the APSID did not differ between pandemic and prepandemic cohorts. Between groups, the patient group showed lower CPPV (P < 0.001), CPPS (P < 0.001), σCPPS (P < 0.001), SRV (P = 0.003), and SRS (P = 0.019) but higher σCPPV (P < 0.001), Jitt (P < 0.001), Shim (P < 0.001), and NHR (P < 0.001) than the normal group irrespective of pandemic or prepandemic periods. As for males, F0 of the patient group was higher than the normal group (P = 0.013). The patient group showed higher CSIDV, CSIDS, and APSID than the normal group (P < 0.001). There was no interaction effect for any of the parameters above. Results of the ROC curve analysis for the acoustic measures and overall severity estimates are presented in Table 2 and Figures 1 to 3 . Among the acoustic parameters, the AUCs of the CPP (ranging from .797 to .868) and σCPP (ranging from .650 to .769) were high in both pandemic and prepandemic cohorts. There was no significant difference in the AUCs between the two cohorts. On the other hand, the AUCs of SR (ranging from .521 to .668) and σSR (ranging from .529 to .576) were poor in both cohorts. For the SRV, the AUC of the pandemic cohort was significantly larger than the prepandemic cohort (P = 0.048). The AUCs of the traditional acoustic measures and overall severity estimates were not different between the cohorts (P > 0.05).

TABLE 2

Comparison of the ROC Curve Analysis of the Acoustic Measures Between the Pandemic and Prepandemic Cohorts

Parameters	Conditions	AUC	95% CI	Youden index J	Criterion	Sensitivity (%)	Specificity (%)	Z	P value
CPP_V	Pandemic	.797	.727−.857	.467	10.318	59.17	87.50	0.715	0.474
CPP_V	Prepandemic	.833	.766−.887	.558	9.718	65.83	90.00	0.715	0.474
CPP_S	Pandemic	.868	.805−.916	.608	5.946	68.33	92.50	0.925	0.355
CPP_S	Prepandemic	.824	.756−.880	.558	5.834	70.83	85.00	0.925	0.355
σCPP_V	Pandemic	.700	.622−.769	.342	0.982	46.67	87.50	0.784	0.433
σCPP_V	Prepandemic	.650	.571−.724	.358	0.901	50.83	85.00	0.784	0.433
σCPP_S	Pandemic	.731	.655−.798	.383	3.353	58.33	80.00	0.667	0.505
σCPP_S	Prepandemic	.769	.696−.832	.433	3.614	68.33	75.00	0.667	0.505
SR_V	Pandemic	.668	.589−.740	.308	35.516	53.33	77.50	0.197	0.048*
SR_V	Prepandemic	.537	.456−.616	.217	30.265	41.67	80.00	0.197	0.048*
SR_S	Pandemic	.644	.565−.718	.258	36.273	70.83	55.00	1.730	0.083
SR_S	Prepandemic	.521	.440−.600	.142	28.628	24.17	90.00	1.730	0.083
σSR_V	Pandemic	.532	.452−.612	.108	1.431	60.83	50.00	0.049	0.961
σSR_V	Prepandemic	.529	.448−.608	.125	1.184	90.00	22.50	0.049	0.961
σSR_S	Pandemic	.542	.462−.621	.133	7.686	95.83	17.50	0.455	0.649
σSR_S	Prepandemic	.576	.496−.654	.208	9.397	60.83	60.00	0.455	0.649
Jitt	Pandemic	.795	.724−.854	.500	1.072	65.00	85.00	1.558	0.119
Jitt	Prepandemic	.871	.809−.919	.575	0.833	87.50	70.00	1.558	0.119
Shim	Pandemic	.794	.722−.853	.542	3.743	66.67	87.50	.467	0.641
Shim	Prepandemic	.817	.748−.873	.558	4.345	68.33	87.50	.467	0.641
NHR	Pandemic	.634	.554−.709	.300	0.151	40.00	90.00	1.809	0.071
NHR	Prepandemic	.742	.667−.808	.483	0.139	63.33	85.00	1.809	0.071
CSID_V	Pandemic	.827	.760−.882	.592	1.974	87.17	85.00	0.372	0.710
CSID_V	Prepandemic	.808	.739−.866	.542	13.133	64.17	90.00	0.372	0.710
CSID_S	Pandemic	.801	.731−.860	.525	1.525	60.00	92.50	0.452	0.652
CSID_S	Prepandemic	.777	.705−.839	.492	-0.205	76.67	72.50	0.452	0.652
APSID	Pandemic	.935	.885−.968	.742	19.295	84.17	90.00	0.237	0.813
APSID	Prepandemic	.928	.876−.963	.758	28.245	83.33	92.50	0.237	0.813

P < 0.05.

Abbreviations: APSID, acoustic psychometric severity index of dysphonia; CPP, cepstral peak prominence; CSID, cepstral spectral index of dysphonia; F0, fundamental frequency; F, female; Grade, Grade of the GRBAS scale; Jitt, jitter percent; M, male; NHR, noise-to-harmonic ratio; S, sentence production; SR, L/H spectral ratio; Shim, shimmer percent; Severity, Self-perceived severity score of the Korean version of the voice activity and participation profile; V, vowel production; σ, standard deviation.

FIGURE 1

Comparison of the ROC curves of the cepstral measures between the pandemic and prepandemic cohorts. CPP, cepstral peak prominence; S, sentence production; V, vowel production; σ, standard deviation.

FIGURE 3

Comparison of the ROC curves of the overall severity estimates between the pandemic and prepandemic cohorts. APSID, acoustic psychometric severity index of dysphonia; CSID, cepstral spectral index of dysphonia; S, sentence production; V, vowel production .

Comparison of the ROC Curve Analysis of the Acoustic Measures Between the Pandemic and Prepandemic Cohorts P < 0.05. Abbreviations: APSID, acoustic psychometric severity index of dysphonia; CPP, cepstral peak prominence; CSID, cepstral spectral index of dysphonia; F0, fundamental frequency; F, female; Grade, Grade of the GRBAS scale; Jitt, jitter percent; M, male; NHR, noise-to-harmonic ratio; S, sentence production; SR, L/H spectral ratio; Shim, shimmer percent; Severity, Self-perceived severity score of the Korean version of the voice activity and participation profile; V, vowel production; σ, standard deviation. Comparison of the ROC curves of the cepstral measures between the pandemic and prepandemic cohorts. CPP, cepstral peak prominence; S, sentence production; V, vowel production; σ, standard deviation. Comparison of the ROC curves of the traditional acoustic measures and overall severity estimates between the pandemic and prepandemic cohorts. Jitt, jitter percent; NHR, noise-to-harmonic ratio; Shim, shimmer percent. Comparison of the ROC curves of the overall severity estimates between the pandemic and prepandemic cohorts. APSID, acoustic psychometric severity index of dysphonia; CSID, cepstral spectral index of dysphonia; S, sentence production; V, vowel production . The correlation matrix between the acoustic measures and the auditory-perceptual estimations in the pandemic and prepandemic cohorts is presented in Figure 4 . For both the pandemic and prepandemic cohorts, CPP of the vowel and sentence samples showed moderate to high negative correlation with CSID, APSID, and Grade, ranging from -0.63 to -0.93. On the other hand, traditional acoustic measures showed low to high positive correlation with the overall severity estimates, ranging from 0.42 to 0.78.

FIGURE 4

Correlation matrix between the acoustic measures and the auditory-perceptual estimations of the pandemic versus prepandemic cohorts. APSID, acoustic psychometric severity index of dysphonia; CPP, cepstral peak prominence; CSID, cepstral spectral index of dysphonia; Grade, grade of the GRBAS scale; Jitt, jitter percent; NHR, noise-to-harmonic ratio; S, sentence production; Shim, shimmer percent; V, vowel production.

DISCUSSION

The patients need to wear face masks to prevent potential respiratory particle emission and subsequent viral transmission during voice assessment and therapy sessions in the voice clinics. They are known to degrade speech perception, discrimination, and intelligibility with environmental noise, inducing substantial increase in self-perception of vocal changes, vocal effort, and communication stress. , , 28, 29, 30, 31 Moreover, they increased perceived vocal symptoms and difficulties in coordinating speech and breathing during speech production, especially for professional voice users. The effect of wearing masks on speech production and voice parameters needs to be investigated in order to interpret the voice outcome measured in different conditions. In this study, the majority of the voice parameters did not differ between the masked and unmasked conditions, although minor spectral ratio measures differed between the conditions. Moreover, the current results showed that the clinical usefulness was not compromised by wearing face masks to detect glottic insufficiency. As the traditional acoustic measurement is based on the periodicity of voice samples, chaotic voice samples of breathy vocal quality, especially severe ones, are not easy to analyze acoustically with reliability. Instead, cepstral measures are recommended for acoustic analysis of vowel prolongation and continuous speech samples. In previous studies of healthy populations or simulation situations, acoustic correlates of vocal quality, including the CPP, maximum phonation time, F0, Jitt, Shim, and HNR, were not statistically different between the masked and unmasked conditions.6, 7, 8 , 11, 12, 13 , Although a study showed that Jitt and Shim increased by wearing the mask, CPP remained unchanged by wearing the mask, which is in accordance with the current results. Furthermore, the cutoff scores of the CPP measures were similar to those of a previous Korean study (9.9995 dB for sustained vowel, 7.668 dB for running speech), which was performed during the prepandemic period. Together with the results of ROC curve analysis, these results suggest that the clinical usefulness and reliability of acoustic measures was not compromised by wearing the mask in the pandemic cohort, at least for the patients with glottic insufficiencies. It is difficult to link the research results from laboratory settings to changes in the assessment procedures or interpretation of voice parameters in clinics because various acoustic correlates of vocal quality are based on different frequency ranges. For example, L/H spectral ratio is the ratio of the spectral energy of lower to higher frequencies based on 4 kHz, although the reference frequency can vary. Similarly, the harmonic-to-noise ratio can be measured based on different frequencies, for example, 500 Hz, 1,500Hz, and 2,500Hz. On the other hand, when calculating the CPP using the default setting of the ADSV, the cepstral peak is located scanning the data in the cepstral array corresponding to the frequency range 60 to 300 Hz. H1-H2 and H1-A1 are acoustic correlates of breathy vocal quality, which are based on the spectral tilt between harmonics and formant frequencies below 1 kHz. Thus, the frequency range used to calculate each parameter should be carefully considered when predicting potential influences of the low-pass filtering effect of the face masks. Significant main effects of conditions were observed for SR measures. A higher SR for the pandemic condition may indicate that the severity of voice disorders somewhat decreased by wearing a face mask due to the decreased spectral energy of the high frequencies. This is in accordance with the previous studies that reported the attenuation effect of the face masks on the spectral energy of the high frequencies and an increase in the low-to-high spectral ratio and. A study reported that the most substantial attenuation was above 4 kHz and the L/H ratio is calculated based on the spectral energy of the lower frequencies compared to that of the higher frequencies over 4 kHz. The CSID is calculated with the CPP and L/H ratio measures derived from the vowel and sentence samples. On the other hand, APSID is based on the CPP and the Severity score reported by the patients using the K-VAPP questionnaire. Subsequently, the CSID of the vowel and sentence production differed between the conditions, because the CSID is partly based on the SR measures, which differed between the conditions. More precisely, the lower CSID in the pandemic condition implies that the high-frequency noise induced by the glottic insufficiency was filtered by wearing the face masks, followed by the decreased severity of voice disorders estimated by the CSID. These results indicate that one should be cautious in clinics and research when directly comparing the L/H ratio and CSID measures obtained in the masked condition to those obtained in the unmasked condition. Specifically, the severity estimated in the masked conditions should not be underestimated. Under the identical condition in terms of the use of face masks, we still can rely on the CSID repeatedly measured, because the AUCs did not differ between the masked and unmasked conditions. Although the L/H ratio decreased when wearing the face masks, the significance of the gap between the conditions should not be overestimated, because the AUC of the SR and σSR was poor in both conditions (<.07). This is in accordance with the Korean studies pertaining to the clinical usefulness of the cepstral and spectral measures in detecting dysphonia. Rather, the AUCs of the other acoustic measures including the CPP, Jitt, Shim, NHR, and overall severity estimates were similar between the two conditions. These results imply that the major acoustic parameters including the cepstral and traditional parameters are useful, reliable, and comparable in both the masked and unmasked conditions. Overall, the APSID was higher compared to the CSIDV and CSIDS of the identical participants. This means that the overall severity estimated by the APSID was higher than that estimated by the CSID. This is not surprising because the APSID reflects the self-perceived severity of the patients themselves. Assuming that the severity of voice disorder perceived by the patient is the highest, 20 points out of 100 will be added compared to the case where the voice is perceived as normal. A similar trend was observed in the original study of the APSID development. The AUC of the APSID was higher than that of both CSIDV and CSIDS, and the CSID of the pandemic condition was slightly higher than that of the prepandemic condition. This is consistent with SR's slightly elevated AUC during the pandemic period. Also, this suggests that the APSID, which reflects the patient-reported outcome measure, may be more useful than the SR and CSID, where the acoustic energy is affected due to wearing the mask. As the Severity score is obtained using only one item, it can be easily used for remote voice therapy and continuous voice evaluation with or without a mask in the era of COVID-19 and its variants. This study has several limitations due to the retrospective study design. First, direct comparison of the identical participants between the conditions was not possible because of the potential ethical problem. Aerodynamic measures such as subglottal pressure and mean air flow rate were not included in the study, because the aerodynamic assessment was not feasible due to wearing the face masks. Second, the sample size of the normal group was relatively small compared to the patient group. Patients with various voice disorders other than glottic insufficiency should also be further investigated. Third, the current data do not reflect the effect of face masks on the perception of voice quality in patients with dysphonia. Lastly, face-to-mask gaps in daily communication could be another variable which affects the acoustic measures.

CONCLUSION

In conclusion, most acoustic correlates of the vocal quality remained largely unaffected by wearing the face masks, although some spectral ratio measures suggested decreased severity. The clinical reliability of the acoustic analysis in patients with glottic insufficiency was not compromised by wearing the masks, especially for the CPP measures. The current results also indicated that the current protocol of acoustic analysis could be carried out while wearing a face mask to ensure safety in the pandemic era and fluctuating conditions. Direct comparison between the acoustic measurements before and during the pandemic is possible, although caution should be exercised with the overall severity estimates derived from the spectral ratio measures.

Author contributions

Seung Jin Lee: study design, acquisition of data, analysis and interpretation of data, manuscript writing; Min Seok Kang: acquisition of data, manuscript writing; Young Min Park: study design, manuscript writing; Jae-Yol Lim: study design, analysis and interpretation of data, manuscript writing. All authors reviewed and approved the manuscript.

Declaration of Competing Interest

There are no potential conflicts of interest to disclose.

28 in total

Review 1. Chaos in voice, from modeling to measurement.

Authors: Jack J Jiang; Yu Zhang; Clancy McGilligan
Journal: J Voice Date: 2005-06-20 Impact factor: 2.009

2. Voice Differences When Wearing and Not Wearing a Surgical Mask.

Authors: Maria Luisa Fiorella; Giada Cavallaro; Vincenzo Di Nicola; Nicola Quaranta
Journal: J Voice Date: 2021-03-09 Impact factor: 2.009

3. Acoustic voice characteristics with and without wearing a facemask.

Authors: Duy Duong Nguyen; Patricia McCabe; Donna Thomas; Alison Purcell; Maree Doble; Daniel Novakovic; Antonia Chacon; Catherine Madill
Journal: Sci Rep Date: 2021-03-11 Impact factor: 4.379

4. Respiratory Particle Emission During Voice Assessment and Therapy Tasks in a Single Subject.

Authors: Lauren Timmons Sund; Neel K Bhatt; Elisabeth H Ference; Wihan Kim; Michael M Johns
Journal: J Voice Date: 2020-10-22 Impact factor: 2.009

Review 5. Voice Therapy in the Context of the COVID-19 Pandemic: Guidelines for Clinical Practice.

Authors: Adrián Castillo-Allendes; Francisco Contreras-Ruston; Lady Catherine Cantor-Cutiva; Juliana Codino; Marco Guzman; Celina Malebran; Carlos Manzano; Axel Pavez; Thays Vaiano; Fabiana Wilder; Mara Behlau
Journal: J Voice Date: 2020-08-07 Impact factor: 2.009

6. Are Acoustic Markers of Voice and Speech Signals Affected by Nose-and-Mouth-Covering Respiratory Protective Masks?

Authors: Youri Maryn; Floris L Wuyts; Andrzej Zarowski
Journal: J Voice Date: 2021-02-16 Impact factor: 2.009

7. Effects of face masks on acoustic analysis and speech perception: Implications for peri-pandemic protocols.

Authors: Michelle Magee; Courtney Lewis; Gustavo Noffs; Hannah Reece; Jess C S Chan; Charissa J Zaga; Camille Paynter; Olga Birchall; Sandra Rojas Azocar; Angela Ediriweera; Katherine Kenyon; Marja W Caverlé; Benjamin G Schultz; Adam P Vogel
Journal: J Acoust Soc Am Date: 2020-12 Impact factor: 1.840

8. Assessment of Use and Fit of Face Masks Among Individuals in Public During the COVID-19 Pandemic in China.

Authors: Xiangbin Pan; Xi Li; Pengxu Kong; Lin Wang; Rundi Deng; Bin Wen; Luoxi Xiao; Honglin Song; Yi Sun; Hongmei Zhou; Jiang Lu; Yang Wang; Qiuzhe Guo; Lin Duo; Chengye Sun
Journal: JAMA Netw Open Date: 2021-03-01

9. COVID-19: Acoustic Measures of Voice in Individuals Wearing Different Facemasks.

Authors: Ashwini Joshi; Teresa Procter; Paulina A Kulesz
Journal: J Voice Date: 2021-06-19 Impact factor: 2.009