Literature DB >> 26557363

Effect of Compression Ratio on Perception of Time Compressed Phonemically Balanced Words in Kannada and Monosyllables.

Prashanth Prabhu¹, Mirale Jagadish Sujan¹, Satish Rakshith¹.

Abstract

The present study attempted to study perception of time-compressed speech and the effect of compression ratio for phonemically balanced (PB) word lists in Kannada and monosyllables. The test was administered on 30 normal hearing individuals at compression ratios of 40%, 50%, 60%, 70% and 80% for PB words in Kannada and monosyllables. The results of the study showed that the speech identification scores for time-compressed speech reduced with increase in compression ratio. The scores were better for monosyllables compared to PB words especially at higher compression ratios. The study provides speech identification scores at different compression ratio for PB words and monosyllables in individuals with normal hearing. The results of the study also showed that the scores did not vary across gender for all the compression ratios for both the stimuli. The same test material needs to be compared the clinical population with central auditory processing disorder for clinical validation of the present results.

Entities: CellLine Chemical Disease Gene Species

Keywords: auditory closure; compression ratio; monosyllables; phonemically balanced words; time compressed speech

Year: 2015 PMID： 26557363 PMCID： PMC4627116 DOI： 10.4081/audiores.2015.128

Source DB: PubMed Journal: Audiol Res ISSN： 2039-4330

Introduction

Auditory processing requires acquisition of auditory processes namely in sound localization and lateralization; auditory discrimination; auditory pattern recognition; audition in temporal aspects; auditory performance in competing acoustic signals; and auditory performance with degraded acoustic signals.[1-3] If the person has an abnormality in any one or more of these processes, that condition is referred to as (central) auditory processing disorder [(C)APD], provided the difficulties are not due to problems in language, cognition or other factors.[1] These auditory processes are usually assessed using a battery of tests. Baran[4] classified the test battery for (C)APD into five categories namely dichotic tests, auditory temporal processing and patterning tests, binaural interaction tests, monaural low-redundancy tests, and electrophysiologic tests. Auditory closure abilities are usually assessed using monaural low redundancy tests. One of the common monaural low redundancy test is the time compressed speech test.[5] Time compressed speech is also used as a test for assessment of temporal processing.[6] Time-compressed speech reduces the extrinsic redundancy of speech materials as the rate of speech is altered without causing distortion of intensity and frequency of speech.[7] The listeners with normal intrinsic redundancy will use their auditory closure skills to fill the missing information in the time-compressed speech. However, individuals with (C)APD with reduced intrinsic redundancy exhibit poor performance suggesting auditory closure deficits.[7] The extent of auditory closure deficits can be determined by varying the level of degradation of loss of extrinsic redundancy through changes in rate of time compression.[3] The time compressed speech material is typically used for assessment of auditory processing disorders or in individuals with neurological deficits.[8] The time-compressed speech is also efficient in understanding the auditory processing in aging populations.[9-11] It is also reported that the training of time-compressed speech enhances the ability to recognize speech even with compression and degradation of the speech signal.[12] Considering all these major applications of time-compressed speech, it is essential to develop time-compressed speech in all languages for further exploration of these areas. There are only a few Indian studies which report time compressed speech in Indian languages.[13-15] A time compressed speech test was developed in Kannada for children by Kumar.[13] However, there are no studies reported on time-compressed speech in adults with normal hearing in the Kannada language. In addition, there are no Indian studies, which have assessed perception of time-compressed speech for monosyllables. There is also a need to develop normative scores for non-sense monosyllables, as it is not language specific. Non-language specific normative scores would help to evaluate larger clinical population with different languages in which time compressed speech is not yet developed. Thus, the aim of the present study was to determine speech identification scores (SIS) for time compressed speech using phonemically balanced (PB) words in Kannada and non-sense monosyllables. It is well established that as the degree of time compression (compression ratio) increases the speech perception scores decreases.[16,17] It is reported that the scores reduce gradually from 0% to 60% and more difficulty is experienced at 70% time compression and above.[18] There are reports which suggest that speech perception varies with the type of stimuli used for different degrees of time compression.[5] Thus, in the present study, we measured whether speech identification scores vary across phonemically balanced words and monosyllables for different compression ratios. The study would be helpful in determining the appropriate compression ratio to be used in clinical population. In addition, the results of the study would be useful to understand the variation in perception of time-compressed speech across stimuli. Thus, the aim of the study was to determine SIS for time-compressed speech with different compression ratios for PB words in Kannada and monosyllables in normal hearing individuals. The main objective of the study were to determine SIS for time compressed PB words in Kannada and monosyllables at compression ratios of 40%, 50%, 60%, 70% and 80% and thus establish whether there is an optimum compression ratio to be used for clinical population. In addition, it was also attempted to compare the SIS for PB words in Kannada and monosyllables across different compression ratios and determine if there is any gender effect on SIS for time-compressed speech for PB words and monosyllables.

Materials and Methods

Participants

Thirty individuals (15 males and 15 females) between the ages of 17-30 years (mean age: 19.2) participated in the study. All the participants had pure tone thresholds between 0 dB HL - 15 dB HL from 250 Hz to 8000 Hz for individual ears. All the participants had normal middle ear function. None of the subjects reported previous history of use of ototoxic drugs, long/short term exposure to high-level noise, or otological/neurological diseases. All the participants obtained normal speech perception in noise (SPIN) scores. An informed consent was taken from all the participants of the study. All tests were carried out in sound treated audiometric rooms with permissible noise levels standards of ANSI S3.1-1999 (R 2013; SAI Global Ltd., Sydney, Australia).

Procedure

The air conduction (AC) and bone conduction (BC) thresholds were estimated using the Modified Hughson-Westlake procedure.[19] AC thresholds were obtained for pure tones from 250 Hz to 8 kHz and BC thresholds from 250 Hz to 4 kHz in octave frequencies. Speech identification scores were obtained for phonemically balanced words developed for adults in Kannada by Yathiraj and Vijayalakshmi.[20] Immittance evaluation using tympanometry and acoustic reflex threshold testing was done with 226-Hz probe tone and acoustic reflexes for 500, 1000, 2000 and 4000 Hz (ipsilateral and contralateral) in a calibrated middle ear analyzer (GSI Tympstar V 2.0; Grason-Stadler, Eden Prairie, MN, USA). All the participants had normal tympanogram with reflexes present at all frequencies (ipsilateral and contralateral) in both ears. SPIN was administered on all the participants at 0 dB SNR.

Speech identification scores for time compressed speech

The phonemically balanced word list with 25 words developed by Sreela and Devi[21] and list of monosyllables with 20 monosyllables developed by Mayadevi[22] were recorded by a female native Kannada speaker. The recorded stimuli were subjected to normalization such that all the words had the same intensity. A 1 kHz calibration tone was recorded prior to the lists to monitor the VU Meter. The four lists of PB words and 2 lists of monosyllables were randomized using random tables to make eight lists. The PB word list and monosyllable word list were time compressed by shortening them digitally using pitch synchronous overlap and add method[23] using PRAAT software (http://www.praat.org).[24] The PB words and monosyllables were time compressed into compression ratios of 50%, 60%, 70% and 80%. The stimuli were routed through a computer to the CD/tape input of digital PIANO Inventis diagnostic srl (Padova, Italy) audiometer through TDH-50 headphones with MX-41/AR cushions. The presentation level of the test material was at 40 dB SL (re: speech recognition threshold). The speech identification scores were determined for each compression ratio for both stimuli. The participants were instructed to give a written response. The participants were also informed to guess the speech items if they were not very clear. The 50% of participants were tested in right ear first and the remaining 50% participants were tested in left ear to avoid an ear effect. The stimuli presentation was randomized for all the lists to avoid practice and order effects.

Data analysis

The correct responses were calculated for both stimuli for all the different ratios and converted to percentage of scores. The statistical analysis of the data was carried out using Statistical package for social sciences (SPSS; IBM Corp., Armonk, NY, USA) version 17. Paired samples t-test and mixed analysis of variance (ANOVA) were used to analyze the data.

Results

The results of the study showed that the scores decreased with increase in compression for both stimuli. The mean and SD of SIS obtained for different PB words and monosyllables are shown in Figure 1. For PB words, SIS was around 90% at 50% compression ratio and for monosyllables; SIS was around 90% at 60% compression ratio. Paired sample t-tests showed that scores were significantly higher (P<0.001) for monosyllables compared to PB words across all compression ratios.

Figure 1.

Mean and standard deviation for phonemically balanced (PB) words and monosyllables across different compression ratios.

Mixed ANOVAs were done considering scores across compression ratio as a within subject factor and gender as a between subject factor. The result showed a significant main effect of compression ratio for both PB words and monosyllables. Bonferroni’s multiple group comparison suggested that there was significant difference (P<0.001) across all the compression ratio conditions for both stimuli. There was no significant main effect of gender and none for the interaction of gender and compression ratio. The SIS across gender for different compression ratios for PB words is shown in Figure 2. The SIS across gender for different compression ratios for monosyllables is shown in Figure 3.

Figure 2.

Mean and standard deviation for phonemically balanced (PB) words across gender for different compression ratios.

Figure 3.

Mean and standard deviation for monosyllables across gender for different compression ratios.

The difference in means between time compressed monosyllables and PB words across compression ratio was calculated and the results are shown in Figure 4. The difference in mean scores between monosyllables and PB words for 40%, 50% compression was 5% and it was 6% for 60% compression. However, the difference in means between stimuli for 70% compression was 13.6% and the difference was 25% for 80% compression.

Figure 4.

Difference in speech identification scores (SIS) between monosyllables and phonemically balanced words for different compression ratios.

Discussion

The result of the study shows a reduction in speech identification scores with increase in the compression ratio which is in consensus with previous studies on time compressed speech.[16,17] The scores were highest for lower compression ratios for both the type of stimuli. This suggests that under lower compression ratios, the participants were able to use auditory closure abilities to guess the items correctly.[25-27] The speech identification scores were high (around 90%) and were less variable (lowest standard deviation) at 60% compression ratio for monosyllables and 50% compression ratio for PB words. The results are in consensus with previous studies which report that the average correct response was around 82% in the most stable compression ratio in normal hearing adults.[16,18,28] Thus, it provides normative scores at different compression ratios for PB words and monosyllables in individuals with normal hearing. The results of the present study showed no effect of gender on time-compressed speech. Lau[29] also reported that performance on time compressed speech did not vary across gender for Cantonese time compressed speech test. Bhargavi et al.,[15] Sujitha[14] and Kumar[13] also reported that there was no difference in terms of gender for different compression ratios in children between 7-12 years on tests of time compressed speech in Indian languages. Thus, the results of the present study are in agreement with previously reported studies which suggests that there exist no significant difference on scores for time compresses speech across males and females for different compression ratios. This result suggests that there is no need to have separate normative values for males and females for time compressed speech test with PB words and monosyllables. The speech identification scores for time compressed speech were expected to be better for PB words compared to monosyllables considering the higher redundancy for PB words. However, an inverse result was obtained in our study where scores were better for monosyllables compared to PB words at all compression ratios and the difference was greater at higher compression ratios. The monosyllables used in this study were non-sense consonant-vowel syllables that lacked consonant clusters; however, the PB words did have consonant clusters. As observed, after the time compression, the consonant clusters were absent reducing the important temporal cues, which may have led to higher mistakes for PB words. This effect of loss of temporal cues on SIS is reported to increase as the compression ratio increases. Rabelo and Schochat[5] also reported that scores were better for monosyllables compared to disyllables for Brazilian time compressed speech. They also attributed the difference to loss of consonant clusters in disyllables for poorer speech identification scores.

Conclusions

The study provides data on perception of time-compressed speech for PB words in Kannada and monosyllables across different compression ratios. The result of the study showed no difference in scores across males and females for both stimuli for all compression ratios. The result of the study also recommends the use of monosyllables while using higher compression ratios. Hence, the same test material needs to be compared the clinical population with (C)APD for clinical validation of the present results.

8 in total

1. Temporal processing in the aging auditory system.

Authors: A Strouse; D H Ashmead; R N Ohde; D W Grantham
Journal: J Acoust Soc Am Date: 1998-10 Impact factor: 1.840

2. Dissociations in perceptual learning revealed by adult age differences in adaptation to time-compressed speech.

Authors: Jonathan E Peelle; Arthur Wingfield
Journal: J Exp Psychol Hum Percept Perform Date: 2005-12 Impact factor: 3.332

3. Psychoacoustic tests for central auditory processing: normative data.

Authors: Rafi Shemesh
Journal: J Basic Clin Physiol Pharmacol Date: 2008

4. Time compressed speech--a perspective.

Authors: D B Orr
Journal: J Commun Date: 1968-09

5. Effects of time compression and time compression plus reverberation on the intelligibility of Northwestern University Auditory Test No. 6.

Authors: R H Wilson; J P Preece; D L Salamon; J L Sperry; S P Bornstein
Journal: J Am Acad Audiol Date: 1994-07 Impact factor: 1.664

6. Spontaneous segmentation in normal and in time-compressed speech.

Authors: A Wingfield; K A Nolan
Journal: Percept Psychophys Date: 1980-08

7. Recognition of time-compressed and natural speech with selective temporal enhancements by young and elderly listeners.

Authors: Sandra Gordon-Salant; Peter J Fitzgibbons; Sarah A Friedman
Journal: J Speech Lang Hear Res Date: 2007-10 Impact factor: 2.297

8. Time-compressed speech test in Brazilian Portuguese.

Authors: Camila Maia Rabelo; Eliane Schochat
Journal: Clinics (Sao Paulo) Date: 2007-06 Impact factor: 2.365

8 in total