| Literature DB >> 28198405 |
Kazuo Ueda1, Yoshitaka Nakajima1.
Abstract
The peripheral auditory system functions like a frequency analyser, often modelled as a bank of non-overlapping band-pass filters called critical bands; 20 bands are necessary for simulating frequency resolution of the ear within an ordinary frequency range of speech (up to 7,000 Hz). A far smaller number of filters seemed sufficient, however, to re-synthesise intelligible speech sentences with power fluctuations of the speech signals passing through them; nevertheless, the number and frequency ranges of the frequency bands for efficient speech communication are yet unknown. We derived four common frequency bands-covering approximately 50-540, 540-1,700, 1,700-3,300, and above 3,300 Hz-from factor analyses of spectral fluctuations in eight different spoken languages/dialects. The analyses robustly led to three factors common to all languages investigated-the low &mid-high factor related to the two separate frequency ranges of 50-540 and 1,700-3,300 Hz, the mid-low factor the range of 540-1,700 Hz, and the high factor the range above 3,300 Hz-in these different languages/dialects, suggesting a language universal.Entities:
Mesh:
Year: 2017 PMID: 28198405 PMCID: PMC5309770 DOI: 10.1038/srep42468
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Analysed speech samples.
| Languages/dialects | Number of sentences | Number of speakers | Overall duration of utterances (s) | Mean duration per utterance (s) | |
|---|---|---|---|---|---|
| Female | Male | ||||
| American English | 86 | 10 | 10 | 4,123.2 | 2.4 |
| British English | 200 | 5 | 5 | 4,038.5 | 2.0 |
| Cantonese | 58 | 5 | 5 | 1,131.7 | 2.0 |
| French | 200 | 5 | 5 | 3,533.2 | 1.8 |
| German | 200 | 5 | 5 | 3,707.0 | 1.9 |
| Japanese | 200 | 5 | 5 | 5,041.3 | 2.5 |
| Mandarin | 78 | 5 | 5 | 1,834.9 | 2.4 |
| Spanish | 136 | 5 | 5 | 2,918.1 | 2.1 |
| Total | 1,158 | 45 | 45 | 26,327.8 | 2.1 |
Speech samples were extracted from the database11.
Figure 1Factor loadings plotted against the centre frequency of critical bands.
(a) Three-factor analysis. (b) Four-factor analysis. The thick lines represent factor loadings derived from the merged data across eight languages/dialects; the colours of the thick lines are to distinguish factors. The thin lines show the results of individual languages/dialects without distinguishing factors: American English (pink), British English (dark green), Cantonese (purple), French (sky blue), German (black), Japanese (blue), Mandarin (yellow), and Spanish (olive green). The broken lines are the counterparts of the solid lines of the same colours, using a filter-bank shifted up by half a critical bandwidth (Supplementary Table S1). The cumulative contributions ranged from 33–41% (a) and from 40–47% (b), depending on the analysed data set and the utilised filters. One division of the horizontal axis corresponds to 0.5 critical bandwidth, with the two sets of centre frequencies alternating. Orange vertical lines represent schematic frequency boundaries estimated from crossover frequencies of the curves.