| Literature DB >> 31097764 |
Parham Mokhtari1,2, Hiroaki Kato3, Hironori Takemoto4, Ryouichi Nishimura5, Seigo Enomoto3, Seiji Adachi6, Tatsuya Kitamura7.
Abstract
Humans can externalise and localise sound-sources in three-dimensional (3D) space because approaching sound waves interact with the head and external ears, adding auditory cues by (de-)emphasising the level in different frequency bands depending on the direction of arrival. While virtual audio systems reproduce these acoustic filtering effects with signal processing, huge memory-storage capacity would be needed to cater for many listeners because the filters are as unique as the shape of each person's head and ears. Here we use a combination of physiological imaging and acoustic simulation methods to confirm and extend previous studies that represented these filters by a linear combination of a small number of eigenmodes. Based on previous psychoacoustic results we infer that more than 10, and as many as 24, eigenmodes would be needed in a virtual audio system suitable for many listeners. Furthermore, the frequency profiles of the top five eigenmodes are robust across different populations and experimental methods, and the top three eigenmodes encode familiar 3D spatial contrasts: along the left-right, top-down, and a tilted front-back axis, respectively. These findings have implications for virtual 3D-audio systems, especially those requiring high energy-efficiency and low memory-usage such as on personal mobile devices.Entities:
Mesh:
Year: 2019 PMID: 31097764 PMCID: PMC6522525 DOI: 10.1038/s41598-019-43967-0
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Correspondence between PCA and DCT representations. Each error distribution (n = 47,500) is shown by its median and 1st & 3rd quartiles. White boxes: all-inclusive PCA. Black boxes: cross-validation PCA (holding out one individual at a time). Red vertical thick lines: DCT with 8, 16, and 32 coefficients. Pale red dotted lines and patches: horizontal extension of DCT results to closest-matching PCA error distributions (i.e., to 4, 10, and 24 eigenmodes respectively).
Figure 2Top five eigenmodes of human HRTFs. Data from two previous studies shown in black dotted[1] and solid[2] lines (reproduced from Fig. 1 of ref.[2]). Our data shown in red (all-inclusive analysis) and pale blue (19 separate analyses leaving out one person at a time).
Eigenmode correlations with previous studies.
| Eigenmode | K&W | M&G | ||
|---|---|---|---|---|
| (a) | (b) | (a) | (b) | |
| 1 | 0.93 | 0.97 | 0.91 | 0.96 |
| 2 | 0.94 | 0.99 | 0.86 | 0.97 |
| 3 | 0.79 | 0.98 | 0.72 | 0.95 |
| 4 | 0.57 | 0.99 | 0.62 | 0.88 |
| 5 | 0.78 | 0.99 | 0.70 | 0.93 |
| mean | 0.80 | 0.98 | 0.76 | 0.94 |
Correlation coefficients between the top five eigenmodes and those of two previous studies (K&W1, M&G2). (a) before any frequency scaling. (b) after optimal frequency scaling[12] (+8.6%[1] and +7.8%[2]) to maximise the mean correlation.
Figure 3Spatial maps of top three eigenmodes. Hammer projection of mean weights on each eigenmode, at all 1250 farfield locations. Black reference lines indicate the median plane (vertical line), the frontal plane (horizontal line), and the horizontal plane (curved lines). The color scale extends to ±1 standard-deviation.