| Literature DB >> 32517134 |
Weijian Hu1, Kaiwei Wang1, Kailun Yang2, Ruiqi Cheng1, Yaozu Ye1, Lei Sun1, Zhijie Xu3.
Abstract
In recent years, with the development of depth cameras and scene detection algorithms, a wide variety of electronic travel aids for visually impaired people have been proposed. However, it is still challenging to convey scene information to visually impaired people efficiently. In this paper, we propose three different auditory-based interaction methods, i.e., depth image sonification, obstacle sonification as well as path sonification, which convey raw depth images, obstacle information and path information respectively to visually impaired people. Three sonification methods are compared comprehensively through a field experiment attended by twelve visually impaired participants. The results show that the sonification of high-level scene information, such as the direction of pathway, is easier to learn and adapt, and is more suitable for point-to-point navigation. In contrast, through the sonification of low-level scene information, such as raw depth images, visually impaired people can understand the surrounding environment more comprehensively. Furthermore, there is no interaction method that is best suited for all participants in the experiment, and visually impaired individuals need a period of time to find the most suitable interaction method. Our findings highlight the features and the differences of three scene detection algorithms and the corresponding sonification methods. The results provide insights into the design of electronic travel aids, and the conclusions can also be applied in other fields, such as the sound feedback of virtual reality applications.Entities:
Keywords: electronic travel aid; scene detection; sonification; visually impaired people
Mesh:
Year: 2020 PMID: 32517134 PMCID: PMC7309097 DOI: 10.3390/s20113222
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Flow chart of the proposed assistance system. We propose three sonification methods and corresponding scene detection algorithms to convert scene information to sound.
Figure 2Right-handed spherical coordinate that used to represent the spatial location of sound sources.
Figure 3Image sonification. (a) Raw color image. (b) Raw depth image, where invalid depth values are indicated by dark blue. (c) The result of superpixel segmentation on color image. (d) Downsampled depth image guided by superpixels in (c). (e) Schematic of image sonification.
Parameter mapping rules of image sonification. Parameters of superpixel are mapped to the parameters of sine waves. The origin of the image coordinate system is at the top left corner of raw depth images at the resolution of .
| Data Parameter | Sound Parameter | Mapping Function |
|---|---|---|
| Superpixel vertical coordinate | Pitch | {(0, −6 st), (240, +6 st)} |
| Superpixel horizontal coordinate | Azimuth |
|
| Depth value | Loudness | {(0.2 m, 5 dB), (1.5 m, −50 dB), (3.5 m, −70 dB)} |
Figure 4Obstacle sonification. (a) Raw depth image. (b) The results of ground and obstacle detection, where green masks represent the ground, and the detected obstacles are highlighted by red bounding boxes. (c) Sonification scheme, where obstacles are indicated by blue striped blocks and the obstacle distance of each section is indicated by red bold lines.
Parameter mapping rules of obstacle sonification. Sections in the polar coordinate system is mapped to the parameters of instrument.
| Data Parameter | Sound Parameter | Mapping Function |
|---|---|---|
| Section index | Timbre |
|
| Section index | Azimuth |
|
| Section distance | Loudness | {(0.2 m, 0 dB), (1.5 m, −40 dB), (3.5 m, −70 dB)} |
| Section distance | Tempo | {(0.2 m, 0.1 s), (1.5 m, 1 s), (3.5 m, 2 s)} |
| Section distance | Pitch | {(0.2 m, +6 st), (3.5 m, −6 st)} |
Figure 5Path sonification. (a) Traversable distance curves, where green masks represent detected ground, and the blue curve and red curve represent respectively the raw traversable distance curve and the smoothed curve. The red arrow denotes the most-traversable direction. (b) According to the most-traversable direction, either flute sound or water drop sound will be played.
Parameter mapping rules of path sonification. Pathway information is mapped to the parameters of flute and water drop sound.
| Data Parameter | Sound Parameter | Mapping Function |
|---|---|---|
| Most-traversable direction | Flute azimuth |
|
| Most-traversable direction | Flute pitch | |
| Most-traversable direction | Flute loudness | |
| Traversable distance ahead | Water drop loudness | |
| Traversable distance ahead | Water drop tempo | {(1.5 m, 0.5 s), (2 m, 1 s), (3.5 m, 2 s)} |
Figure 6Field experiments. (a) Hardware implementation. A pair of smart glasses with an RGB-D camera and a bone conduction headphone. (b) A participant is training in the field with the help of us. (c) Diagram and panorama image of the experiment field, where obstacles are represented by little squares.
Basic information of volunteers.
| Subject ID | Gender | Age | Vision |
|---|---|---|---|
| 1 | female | 18 | total blind |
| 2 | female | 18 | total blind |
| 3 | male | 17 | low vision |
| 4 | male | 18 | total blind |
| 5 | male | 18 | total blind |
| 6 | male | 19 | total blind |
| 7 | female | 19 | total blind |
| 8 | female | 17 | low vision |
| 9 | male | 17 | low vision |
| 10 | male | 18 | total blind |
| 11 | female | 19 | total blind |
| 12 | male | 17 | total blind |
Experiment results. S1, S2 and S3 represent image sonification, obstacle sonification and path sonification respectively, and W represents the white cane method.
| Subject ID | Training Time | Mean Completion Time | Number of Failures | Most Favorite | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| S1 | S2 | S3 | S1 | S2 | S3 | W | S1 | S2 | S3 | W | ||
| 1 | 5 | 11 | 15 | 48.0 | 55.3 | 32.7 | 37.0 | 0/2 | 0/0 | 0/0 | 0/0 | S2 |
| 2 | 6 | 12 | 11 | 28.4 | 30.2 | 39.2 | 31.2 | 1/0 | 1/0 | 0/0 | 0/0 | S3 |
| 3 | 6 | 9 | 12 | 34.1 | 36.4 | 36.2 | 42.3 | 0/0 | 0/0 | 0/0 | 0/0 | S2 |
| 4 | 6 | 10 | 14 | 38.1 | 36.7 | 28.9 | 30.7 | 0/0 | 0/0 | 0/0 | 0/0 | S3 |
| 5 | 5 | 11 | 13 | 67.9 | 66.2 | 62.6 | 52.2 | 0/0 | 0/0 | 1/0 | 0/0 | S2 |
| 6 | 5 | 13 | 12 | 35.6 | 33.0 | 35.2 | 33.5 | 0/0 | 0/0 | 0/0 | 0/0 | W |
| 7 | 5 | 12 | 17 | 36.8 | 37.6 | 42.4 | 32.8 | 2/0 | 2/0 | 0/0 | 0/0 | W |
| 8 | 6 | 15 | 15 | 33.4 | 36.0 | 36.3 | 32.5 | 0/0 | 0/0 | 0/0 | 0/0 | S1 |
| 9 | 6 | 11 | 18 | 56.8 | 47.1 | 57.5 | 47.7 | 0/1 | 0/0 | 1/0 | 0/0 | S1 |
| 10 | 7 | 9 | 16 | 52.9 | 60.1 | 47.6 | 36.0 | 0/0 | 2/0 | 0/0 | 0/0 | S3 |
| 11 | 5 | 11 | 17 | 38.2 | 36.2 | 41.2 | 24.9 | 0/0 | 0/0 | 0/0 | 0/0 | W |
| 12 | 7 | 12 | 14 | 33.2 | 35.1 | 34.0 | 31.6 | 1/0 | 0/0 | 0/0 | 0/0 | S3 |
| Sum | / | / | / | / | / | / | / | 4/3 | 5/0 | 2/0 | 0/0 | / |
| Mean | 5.8 | 11.3 | 14.5 | 41.9 | 42.5 | 41.2 | 37.1 | / | / | / | / | / |
| SD | 0.8 | 1.7 | 2.2 | 11.3 | 11.3 | 9.7 | 8.2 | / | / | / | / | / |
Figure 7Box chart of the questionnaire results. I1 to I5 indicate five items of questionnaire for evaluating scene representation, navigation, complexity, comfort and overall satisfaction, and the score is based on the 7-point Likert scales, where 1 means strongly disagree and 7 means strongly agree. The median and mean values of each item are shown in red lines and diamond markers respectively.