| Literature DB >> 36081002 |
Jing Wang1,2, Rongfeng Zhao1, Peitong Li1, Zhiqiang Fang1, Qianqian Li1, Yanling Han1, Ruyan Zhou1, Yun Zhang1.
Abstract
Visual prostheses, used to assist in restoring functional vision to the visually impaired, convert captured external images into corresponding electrical stimulation patterns that are stimulated by implanted microelectrodes to induce phosphenes and eventually visual perception. Detecting and providing useful visual information to the prosthesis wearer under limited artificial vision has been an important concern in the field of visual prosthesis. Along with the development of prosthetic device design and stimulus encoding methods, researchers have explored the possibility of the application of computer vision by simulating visual perception under prosthetic vision. Effective image processing in computer vision is performed to optimize artificial visual information and improve the ability to restore various important visual functions in implant recipients, allowing them to better achieve their daily demands. This paper first reviews the recent clinical implantation of different types of visual prostheses, summarizes the artificial visual perception of implant recipients, and especially focuses on its irregularities, such as dropout and distorted phosphenes. Then, the important aspects of computer vision in the optimization of visual information processing are reviewed, and the possibilities and shortcomings of these solutions are discussed. Ultimately, the development direction and emphasis issues for improving the performance of visual prosthesis devices are summarized.Entities:
Keywords: artificial vision; computer vision; dropout and distorted phosphenes; optimization strategy; visual prosthesis
Mesh:
Year: 2022 PMID: 36081002 PMCID: PMC9460383 DOI: 10.3390/s22176544
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Progress on clinical trials of visual prosthetics.
| Implant Site | Visual Prosthesis | Electrode Number | Visual Implant Vision | Clinical Trial Numbers | Status |
|---|---|---|---|---|---|
| Epiretinal | Argus® II [ | 60 | 20/1260 | NCT03635645 | Received the CE mark in 2011, FDA approval in 2013, and two patients to identify a subset of the Sloan letters. |
| IRIS II [ | 150 | NA | NCT02670980 | Ten patients evaluated for functional visual tasks for up to 3 years | |
| IMI [ | 49 | NA | NCT02982811 | Follow-up of 20 patients with faint light perception for 3 months | |
| Subretinal | Alpha-AMS [ | 1500 | 20/546 | NCT03629899 | Received the CE mark in 2013 and had patients achieve an optimal visual acuity of 20/546. |
| PRIMA [ | 378 | 20/460 | NCT03333954 | Implantation of PRIMA to five patients was started in 2017 with 36 months of follow-up. | |
| Suprachoroidal retinal prosthesis [ | 49 | NA | NCT05158049 | Seven implants were assessed for vision, orientation, and movement. | |
| Bionic Eye [ | 44 | NA | NCT03406416 | The safety of the device was evaluated in 2018 by implantation in four subjects with increased electrode–retinal distance and stable impedance after the procedure, with no side effects. | |
| Intracortical | ORION [ | 60 | NA | NCT03344848 | Six patients without photoreceptors were approved by the FDA to be implanted in 2017, and each implant recipient received a 5-year follow-up; data from the relevant trials are not yet publicly available. |
| optic cortex | ICVP [ | 144 | NA | NCT04634383 | Five participants, tested weekly for 1 to 3 years, were assessed for electrical-stimulation-induced visual perception. |
| CORTIVIS [ | 100 | NA | NCT02983370 | After receiving FDA approval, it was implanted in five patients for six months in 2019. |
Data source: National Center for Biotechnology Information official website.
Figure 1(A) shows an individual letter recognition task (dark environment). The display shows the letters in white in Century Gothic font on a black background, and the monitor next to it shows the camera view (V) and array (A) in real time; (B) is an illustration of the difference between the electrode activation maps in standard and scrambled modes when the camera is viewing the letter “N”. The correspondence between the real position of the phosphenes and the stimulus position on the array were randomized in the scrambled mode (Adapted with permission from Ref. [62]. Copyright 2011 Royal Australian and New Zealand College of Ophthalmologists).
Figure 2Retinal ganglion cells with 90-degree axonal curvature. (A) Diffuse streaky phosphenes produced by stimulation of distant retinal ganglion cells through the overlying axons. (B) Direct stimulation of retinal ganglion cells beneath electrodes produces punctuate phosphene (cited from [34]).
Figure 3Schematic diagram of poor phosphenes induced by CORTIVIS electrode implantation. (A) Simultaneous stimulation of four electrodes arranged in a square may produce the perception; (B) Immediately after implantation, the induced phosphenes may cause poor perception of objects, such as the letter “E” in the figure. However, appropriate learning and rehabilitation strategies can help to improve the poor perception (adapted from [66]).
Figure 4Schematic diagram of caricatured human face. (A) How faces are represented in the brain, and how this explains improved performance with caricaturing. The dimensions coded on the axes remain unknown, so they might represent the width of the face or other variables. (B) Examples of caricatured faces. Facial features are altered, e.g., the higher the degree of caricature, the thicker the lips become. (C) The leftmost is the face after 60% caricature. The three images from right to left are phosphene images with random dropout at different resolutions (adapted from [78]).
Figure 5Two optimization strategy procedures (cited from [87]).
Figure 6Principle of optimization of Chinese characters for nearest neighbor search. is the ideal location for the induced phosphene point of view, is the location of the actual induced phosphene, is the location of the phosphene induced by another electrode, represents the shortest distance (cited from [87]).
Figure 7The principle of optical illusion point of view supplement. z is arbitrary Gaussian noise; M is the dropout part; in the binary mask, 0 indicates the dropout position of the phosphenes, and 1 indicates the retained phosphene position; is the input image; is the image of ideal phosphenes. is a mapping close to the real generated image obtained from the Gaussian noise z. is the optimal solution. is the most suitable mapping to represent the dropout part (adapted from [93]).
Image processing optimization on visual prostheses.
| Visual Tasks | Optimization Methods | Array | Distortion | Dataset | Evaluation | Results |
|---|---|---|---|---|---|---|
| Optimization | ||||||
| Significance | no | no | self-construction | Subjects selected the | ||
| amplification | Subject | significance amplification | ||||
| window | preference | window as the most | ||||
| [ | helpful method. | |||||
| no | no | self-construction | Recognition accuracy | The recognition accuracy of | ||
| VJFR; | VJFR-ROI, SFR-ROI, and | |||||
| SFR; | MFR-ROI were | |||||
| 52.78 ± 18.52%, 62.78 ± 14.83% | ||||||
| MFR | and 67.22 ± 14.45% | |||||
| [ | respectively | |||||
| Histogram | no | no | self-construction | real-time | ||
| face detection | ||||||
| equalization | Algorithm | at low resolution (30 fps) | ||||
| enhancement | runtime | |||||
| [ | yes | yes | 26 faces | Correct recognition rates of | ||
| Face | Caricatured human | Recognition | 53% and 65% were | |||
| Recognition | face [ | accuracy | obtained with old faces and | |||
| new faces, respectively. | ||||||
| FaceNet [ | no | no | self-construction | The average face | ||
| Recognition | recognition accuracy | |||||
| accuracy | obtained by the subjects | |||||
| reached over 77.37%. | ||||||
| no | no | self-construction | The average recognition | |||
| accuracy at 8 × 8, 12 × 12, and | ||||||
| 16 × 16 resolutions were | ||||||
| Sobel edge | Recognition | 27 ± 12.96%, 56.43 ± 17.54%, | ||||
| detection and | accuracy, | and 84.05 ± 11.23%, | ||||
| contrast | response time | respectively; the average | ||||
| enhancement | response times were | |||||
| techniques [ | 3.21 ± 0.68 s, 0.73 s, and | |||||
| 1.93 ± 0.53 s. | ||||||
| F2Pnet [ | yes | no | AIRS-PFD | Mean individual | ||
| Individual | identifiability of 46% at a | |||||
| identifiability | low resolution with | |||||
| dropout | ||||||
| yes | yes | The irregularity index | ||||
| Commonly Used | reached 0.4, and the | |||||
| NNS and | Chinese | Recognition | average recognition | |||
| expansion | Character | accuracy | accuracy of the subjects | |||
| method [ | Database | after using the correction | ||||
| method was over 80%. | ||||||
| no | no | Standardized | Reading speed | The reading speeds of the | ||
| MNREAD | subjects using 6 × 6 and 8 × 8 | |||||
| Threshold judgment | reading test | resolutions reached 15 | ||||
| [ | provided by | words/min and 30 | ||||
| Dr. G.E. Legge | words/min. | |||||
| yes | yes | Commonly used | ||||
| Character | modern Chinese | The recognition accuracy of | ||||
| Recognition | Projection and | characters (the | Recognition | the subjects using the NNS | ||
| NNS [ | first 500 in the | accuracy | method exceeded 68%. | |||
| statistical table) | ||||||
| yes | no | N, H, R, S | The average recognition | |||
| Directed | Recognition | accuracy of the subjects | ||||
| phosphenes [ | accuracy | was 65%. | ||||
| SP [ | no | no | After SP, the character | |||
| 26 English letters | Recognition | recognition accuracy of the | ||||
| 40 Korean letters | accuracy | subjects broke the passing | ||||
| line (60%). | ||||||
| Checkerboard-style | no | no | NA * | NA * | ||
| phosphene guide | RGB-D camera | |||||
| walking [ | capture | |||||
| no | self-construction | Percentage of | The mean PC of subjects in | |||
| correctly | single task was | |||||
| completed | 88.72 ± 1.41%, mean CT was | |||||
| Top-down global | tasks (PC), | 41.76 ± 2.9s, and mean | ||||
| contrast | completion | HMID was 575.70 ± 38.53°; | ||||
| significance | time (CT), | the mean PC of subjects in | ||||
| detection [ | head | multitask was 84.72 ± 1.41%, | ||||
| movements in | mean CT was 40.73 ± 2.1 s, | |||||
| degrees | and mean HMID was | |||||
| (HMID) | 487.38 ± 14.71°. | |||||
| no | no | self-construction | The average recognition | |||
| accuracy of the subjects | ||||||
| GBVS and edge | Recognition | was 70.63 ± 7.59% for single | ||||
| detection [ | accuracy | object recognition and | ||||
| 75.31 ± 11.40% for | ||||||
| double-target recognition. | ||||||
| Generating an | yes | yes | ETH-80 | The subjects were able to | ||
| Object | additive model for | Recognition | accomplish an average | |||
| Recognition | adversarial | accuracy | recognition accuracy of | |||
| networks [ | 80.3 ± 7.7% for all objects. | |||||
| SIE-OMS [ | no | no | Object recognition correct | |||
| Public | Recognition | rate reached 62.78%; room | ||||
| Indoor scenes | accuracy | recognition correct rate | ||||
| dataset [ | reached 70.33%. | |||||
| no | no | self-construction | Subjects achieved a mean | |||
| PC of 87.08 ± 1.92% in the | ||||||
| Percentage of | object test task in the | |||||
| Mask-RCNN layers | correctly | description scene and a | ||||
| [ | completed | mean PC of 60.31 ± 1.99% in | ||||
| tasks (PC) | the description scene | |||||
| content test. | ||||||
| InI-based object | yes | no | self-construction | NA * | NA * | |
| segmentation [ | ||||||
| Improved SAPHE | no | no | Captured directly | Recognition | The average RA of subjects | |
| algorithm [ | with the camera | accuracy (RA) | was 86.24 ± 1.88%. | |||
| no | no | self-construction | Success rate | The success rate for | ||
| Depth and edge | subjects with depth and | |||||
| combinations [ | edge was over 80%. |
* NA stands for No Assessment Indicators and Experimental Assessment Results.