| Literature DB >> 34045485 |
Nadia Paraskevoudi1,2, John S Pezaris3,4.
Abstract
The visual pathway is retinotopically organized and sensitive to gaze position, leading us to hypothesize that subjects using visual prostheses incorporating eye position would perform better on perceptual tasks than with devices that are merely head-steered. We had sighted subjects read sentences from the MNREAD corpus through a simulation of artificial vision under conditions of full gaze compensation, and head-steered viewing. With 2000 simulated phosphenes, subjects (n = 23) were immediately able to read under full gaze compensation and were assessed at an equivalent visual acuity of 1.0 logMAR, but were nearly unable to perform the task under head-steered viewing. At the largest font size tested, 1.4 logMAR, subjects read at 59 WPM (50% of normal speed) with 100% accuracy under the full-gaze condition, but at 0.7 WPM (under 1% of normal) with below 15% accuracy under head-steering. We conclude that gaze-compensated prostheses are likely to produce considerably better patient outcomes than those not incorporating eye movements.Entities:
Year: 2021 PMID: 34045485 PMCID: PMC8160142 DOI: 10.1038/s41598-021-86996-4
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Gaze conditions and the problem created by ignoring eye position. In the general case where the eyes are not used to image the external world in a visual prosthesis, there is potential for the camera axis, typically steered by head position, and the gaze location, determined by the combination of head position and eye position within the head, to be out of agreement. This figure depicts three conditions, by row, the first where the gaze (blue) and camera (green) are aligned, pointing forward (top row), the second where the head and camera are in the same positions, but the eyes have deviated to the left (middle row), and the third where the head and camera have been turned to the left, but the eyes have counter-rotated to the right such that gaze is once again straight ahead (bottom row). The second column from the left shows the image that the camera sees. The third column depicts what is perceived when the prosthesis includes compensation for gaze location (Head and Eye, or Full Gaze), with congruent stimulation resulting in a perception that is spatially stable. The fourth column depicts what is perceived by the subject where the prosthesis delivers the image from the uncompensated camera (as in head-steered devices), but the brain continues to apply its knowledge about eye position, creating spatially incongruent conditions whenever the camera and eyes are not pointing in the same direction.
Figure 2Population reading accuracy and speed. Reading accuracy (left) and speed (right) are shown for the three viewing conditions with population median as lines, 16/84 percentile range as dark color, 5/95 percentile range as light color. Performance is shown across the range of font sizes and measured in percent of correctly read words for accuracy, and number of correctly read words per minute for speed. The three conditions are Normal, where text is shown unadulterated on the screen (blacks), Full Gaze, where text is shown in a simulation of phosphene vision with full gaze compensation of the scene camera (reds), and Head Only, where text is shown in phosphene vision where the scene camera is steered only by head motions (oranges). In both phosphene view cases, the phosphene locations are locked to retinotopic location based on the instantaneous measured gaze position; we expect phosphenes for all visual prostheses to be stable in the retinotopic coordinate system (see Introduction). In the Full Gaze condition, the simulated visual prosthesis is assumed to incorporate a gaze tracker that provides rapid compensation for eye movements, performing a translation of the scene camera image as if it were being steered by the eyes. In the Head Only condition, gaze compensation of the scene camera is disabled, rendering it steerable only through head motion; this reflects the operation of many contemporary visual prosthesis designs.
Reading accuracy and speed results.
| Measurement | Viewing condition | logMAR | |||||
|---|---|---|---|---|---|---|---|
| 0.9 | 1.0 | 1.1 | 1.2 | 1.3 | 1.4 | ||
| Normal | |||||||
| Full gaze | |||||||
| Head only | |||||||
| Normal | |||||||
| Full gaze | |||||||
| Head only | |||||||
Values for reading accuracy in percent correct and reading speed in words per minute (WPM) are given for the three viewing conditions over the population of 23 subjects. Data are presented by font size and given as median values with differences to 16th and 84th percentiles. See Fig. 2 for a graphical presentation of these data.
Figure 3Performance in first versus second mini-block. Performance on reading accuracy (top row) and reading speed (bottom row) for the three viewing conditions, Normal (black), Full Gaze compensation (red), and Head Only steering (orange) is shown for the first versus second mini-block in each condition. All conditions had two mini-block presentations, each of which included one trial for all font sizes. Dots show the performance of matched stimulus conditions on the two mini-blocks. Unfilled circles show the population mean of the scattergrams. For the Normal condition, reading accuracy did not produce meaningful data as accuracies were all at 100%.
Figure 4Example gaze and head traces from one subject. Gaze (left column) and head (right column) position traces are shown for one of the most overall-capable subjects reading text at 1.4 logMAR (the largest size) under the three viewing conditions, Normal (top row), Full Gaze (middle row), and Head Only (bottom row). For each trace, the start is shown by an unfilled circle, and the end by a filled circle. Trial length, reading accuracy, and reading speed are shown underneath each pair of plots: the first two trials (top two rows) took some few seconds to complete while the third trial (bottom row) took over 8 min; for Head Only, the values are substantially better than the population means, but a similar dichotomy in trial completion time was typical across the population. Eye movements during the Normal and Full Gaze conditions here follow a typical scan path for reading the three lines in the MNREAD sentences; head position is nearly motionless, also typical, without having instructed the subject to hold their head still. Eye movements in the Head Only condition reflect stereotypical looping center-out-and-back movements, while head position reflects attempts to scan the scene. This looping behavior is triggered by eye movements to foveate a portion of text the subject wishes to view, followed by a realization that the text tracks their gaze location (Fig. 9), and a subsequent saccade back to the center of the screen.
Figure 9Screen captures from experimental conditions. Example screen shots are shown from the three conditions for an example sentence at the largest font size, 1.4 logMAR. The three columns correspond to the Normal (left column), Full Gaze (middle), and Head Only (right) conditions. The three rows correspond to three different alignment conditions for gaze (green circle) and head (blue cross) positions: the first (top row) contains the stimuli displayed under the three conditions when both the gaze and head positions are straight ahead, at the center of the screen where the subject is looking at the word /her/; the second (center) contains stimuli when the gaze is left (eyes rotated left), but the head is straight such that the subject’s gaze is on /water/, but the camera remains directed at /her/; the third (bottom) is for the gaze straight (eyes counter-rotated right) and head left such that the gaze is again on /her/, but the camera is on /water/. When the eyes are deviated from straight forward in the head, the resulting disagreement between gaze and head position creates spatially incongruent stimuli for Head Only conditions that was found to be highly disconcerting to subjects and resulted in very poor performance compared to the Full Gaze conditions that are sensitive to eye position.
Figure 8Phosphene pattern. The phosphene pattern used in this experiment contained 2000 phosphenes total, spread over the entire visual field in a center-weighted pattern that follows the natural profile of visual acuity from the center of gaze to the periphery[5, 67], as seen in the left image. During the task, the phosphenes that fall on the subject monitor would be simulated. As the pattern is gaze-locked in retinotopic coordinates, the pattern is shifted with gaze movements during the simulation, and the number of phosphenes that fall on the monitor accordingly varies; that number would be maximum at about 1200 phosphenes when the gaze position was straight ahead, as seen in the middle image. In phosphene view mode, those phosphenes would be used as a filter on the image to be presented to the subject (see Figs. 7, 9). The central part of the visual field that carries the highest density of phosphenes is shown in the right image. For phosphene patterns in this class that model thalamic visual prostheses under development in our laboratory, the central 10 degrees typically has about one-quarter of the total phosphenes[51].
Figure 5Apparatus. An apparatus was constructed to drive the simulation under the various viewing conditions. Subjects are seated at a table on which rests a high-refresh rate LCD monitor (ASUS, PG279Q, running at 100 Hz refresh), a gaze tracking camera (SR Research, EyeLink 1000 +), a head position tracking camera (TrackHat, opentrack), and a microphone. As part of the head tracker, subjects wear a set of headphones that are used only to hold a three-LED arbor so that it faces the head-tracking camera (Natural Point, TrackClip Pro). Simple sentences are presented on the subject screen in a simulation of artificial vision using phosphenes that are stabilized on the subject's retina based on the instantaneous report from the gaze tracker. A virtual scene camera is steered either through gaze location (Full Gaze condition) or head position (Head Only condition) in the simulation. As the subjects read the sentences aloud, scores are kept by the experimenter on sheets that are not visible to the subjects. The vision simulation and experimental control are run from a single computer (Lenovo Tiny M710p).
Figure 6Overview of experimental design and procedure. (Left) Each subject’s session started with a coarse assessment of visual acuity using a traditional Snellen chart. This test was to verify that subjects had largely normal vision, rather than to precisely measure their visual acuity, as the experimental requirements were well below ordinary acuity. Then, after subjects were seated in front of the apparatus, a quick calibration of the gaze position system was performed, typically lasting 2–3 min. Finally, the experiment itself proceeded with the presentation of the six mini-blocks in sequence, starting with a control condition mini-block, followed by four experimental mini-blocks, and then closing with a second control mini-block. (Right) The reading task used a trial-based structure where each trial began with the subject fixating a central point using normal vision. After a brief pause, a simple, three-line sentence was presented on the screen in one of the three conditions, depending on the mini-block: the Normal condition using natural vision, and the Full Gaze and Head Only conditions using phosphene-view simulation to present the stimulus. In both phosphene view conditions, the phosphenes were stabilized on the retina; in the Full Gaze condition, the simulated scene camera was steered by the combination of eye and head positions, whereas in the Head Only condition, the scene camera was steered only by head position. For all three conditions, the subjects read the sentence aloud as best they could, and then fixated the Next Sentence dot to proceed to the next trial.
Figure 7Stimulus conditions and font sizes. Example renderings of text used during the simulations is shown for the Normal condition (bottom row) where text is rendered at the native resolution of the screen, and the phosphene view conditions (top row) where text is shown as seen through a collection of 2000 phosphenes. The central part of the screen, approximately 10 degrees of visual angle across, is depicted for the six different font sizes used. Font sizes were calibrated to correspond to 0.9 logMAR (smallest) through 1.4 logMAR (largest). The central part of the phosphene pattern corresponding to the screen areas rendered here is shown in the upper left square, with each phosphene appearing as a light blue Gaussian on a white background. When this pattern is used as a filter on the white text/black background, it results in the images in the top row, for gaze and camera positions both at the center of the screen. As the gaze location moves around in either of the phosphene view conditions, the phosphene pattern is stabilized on the retina, but the scene camera is steered according to the testing condition of the mini-block (Full Gaze or Head Only).