Literature DB >> 24390225

Flies and humans share a motion estimation strategy that exploits natural scene statistics.

Damon A Clark¹, James E Fitzgerald², Justin M Ales³, Daryl M Gohl⁴, Marion A Silies⁴, Anthony M Norcia⁵, Thomas R Clandinin⁴.

Abstract

Sighted animals extract motion information from visual scenes by processing spatiotemporal patterns of light falling on the retina. The dominant models for motion estimation exploit intensity correlations only between pairs of points in space and time. Moving natural scenes, however, contain more complex correlations. We found that fly and human visual systems encode the combined direction and contrast polarity of moving edges using triple correlations that enhance motion estimation in natural environments. Both species extracted triple correlations with neural substrates tuned for light or dark edges, and sensitivity to specific triple correlations was retained even as light and dark edge motion signals were combined. Thus, both species separately process light and dark image contrasts to capture motion signatures that can improve estimation accuracy. This convergence argues that statistical structures in natural scenes have greatly affected visual processing, driving a common computational strategy over 500 million years of evolution.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2014 PMID： 24390225 PMCID： PMC3993001 DOI： 10.1038/nn.3600

Source DB: PubMed Journal: Nat Neurosci ISSN： 1097-6256 Impact factor: 24.884

Introduction

The statistical distribution of light intensities across space is a core feature of any environment[1-3]. These spatial distributions can be sampled over time to extract information about visual motion, a critical behavioral cue for many animals. The dominant computational models of motion processing estimate motion by correlating light intensity between pairs of points separated in space and time or, equivalently, by measuring local motion energy[4,5]. These pair correlations provide information about the direction and speed of moving edges. However, natural scenes contain additional information about motion that these signals do not capture[6-8]. For example, the contrast polarity, being either dark or light, is also a fundamental feature of moving edges yet is explicitly discarded by pair correlations. Here we show that both the fly and human visual systems take advantage of this additional information, available in specific correlations between three points in space and time, to detect motion. There are two dominant models of motion perception. The first of these is the Hassenstein-Reichardt Correlator (HRC), which computes spatiotemporal correlations directly by multiplying local contrast signals at two points in space, one at a later time than the other. These products are then summed in anti-symmetric fashion to produce an average signal whose sign and amplitude indicates the direction and magnitude of motion (Fig. S1)[4]. Motion energy, a second correlational model, begins with linear, oriented spatiotemporal receptive fields that are sensitive to particular directions of motion (Fig. S1)[5]. Subsequent circuit operations square and then sum these responses to produce a motion signal. Based on neural and behavioral measurements, motion energy models have been favored in vertebrates[9], while HRC models have been favored in invertebrates[10]. Nonetheless, the two models are sometimes mathematically equivalent[5,11], and both ultimately compute correlations only between pairs of points in space and time (see SI). Experiments using a variety of artificial stimuli have demonstrated that both vertebrates and invertebrates can detect motion even when there are no systematic correlations in intensity between pairs of points[12-16]. An optimal motion estimator would incorporate prior statistical information about the environment and its motion, and would compute many types of stimulus correlations to take advantage of higher-order statistics in moving natural scenes[6]. In particular, analysis of optimal estimators suggests that natural luminance asymmetries[2,17] would allow animals to estimate motion using triple correlations[6]. Here we show that two very different visual systems, those of flies and humans, employ triple correlations to estimate motion in a manner that distinguishes light and dark edges. These results suggest that the separate processing of dark and light in the visual pathways of many organisms can increase the fidelity of motion perception.

Results

To demonstrate how the motion of natural scenes generates spatiotemporal correlations, we approximated full-field motion by rigid translations of natural images (Fig. 1ai, 1bi). Minimal motion energy and HRC-based models rely exclusively on information extracted from pairwise correlations across the image. One simple example of this correlation structure is the difference between rightward and leftward correlations (Fig. 1aii). In this case, the local correlation, on average, indicated the correct direction of motion (red areas in Fig. 1bii), but because of the variability within the image[18], this signal also suggested leftward motion in some regions (blue areas in Fig. 1bii). In this example, the standard deviation of the local motion signal, computed across pixels, was 3.6 times the mean. Spatiotemporal averaging can suppress this variability[18], but at the expense of resolution.

Figure 1

Multiple correlations signify natural image motion. Each row presents a comparison between correlational motion signatures. Columns present: (i) context for each comparison; (ii) properties of pairwise motion estimators; (iii) properties of diverging 3-point estimators; and (iv) properties of converging 3-point estimators. (ai) Motion is approximated by the rigid translation of natural images. (aii-aiv) Cartoon of the correlation structure that each estimator detects. (bi) Example natural image. (bii–biv) Pixelwise contributions to motion estimation are highly variable and differ across estimators. (ci) An ensemble of natural images. (cii–civ) The accuracy with which correlations convey motion is examined across this ensemble. The performance of each estimator is quantified through the Pearson’s correlation between the estimator output and the simulated velocity. We linearly combined estimators to quantify the improvements afforded by multiple correlational signals. The numbers above each bar denote the fractional increase with respect to the 2-point estimate. (d) Same as (c), but with signals spatiotemporally filtered to match motion processing in Drosophila. Error bars are standard deviations over cross-validating trials (see Online Methods).

Here we consider two triple correlation structures involving three points in space and time, which we refer to as diverging (Fig. 1aiii) and converging (Fig. 1aiv). The diverging case incorporates two spatial points at the later time (one point diverges into two points), while the converging case incorporates two spatial points at the earlier time (two points converge onto one point). These two triple correlations maintain the local spatial and temporal resolutions of the comparable pair correlation. Like the pairwise estimator, these triple correlations were highly variable across the image, but their average signified the direction of motion (Fig. 1biii–biv). Importantly, the motion signals provided by these triple correlations incompletely overlapped with the motion signals derived from pair correlations (Fig. 1bii–biv). Thus, 3-point motion signals provide additional information about motion, beyond what can be obtained from the pairwise signal[6]. Because the accuracy of motion estimation is scene-dependent, one must determine whether motion estimators that capture specific spatiotemporal correlations perform reliably across an ensemble of scenes. We used a published natural image database[19] to determine how well pair and triple correlations can predict image velocity (Fig. 1c–d). We considered motion estimators with human-like spatial resolution (see Online Methods, Fig. 1ci) and with Drosophila-like spatiotemporal sampling (see Online Methods, Figs. 1di, S1). In both cases, we approximated the distribution of image velocities with a Gaussian having zero mean, using standard deviations that were 5°/s and 90°/s respectively, comparable to estimated natural speeds[20,21]. The output of local pairwise motion estimators was weakly correlated with the image velocity (Fig. 1cii–dii). This correlation was improved by averaging two pairwise estimators that survey neighboring spatial points (Fig. 1cii–dii). The diverging and converging triple correlations were more weakly correlated with the velocity (Fig. 1ciii–civ,diii–div), but typically improved motion estimation when summed with the pairwise estimate using optimal weighting coefficients (Fig. 1ciii–civ,diii–div). Strikingly, for Drosophila-like sampling, the increased accuracy afforded by the converging 3-point motion estimator (Fig. 1div) exceeded that afforded by a neighboring pairwise estimator (Fig. 1dii). Because these two 3-point estimators sampled the same two spatial and temporal points as the pairwise estimator, this improvement is available without sacrificing either spatial or temporal precision. Importantly, these triple correlations require asymmetric contrast distributions for their functionality[6] (Fig. S2), thereby capitalizing on the strong asymmetries present in natural contrast distributions that are absent from many artificial stimuli (Fig. S2 and S3). Triple correlations, unlike pair correlations, can encode whether a moving edge is light or dark (Fig. 2). Light and dark edges are defined by whether light intensity at a single point in space increases (light edge, green in Fig. 2) or decreases (dark edge, violet in Fig. 2) as the edge moves across that point in space. The net pairwise correlation motion signal was positive when either edge type moved to the right and was negative when either edge type moved to the left (Fig. 2a). Because moving light and dark edges induce the same pair correlations, the contrast polarity of the moving edge cannot be discerned from pairwise correlations alone (see also SI). Conversely, 3-point estimators do capture edge polarity information. The diverging 3-point estimator produced a positive signal to rightward-moving light edges and leftward-moving dark edges, and a negative signal to leftward-moving light edges and rightward-moving dark edges (Fig. 2b). Thus, triple correlations jointly encode the direction of motion and the contrast polarity of the moving edge. Critically, this joint encoding implies that the contrast polarity of each moving edge can be deduced in local regions of space from triple correlations once the direction of motion is determined. For example, when the motion is rightward, positive diverging triple correlations signal the presence of a light edge and negative diverging triple correlations signal the presence of a dark edge. We observed similar response patterns in pair and triple correlations when we simulated moving edges by translating natural scenes (Fig. 2c). Triple correlations are able to improve motion estimates in natural scenes (Fig. 1) precisely because in natural scenes, light and dark moving edges generate asymmetric triple correlation signals (see SI).

Figure 2

Triple correlations distinguish between light and dark moving edges. (a) A dark point becomes light when a light edge moves across the visual field (rows 1 and 3), and a light point becomes dark when a dark edge moves across the visual field (rows 2 and 4). We decompose the net pair correlation motion signal into four elements whose frequency of occurrence depends upon the motion. This net pair correlation motion signal reflects the direction of motion (compare rows 1 and 2 to rows 3 and 4) and is insensitive to whether the edge was light or dark (compare row 1 to row 2 or row 3 to row 4). (b) We similarly decompose the net diverging and converging triple correlation into four elements (shown for the diverging triple correlation). The sign of the net diverging triple correlation depends both on the contrast polarity of the edge and on the direction of motion (shown for rightward motion). Thus, triple correlations jointly encode the direction and contrast polarity of a moving edge. (c) Natural motion comprises both moving light edges and moving dark edges. Motion signals are associated with each moving edge, but only the 3-point motion signatures distinguish between edge contrast polarities.

The early insect visual system contains distinct substrates specialized for detecting moving light and dark edges[22,23]. Because triple correlations can improve motion estimation accuracy and can discriminate between the motion of light and dark edges, we next determined whether flies actually detect these signals. Following previous psychophysical approaches[14], we constructed “glider” stimuli that enforce positive or negative correlations of the same form as the pairwise, diverging, and converging correlations shown in Figure 1 (Fig. 3a and S4). Importantly, the 3-point “gliders” contained no net 2-point correlations, and vice versa. Thus, by construction, these glider stimuli separate the motion information contained in 3-point correlations from that specified by 2-point correlations (see Online Methods and Fig. S4).

Figure 3

Drosophila responds to triple correlations. (a) Binary spatiotemporal patterns, glider stimuli with 2- and 3-point contrast correlations, were presented to flies. Space-time plots for each of the 6 gliders, and an uncorrelated stimulus, are shown. (b) During the presentation, we measured flies’ turning in response to each glider. Positive rotational velocities represent turning in the direction of the ‘centroid’ of the pattern (to the right in the space-time plots in (a)). (c) One second periods of glider stimuli were interleaved with uncorrelated stimuli; the timing of the presentation of the gliders is denoted by the thick black bar. Response curves show the mean (solid line) and SEM (shading) over flies. (d) Mean turning velocities were computed for each glider by averaging over 0.5s of the stimulus (gray bar in (c)). Turning responses are presented for wild-type Drosophila, alongside the predicted response of a Hassenstein-Reichardt Correlator (HRC) to each glider. N=12 in (c) and (d). ‘**’ denotes a difference from 0 at the p<0.01 level (two-tailed t-test); from right to left, the marked p-values are 4.4×10−3 (t11=3.6), 6.0×10−7 (t11=10.2), 8.2×10−4 (t11=4.6), and 3.7×10−6 (t11=8.5). Error bars show SEM.

We presented spatially homogeneous glider stimuli on panoramic screens arranged around tethered flies. In this apparatus, flies walked on an air-cushioned ball, which was tracked to monitor fly turning[23,24] (Fig. 3b). Flies respond to visual rotations by turning in the direction of motion (the optomotor response), thereby allowing the movement of the ball to provide a behavioral measure of the fly’s motion percept[25]. As predicted by the HRC, flies turned in one direction when presented with the positive 2-point correlations and in the opposite direction with the negative 2-point correlations (top, Fig. 3c–d)[4,23,26]. Remarkably, flies also turned in response to the diverging and converging 3-point gliders, with responses that approached 20% of the 2-point glider response (bottom, Fig. 3c–d). These 3-point glider stimuli are, by design, very different from natural motion, as they achieve their correlation specificity by averaging out all 2-point correlations, whereas natural motion contains both 2-point and 3-point correlations (Fig. 1). As with the 2-point stimuli, positive and negative correlations in 3-point gliders evoked turns in opposite directions. In other words, simply inverting the contrast of 3-point glider patterns inverted their perceived directions. Neither the canonical HRC nor the motion energy model predict that flies would respond to 3-point glider stimuli[14] (Fig. 3d, bottom), and a recent modification to the HRC[27] also does not predict the measured responses (Figure S4d). This behavioral response was not a generic consequence of arbitrary triple correlations in the stimulus, as flies responded only weakly to several other glider stimuli (Fig. S5). Previous studies demonstrate that the disruption of large monopolar cell (LMC) function causes selective behavioral deficits in response to moving light or dark edges[22-24,28]. We therefore determined how the disruption of LMC function affects behavioral responses to the diverging and converging 3-point gliders by genetically suppressing synaptic output from the three LMCs (L1–L3) that have been associated with motion detection[24] (Fig. 4a). Control strains in which LMC function was normal all responded similarly to wild-type flies for each glider (Fig 4b). When LMC function was genetically disrupted, responses to various glider stimuli increased, decreased, or inverted compared to the controls (Fig 4b).

Figure 4

Detection of triple correlations associated with specific pathways in Drosophila. (a) Left: schematic of the inputs to the fly motion processing pathways. Signals from photoreceptors are relayed through the lamina monopolar cells L1, L2, and L3. Right: a temperature-inducible dominant negative suppressor of synaptic transmission (shi ) was used to silence L1, L2 and L3 using cell-specific expression of Gal4 (L1 shown, red). (b) We examined the responses of these disrupted motion detectors to 3-point gliders. Responses are plotted relative to the 2-point positive glider response. The two control genotypes (Gal4/+ and +/shi ) have all input pathways intact, but contain the genetic constructs for the experimental genotype (Gal4/shi ). For the genotypes Gal4/shi, Gal4/+, +/shi, from top to bottom, N = (19, 14, 19), (18, 13, 19), (29, 16, 19), (22, 14, 19), and (17, 15, 19). Error bars are ± SEM. ‘*’ and ‘**’ represent p<0.01 and p<0.001 differences from both control genotypes (two-tailed t-test).

Triple correlations are differentially associated with moving light and dark edges (Figs. 2b), as are the three LMCs (L1–L3) that provide inputs to motion detecting circuits[22-24]. We thus tested whether 3-point correlation responses predicted the relative strength of responses to moving light and dark edges both in wild-type flies and in flies with disrupted motion pathway inputs. To do this, we used an edge selectivity index corresponding to the behavioral response to light edges minus the response to dark edges divided by their sum[22-24]. Each glider forces a particular correlation to occur consistently throughout the visual field (see Online Methods). We thus used the observed behavioral response of each genotype of flies to a glider stimulus to infer its sensitivity to the associated correlation. We then predicted the response of each genotype to moving light and dark edges as the appropriately weighted sum of its 2- and 3-point glider responses (Fig. 5), where the weighting scheme was determined by counting how often each glider’s correlational element appeared in each edge type (Fig. 5a and S6). Remarkably, we observed a high correlation between the edge selectivity predicted by the weighted responses to glider stimuli and the independent measurements of edge selectivity (Fig. 5b). This striking result suggests that these 3-point correlations are integral to edge selectivity in flies.

Figure 5

Triple correlations predict the edge selectivity of motion pathways. (a) The frequency of correlational elements in a moving edge depends on its contrast polarity and direction (see Fig. 2), and we compute the relative abundance of each correlation from the difference in frequency of each element in rightward versus leftward motion (see Fig. S6). The relative abundances of the four triple correlation elements differ between light and dark edges. (b) We used the relative abundance of each correlational element in each edge type (see also Fig. 2, S6) to weight and sum the response of each genotype to each correlational element (Fig. 4). This generated the glider-predicted responses to each edge type, from which we computed the predicted edge selectivity for each genotype. It correlated highly with the behaviorally measured edge selectivity (see Online Methods). Edge selectivity is computed to be the light minus dark edge responses divided by their sum. Error bars on points are ± SEM.

In the primate visual system, light and dark are differentially processed in a variety of ways[29]. We therefore examined whether humans have separate pathways to process the direction of motion for light and dark edges, and whether triple correlations are involved in human edge contrast selectivity. Instead of genetic manipulation, as in flies, we used differential adaptation. In a first experiment, we measured scalp EEG signals, and in a second experiment, we measured behavioral responses. We designed stimuli that independently manipulate both edge contrast polarity and motion direction[23]. In particular, we generated two complementary “opposing edge” stimuli in which every light edge moved in one direction and every dark edge moved in the opposite direction (Fig. S7, supp. movie M1). We denoted the stimulus where light edges moved to the right as “A” and the stimulus where light edges moved to the left as “B.” These stimuli were balanced for leftward and rightward motion and for positive and negative edge contrast polarity, but they were imbalanced in the compound feature that combines motion direction with edge polarity. Human participants were presented with either stimulus A or stimulus B as an adapting stimulus and then probed with a stimulus that rapidly interleaved short segments of A and B, designated A′ and B′ (Figs. 6a and S7, supp. movie M2), while EEG signals were recorded on the scalp (Fig. 6b). If light and dark edge motions are treated equivalently by neural circuitry, then the probe response should be independent of the adapter. If, however, A and B differentially adapt some neural population, then they would be expected to have complementary effects on the probe response. In particular, as each adapter corresponds to one half of the probe, selective adaptation should create a response that coincides with the rate of alternation between A′ and B′ in the probe. Since the two halves of the probe were constructed to be 180° out of phase, the differentially adapted responses should also be 180° out of phase. Indeed we found that the two adapting stimuli evoked different departures from baseline in the EEG signal (Fig. 6c, S7) (n=7 participants), and the measured phase difference between them was 179±20° (Fig. 6d, S7).

Figure 6

Humans differentially adapt to moving light and dark edges. (a) Schematic of adapter and probe stimulus paradigm (see Figure S4). Black box denotes the time interval used for analysis. (b) Scalp topography of the amplitude of the response at the A′/B′ alternation rate (3 Hz). The amplitude peaks near the occipital pole. (c) Time average of the response from the peak electrode to the probe stimulus under the two adaptation regimes. Response to the unadapted state obtained by probe presentation without the adapting stimulus has been subtracted from this signal (see Figure S4). The response to the probe shows complementary modulation by the adapting stimuli at the frequency of probe alternation (3 Hz). Gray area represents ± 1 SEM. (d) The within subject difference of phase and amplitude at 3 Hz between the two adapting conditions. Ellipse represents 1 SEM while the shaded wedge indicates the 95% confidence interval for the phase. N = 7 subjects in (c) and (d).

To compare these adaptation effects with those caused by adaptation to drifting sine-wave grating motion[30], we computed an adaptation index that quantified the fractional change in response due to adapation[30] (see Online Methods). Previous experiments using drifting sine-wave gratings produced an adaptation index of 2.31 ± 0.52 (mean ± SEM). Here we observed an adaptation index of 1.71 ± 0.18 (mean ± SEM). The difference between these adaptation indices is small and not statistically significant, demonstrating that these adaption effects have a similar fractional magnitude. This is remarkable because the adapting stimuli in the current experiment contained no net motion, but generated direction-selective adaptation. Thus, these experiments demonstrate that the human visual system contains neurons that differentially adapt to the motion of light and dark edges, thereby suggesting that neural populations are differentially activated by the motion of these two edge types. We next investigated whether edge-polarity-selective motion pathways in humans were associated with behavioral responses to triple stimulus correlations, as in flies. By asking participants to classify glider stimuli as moving leftward or rightward, we first reproduced previous psychophysical results demonstrating that humans perceive diverging and converging 3-point gliders as motion[14] (black bars, Figs. 7 and S8c, supp. movie M3). We then adapted edge selective motion pathways using opposing edge stimuli as adapters, and probed the perception of glider motion. Results were aggregated using the relative orientation of the adapter and probe (see Online Methods). These stimuli had no effect on converging glider percepts (gray bars Figs. 7 and S8c). However, these opposing edge adapters inverted the perceived direction of a specific glider (the negative left-diverging glider; Fig. 7).

Figure 7

Adaptation to moving light and dark edges differentially affects the perception of specific 3-point gliders. Subjects were presented all combinations of four types of adapter stimuli (left column), and 8 gliders (right hand panels), and asked to report the direction of perceived glider motion. Results for each of the 8 glider stimuli are shown, grouped by glider. The color of the bar corresponds to the adapting stimulus: static (black), opposing edges (gray), light edges only (green), or dark edges only (magenta). All stimuli were presented mirror-symmetrically, and responses were aligned to the direction shown in the left hand column. ‘*’, ‘**’, and ‘***’ indicate differences between conditions at p=1.6×10−3 (t16=5.6), p=8.8×10−4 (t14=6.2), and p=2.8×10−6 (t14=10.2) (two-tailed t-test, Bonferroni corrected for 40 comparisons). N=9 subjects for static and opposing edge adaptation; N=7 for light and dark edge adaptation conditions.

As opposing edge adapters contain both light and dark edge motion, this adaptation effect could emerge from moving light edges, dark edges or both. To resolve this ambiguity, we generated two more types of adapting stimuli, each containing either light or dark moving edges. These stimuli were not motion balanced, and some participants reported a strong motion aftereffect that caused all responses to be in the opposite direction to the adapter (n=2 of 9, see Fig. S8d). Thus, we restricted subsequent analyses to those participants whose responses were not dominated by motion after-effects (n=7 of 9). Light edge adaptation caused perceptual inversion of the same glider affected by opposing edge adaptation (green bars, Fig. 7). In contrast, dark edge adaptation had no effect on glider perception (purple bars, Fig. 7). Because the light and dark edge adaptors have different effects on glider perception, yet contain equivalent net motion signals, this specific pattern of adaptation cannot be explained by a simple motion after-effect. Thus, these experiments demonstrate that at least one triple correlation is differentially involved in edge polarity-selective motion processing in humans.

Discussion

Our results indicate that both flies and humans extract the motion of light and dark edges via distinct processing pathways in a way that allows both organisms to exploit higher-order statistical correlations that are present in moving natural scenes. We demonstrated this in the detailed pattern of human and fly psychophysical responses, in the profile of human neural responses after adaptation, and in the behavioral responses of flies after genetic manipulation of underlying input channels. Our analysis of natural scenes showed that triple correlations provide information that can improve motion estimates without reducing spatial or temporal resolution. Such odd-ordered correlations are also required for discriminating between edge polarities. Building on previous work showing that visual input circuits in Drosophila are separately specialized for detecting moving light and dark edges[22,23], we showed that flies respond to triple correlations, and that the pattern of these responses predicted edge polarity selectivity across genotypes. Our EEG measurements found a neural correlate of polarity specific motion pathways in humans, consistent with previous psychophysical observations[31,32]. In addition, while humans had previously been shown to perceive motion in 3-point glider stimuli[14], we showed that behavioral responses to specific 3-point correlation stimuli are differentially adapted by light and dark moving edges. Thus, we have shown that higher-order correlations play central roles in distinguishing moving edge polarity in both flies and humans. Models of motion processing that are restricted to pairwise correlations (e.g. the motion energy model) have been extremely successful in explaining both perception and neural activity in areas V1 and MT[9,33,34]. Nevertheless, there is also strong evidence that such models cannot fully capture motion processing in either primates[14,15] or invertebrates[12,13,35]. For example, humans perceive robust motion in certain stimuli lacking pairwise correlations (“non-Fourier” motion)[15]. Moreover, experiments in non-human primates have demonstrated that non-Fourier motion can elicit direction-selective behavioral responses in the absence of direction-selective neural responses in area MT [36,37]. Several studies have shown that flies can detect certain non-Fourier motion cues[12,13,35]. This prior work utilized motion stimuli that included quadruple (4-point), or higher even-ordered correlations, but lacked triple correlations. As a result, models that account for these effects specifically detect quadruple correlations[35]. Our current results highlight that triple correlations are important for estimating motion when luminance distributions are asymmetric, as they are in natural scenes[38], and for encoding the contrast polarity of a moving edge. More generally, our results illustrate how non-Fourier motion cues that seem paradoxical in isolation can signify motion in natural environments[6]. Relatively minor changes to existing biological models of motion detection can provide access to high-order motion correlations. For example, the ON and OFF channels in the vertebrate retina treat contrast increments and decrements differently, in terms of both amplitude and kinetics[29]. This asymmetry, were it to be appropriately retained, could enhance motion estimation by giving downstream neurons access to higher-order correlations. Similarly, the existence of ON direction-selective retinal ganglion cells[39] shows that contrast polarity specific motion signals are present already in the retina. Interestingly, our demonstration that triple correlations both improve motion estimation in natural environments and associate with edge polarity specific pathways in fly and human visual systems suggests that brains might utilize separate ON and OFF processing channels to extract complex signatures of natural motion. Nevertheless, it is also possible that the brain explicitly computes higher-order correlations. A variety of machine vision algorithms make direct use of higher-order statistics to improve motion estimation[7,8,40]. Here we multiply three signals to build HRC-like models that compute triple correlations. The vertebrate motion energy model computes and squares local spatiotemporal frequency components to compute the motion signal. By extension, multiplying three frequency components can produce the bispectrum, which encodes triple correlations. Thus, simple generalizations of standard models that compute pairwise correlation enable the computation of triple correlations. Dark and light have long been considered to be perceptually distinct[29]. Our work extends studies in primates showing that motion processing has a component that retains information about edge contrast polarity[31,32,41]. The distinct processing of light and dark moving edges may reflect fundamental differences in the statistics of light and dark objects in the world. For example, luminance asymmetries in natural scenes imply that the contrast magnitude of light edges can vastly exceed the contrast magnitude of dark edges[2,17]. More complex asymmetries also exist. For instance, light and dark are correlated with distance in natural scenes, with near elements tending to be brighter than far ones[42]. Further, the asymmetry in magnitude between light and dark in natural scenes implies that light objects tend to have a smaller spatial extent than dark objects. Therefore, light edges tend to be quickly followed by dark ones, and processing light and dark edges separately could reflect ethological differences in the detection of light and dark edges as the leading or lagging edges of moving objects[43]. Overall, light and dark edges are not symmetric in natural environments, and treating each separately can enhance motion estimation. We have demonstrated that triple correlations can distinguish the motion of light and dark edges and can improve the accuracy of motion estimation without sacrificing spatiotemporal resolution. Although we have emphasized the utility of triple correlations for wide-field motion estimation, these properties might make triple correlations even more critical in behavioral contexts where light and dark signals hold special significance or where extensive spatiotemporal averaging is not possible. For example insects track the motion of dark objects[44] and exhibit stereotyped escape responses when presented with looming visual stimuli[45]. The stimuli that elicit these behaviors have strong light-dark edge dependencies: a dark object moving across a light background always consists of a dark leading edge and light lagging edge[43], whereas the edges of looming objects all have the same contrast polarity and move in different directions. Object tracking is also distinct from wide-field motion estimation in humans[15] and could similarly exploit triple correlations to account for object polarity. In addition, depth estimation using binocular disparity has many similarities with motion estimation and also utilizes luminance polarity information through independent disparity-tuned mechanisms for light and dark features[46]. Interestingly, observers better estimate depth when the stimulus contains both light and dark information[46], possibly reflecting the aforementioned correlation between luminance and distance in natural scenes[42]. Consistent with these observations, neurons in macaque V1 show conjoint disparity and luminance tuning that reflects the pattern of environmental correlations between brightness and depth[47]. There is strong selective pressure to compute accurate motion estimates. The statistics of the terrestrial world, combined with biological and physical constraints on neural circuitry[48-50], shape the computations that underpin these motion estimates, resulting in the possibility of similar motion estimation solutions across diverse taxa. The common existence of light and dark edge selective pathways in flies and humans, along with the association of triple correlation computation with those pathways, points to a deep similarity between fly and human motion estimation strategies. Since chordates and arthropods diverged more than 500 million years ago, and their visual systems have very different architectures, it seems unlikely that these similarities are the result of a conserved algorithm derived from a common ancestor. Rather, these commonalities could reflect the purposeful use of high-order statistics to estimate motion, leading flies and humans to converge on remarkably similar computational strategies to process this critical cue.

METHODS

Natural Scene Statistics Analysis Methods

Correlation images

We began with natural images from van Hateren’s database (image size = 1024×1536 pixels, pixel size = one arc minute)[19]. We linearly converted each image to a contrast scale, C = (I − I0)/I0, where C is the ith pixel contrast, I is its intensity, and I0 is the image’s average pixel intensity. The pair correlation image (Fig. 1bii) was where R denotes the ith pixel of the pair correlation image, C(t) denotes the image at time t, summations inside the pixel index denote horizontal shifts, Δ = 10 arcmin, and δ = 30 ms. The velocity of motion was v = 5.6°/s, so C(t) = C+Δ(t + δ). The diverging (Fig. 1biii) and converging (Fig. 1biv) triple correlation images were where D and N denote the ith pixels of the diverging and converging triple correlation images. Symmetric color axes saturate one standard deviation from 0.

Quantitative comparison of motion estimators

To consider the accuracy of motion estimation strategies built from raw image correlations, we converted images to contrast, randomly chose a row from an image, sampled the velocity from a zero-mean Gaussian distribution with 5°/s standard deviation, and computed local pair and triple correlations as above. We performed 107 simulations for Fig. 1c and S3 and 105 simulations for Fig. S2. We also considered strategies with HRC-like spatiotemporal sampling. Since pixels were small relative to Drosophila’s sampling, we down-sampled each image to one-degree pixels by averaging. We converted images to contrast and then emulated photoreceptor blurring by filtering across rows with a Gaussian kernel (FWHM = 5.7°). We considered the central row of the filtered image as a one-dimensional image, c(x). Given a randomly chosen image and a velocity drawn from a zero-mean Gaussian with 90°/s standard deviation, we modeled photoreceptor responses as where i indexes the photoreceptor, x is the position of the ith photoreceptor (spaced 5.1° apart), T is a causal exponential kernel (timescale = 10 ms), and h is a Gaussian kernel (FWHM = 5.7°)[51]. We applied reflective boundary conditions to generate images that covered the visual field. The general HRC is We took the filters to be where τ =30 ms[23], and g(t) is comparable to LMC responses[52]. The converging and diverging third-order correlators used identical filters: where N(t) and D(t) denote the converging and diverging 3-point correlators (Fig. S1). We performed 4 · 105 simulations (duration = 800 ms, time step = 10 ms) and considered each correlator’s final value as its velocity estimate. We estimated the correlation between each estimator and the velocity using 2-fold cross-validation. For each data partitioning, we randomly assigned half of the simulations to the training set (the rest comprised the test set). We determined the optimal linear coefficients to combine motion estimators from the training set and evaluated performances on the test set. The percentages in Figs. 1, S2, and S3 are fractional correlation increases, relative to the local 2-point correlator. Figs. 1c, S2, and S3 show means and standard deviations across 100 data partitionings (1000 in Fig. 1d).

Drosophila Methods

Strains

Drosophila melanogaster were raised and prepared for testing as previously described[23], grown on molasses-based food in a 12hr-12hr light-dark cycle, and tested for behavior during the 4 hours after lights-on or 4 hours before lights-off. Females were collected on CO2 1–2 days after eclosion, then tested 48–72 hours later, using cold to immobilize them prior to gluing with a UV-cured epoxy. Genotypes are listed in Table S1. All inserts were in the isoD1 background[53] or backcrossed 5 times into that background.

Fly Behavior

Fly responses to different visual stimuli were measured with an apparatus similar to one previously described[23], with modified screens[24]. Three screens, each 3 × 3cm, were arranged as adjacent sides of a cube with the fly at the center. They subtended from +135° to −135° azimuthally, and from +45° to −45° vertically. Images were relayed from a digital light projector to a coherent fiber-optic bundle [23], and then imaged onto back-projection material that constituted the screens. The angular resolution was identical to the previously described rig[23]; each pixel subtended approximately 1°. The luminance in all glider experiments was 6 cd/m2; edge selectivity data was taken at 18 cd/m2. All flies were tested at a temperature of 34° C, the restrictive temperature for shibire [54].

Fly Stimuli

All stimuli were presented with a screen refresh rate of 240 Hz, using custom code to generate stimuli[23]. Pixel intensities were gamma-corrected, and all stimuli were drawn on a virtual cylinder about the fly, using software to perform azimuthal angular corrections. All stimuli were presented both clockwise and counter-clockwise; the response was measured as the difference in response between the two directions. Each glider presentation had a different random seed pattern. Each stimulus was presented more than 30 times, and responses averaged to obtain each fly’s response. Presented means and SEMs were computed for the tested flies from each fly’s average response to each stimulus. Binary glider stimuli were created in real time as described in Constructing Gliders (below) and in their original publication[14]. Glider stimuli lasted for 1 second, and were interleaved with 0.5 s periods of uncorrelated updates. All glider pixel updates took place at 40 Hz. The correlations were 1-dimensional, so that each ‘pixel’ subtended 5° horizontally and the entire extent vertically. The reported response is the integral of the response from 0.5 s to 1 s after stimulus onset. Non-responding flies were excluded from the analysis by requiring that flies exceed a threshold response to at least one of the glider stimuli. That threshold was set at 30°/s during the second 0.5 seconds of the glider stimulus. WT flies respond to the positive parity, 2-point glider with turning rates of 140 ± 50°/s (mean±std), so that this threshold excluded very few flies. In most genotypes, including WT, this procedure excluded no flies at all; it excluded 1 of 20 flies of genotype L1/shi and 2 of 19 of genotype L2+L3/shi. Among three sicklier genotypes, it excluded 5 of 27 (L3/shi), 9 of 23 (L3/+), and 9 of 25 (L2/+). Inclusion of these non-responding flies tended to bring all responses towards 0 and to add variability to the predicted edge selectivity, due to the small denominator in the normalization procedure (see below). Individual light and dark edge stimuli were created, presented, and analyzed as previously described[23].

Fly Metrics & Statistics

The glider fractional response was computed for each genotype to account for differences in general health, which created variability in the strength of all turning responses. Within a given genotype, all flies’ 2- and 3-point stimulus-induced rotations were computed; then each average was divided by the genotype’s mean response to the 2-point, parity +1 stimulus to compute the response as a fraction of the genotype’s 2-point response. Edge selectivity (Figure 5) was defined as (X − X)/(X + X), where X is the experimental light edge response and X is the experimental dark edge response. When either X or X averages to close to zero, instances of this metric can become greater than 1 or less than −1, a property that accounts for the error bars overlapping ±1 in Figure 5. X and X were directly computed for the light and dark edge stimuli by integrating the evoked turning response[23]. We generated predictions for X and X, denoted X̂ and X̂, from the experimental responses to each glider, where the linear coefficients correspond to the components of each correlation in the edge type (Fig. 5a and S6), X+/X− are the responses to the positive/negative parity 2-point gliders, X+/X− are the responses to the positive/negative parity diverging 3-point gliders, and X+/X− are the responses to the positive/negative parity converging 3-point gliders. The glider predicted edge selectivity was (X̂ − X̂)/(X̂ + X̂). Throughout, the statistical significance of differences were judged by a 2-tailed Student’s t-test between experimental and control groups, or between the experiment and null hypothesis for WT data. Significance is reported as the maximum p-value relative to the 2 controls. We did not assess whether the data was normally distributed. We did not use statistical methods to predetermine our sample sizes, but they are comparable to those in the literature[23,24,26,28].

Simulated EMD responses to glider stimuli

Wild-type HRC and ON/OFF model motion detector responses to various gliders (Figs. 3d and S4d) were simulated in Matlab (Natick, MA). The WT HRC parameters were taken directly from a recent study[23], while the model parameters for the ON/OFF model were taken from its publication[27]. Simulations were run on a 1-dimensional array of 61 photoreceptors, with spacing of 5.1° and Gaussian acceptance angle of 5.7° (FWHM)[55]. Spatial resolution of the simulation was 1°, and the simulation ran in 1ms time steps. Means and standard deviations in figures were computed from 25 instantiations of the gliders. Responses to gliders were computed as the mean response over 2 seconds, excluding an initial second to discard transients. In the case of the ON/OFF model, the response was the mean in the preferred direction minus the mean in the null direction.

Human EEG Methods

Subjects

Informed consent was obtained from the 7 subjects (4 male, 3 female, ages 24–44 years), who participated in the experiments under protocols approved by the Institutional Review Board of Stanford University. We did not use statistical methods to predetermine our sample sizes, but they are comparable to those in the literature[30].

Stimulus generation

Stimulus generation and signal analysis were performed by in-house software, running on a Macintosh G4 platform. Stimuli were presented in a dark and quiet room on a calibrated CRT monitor at a resolution of 800 × 600 pixels viewed from 125 cm (full-width of 18.0°, full-height of 13.6°), with a 72-Hz vertical refresh rate. There were two types of adapting stimuli. Both consisted of a 2° period vertical opposing edge presented at 6 Hz (resulting in a velocity of 12°/s) with a mean luminance of 71.5 cd/m2 and a 97% Michelson contrast. The difference between adapting stimuli was in the motion direction of the bright and dark edges. The probe stimulus comprised a 4° period vertical opposing edge presented at 3 Hz (resulting in a velocity identical to the adapter of 12°/s) and was identical for all three conditions (no adaptation, adapter type A, adapter type B). The adapting stimulus was viewed for 20 seconds. Immediately after the adaptation the probe stimulus was displayed and reversed edge polarity directions at 3 Hz for 12 seconds. All stimuli space-time plots are shown in Fig. S7, and movies of adapter A and the probe stimulus are shown in movies M1 and M2.

Adaptation Procedure

A session began with a block of 20 probe trials (unadapted) that was followed by 20 trials of an adapt/probe cycle with a randomly chosen adapter (A or B). The adaptation/probe cycle within a block is illustrated in Figure 6a. After a block using the first adapter, the participant was allowed to rest for several minutes in order to dissipate the adaptation effect, after which they were given a block using the alternate adapter.

Attention Control

To control for possible time- or stimulus-dependent effects of attention, participants performed a demanding letter discrimination task at fixation during measurement of the steady-state Visual Evoked Potential (ssVEP)[56].

Spectral signature of stimulus specific adaptation

We used a method that isolates responses of directionally selective neurons in the ssVEP[30]. Briefly, in the unadapted state, the ssVEP after each phase of the probe stimulus is identical. After adaptation the responses of individual neurons that are tuned for the adapting stimulus are reduced. The resulting imbalanced, adapted response creates a signal that has a different temporal sequence, either strong/weak or weak/strong, depending on the adapting stimulus. Therefore, the presence of odd harmonic responses that are 180°-phase shifted after adapting to the different adapters is diagnostic of selective adaptation. Because phase is a circular variable we used the circular statistics toolbox (Matlab) to calculate the 95% confidence intervals for the differences in phase[57].

EEG signal acquisition and source imaging procedure

The electroencephalogram (EEG) was recorded, preprocessed, and analyzed identically to prior experiments[58]. The resulting amplitude spectra of the ssVEP were then evaluated at the first harmonic of the stimulus frequency (3 Hz). Reported averages in Fig. 6 and S7 are from an average of probe responses between 1 and 6 s after the probe began; a time course of the first harmonic response is shown in Fig. S7c.

Adaptation Index

The adaptation index provides a measure of the fractional change in the signal due to adaptation. The amplitude of the first harmonic was divided by the summed amplitude of the first and second harmonic, using the single electrode with the maximum response (shown in Fig. 6b). We then averaged the two adapted conditions and divided by the unadapted case.

Human Psychophysics Methods

Informed consent was obtained from the 9 subjects (5 male, 4 female, ages 23–55 years), who participated in the experiments under protocols approved by the Institutional Review Board of Stanford University. We did not use statistical methods to predetermine our sample sizes, but they are comparable to those in the literature[14].

Stimulus Generation and Presentation

Dynamic stimuli were presented on a HP-1320 video monitor, with 800 × 600 pixel resolution, 120 Hz refresh rate, mean luminance of 48.1 cd/m2, and contrast of 93.5% (Michelson). A 5°-radius aperture surrounded the fixation cross; stimuli were presented within the aperture and mean luminance outside. We used four adapting stimuli (Fig S8): static bars, opposing light and dark edges, light edges, and dark edges. The static adapter was a full contrast square wave with a period of 2°. The opposing edges adapter had the same parameters as the adapter in the EEG experiments: 2° period and 12°/s edge speeds. The single edges consisted of periodic single edges moving across the screen in one direction, repeated in time (see Fig S8). The single edges had identical parameters to the opposing edge stimuli, moving at a speed of 12°/s and with a period of 2°. We created a total of 8 3-point glider stimuli: converging/diverging, ±1 parity, and centroid moving left/right. The gliders were full contrast, updated at 30 Hz, and consisted of square pixels of 0.1° on a side, updated using the glider update rules to constrain the contrast correlations appropriately in each case (see Constructing Gliders, below). Each row of the glider stimulus was independent of the others, but updated with the identical update rule (see movie M3).

Adaptation Protocols

An initial adaptation phase lasted for the first 20 seconds of each experiment. After the initial adaptation, probe glider stimuli were shown in random order for 0.65 seconds, interleaved with 3-second adaptation ‘top ups’, repeating the initial adapter. All adapters were tested along with their mirror-symmetric counterpart, and the results averaged with appropriate sign changes. Each glider was tested 20 times for each adapter, for a total of 40 trials for each subject in the points shown in Fig. 7. The static case contained 20–60 trials for each subject, as it was occasionally run more than once if the subject participated in multiple sessions.

Data analysis

For each adaptation and glider condition, the fraction of trials that were perceived as moving to the right was computed for each subject. Means and standard errors presented in the Figure were computed from the cross-subject data, and significance between conditions was assessed with a 2-sample, unpaired Student’s t-test, Bonferoni-corrected. (The bar plots in Fig. 7 represent a linear transform of fraction-to-the-right. 0 maps to all-left, 0.5 maps to random, and 1 maps to all-right.) We did not assess whether data was normally distributed.

Constructing gliders

Gliders are binary spatiotemporal stimuli with enforced correlational structures that were previously developed and used to produce motion percepts in humans[14]. These authors describe several properties that make gliders ideal for probing mechanisms of motion detection. First, they are, on average, equiluminant in time and space, averaging to a mean gray. Second, the variance at each time and at each point in space is the same. Third, the enforced correlational structure excludes many other correlational structures. For instance, the four 3-point gliders we produce here exclude all 2-point correlations, and exclude all other 3-point correlations. Gliders can produce more complex, higher-order or more distant correlations[14]. Each glider stimulus begins with a random seed and uses an update rule to determine whether a pixel should be white or black. It iterates the rule across all pixels to produce a full space-time pattern that obeys the correlational pattern determined by the update rule. Figure S4a shows examples of each update rule, and Figure S4c gives examples of each rule’s space-time patterns. Movie M3 shows 4 gliders similar to those used in the human psychophysics experiments. The simplest update rule is uncorrelated, when each pixel is randomly chosen to be black or white, independent of all other pixels’ values. The second simplest pattern is a 2-point update rule. We coded white pixels as having value +1 and black as having −1, matching our standard definition of fractional contrast. The update rule is then C(t)C+Δ(t + δ) = P, where Δ is the pixel spacing, δ is the frame duration, and P is the parity of the pattern, which is equal to +1 for even parity or −1 for odd parity. This is equivalently written as C+Δ(t + δ) = PC(t), since C = 1/C for our values of C. Each new pixel value is the previous time’s adjacent pixel multiplied by 1 or −1. For even parity, this rule translate the entire pixel pattern in one direction, while for odd parity, it translates and inverts on each update. The 3-point glider rules are a generalized version of the 2-point rules. The update rule is C(t)C+Δ(t)C+Δ(t + δ) = P (converging case) and C(t)C(t + δ)C+Δ(t + δ) = P (diverging case). Each pixel’s value is updated as a function of its surrounding pixels’ values and is determined by the seed state, so these patterns are similar to Conway’s Game of Life, the origin for the term “glider”[59]. Edge cases can result in undetermined pixel values; in such cases, we seeded with random pixel contrasts. In 3-point gliders, the two different parities are contrast inversions of each other: inverting the contrast on a P = 1 pattern turns it into a P = −1 pattern (Figure S4b). Conversely, inverting the contrast of a 2-point glider does not alter its parity, since all pairwise products remain unchanged. Supplementary Figure 8. Human psychophysics experimental details. (a–b) Space-time intensity plots of the adapters and gliders. (a) Subjects’ visual systems were adapted with a static adapter, with opposing edge motion, with light edge motion, and with dark edge motion. Each presentation of the adapters had a random spatial phase. See also movie M1 for an example opposing edge stimulus. (b) Space-time plots corresponding to a single row of each glider stimulus (shown after adaptation). All centroids move to the right in the top row and to the left in the bottom row. See also movie M3 for example glider stimuli. (c) Individual subject responses to the glider stimuli following adaptation to the static adapter. Individual subjects’ responses are coded by color. The underlying bar plot shows subject means and SEMs for each glider. (d) Two out of nine subjects perceived an overwhelming motion after-effect after adaptation to light or dark edge motion. Shown here is one of these subject’s glider responses to the gliders after adaptation to rightward-moving light edges. All responses were to the left, a result that we interpreted as a motion after-effect resulting from net motion in the adapter. The opposing edge adapter avoided this problem. Supplementary Movie 1. Opposing edges. This stimulus was designed to be equiluminant in time. Light edges move to the right, and dark edges move to the left. In the EEG experiment, this stimulus was full screen. See Fig. S7 for a space-time diagram of this stimulus. Supplementary Movie 2. Probe edges. This stimulus consisted of two alternating versions of the opposing edge stimuli. Light edges moved to the right for half of the probe and to the left for the other half. Dark and light edges always moved in opposite directions. See Fig. S7 for a space-time diagram of this stimulus. Supplementary Movie 3. Glider stimuli. Here each 3-pt glider stimulus is presented sequentially: positive diverging, negative diverging, positive converging, and negative converging. All glider centroids moved to the right. Most subjects perceived leftward motion in the second glider and rightward motion in the first, third, and fourth gliders (see Figure 7 and S8). Supplementary Figure 1. Models of motion estimation. (a) The classical Hassenstein-Reichardt Correlator (HRC). In this paper, h1(x) and h2(x) were modeled as Gaussian spatial acceptance filters (centered on different points in visual space), f(t) was a low-pass filter, and g(t) was a high-pass filter. In essence, the HRC multiplies delayed values of the contrast with current contrast values across two spatial points. The subtraction stage results in mirror symmetry, thereby enabling responses to both rightward and leftward motion. (b) The classical motion energy (ME) model applies several oriented spatiotemporal filters to the visual input. These filtered signals are subsequently squared and linearly combined to compute the ‘motion energy.’ (c) To compute the 2- and 3-point correlation images, we computed the product with the rightward orientation (left) and then subtracted the mirror symmetric product with a leftward orientation (center). This results in the images on the right, which are also displayed in Figure 1b. (d) HRC-like diagrams for the diverging and converging 3-point correlators used in Figure 1d. Supplementary Figure 2. Triple correlations only signify motion when the stimulus is light-dark asymmetric. Each row presents a comparison between correlational motion signatures. Columns present: (i) context for each comparison; (ii) properties of pairwise motion estimators; (iii) properties of diverging 3-point estimators; and (iv) properties of converging 3-point estimators. (ai) Motion is approximated by the rigid translation of images. (aii–aiv) Cartoon of the correlation structure that each estimator detects. (bi) Example sinusoidal grating. (bii) Pair correlations signified motion across the image. (biii–biv) Triple correlations depended on the local phase of the sinusoidal grating and spatially averaged to zero. (ci) Example asymmetric grating. The luminance at each point in space was the luminance of the example sinusoidal grating raised to the tenth power. (cii) Pair correlations varied across the image. (ciii–iciv) Triple correlations still depended on the local phase of the grating, but their spatial average was nonzero. (di) Cartoon of an ensemble of sinusoidal gratings that vary in period and phase. (dii–div) The accuracy with which correlations convey motion was examined across this ensemble. The performance of each estimator was quantified through the Pearson’s correlation between the estimator output and the simulated velocity. We linearly combined estimators to quantify the improvements afforded by multiple correlational signals. The numbers above each bar denote the fractional increase with respect to the 2-point estimate. Neither spatial averaging nor triple correlations improved the motion estimate. (e) Same as (d), but for asymmetric gratings. In this case, both spatial averaging and triple correlations improved the accuracy of motion estimation. Error bars are standard deviations over cross-validating trials (see Supp. Methods). Supplementary Figure 3. Correlational motion estimation with spatially varying contrast gain. Rows a–c present a comparison between correlational motion signatures, when (a) the contrast gain is set locally by considering the average luminance over one degree squares of pixels, (b) the contrast gain is set locally by considering the average luminance over five degree squares of pixels, (c) the contrast gain is set globally by considering the average luminance over the full image. Columns apply to rows a–c and present: (i) example local luminance average, which sets the contrast gain; (ii) accuracy of pairwise motion estimators; (iii) accuracy of diverging 3-point estimators; and (iv) accuracy of converging 3-point estimators. Columns (ii)–(iv) are of the same format as in Figs. 1 and S2, and Figs. S3cii–civ are identical to Figs. 1cii–iv. Error bars are standard deviations over cross-validating trials (see Supp. Methods). (d) Contrast histograms when the contrast gain was determined by averaging over one degree squares of pixels (left), five degree squares of pixels (center), or the full visual field (right). The mean (i.e. c̄ = 〈c〉), variance (i.e. 〈(c − c̄)2〉), and third central moment (i.e. 〈(c − c̄)3〉) are shown alongside each histogram. Supplementary Figure 4. Glider construction and model responses. (a) Diagrams of the update rules that generate glider stimuli (see also Online Methods). Given a seed row and a seed column of pixel contrasts (upper left), the glider update rules fill in all remaining pixels, one row at a time, to generate an instantiation of the glider. The red points in the diagrams exemplify the update rule for each glider. The illustrative choices are not special, as any such pixel combinations will obey the update rule by construction. (b) Within a 2-point glider, all 3-point correlations average to 0. Similarly, within a 3-point glider, 2-point correlations (and the other 3-point correlations) average to 0. (c) Example space-time plots of the glider stimuli. (d) The ON/OFF model proposed in Eichner et al.[27] correctly predicted the signs of the 2-point glider responses but did not predict the observed 3-point glider responses. Error bars are SEM, as in Fig. 3. Supplementary Figure 5. Drosophila respond to several triple correlations involving two points in space and three points in time. The correlation structures for each glider and sample space-time intensity plots are shown at top, and the behavioral responses (relative to the positive 2-pt glider response) are shown below. We found that flies respond less strongly to these gliders than to the diverging and converging gliders (compare to Fig. 3d). We measured statistically significant responses in only 2 of 6 cases (two-tailed t-test, ‘*’ corresponds to p=2.0×10−2 (t15=2.6) and ‘**’ corresponds to p=3.7×10−3 (t15=3.4)). Error bars are SEM and N=16. Supplementary Figure 6. We computed the abundance of each correlational element in each edge type. We first counted the number of times that each correlational element appeared in right and left-moving edges of the same polarity. The difference between the rightward and leftward counts provided an edge-specific directional signal (denoted “Net” in the figure). As expected, these directional signals depended on edge type for 3-point correlational elements but not for 2-point correlational elements. In particular, the sign of each 3-point directional signal inverted when the edge polarity inverted. In Figure 5, we used these directional signals as linear weighting coefficients to predict Drosophila behavioral responses to moving edges from measured glider responses. Supplementary Figure 7. EEG experimental details. (a) Space-time plots of the two mirror-symmetric opposing edge adapters. The light and dark edges moved in opposite directions and are highlighted by the green and purple lines, respectively. Each presentation of the adapters had a random spatial phase. See also movie M1 for examples of the adapters. The probe temporally alternated between adapter A and adapter B. The spatial period of the adapting stimuli was doubled in the probe stimulus, but the speed of the edges remained the same. Green and purple lines highlight moving edges and show that the directions of the light and dark edges invert during the two halves of the probe. The probe was always presented with the same spatial phase. See also movie M2 for an example of the probe. (b) Evoked response waveforms for the probe stimuli under unadapted and adapted conditions. Data were only from the time interval shown in Figure 6a (and in panel c below). The data show that responses to identical stimuli were differentially affected by the identity of the adapter. (c) Strength of the first harmonic response as a function of time after the end of the adapting period. Both adapted responses were above baseline for approximately 5 seconds. Error patches on lines represent 1 SEM. (d) Phase-amplitude plots for the adapted EEG responses shown in Figure 6c. Ellipses represent 1 SEM. Supplementary Table 1. Genotypes of Drosophila lines used in experiments in this paper.

54 in total

1. Evoked potential and psychophysical analysis of Fourier and non-Fourier motion mechanisms.

Authors: J D Victor; M M Conte
Journal: Vis Neurosci Date: 1992-08 Impact factor: 3.241

Review 2. Fly motion vision.

Authors: Alexander Borst; Juergen Haag; Dierk F Reiff
Journal: Annu Rev Neurosci Date: 2010 Impact factor: 12.449