| Literature DB >> 28140391 |
Niklas Wilming1,2, Selim Onat1,3, José P Ossandón1,4, Alper Açık1,5, Tim C Kietzmann1,6, Kai Kaspar1,7, Ricardo R Gameiro1, Alexandra Vormberg1,8,9, Peter König1,2.
Abstract
We present a dataset of free-viewing eye-movement recordings that contains more than 2.7 million fixation locations from 949 observers on more than 1000 images from different categories. This dataset aggregates and harmonizes data from 23 different studies conducted at the Institute of Cognitive Science at Osnabrück University and the University Medical Center in Hamburg-Eppendorf. Trained personnel recorded all studies under standard conditions with homogeneous equipment and parameter settings. All studies allowed for free eye-movements, and differed in the age range of participants (~7-80 years), stimulus sizes, stimulus modifications (phase scrambled, spatial filtering, mirrored), and stimuli categories (natural and urban scenes, web sites, fractal, pink-noise, and ambiguous artistic figures). The size and variability of viewing behavior within this dataset presents a strong opportunity for evaluating and comparing computational models of overt attention, and furthermore, for thoroughly quantifying strategies of viewing behavior. This also makes the dataset a good starting point for investigating whether viewing strategies change in patient groups.Entities:
Mesh:
Year: 2017 PMID: 28140391 PMCID: PMC5283059 DOI: 10.1038/sdata.2016.126
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Studies and associated meta data.
| ‘V. Dur.’=Viewing duration, ‘# Obs.’=number of observers, ‘# Fix.’=number of fixations, ‘rec.’=recognition, ‘Discrimin.’=discrimination, ‘FV’=free-viewing, ‘GV’=guided viewing, ‘CAT’=content awareness task, ‘IST’=information search task. ‘Students’ implies that participants were recruited from the student population at the University of Osnabrück but that no further age information is available. | ||||||||
|---|---|---|---|---|---|---|---|---|
| 3D | 20 | 18 | 23 | Depth rec. | 20s | 14 | 84093 | Students |
| AFC | 2 | 18 | Image rec. | 5s | 20 | 39358 | Students | |
| Age study | 0 | 2 | 7, 8, 10,11 | Patch rec. | 5s | 58 | 105813 | 7.6, 22.1, 80.6 |
| APP | 6 | 14 | 12 | Object rec. | Var. | 73 | 99101 | 25.6 (SD 8.9) |
| APPC | 7 | 22 | 12 | Object rec. | Var. | 46 | 12866 | 25 (SD 6) |
| Baseline | 3 | 12 | 7, 8, 10, 11 | FV | 6s | 48 | 203772 | 23.1 (19–28) |
| Bias | 11 | 13 | 7, 8, 10, 11 | FV | 6s | 43 | 176391 | 23.1 (19–28) |
| Cross Modal | 16 | 19 | 19 | FV | 6s | 50 | 120261 | Students (23) |
| Cross Modal 2 | 17 | 20 | 20 | FV | 6s | 32 | 31826 | Students (19–36) |
| EEG | 9 | 14 | FV | 8s | 7 | 70026 | Students | |
| Face Discrim. | 18 | 21 | Discrimin. | 1.5s | 29 | 100448 | 26.6 (SD 4.54, 19–35) | |
| Face Learning | 19 | 22 | Aversive learning | 1.5s | 104 | 145378 | 26.9 (SD 4.14, 20–36) | |
| Filtered | 21 | 13 | 7, 8, 24, 25, 26, 27 | FV | 6s | 47 | 83834 | 22.6 (SD 2.3, 19–28) |
| Gap | 22 | 13 | 7, 8 | FV | 6s | 24 | 49208 | 22 (SD 2.5, 19–28) |
| Head Fixed | 15 | 8, 10 | FV, GV | 6s | 19 | 151338 | 24 (18–41) | |
| Memory I | 4 | 15, 16 | 7, 8, 10, 11 | FV | 6s | 45 | 179473 | 24.2 (18–48) |
| Memory II | 5 | 19, 15 | 16 | FV | 6s | 34 | 109830 | 25.9 (19–49) |
| Monocular | 12 | 15, 17 | FV | 6s | 68 | 282602 | 22.4 (SD 2.98) | |
| Patch | 1 | 18 | Patch rec. | 5s | 35 | 64449 | Students | |
| Scaled | 10 | 15, 17, 18 | FV | 6s | 24 | 166156 | 21.38 (+/- 3.00) | |
| Tactile | 8 | 17 | 7, 8, 10, 11, 14 | FV | min 6s | 57 | 358578 | 21.8 (18–29) |
| Webtask | 13 | 21 | 6 | FV, CAT, IST | 6–12s | 48 | 151581 | 41.5 (24–55) |
| Webtask @ School | 14 | 6 | FV, CAT, IST | 6–12s | 24 | 40553 | 12 (+/- 0.3) |
Stimulus presentation and recording information metadata.
| ‘Disp. size’=display size, ‘Img. size’=image / stimulus size in pixel, ‘Img. pos (pixel)’=upper left corner of the image (x,y=0,0=top left of screen; ‘central’=each stimulus was centered on the screen; y=0 is omitted), ‘V. dist.’=viewing distance measured between eyes and screen, ‘PPD’=pixels per degree of visual angle, ‘Sampling freq’=sampling frequency of gaze position, ‘Val. error’=maximal average validation error before gaze tracking started, ‘EL II’=Eye Link II, ‘EL 1000’=Eye Link 1000. | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 3D | SeeReal C-s | 640×1024 | 30.9×24.9 | 1280×1024 | 0 | 65 | 19.1 | 250 Hz | 0.3 | EL II |
| AFC | ViewSonic | 1280×960 | 36.2×27.2 | 1280×960 | 0 | 65 | 35.3 | 250 Hz | 0.3 | EL II |
| Age study | ViewSonic | 1280×960 | 35 × 26.5 | 1280×960 | 0 | 65 | 36.3 | 500 Hz | 0.6 | EL 1000 |
| APP | Apple Cinema | 2560×1600 | 23.8×23.8 | variable | central | 60 | 41.6 | 500 Hz | 0.3 | EL II |
| APPC | Apple Cinema | 2560×1600 | 23.8×23.8 | variable | central | 60 | 41.6 | 500 Hz | 0.3 | EL II |
| Baseline | SM1100 | 1280×960 | 29×22 | 1280×960 | 0 | 80 | 45.6 | 500 Hz | 0.3 | EL II |
| Bias | SM1100 | 1280×960 | 28×21 | 1280×960 | 0 | 80 | 45.6 | 500 Hz | 0.3 | EL II |
| Cross Modal | SM1100 | 1024×768 | 28×21 | 1024×768 | 0 | 80 | 36.5 | 500 Hz | 0.3 | EL II |
| Cross Modal 2 | Apple Cinema | 2560×1600 | 44×36 | 1944×1600 | 308, 0 | 60 | 58 | 500Hz | 0.5 | EL II |
| EEG | U2311Hb | 1920×1080 | 27×46 | 1280×960 | 320, 6 | 60 | 41 | 500 Hz | 0.5 | EL 1000 |
| Face Discrimin. | SM 204B | 1600×1200 | 40.6×30.5 | 1600×1200 | central | 50 | 39 | 250 Hz | 0.3 | EL 1000 |
| Face Learning | SM 204B | 1600×1200 | 40.6×30.5 | 1600×1200 | central | 50 | 39 | 250 Hz | 0.3 | EL 1000 |
| Filtered | SM1100 | 1280×960 | 28×21 | 1280×960 | 0 | 80 | 45.6 | 500 Hz | 0.3 | EL II |
| Gap | SM1100 | 1280×960 | 28×21 | 1280×960 | 0 | 80 | 45.6 | 500 Hz | 0.3 | EL II |
| Head Fixed | BenQ XL2420T | 1920×1080 | 53.8×30.24 | 1440×1080 | 240,0 | 60 | 38 | 500 Hz | 0.55 | EL 1000 |
| Memory I | SM1100 | 1280×960 | 29×22 | 1280×960 | 0 | 80 | 45.6 | 500 Hz | 0.35 | EL II |
| Memory II | Apple Cinema | 2650×1600 | 46.2×28.9 | 2650×1600 | 0 | 80 | 55.4 | 500Hz | 0.35 | EL II |
| Monocular | Apple Cinema | 2560×1600 | 46.2×28.9 | variable | 0 | 80 | 55.4 | 500Hz | 0.5 | EL II |
| Patch | ViewSonic | 1280×960 | 36.2×27.2 | 1280×960 | 0 | 65 | 35.3 | 250Hz | 0.3 | EL II |
| Scaled | Apple Cinema | 2560×1600 | 46.2×28.9 | 640×400–2560×1600 | central | 80 | 55.4 | 500Hz | 0.3 | EL II |
| Tactile | SM1100 | 1280×960 | 28×21 | 1280×960 | 0 | 80 | 45.6 | 500Hz | 0.5 | EL II |
| Webtask | SyncMaster 971p | 1280×1024 | 35.56×28.4 | 1272×922 | 4,86 | 60 | 36 | 500Hz | 0.3 | EL II |
| Webtask @ School | SyncMaster 971p | 1280×1024 | 35.56×28.4 | 1272×922 | 4,86 | 60 | 36 | 500Hz | 0.5 | EL 1000 |
Figure 1Dataset overview.
(a) Smoothed spatial distribution of fixation locations for each study. The frame indicates screen borders. (b) Counts of fixation durations for each study. (c) A scatter plot showing number of observers and number of images per study. Circle size scales with the number of fixations, e.g. the difference between largest (35×103 fixations) and smallest (1.3×103 fixations) study is about a factor 27. (d) The number of observers per image set. (e) The plot shows how many images were seen by how many observers.
Description of fields in the dataset
| Subject ID | ||
| Dataset | ||
| X-position of fixation | ||
| Y-position of fixation | ||
| Start of fixation in ms | ||
| End of fixation in ms | ||
| Category Nr. | ||
| File number of viewed image | ||
| Fixation number within trial | ||
| Trial # within experiment | ||
| 1 f the trial was the start of a block | ||
| 1 if image is left-right mirrored | ||
| 1 if hands crossed | ||
| stimulation at trial start (at 150ms, 1:left; 2:right; 3:bilateral) | ||
| 1 if stimulation occurred later during the trial | ||
| Side of late stimulation (1: left; 2:right; 3:bilateral) | ||
| time of late stimulation | ||
| Training trials=0, Experimental trials=1 | ||
| Time of button press upon recognition of shown object | ||
| Time of button press in cases of flipped perception | ||
| Rating of perceptual certainty | ||
| Code for context version shown | ||
| Time until first saccade away from drift position | ||
| Prior knowledge of the Stimuli | ||
| 1=ambiguous, 2,3 unambiguous | ||
| Code for initial perception | ||
| Code for potential second percept | ||
| Age group of the participant (Children: 0, Younger Adults: 1; Older Adults: 2) | ||
| Answer (0: No; 1: Yes) | ||
| Is the trial a catch trial? (0: No; 1: Yes) | ||
| The central location of the patch in the image from which it is taken | ||
| Is the fixation location within the monitor (0: No, 1: Yes) | ||
| The local luminance contrast modification weight | ||
| The amount of noise added to the phase spectrum | ||
| Block within session (In one block only pictures of one size were shown) | ||
| 1=website, 2=urban, 3=landscape | ||
| Image index | ||
| 1=scaled image, 2=cropped image | ||
| Each session showed 5 blocks/sizes. Break in between each session | ||
| Presented image size | ||
| Fixation within monitor coordinates. | ||
| Fixation within image coordinates of respective size | ||
| The local luminance contrast modification weight | ||
| The amount of noise added to the phase spectrum | ||
| The base image used for modifications | ||
| Time of buttonpress upon recognition of shown object | ||
| Time of buttonpress in cases of flipped perception | ||
| Rating of perceptual certainty | ||
| Prior knowledge of the Stimuli | ||
| 1=ambiguous, 2,3 unambiguous | ||
| Code for initial perception | ||
| Code for potential second percept | ||
| 1: free viewing task, 2: information search task, 3: content awareness task. | ||
| 1 if example trial. | ||
| Familiarity rating, ranges from 1 (never seen) to 5 (well known). | ||
| 1 if fixation was on the stimulus. | ||
| Original category used in the article. 1:news, 2:blogs, 3:landing pages, 4:shops, 5:company, 6:information. | ||
| Original filenumber within a category. | ||
| Relevance during content awareness task, ranges from 1=not relevant to 5=highly relevant. | ||
| Searchterm used during the information search task. | ||
| URL displayed during this trial | ||
| User groups that the participant could choose from. | ||
| Selected user group (1–5). | ||
| 1: head fixed with chin rest and mouth guard, 2: head free, 3: break between head fixed and head free block, 4: break | ||
| 1 for guided viewing trials | ||
| Before (1) or after (5) the aversive learning | ||
| Whether the trial is a reference face (0 or 180 degrees) or not. | ||
| Reference face identity. Chain 1 will be associated to an aversive outcome, whereas chain 2 will stay neutral (only face learning). | ||
| Angular distances (x100) from the reference faces. | ||
| 1 if trial is oddball. | ||
| 1 if aversive outcome is delivered during the trial otherwise 0. | ||
| before (2), during (3), after (4) aversive learning | ||
| -135, -90, -45, 0, 45, 90, 135, 180: Angular distances from the face that will be associated to an aversive outcome. 500: Trials with aversive outcome 1000: oddball trials 3000: Null trials. | ||
| 1 if observer performed a perceptual discrimination task before. | ||
| Scene number, 24 in total. A given scene is presented with or without depth information in 3 different versions: original, pink noise, white noise. | ||
| 1 if image is left-right mirrored | ||
| 0—right-handers; 1—left-handers | ||
| 1: no filter, 2: high pass filter, 3: low pass filter | ||
| 1 if mirrored | ||
| 1 if delay | ||
| 1 if image is left-right mirrored | ||
| Gap in ms between disappearance of drift correction dot and appearance of image (0,300,600,900) | ||
| 1=1st presentation of image; 2=2nd presentation; etc. | ||
| 1=fixation on image; 0=fixation beyond image borders | ||
| 1=1st presentation of image; 2=2nd presentation; etc. | ||
| Complexity of image (1=low, 2=middle, 3=high) | ||
| 1=fixation on image; 0=fixation beyond image borders | ||
| 10=visual; 11 => audio-visual (top-left); 12 => audio-visual (bottom-left); 13 => audio-visual (top-right); 14 => audio-visual (bottom-right); 21 => audio (top-left); 22 => audio (bottom-left); 23 => audio (top-right); 24 => audio (bottom-right) | ||
| 1=visual; 2 => audio-visual left; 3 => audio-visual right; 4 => audio left; 5 => audio right |
Figure 2Image category overview.
Each panel shows nine example images from a different category.