Literature DB >> 26352449

Actions in the Eye: Dynamic Gaze Datasets and Learnt Saliency Models for Visual Recognition.

Stefan Mathe, Cristian Sminchisescu.   

Abstract

Systems based on bag-of-words models from image features collected at maxima of sparse interest point operators have been used successfully for both computer visual object and action recognition tasks. While the sparse, interest-point based approach to recognition is not inconsistent with visual processing in biological systems that operate in `saccade and fixate' regimes, the methodology and emphasis in the human and the computer vision communities remains sharply distinct. Here, we make three contributions aiming to bridge this gap. First, we complement existing state-of-the art large scale dynamic computer vision annotated datasets like Hollywood-2 [1] and UCF Sports [2] with human eye movements collected under the ecological constraints of visual action and scene context recognition tasks. To our knowledge these are the first large human eye tracking datasets to be collected and made publicly available for video, vision.imar.ro/eyetracking (497,107 frames, each viewed by 19 subjects), unique in terms of their (a) large scale and computer vision relevance, (b) dynamic, video stimuli, (c) task control, as well as free-viewing. Second, we introduce novel dynamic consistency and alignment measures, which underline the remarkable stability of patterns of visual search among subjects. Third, we leverage the significant amount of collected data in order to pursue studies and build automatic, end-to-end trainable computer vision systems based on human eye movements. Our studies not only shed light on the differences between computer vision spatio-temporal interest point image sampling strategies and the human fixations, as well as their impact for visual recognition performance, but also demonstrate that human fixations can be accurately predicted, and when used in an end-to-end automatic system, leveraging some of the advanced computer vision practice, can lead to state of the art results.

Entities:  

Mesh:

Year:  2015        PMID: 26352449     DOI: 10.1109/TPAMI.2014.2366154

Source DB:  PubMed          Journal:  IEEE Trans Pattern Anal Mach Intell        ISSN: 0098-5589            Impact factor:   6.226


  6 in total

1.  Stable Gaze Tracking with Filtering Based on Internet of Things.

Authors:  Peng Xiao; Jie Wu; Yu Wang; Jiannan Chi; Zhiliang Wang
Journal:  Sensors (Basel)       Date:  2022-04-20       Impact factor: 3.847

2.  Delving into Egocentric Actions.

Authors:  Yin Li; Zhefan Ye; James M Rehg
Journal:  Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit       Date:  2015-06

3.  Multi-task SonoEyeNet: Detection of Fetal Standardized Planes Assisted by Generated Sonographer Attention Maps.

Authors:  Yifan Cai; Harshita Sharma; Pierre Chatelain; J Alison Noble
Journal:  Med Image Comput Comput Assist Interv       Date:  2018-09-26

4.  SonoEyeNet: Standardized Fetal Ultrasound Plane Detection Informed by Eye Tracking.

Authors:  Y Cai; H Sharma; P Chatelain; J A Noble
Journal:  Proc IEEE Int Symp Biomed Imaging       Date:  2018-05-24

5.  A free database of eye movements watching "Hollywood" videoclips.

Authors:  Francisco M Costela; Russell L Woods
Journal:  Data Brief       Date:  2019-06-04

6.  An Unsupervised Framework for Online Spatiotemporal Detection of Activities of Daily Living by Hierarchical Activity Models.

Authors:  Farhood Negin; François Brémond
Journal:  Sensors (Basel)       Date:  2019-09-29       Impact factor: 3.576

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.