Literature DB >> 30802849

Moments in Time Dataset: One Million Videos for Event Understanding.

Mathew Monfort, Alex Andonian, Bolei Zhou, Kandan Ramakrishnan, Sarah Adel Bargal, Tom Yan, Lisa Brown, Quanfu Fan, Dan Gutfreund, Carl Vondrick, Aude Oliva.   

Abstract

We present the Moments in Time Dataset, a large-scale human-annotated collection of one million short videos corresponding to dynamic events unfolding within three seconds. Modeling the spatial-audio-temporal dynamics even for actions occurring in 3 second videos poses many challenges: meaningful events do not include only people, but also objects, animals, and natural phenomena; visual and auditory events can be symmetrical in time ("opening" is "closing" in reverse), and either transient or sustained. We describe the annotation process of our dataset (each video is tagged with one action or activity label among 339 different classes), analyze its scale and diversity in comparison to other large-scale video datasets for action recognition, and report results of several baseline models addressing separately, and jointly, three modalities: spatial, temporal and auditory. The Moments in Time dataset, designed to have a large coverage and diversity of events in both visual and auditory modalities, can serve as a new challenge to develop models that scale to the level of complexity and abstract reasoning that a human processes on a daily basis.

Entities:  

Mesh:

Year:  2019        PMID: 30802849     DOI: 10.1109/TPAMI.2019.2901464

Source DB:  PubMed          Journal:  IEEE Trans Pattern Anal Mach Intell        ISSN: 0098-5589            Impact factor:   6.226


  14 in total

1.  Six dimensions describe action understanding: The ACT-FASTaxonomy.

Authors:  Mark A Thornton; Diana I Tamir
Journal:  J Pers Soc Psychol       Date:  2021-09-30

2.  Social-affective features drive human representations of observed actions.

Authors:  Diana C Dima; Tyler M Tomita; Christopher J Honey; Leyla Isik
Journal:  Elife       Date:  2022-05-24       Impact factor: 8.713

3.  The Mouse Action Recognition System (MARS) software pipeline for automated analysis of social behaviors in mice.

Authors:  Cristina Segalin; Jalani Williams; Tomomi Karigo; May Hui; Moriel Zelikowsky; Jennifer J Sun; Pietro Perona; David J Anderson; Ann Kennedy
Journal:  Elife       Date:  2021-11-30       Impact factor: 8.713

4.  Distinct contributions of functional and deep neural network features to representational similarity of scenes in human brain and behavior.

Authors:  Iris Ia Groen; Michelle R Greene; Christopher Baldassano; Li Fei-Fei; Diane M Beck; Chris I Baker
Journal:  Elife       Date:  2018-03-07       Impact factor: 8.140

5.  Quantifying influence of human choice on the automated detection of Drosophila behavior by a supervised machine learning algorithm.

Authors:  Xubo Leng; Margot Wohl; Kenichi Ishii; Pavan Nayak; Kenta Asahina
Journal:  PLoS One       Date:  2020-12-16       Impact factor: 3.240

6.  Intelligence Is beyond Learning: A Context-Aware Artificial Intelligent System for Video Understanding.

Authors:  Ahmed Ghozia; Gamal Attiya; Emad Adly; Nawal El-Fishawy
Journal:  Comput Intell Neurosci       Date:  2020-12-23

7.  Interactive video retrieval evaluation at a distance: comparing sixteen interactive video search systems in a remote setting at the 10th Video Browser Showdown.

Authors:  Silvan Heller; Viktor Gsteiger; Werner Bailer; Cathal Gurrin; Björn Þór Jónsson; Jakub Lokoč; Andreas Leibetseder; František Mejzlík; Ladislav Peška; Luca Rossetto; Konstantin Schall; Klaus Schoeffmann; Heiko Schuldt; Florian Spiess; Ly-Duyen Tran; Lucia Vadicamo; Patrik Veselý; Stefanos Vrochidis; Jiaxin Wu
Journal:  Int J Multimed Inf Retr       Date:  2022-01-26

8.  Larger visual changes compress time: The inverted effect of asemantic visual features on interval time perception.

Authors:  Sandra Malpica; Belen Masia; Laura Herman; Gordon Wetzstein; David M Eagleman; Diego Gutierrez; Zoya Bylinskii; Qi Sun
Journal:  PLoS One       Date:  2022-03-22       Impact factor: 3.240

9.  Horizontal Review on Video Surveillance for Smart Cities: Edge Devices, Applications, Datasets, and Future Trends.

Authors:  Mostafa Ahmed Ezzat; Mohamed A Abd El Ghany; Sultan Almotairi; Mohammed A-M Salem
Journal:  Sensors (Basel)       Date:  2021-05-06       Impact factor: 3.576

10.  You won't believe what this guy is doing with the potato: The ObjAct stimulus-set depicting human actions on congruent and incongruent objects.

Authors:  Yarden Shir; Naphtali Abudarham; Liad Mudrik
Journal:  Behav Res Methods       Date:  2021-02-25
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.