Literature DB >> 30381807

Region-based Activity Recognition Using Conditional GAN.

Xinyu Li1, Yanyi Zhang1, Jianyu Zhang1, Yueyang Chen1, Huangcan Li1, Ivan Marsic1, Randall S Burd2.   

Abstract

We present a method for activity recognition that first estimates the activity performer's location and uses it with input data for activity recognition. Existing approaches directly take video frames or entire video for feature extraction and recognition, and treat the classifier as a black box. Our method first locates the activities in each input video frame by generating an activity mask using a conditional generative adversarial network (cGAN). The generated mask is appended to color channels of input images and fed into a VGG-LSTM network for activity recognition. To test our system, we produced two datasets with manually created masks, one containing Olympic sports activities and the other containing trauma resuscitation activities. Our system makes activity prediction for each video frame and achieves performance comparable to the state-of-the-art systems while simultaneously outlining the location of the activity. We show how the generated masks facilitate the learning of features that are representative of the activity rather than accidental surrounding information.

Entities:  

Keywords:  Activity Recognition; Deep Learning; Generative Adversarial Network; Localization

Year:  2017        PMID: 30381807      PMCID: PMC6205507          DOI: 10.1145/3123266.3123365

Source DB:  PubMed          Journal:  Proc ACM Int Conf Multimed


  3 in total

1.  Learning to forget: continual prediction with LSTM.

Authors:  F A Gers; J Schmidhuber; F Cummins
Journal:  Neural Comput       Date:  2000-10       Impact factor: 2.026

2.  Human Action Recognition in Unconstrained Videos by Explicit Motion Modeling.

Authors:  Yu-Gang Jiang; Qi Dai; Wei Liu; Xiangyang Xue; Chong-Wah Ngo
Journal:  IEEE Trans Image Process       Date:  2015-07-14       Impact factor: 10.856

3.  Deep Multimodal Feature Analysis for Action Recognition in RGB+D Videos.

Authors:  Amir Shahroudy; Tian-Tsong Ng; Yihong Gong; Gang Wang
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2017-04-05       Impact factor: 6.226

  3 in total
  1 in total

1.  Human Conversation Analysis Using Attentive Multimodal Networks with Hierarchical Encoder-Decoder.

Authors:  Yue Gu; Xinyu Li; Kaixiang Huang; Shiyu Fu; Kangning Yang; Shuhong Chen; Moliang Zhou; Ivan Marsic
Journal:  Proc ACM Int Conf Multimed       Date:  2018-10
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.