Literature DB >> 29985134

Sequential Video VLAD: Training the Aggregation Locally and Temporally.

Youjiang Xu, Yahong Han, Richang Hong, Qi Tian.   

Abstract

As characterizing videos simultaneously from spatial and temporal cues has been shown crucial for the video analysis, the combination of convolutional neural networks and recurrent neural networks, i.e., recurrent convolution networks (RCNs), should be a native framework for learning the spatio-temporal video features. In this paper, we develop a novel sequential vector of locally aggregated descriptor (VLAD) layer, named SeqVLAD, to combine a trainable VLAD encoding process and the RCNs architecture into a whole framework. In particular, sequential convolutional feature maps extracted from successive video frames are fed into the RCNs to learn soft spatio-temporal assignment parameters, so as to aggregate not only detailed spatial information in separate video frames but also fine motion information in successive video frames. Moreover, we improve the gated recurrent unit (GRU) of RCNs by sharing the input-to-hidden parameters and propose an improved GRU-RCN architecture named shared GRU-RCN (SGRU-RCN). Thus, our SGRU-RCN has a fewer parameters and a less possibility of overfitting. In experiments, we evaluate SeqVLAD with the tasks of video captioning and video action recognition. Experimental results on Microsoft Research Video Description Corpus, Montreal Video Annotation Dataset, UCF101, and HMDB51 demonstrate the effectiveness and good performance of our method.

Year:  2018        PMID: 29985134     DOI: 10.1109/TIP.2018.2846664

Source DB:  PubMed          Journal:  IEEE Trans Image Process        ISSN: 1057-7149            Impact factor:   10.856


  3 in total

1.  Two-Way Affective Modeling for Hidden Movie Highlights' Extraction.

Authors:  Zheng Wang; Xinyu Yan; Wei Jiang; Meijun Sun
Journal:  Sensors (Basel)       Date:  2018-12-03       Impact factor: 3.576

2.  Medical Image Captioning Using Optimized Deep Learning Model.

Authors:  Arjun Singh; Jaya Krishna Raguru; Gaurav Prasad; Surbhi Chauhan; Pradeep Kumar Tiwari; Atef Zaguia; Mohammad Aman Ullah
Journal:  Comput Intell Neurosci       Date:  2022-03-09

3.  OPTICS-based Unsupervised Method for Flaking Degree Evaluation on the Murals in Mogao Grottoes.

Authors:  Pan Li; Meijun Sun; Zheng Wang; Bolong Chai
Journal:  Sci Rep       Date:  2018-10-29       Impact factor: 4.379

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.