Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deep Image-to-Video Adaptation and Fusion Networks for Action Recognition.

Literature DB >> 31831421

Deep Image-to-Video Adaptation and Fusion Networks for Action Recognition.

Yang Liu, Zhaoyang Lu, Jing Li, Tao Yang, Chao Yao.

Abstract

Existing deep learning methods for action recognition in videos require a large number of labeled videos for training, which is labor-intensive and time-consuming. For the same action, the knowledge learned from different media types, e.g., videos and images, may be related and complementary. However, due to the domain shifts and heterogeneous feature representations between videos and images, the performance of classifiers trained on images may be dramatically degraded when directly deployed to videos. In this paper, we propose a novel method, named Deep Image-to-Video Adaptation and Fusion Networks (DIVAFN), to enhance action recognition in videos by transferring knowledge from images using video keyframes as a bridge. The DIVAFN is a unified deep learning model, which integrates domain-invariant representations learning and cross-modal feature fusion into a unified optimization framework. Specifically, we design an efficient cross-modal similarities metric to reduce the modality shift among images, keyframes and videos. Then, we adopt an autoencoder architecture, whose hidden layer is constrained to be the semantic representations of the action class names. In this way, when the autoencoder is adopted to project the learned features from different domains to the same space, more compact, informative and discriminative representations can be obtained. Finally, the concatenation of the learned semantic feature representations from these three autoencoders are used to train the classifier for action recognition in videos. Comprehensive experiments on four real-world datasets show that our method outperforms some state-of-the-art domain adaptation and action recognition methods.

Year: 2019 PMID： 31831421 DOI： 10.1109/TIP.2019.2957930

Source DB: PubMed Journal: IEEE Trans Image Process ISSN： 1057-7149 Impact factor: 10.856

Keyword Cloud
Cited

2 in total

1. Action Recognition Using Action Sequences Optimization and Two-Stream 3D Dilated Neural Network.

Authors: Xin Xiong; Weidong Min; Qing Han; Qi Wang; Cheng Zha
Journal: Comput Intell Neurosci Date: 2022-06-13

2. Development and Validation of a 3-Dimensional Convolutional Neural Network for Automatic Surgical Skill Assessment Based on Spatiotemporal Video Analysis.

Authors: Daichi Kitaguchi; Nobuyoshi Takeshita; Hiroki Matsuzaki; Takahiro Igaki; Hiro Hasegawa; Masaaki Ito
Journal: JAMA Netw Open Date: 2021-08-02

2 in total