Literature DB >> 29993730

Describing Video With Attention-Based Bidirectional LSTM.

Yi Bin, Yang Yang, Fumin Shen, Ning Xie, Heng Tao Shen, Xuelong Li.   

Abstract

Video captioning has been attracting broad research attention in the multimedia community. However, most existing approaches heavily rely on static visual information or partially capture the local temporal knowledge (e.g., within 16 frames), thus hardly describing motions accurately from a global view. In this paper, we propose a novel video captioning framework, which integrates bidirectional long-short term memory (BiLSTM) and a soft attention mechanism to generate better global representations for videos as well as enhance the recognition of lasting motions in videos. To generate video captions, we exploit another long-short term memory as a decoder to fully explore global contextual information. The benefits of our proposed method are two fold: 1) the BiLSTM structure comprehensively preserves global temporal and visual information and 2) the soft attention mechanism enables a language decoder to recognize and focus on principle targets from the complex content. We verify the effectiveness of our proposed video captioning framework on two widely used benchmarks, that is, microsoft video description corpus and MSR-video to text, and the experimental results demonstrate the superiority of the proposed approach compared to several state-of-the-art methods.

Entities:  

Year:  2018        PMID: 29993730     DOI: 10.1109/TCYB.2018.2831447

Source DB:  PubMed          Journal:  IEEE Trans Cybern        ISSN: 2168-2267            Impact factor:   11.448


  6 in total

1.  Deep learning approaches based improved light weight U-Net with attention module for optic disc segmentation.

Authors:  R Shalini; Varun P Gopi
Journal:  Phys Eng Sci Med       Date:  2022-09-12

2.  Screening and functional prediction of differentially expressed genes in walnut endocarp during hardening period based on deep neural network under agricultural internet of things.

Authors:  Zhongzhong Guo; Shangqi Yu; Jiazhi Fu; Kai Ma; Rui Zhang
Journal:  PLoS One       Date:  2022-02-24       Impact factor: 3.240

3.  Automatic Multichannel Electrocardiogram Record Classification Using XGBoost Fusion Model.

Authors:  Xiaohong Ye; Yuanqi Huang; Qiang Lu
Journal:  Front Physiol       Date:  2022-04-14       Impact factor: 4.755

4.  Privacy-preserving household load forecasting based on non-intrusive load monitoring: A federated deep learning approach.

Authors:  Xinxin Zhou; Jingru Feng; Jian Wang; Jianhong Pan
Journal:  PeerJ Comput Sci       Date:  2022-08-02

5.  A CTR prediction model based on session interest.

Authors:  Qianqian Wang; Fang'ai Liu; Xiaohui Zhao; Qiaoqiao Tan
Journal:  PLoS One       Date:  2022-08-17       Impact factor: 3.752

6.  DeepCINAC: A Deep-Learning-Based Python Toolbox for Inferring Calcium Imaging Neuronal Activity Based on Movie Visualization.

Authors:  Julien Denis; Robin F Dard; Eleonora Quiroli; Rosa Cossart; Michel A Picardo
Journal:  eNeuro       Date:  2020-08-17
  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.