Literature DB >> 31331893

Re-Caption: Saliency-Enhanced Image Captioning through Two-Phase Learning.

Lian Zhou, Yuejie Zhang, Yugang Jiang, Tao Zhang, Weiguo Fan.   

Abstract

Visual and semantic saliency are important in image captioning. However, single-phase image captioning benefits little from limited saliency without a saliency predictor. In this paper, a novel saliency-enhanced re-captioning framework via two-phase learning is proposed to enhance the single-phase image captioning. In the framework, visual saliency and semantic saliency are distilled from the first-phase model and fused with the second-phase model for model self-boosting. The visual saliency mechanism can generate a saliency map and a saliency mask for an image without learning a saliency map predictor. The semantic saliency mechanism sheds some lights on the properties of words with part-of-speech Noun in a caption. Besides, another type of saliency, sample saliency is proposed to explicitly compute the saliency degree of each sample, which helps for more robust image captioning. In addition, how to combine the above three types of saliency for further performance boost is also examined. Our framework can treat an image captioning model as a saliency extractor, which may benefit other captioning models and related tasks. The experimental results on both the Flickr30k and MSCOCO datasets show that the saliency-enhanced models can obtain promising performance gains.

Entities:  

Year:  2019        PMID: 31331893     DOI: 10.1109/TIP.2019.2928144

Source DB:  PubMed          Journal:  IEEE Trans Image Process        ISSN: 1057-7149            Impact factor:   10.856


  1 in total

1.  Correlation Analysis of Japanese Literature and Psychotherapy Effects Based on an Equation Diagnosis Algorithm.

Authors:  Zhang Tingting
Journal:  Occup Ther Int       Date:  2022-06-11       Impact factor: 1.565

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.