Literature DB >> 29993735

Towards Personalized Image Captioning via Multimodal Memory Networks.

Cesc Chunseong Park, Byeongchang Kim, Gunhee Kim.   

Abstract

We address personalized image captioning, which generates a descriptive sentence for a user's image, accounting for prior knowledge such as her active vocabularies or writing style in her previous documents. As applications of personalized image captioning, we solve two post automation tasks in social networks: hashtag prediction and post generation. The hashtag prediction predicts a list of hashtags for an image, while the post generation creates a natural post text consisting of normal words, emojis, and even hashtags. We propose a novel personalized captioning model named Context Sequence Memory Network (CSMN). Its unique updates over existing memory networks include (i) exploiting memory as a repository for multiple types of context information, (ii) appending previously generated words into memory to capture long-term information, and (iii) adopting CNN memory structure to jointly represent nearby ordered memory slots for better context understanding. For evaluation, we collect a new dataset InstaPIC-1.1M, comprising 1.1M Instagram posts from 6.3K users. We further use the benchmark YFCC100M dataset to validate the generality of our approach. With quantitative evaluation and user studies via Amazon Mechanical Turk, we show that the three novel features of the CSMN help enhance the performance of personalized image captioning over state-of-the-art captioning models.

Year:  2018        PMID: 29993735     DOI: 10.1109/TPAMI.2018.2824816

Source DB:  PubMed          Journal:  IEEE Trans Pattern Anal Mach Intell        ISSN: 0098-5589            Impact factor:   6.226


  1 in total

Review 1.  An Overview of Image Caption Generation Methods.

Authors:  Haoran Wang; Yue Zhang; Xiaosheng Yu
Journal:  Comput Intell Neurosci       Date:  2020-01-09
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.