Literature DB >> 22848128

Babytalk: understanding and generating simple image descriptions.

Girish Kulkarni, Visruth Premraj, Vicente Ordonez, Sagnik Dhar, Siming Li, Yejin Choi, Alexander C Berg, Tamara L Berg.   

Abstract

We present a system to automatically generate natural language descriptions from images. This system consists of two parts. The first part, content planning, smooths the output of computer vision-based detection and recognition algorithms with statistics mined from large pools of visually descriptive text to determine the best content words to use to describe an image. The second step, surface realization, chooses words to construct natural language sentences based on the predicted content and general statistics from natural language. We present multiple approaches for the surface realization step and evaluate each using automatic measures of similarity to human generated reference descriptions. We also collect forced choice human evaluations between descriptions from the proposed generation system and descriptions from competing approaches. The proposed system is very effective at producing relevant sentences for images. It also generates descriptions that are notably more true to the specific image content than previous work.

Entities:  

Mesh:

Year:  2013        PMID: 22848128     DOI: 10.1109/TPAMI.2012.162

Source DB:  PubMed          Journal:  IEEE Trans Pattern Anal Mach Intell        ISSN: 0098-5589            Impact factor:   6.226


  6 in total

1.  Attention based automated radiology report generation using CNN and LSTM.

Authors:  Mehreen Sirshar; Muhammad Faheem Khalil Paracha; Muhammad Usman Akram; Norah Saleh Alghamdi; Syeda Zainab Yousuf Zaidi; Tatheer Fatima
Journal:  PLoS One       Date:  2022-01-06       Impact factor: 3.240

2.  Visual-Text Reference Pretraining Model for Image Captioning.

Authors:  Pengfei Li; Min Zhang; Peijie Lin; Jian Wan; Ming Jiang
Journal:  Comput Intell Neurosci       Date:  2022-01-21

3.  Image Captioning with Bidirectional Semantic Attention-Based Guiding of Long Short-Term Memory.

Authors:  Pengfei Cao; Zhongyi Yang; Liang Sun; Yanchun Liang; Mary Qu Yang; Renchu Guan
Journal:  Neural Process Lett       Date:  2019-01-11       Impact factor: 2.908

4.  Automatic captioning for medical imaging (MIC): a rapid review of literature.

Authors:  Djamila-Romaissa Beddiar; Mourad Oussalah; Tapio Seppänen
Journal:  Artif Intell Rev       Date:  2022-09-17       Impact factor: 9.588

Review 5.  Deep Learning in Medical Imaging: General Overview.

Authors:  June-Goo Lee; Sanghoon Jun; Young-Won Cho; Hyunna Lee; Guk Bae Kim; Joon Beom Seo; Namkug Kim
Journal:  Korean J Radiol       Date:  2017-05-19       Impact factor: 3.500

Review 6.  An Overview of Image Caption Generation Methods.

Authors:  Haoran Wang; Yue Zhang; Xiaosheng Yu
Journal:  Comput Intell Neurosci       Date:  2020-01-09
  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.