Literature DB >> 27046859

Cross-Modal Retrieval With CNN Visual Features: A New Baseline.

Yunchao Wei, Yao Zhao, Canyi Lu, Shikui Wei, Luoqi Liu, Zhenfeng Zhu, Shuicheng Yan.   

Abstract

Recently, convolutional neural network (CNN) visual features have demonstrated their powerful ability as a universal representation for various recognition tasks. In this paper, cross-modal retrieval with CNN visual features is implemented with several classic methods. Specifically, off-the-shelf CNN visual features are extracted from the CNN model, which is pretrained on ImageNet with more than one million images from 1000 object categories, as a generic image representation to tackle cross-modal retrieval. To further enhance the representational ability of CNN visual features, based on the pretrained CNN model on ImageNet, a fine-tuning step is performed by using the open source Caffe CNN library for each target data set. Besides, we propose a deep semantic matching method to address the cross-modal retrieval problem with respect to samples which are annotated with one or multiple labels. Extensive experiments on five popular publicly available data sets well demonstrate the superiority of CNN visual features for cross-modal retrieval.

Entities:  

Year:  2016        PMID: 27046859     DOI: 10.1109/TCYB.2016.2519449

Source DB:  PubMed          Journal:  IEEE Trans Cybern        ISSN: 2168-2267            Impact factor:   11.448


  2 in total

1.  A top-down manner-based DCNN architecture for semantic image segmentation.

Authors:  Kai Qiao; Jian Chen; Linyuan Wang; Lei Zeng; Bin Yan
Journal:  PLoS One       Date:  2017-03-24       Impact factor: 3.240

2.  Emotion Recognition of Violin Playing Based on Big Data Analysis Technologies.

Authors:  Liangjun Zou
Journal:  J Environ Public Health       Date:  2022-09-15
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.