Literature DB >> 25420244

Learning deep hierarchical visual feature coding.

Hanlin Goh, Nicolas Thome, Matthieu Cord, Joo-Hwee Lim.   

Abstract

In this paper, we propose a hybrid architecture that combines the image modeling strengths of the bag of words framework with the representational power and adaptability of learning deep architectures. Local gradient-based descriptors, such as SIFT, are encoded via a hierarchical coding scheme composed of spatial aggregating restricted Boltzmann machines (RBM). For each coding layer, we regularize the RBM by encouraging representations to fit both sparse and selective distributions. Supervised fine-tuning is used to enhance the quality of the visual representation for the categorization task. We performed a thorough experimental evaluation using three image categorization data sets. The hierarchical coding scheme achieved competitive categorization accuracies of 79.7% and 86.4% on the Caltech-101 and 15-Scenes data sets, respectively. The visual representations learned are compact and the model's inference is fast, as compared with sparse coding methods. The low-level representations of descriptors that were learned using this method result in generic features that we empirically found to be transferrable between different image data sets. Further analysis reveal the significance of supervised fine-tuning when the architecture has two layers of representations as opposed to a single layer.

Year:  2014        PMID: 25420244     DOI: 10.1109/TNNLS.2014.2307532

Source DB:  PubMed          Journal:  IEEE Trans Neural Netw Learn Syst        ISSN: 2162-237X            Impact factor:   10.451


  1 in total

1.  M-SAC-VLADNet: A Multi-Path Deep Feature Coding Model for Visual Classification.

Authors:  Boheng Chen; Jie Li; Gang Wei; Biyun Ma
Journal:  Entropy (Basel)       Date:  2018-05-04       Impact factor: 2.524

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.