Literature DB >> 22331853

Semi-supervised hashing for large-scale search.

Jun Wang1, Sanjiv Kumar, Shih-Fu Chang.   

Abstract

Hashing-based approximate nearest neighbor (ANN) search in huge databases has become popular due to its computational and memory efficiency. The popular hashing methods, e.g., Locality Sensitive Hashing and Spectral Hashing, construct hash functions based on random or principal projections. The resulting hashes are either not very accurate or are inefficient. Moreover, these methods are designed for a given metric similarity. On the contrary, semantic similarity is usually given in terms of pairwise labels of samples. There exist supervised hashing methods that can handle such semantic similarity, but they are prone to overfitting when labeled data are small or noisy. In this work, we propose a semi-supervised hashing (SSH) framework that minimizes empirical error over the labeled set and an information theoretic regularizer over both labeled and unlabeled sets. Based on this framework, we present three different semi-supervised hashing methods, including orthogonal hashing, nonorthogonal hashing, and sequential hashing. Particularly, the sequential hashing method generates robust codes in which each hash function is designed to correct the errors made by the previous ones. We further show that the sequential learning paradigm can be extended to unsupervised domains where no labeled pairs are available. Extensive experiments on four large datasets (up to 80 million samples) demonstrate the superior performance of the proposed SSH methods over state-of-the-art supervised and unsupervised hashing techniques.

Entities:  

Year:  2012        PMID: 22331853     DOI: 10.1109/TPAMI.2012.48

Source DB:  PubMed          Journal:  IEEE Trans Pattern Anal Mach Intell        ISSN: 0098-5589            Impact factor:   6.226


  4 in total

1.  Categorization of Images Using Autoencoder Hashing and Training of Intra Bin Classifiers for Image Classification and Annotation.

Authors:  P Mercy Rajaselvi Beaulah; D Manjula; Vijayan Sugumaran
Journal:  J Med Syst       Date:  2018-06-11       Impact factor: 4.460

2.  High-throughput histopathological image analysis via robust cell segmentation and hashing.

Authors:  Xiaofan Zhang; Fuyong Xing; Hai Su; Lin Yang; Shaoting Zhang
Journal:  Med Image Anal       Date:  2015-11-09       Impact factor: 8.545

3.  Rapid Retrieval of Lung Nodule CT Images Based on Hashing and Pruning Methods.

Authors:  Ling Pan; Yan Qiang; Jie Yuan; Lidong Wu
Journal:  Biomed Res Int       Date:  2016-11-22       Impact factor: 3.411

4.  Privacy-Preserving Patient Similarity Learning in a Federated Environment: Development and Analysis.

Authors:  Junghye Lee; Jimeng Sun; Fei Wang; Shuang Wang; Chi-Hyuck Jun; Xiaoqian Jiang
Journal:  JMIR Med Inform       Date:  2018-04-13
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.