Literature DB >> 30530383

MHTN: Modal-Adversarial Hybrid Transfer Network for Cross-Modal Retrieval.

Xin Huang, Yuxin Peng, Mingkuan Yuan.   

Abstract

Cross-modal retrieval has drawn wide interest for retrieval across different modalities (such as text, image, video, audio, and 3-D model). However, existing methods based on a deep neural network often face the challenge of insufficient cross-modal training data, which limits the training effectiveness and easily leads to overfitting. Transfer learning is usually adopted for relieving the problem of insufficient training data, but it mainly focuses on knowledge transfer only from large-scale datasets as a single-modal source domain (such as ImageNet) to a single-modal target domain. In fact, such large-scale single-modal datasets also contain rich modal-independent semantic knowledge that can be shared across different modalities. Besides, large-scale cross-modal datasets are very labor-consuming to collect and label, so it is significant to fully exploit the knowledge in single-modal datasets for boosting cross-modal retrieval. To achieve the above goal, this paper proposes a modal-adversarial hybrid transfer network (MHTN), which aims to realize knowledge transfer from a single-modal source domain to a cross-modal target domain and learn cross-modal common representation. It is an end-to-end architecture with two subnetworks. First, a modal-sharing knowledge transfer subnetwork is proposed to jointly transfer knowledge from a single modality in the source domain to all modalities in the target domain with a star network structure, which distills modal-independent supplementary knowledge for promoting cross-modal common representation learning. Second, a modal-adversarial semantic learning subnetwork is proposed to construct an adversarial training mechanism between the common representation generator and modality discriminator, making the common representation discriminative for semantics but indiscriminative for modalities to enhance cross-modal semantic consistency during the transfer process. Comprehensive experiments on four widely used datasets show the effectiveness of MHTN.

Entities:  

Year:  2018        PMID: 30530383     DOI: 10.1109/TCYB.2018.2879846

Source DB:  PubMed          Journal:  IEEE Trans Cybern        ISSN: 2168-2267            Impact factor:   11.448


  2 in total

1.  Application of Radar Solutions for the Purpose of Bird Tracking Systems Based on Video Observation.

Authors:  Ksawery Krenc; Dawid Gradolewski; Damian Dziak; Adam Kawalec
Journal:  Sensors (Basel)       Date:  2022-05-11       Impact factor: 3.847

2.  Design of Neural Network Model for Cross-Media Audio and Video Score Recognition Based on Convolutional Neural Network Model.

Authors:  Hongxia Liu
Journal:  Comput Intell Neurosci       Date:  2022-06-13
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.