Literature DB >> 31765305

Text-Guided Neural Network Training for Image Recognition in Natural Scenes and Medicine.

Zizhao Zhang, Pingjun Chen, Xiaoshuang Shi, Lin Yang.   

Abstract

Convolutional neural networks (CNNs) are widely recognized as the foundation for machine vision systems. The conventional rule of teaching CNNs to understand images requires training images with human annotated labels, without any additional instructions. In this article, we look into a new scope and explore the guidance from text for neural network training. We present two versions of attention mechanisms to facilitate interactions between visual and semantic information and encourage CNNs to effectively distill visual features by leveraging semantic features. In contrast to dedicated text-image joint embedding methods, our method realizes asynchronous training and inference behavior: a trained model can classify images, irrespective of the text availability. This characteristic substantially improves the model scalability to multiple (multimodal) vision tasks. We also apply the proposed method onto medical imaging, which learns from richer clinical knowledge and achieves attention-based interpretable decision-making. With comprehensive validation on two natural and two medical datasets, we demonstrate that our method can effectively make use of semantic knowledge to improve CNN performance. Our method performs substantial improvement on medical image datasets. Meanwhile, it achieves promising performance for multi-label image classification and caption-image retrieval as well as excellent performance for phrase-based and multi-object localization on public benchmarks.

Entities:  

Mesh:

Year:  2021        PMID: 31765305     DOI: 10.1109/TPAMI.2019.2955476

Source DB:  PubMed          Journal:  IEEE Trans Pattern Anal Mach Intell        ISSN: 0098-5589            Impact factor:   6.226


  3 in total

1.  Interactive thyroid whole slide image diagnostic system using deep representation.

Authors:  Pingjun Chen; Xiaoshuang Shi; Yun Liang; Yuan Li; Lin Yang; Paul D Gader
Journal:  Comput Methods Programs Biomed       Date:  2020-06-27       Impact factor: 5.428

2.  Rule-based automatic diagnosis of thyroid nodules from intraoperative frozen sections using deep learning.

Authors:  Yuan Li; Pingjun Chen; Zhiyuan Li; Hai Su; Lin Yang; Dingrong Zhong
Journal:  Artif Intell Med       Date:  2020-08-09       Impact factor: 7.011

3.  Visual Analysis of College Sports Performance Based on Multimodal Knowledge Graph Optimization Neural Network.

Authors:  Nan Zheng; Meng Sun; Ye Yang
Journal:  Comput Intell Neurosci       Date:  2022-07-01
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.