Literature DB >> 35363608

Contextual Transformer Networks for Visual Recognition.

Yehao Li, Ting Yao, Yingwei Pan, Tao Mei.   

Abstract

Transformer with self-attention has led to the revolutionizing of NLP field, and recently inspires the emergence of Transformer-style architecture design with competitive results in numerous CV tasks. Nevertheless, most of existing designs directly employ self-attention over a 2D feature map to obtain the attention matrix based on pairs of isolated queries and keys, but leave the rich contexts among neighbor keys under-exploited. Here we design a novel Transformer-style module, i.e., Contextual Transformer (CoT) block, for visual recognition. It fully capitalizes on the contextual information among input keys to guide the learning of dynamic attention matrix and thus strengthens the capacity of visual representation. Technically, CoT block first contextually encodes input keys via 3×3 convolution, leading to a static contextual representation. We further concatenate the encoded keys with input queries to learn the dynamic multi-head attention matrix through two consecutive 1×1 convolutions. The learnt attention matrix is multiplied by values to achieve the dynamic contextual representation. The fusion of static and dynamic contextual representations are finally taken as outputs. Our CoT block can readily replace each 3×3 convolution in ResNet architectures, yielding a Transformer-style backbone named as Contextual Transformer Networks (CoTNet). Through extensive experiments over a wide range of applications, we validate the superiority of CoTNet as a stronger backbone.

Entities:  

Year:  2022        PMID: 35363608     DOI: 10.1109/TPAMI.2022.3164083

Source DB:  PubMed          Journal:  IEEE Trans Pattern Anal Mach Intell        ISSN: 0098-5589            Impact factor:   6.226


  2 in total

1.  CAFS: An Attention-Based Co-Segmentation Semi-Supervised Method for Nasopharyngeal Carcinoma Segmentation.

Authors:  Yitong Chen; Guanghui Han; Tianyu Lin; Xiujian Liu
Journal:  Sensors (Basel)       Date:  2022-07-05       Impact factor: 3.847

2.  An Attention-Based CoT-ResNet With Channel Shuffle Mechanism for Classification of Alzheimer's Disease Levels.

Authors:  Chao Li; Quan Wang; Xuebin Liu; Bingliang Hu
Journal:  Front Aging Neurosci       Date:  2022-07-11       Impact factor: 5.702

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.