Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Contextual Transformer Networks for Visual Recognition.

Literature DB >> 35363608

Contextual Transformer Networks for Visual Recognition.

Yehao Li, Ting Yao, Yingwei Pan, Tao Mei.

Abstract

Transformer with self-attention has led to the revolutionizing of NLP field, and recently inspires the emergence of Transformer-style architecture design with competitive results in numerous CV tasks. Nevertheless, most of existing designs directly employ self-attention over a 2D feature map to obtain the attention matrix based on pairs of isolated queries and keys, but leave the rich contexts among neighbor keys under-exploited. Here we design a novel Transformer-style module, i.e., Contextual Transformer (CoT) block, for visual recognition. It fully capitalizes on the contextual information among input keys to guide the learning of dynamic attention matrix and thus strengthens the capacity of visual representation. Technically, CoT block first contextually encodes input keys via 3×3 convolution, leading to a static contextual representation. We further concatenate the encoded keys with input queries to learn the dynamic multi-head attention matrix through two consecutive 1×1 convolutions. The learnt attention matrix is multiplied by values to achieve the dynamic contextual representation. The fusion of static and dynamic contextual representations are finally taken as outputs. Our CoT block can readily replace each 3×3 convolution in ResNet architectures, yielding a Transformer-style backbone named as Contextual Transformer Networks (CoTNet). Through extensive experiments over a wide range of applications, we validate the superiority of CoTNet as a stronger backbone.

Entities: Chemical

Year: 2022 PMID： 35363608 DOI： 10.1109/TPAMI.2022.3164083

Source DB: PubMed Journal: IEEE Trans Pattern Anal Mach Intell ISSN： 0098-5589 Impact factor: 6.226

Keyword Cloud
Cited

2 in total

1. CAFS: An Attention-Based Co-Segmentation Semi-Supervised Method for Nasopharyngeal Carcinoma Segmentation.

Authors: Yitong Chen; Guanghui Han; Tianyu Lin; Xiujian Liu
Journal: Sensors (Basel) Date: 2022-07-05 Impact factor: 3.847

2. An Attention-Based CoT-ResNet With Channel Shuffle Mechanism for Classification of Alzheimer's Disease Levels.

Authors: Chao Li; Quan Wang; Xuebin Liu; Bingliang Hu
Journal: Front Aging Neurosci Date: 2022-07-11 Impact factor: 5.702

2 in total