Literature DB >> 33562715

An Attention-Enhanced Multi-Scale and Dual Sign Language Recognition Network Based on a Graph Convolution Network.

Lu Meng1, Ronghui Li1.   

Abstract

Sign language is the most important way of communication for hearing-impaired people. Research on sign language recognition can help normal people understand sign language. We reviewed the classic methods of sign language recognition, and the recognition accuracy is not high enough because of redundant information, human finger occlusion, motion blurring, the diversified signing styles of different people, and so on. To overcome these shortcomings, we propose a multi-scale and dual sign language recognition Network (SLR-Net) based on a graph convolutional network (GCN). The original input data was RGB videos. We first extracted the skeleton data from them and then used the skeleton data for sign language recognition. SLR-Net is mainly composed of three sub-modules: multi-scale attention network (MSA), multi-scale spatiotemporal attention network (MSSTA) and attention enhanced temporal convolution network (ATCN). MSA allows the GCN to learn the dependencies between long-distance vertices; MSSTA can directly learn the spatiotemporal features; ATCN allows the GCN network to better learn the long temporal dependencies. The three different attention mechanisms, multi-scale attention mechanism, spatiotemporal attention mechanism, and temporal attention mechanism, are proposed to further improve the robustness and accuracy. Besides, a keyframe extraction algorithm is proposed, which can greatly improve efficiency by sacrificing a little accuracy. Experimental results showed that our method can reach 98.08% accuracy rate in the CSL-500 dataset with a 500-word vocabulary. Even on the challenging dataset DEVISIGN-L with a 2000-word vocabulary, it also reached a 64.57% accuracy rate, outperforming other state-of-the-art sign language recognition methods.

Entities:  

Keywords:  GCN; attention mechanism; keyframes extraction; large-vocabulary; sign language recognition

Mesh:

Year:  2021        PMID: 33562715      PMCID: PMC7915156          DOI: 10.3390/s21041120

Source DB:  PubMed          Journal:  Sensors (Basel)        ISSN: 1424-8220            Impact factor:   3.576


  2 in total

1.  Weakly Supervised Learning with Multi-Stream CNN-LSTM-HMMs to Discover Sequential Parallelism in Sign Language Videos.

Authors:  Oscar Koller; Necati Cihan Camgoz; Hermann Ney; Richard Bowden
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2019-04-15       Impact factor: 6.226

2.  Convolutional and recurrent neural network for human activity recognition: Application on American sign language.

Authors:  Vincent Hernandez; Tomoya Suzuki; Gentiane Venture
Journal:  PLoS One       Date:  2020-02-19       Impact factor: 3.240

  2 in total
  1 in total

1.  Research on the Evaluation of Moral Education Effectiveness and Student Behavior in Universities under the Environment of Big Data.

Authors:  Rui Zhu
Journal:  Comput Intell Neurosci       Date:  2022-07-30
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.