Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 A Unified Framework for Multilingual Speech Recognition in Air Traffic Control Systems.

Literature DB >> 32833649

A Unified Framework for Multilingual Speech Recognition in Air Traffic Control Systems.

Yi Lin, Dongyue Guo, Jianwei Zhang, Zhengmao Chen, Bo Yang.

Abstract

This work focuses on robust speech recognition in air traffic control (ATC) by designing a novel processing paradigm to integrate multilingual speech recognition into a single framework using three cascaded modules: an acoustic model (AM), a pronunciation model (PM), and a language model (LM). The AM converts ATC speech into phoneme-based text sequences that the PM then translates into a word-based sequence, which is the ultimate goal of this research. The LM corrects both phoneme- and word-based errors in the decoding results. The AM, including the convolutional neural network (CNN) and recurrent neural network (RNN), considers the spatial and temporal dependences of the speech features and is trained by the connectionist temporal classification loss. To cope with radio transmission noise and diversity among speakers, a multiscale CNN architecture is proposed to fit the diverse data distributions and improve the performance. Phoneme-to-word translation is addressed via a proposed machine translation PM with an encoder-decoder architecture. RNN-based LMs are trained to consider the code-switching specificity of the ATC speech by building dependences with common words. We validate the proposed approach using large amounts of real Chinese and English ATC recordings and achieve a 3.95% label error rate on Chinese characters and English words, outperforming other popular approaches. The decoding efficiency is also comparable to that of the end-to-end model, and its generalizability is validated on several open corpora, making it suitable for real-time approaches to further support ATC applications, such as ATC prediction and safety checking.

Entities: Chemical

Mesh：

Year: 2021 PMID： 32833649 DOI： 10.1109/TNNLS.2020.3015830

Source DB: PubMed Journal: IEEE Trans Neural Netw Learn Syst ISSN： 2162-237X Impact factor: 10.451

Keyword Cloud
Cited

2 in total

1. Computation and memory optimized spectral domain convolutional neural network for throughput and energy-efficient inference.

Authors: Shahriyar Masud Rizvi; Ab Al-Hadi Ab Rahman; Usman Ullah Sheikh; Kazi Ahmed Asif Fuad; Hafiz Muhammad Faisal Shehzad
Journal: Appl Intell (Dordr) Date: 2022-06-11 Impact factor: 5.019

Review 2. A proposed artificial intelligence-based real-time speech-to-text to sign language translator for South African official languages for the COVID-19 era and beyond: In pursuit of solutions for the hearing impaired.

Authors: Milka C Madahana; Katijah Khoza-Shangase; Nomfundo Moroe; Daniel Mayombo; Otis Nyandoro; John Ekoru
Journal: S Afr J Commun Disord Date: 2022-08-19

2 in total