Literature DB >> 33979177

Surface Electromyography-Based Recognition, Synthesis, and Perception of Prosodic Subvocal Speech.

Jennifer M Vojtech1,2, Michael D Chan1, Bhawna Shiwani1, Serge H Roy1, James T Heaton3, Geoffrey S Meltzner4, Paola Contessa1, Gianluca De Luca1, Rupal Patel4,5, Joshua C Kline1.   

Abstract

Purpose This study aimed to evaluate a novel communication system designed to translate surface electromyographic (sEMG) signals from articulatory muscles into speech using a personalized, digital voice. The system was evaluated for word recognition, prosodic classification, and listener perception of synthesized speech. Method sEMG signals were recorded from the face and neck as speakers with (n = 4) and without (n = 4) laryngectomy subvocally recited (silently mouthed) a speech corpus comprising 750 phrases (150 phrases with variable phrase-level stress). Corpus tokens were then translated into speech via personalized voice synthesis (n = 8 synthetic voices) and compared against phrases produced by each speaker when using their typical mode of communication (n = 4 natural voices, n = 4 electrolaryngeal [EL] voices). Naïve listeners (n = 12) evaluated synthetic, natural, and EL speech for acceptability and intelligibility in a visual sort-and-rate task, as well as phrasal stress discriminability via a classification mechanism. Results Recorded sEMG signals were processed to translate sEMG muscle activity into lexical content and categorize variations in phrase-level stress, achieving a mean accuracy of 96.3% (SD = 3.10%) and 91.2% (SD = 4.46%), respectively. Synthetic speech was significantly higher in acceptability and intelligibility than EL speech, also leading to greater phrasal stress classification accuracy, whereas natural speech was rated as the most acceptable and intelligible, with the greatest phrasal stress classification accuracy. Conclusion This proof-of-concept study establishes the feasibility of using subvocal sEMG-based alternative communication not only for lexical recognition but also for prosodic communication in healthy individuals, as well as those living with vocal impairments and residual articulatory function. Supplemental Material https://doi.org/10.23641/asha.14558481.

Entities:  

Mesh:

Year:  2021        PMID: 33979177      PMCID: PMC8740708          DOI: 10.1044/2021_JSLHR-20-00257

Source DB:  PubMed          Journal:  J Speech Lang Hear Res        ISSN: 1092-4388            Impact factor:   2.297


  65 in total

1.  Association of orofacial muscle activity and movement during changes in speech rate and intensity.

Authors:  Michael D McClean; Stephen M Tasko
Journal:  J Speech Lang Hear Res       Date:  2003-12       Impact factor: 2.297

2.  Surface electromyographic activity in total laryngectomy patients following laryngeal nerve transfer to neck strap muscles.

Authors:  James T Heaton; Ehab A Goldstein; James B Kobler; Steven M Zeitels; Gregory W Randolph; Michael J Walsh; John E Gooey; Robert E Hillman
Journal:  Ann Otol Rhinol Laryngol       Date:  2004-09       Impact factor: 1.547

3.  Impact of aberrant acoustic properties on the perception of sound quality in electrolarynx speech.

Authors:  Geoffrey S Meltzner; Robert E Hillman
Journal:  J Speech Lang Hear Res       Date:  2005-08       Impact factor: 2.297

4.  Multi-stream HMM for EMG-based speech recognition.

Authors:  H Manabe; Z Zhang
Journal:  Conf Proc IEEE Eng Med Biol Soc       Date:  2004

5.  Voluntary control of motor units in human antagonist muscles: coactivation and reciprocal activation.

Authors:  C J De Luca; B Mambrito
Journal:  J Neurophysiol       Date:  1987-09       Impact factor: 2.714

6.  Intelligibility of clear speech: effect of instruction.

Authors:  Jennifer Lam; Kris Tjaden
Journal:  J Speech Lang Hear Res       Date:  2013-06-24       Impact factor: 2.297

7.  Training effects on speech production using a hands-free electromyographically controlled electrolarynx.

Authors:  Ehab A Goldstein; James T Heaton; Cara E Stepp; Robert E Hillman
Journal:  J Speech Lang Hear Res       Date:  2007-04       Impact factor: 2.297

8.  Speaking and Hearing Clearly: Talker and Listener Factors in Speaking Style Changes.

Authors:  Rajka Smiljanić; Ann R Bradlow
Journal:  Lang Linguist Compass       Date:  2009-01-01

9.  Extracting time-frequency feature of single-channel vastus medialis EMG signals for knee exercise pattern recognition.

Authors:  Yi Zhang; Peiyang Li; Xuyang Zhu; Steven W Su; Qing Guo; Peng Xu; Dezhong Yao
Journal:  PLoS One       Date:  2017-07-10       Impact factor: 3.240

10.  Interrater reliability: the kappa statistic.

Authors:  Mary L McHugh
Journal:  Biochem Med (Zagreb)       Date:  2012       Impact factor: 2.313

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.