Literature DB >> 32949415

Hey Siri: How Effective are Common Voice Recognition Systems at Recognizing Dysphonic Voices?

Matthew L Rohlfing1, Daniel P Buckley1,2, Jacquelyn Piraquive1, Cara E Stepp2, Lauren F Tracy1.   

Abstract

OBJECTIVES/HYPOTHESIS: Interaction with voice recognition systems, such as Siri™ and Alexa™, is an increasingly important part of everyday life. Patients with voice disorders may have difficulty with this technology, leading to frustration and reduction in quality of life. This study evaluates the ability of common voice recognition systems to transcribe dysphonic voices. STUDY
DESIGN: Retrospective evaluation of "Rainbow Passage" voice samples from patients with and without voice disorders.
METHODS: Participants with (n = 30) and without (n = 23) voice disorders were recorded reading the "Rainbow Passage". Recordings were played at standardized intensity and distance-to-dictation programs on Apple iPhone 6S™, Apple iPhone 11 Pro™, and Google Voice™. Word recognition scores were calculated as the proportion of correctly transcribed words. Word recognition scores were compared to auditory-perceptual and acoustic measures.
RESULTS: Mean word recognition scores for participants with and without voice disorders were, respectively, 68.6% and 91.9% for Apple iPhone 6S™ (P < .001), 71.2% and 93.7% for Apple iPhone 11 Pro™ (P < .001), and 68.7% and 93.8% for Google Voice™ (P < .001). There were strong, approximately linear associations between CAPE-V ratings of overall severity of dysphonia and word recognition score, with correlation coefficients (R2 ) of 0.609 (iPhone 6S™), 0.670 (iPhone 11 Pro™), and 0.619 (Google Voice™). These relationships persisted when controlling for diagnosis, age, gender, fundamental frequency, and speech rate (P < .001 for all systems).
CONCLUSION: Common voice recognition systems function well with nondysphonic voices but are poor at accurately transcribing dysphonic voices. There was a strong negative correlation with word recognition scores and perceptual voice evaluation. As our society increasingly interfaces with automated voice recognition technology, the needs of patients with voice disorders should be considered. LEVEL OF EVIDENCE: 4 Laryngoscope, 131:1599-1607, 2021.
© 2020 American Laryngological, Rhinological and Otological Society Inc, "The Triological Society" and American Laryngological Association (ALA).

Entities:  

Keywords:  Dysphonia; hoarseness; mobile phone; technology; voice recognition

Mesh:

Year:  2020        PMID: 32949415      PMCID: PMC9009156          DOI: 10.1002/lary.29082

Source DB:  PubMed          Journal:  Laryngoscope        ISSN: 0023-852X            Impact factor:   2.970


  19 in total

1.  Consensus auditory-perceptual evaluation of voice: development of a standardized clinical protocol.

Authors:  Gail B Kempster; Bruce R Gerratt; Katherine Verdolini Abbott; Julie Barkmeier-Kraemer; Robert E Hillman
Journal:  Am J Speech Lang Pathol       Date:  2008-10-16       Impact factor: 2.408

2.  A Comparison of Cepstral Peak Prominence Measures From Two Acoustic Analysis Programs.

Authors:  Christopher R Watts; Shaheen N Awan; Youri Maryn
Journal:  J Voice       Date:  2016-10-15       Impact factor: 2.009

3.  The Impact of Dysphonic Voices on Healthy Listeners: Listener Reaction Times, Speech Intelligibility, and Listener Comprehension.

Authors:  Paul M Evitts; Heather Starmer; Kristine Teets; Christen Montgomery; Lauren Calhoun; Allison Schulze; Jenna MacKenzie; Lauren Adams
Journal:  Am J Speech Lang Pathol       Date:  2016-11-01       Impact factor: 2.408

4.  Measuring the impact of dysphonia on quality of life using health state preferences.

Authors:  Matthew R Naunheim; Leanne Goldberg; Jennifer B Dai; Benjamin J Rubinstein; Mark S Courey
Journal:  Laryngoscope       Date:  2019-06-20       Impact factor: 3.325

5.  Acoustic correlates of breathy vocal quality.

Authors:  J Hillenbrand; R A Cleveland; R L Erickson
Journal:  J Speech Hear Res       Date:  1994-08

6.  Assessing outcomes for dysphonic patients.

Authors:  M S Benninger; A S Ahuja; G Gardner; C Grywalski
Journal:  J Voice       Date:  1998-12       Impact factor: 2.009

7.  Speech intelligibility in severe adductor spasmodic dysphonia.

Authors:  Brenda K Bender; Michael P Cannito; Thomas Murry; Gayle E Woodson
Journal:  J Speech Lang Hear Res       Date:  2004-02       Impact factor: 2.297

8.  The quality of life impact of dysphonia.

Authors:  J A Wilson; I J Deary; A Millar; K Mackenzie
Journal:  Clin Otolaryngol Allied Sci       Date:  2002-06

Review 9.  Quality-of-life impact of non-neoplastic voice disorders: a meta-analysis.

Authors:  Seth M Cohen; William D Dupont; Mark S Courey
Journal:  Ann Otol Rhinol Laryngol       Date:  2006-02       Impact factor: 1.547

10.  Formant analysis in dysphonic patients and automatic Arabic digit speech recognition.

Authors:  Ghulam Muhammad; Tamer A Mesallam; Khalid H Malki; Mohamed Farahat; Mansour Alsulaiman; Manal Bukhari
Journal:  Biomed Eng Online       Date:  2011-05-30       Impact factor: 2.819

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.