E G Devine1, S A Gaehde, A C Curtis. 1. Boston Veterans Administration Medical Center, Boston, Massachusetts 02130, USA. devine.eric@boston.va.gov
Abstract
OBJECTIVE: To compare out-of-box performance of three commercially available continuous speech recognition software packages: IBM ViaVoice 98 with General Medicine Vocabulary; Dragon Systems NaturallySpeaking Medical Suite, version 3.0; and L&H Voice Xpress for Medicine, General Medicine Edition, version 1.2. DESIGN: Twelve physicians completed minimal training with each software package and then dictated a medical progress note and discharge summary drawn from actual records. MEASUREMENTS: Errors in recognition of medical vocabulary, medical abbreviations, and general English vocabulary were compared across packages using a rigorous, standardized approach to scoring. RESULTS: The IBM software was found to have the lowest mean error rate for vocabulary recognition (7.0 to 9.1 percent) followed by the L&H software (13.4 to 15.1 percent) and then Dragon software (14.1 to 15.2 percent). The IBM software was found to perform better than both the Dragon and the L&H software in the recognition of general English vocabulary and medical abbreviations. CONCLUSION: This study is one of a few attempts at a robust evaluation of the performance of continuous speech recognition software. Results of this study suggest that with minimal training, the IBM software outperforms the other products in the domain of general medicine; however, results may vary with domain. Additional training is likely to improve the out-of-box performance of all three products. Although the IBM software was found to have the lowest overall error rate, successive generations of speech recognition software are likely to surpass the accuracy rates found in this investigation.
OBJECTIVE: To compare out-of-box performance of three commercially available continuous speech recognition software packages: IBM ViaVoice 98 with General Medicine Vocabulary; Dragon Systems NaturallySpeaking Medical Suite, version 3.0; and L&H Voice Xpress for Medicine, General Medicine Edition, version 1.2. DESIGN: Twelve physicians completed minimal training with each software package and then dictated a medical progress note and discharge summary drawn from actual records. MEASUREMENTS: Errors in recognition of medical vocabulary, medical abbreviations, and general English vocabulary were compared across packages using a rigorous, standardized approach to scoring. RESULTS: The IBM software was found to have the lowest mean error rate for vocabulary recognition (7.0 to 9.1 percent) followed by the L&H software (13.4 to 15.1 percent) and then Dragon software (14.1 to 15.2 percent). The IBM software was found to perform better than both the Dragon and the L&H software in the recognition of general English vocabulary and medical abbreviations. CONCLUSION: This study is one of a few attempts at a robust evaluation of the performance of continuous speech recognition software. Results of this study suggest that with minimal training, the IBM software outperforms the other products in the domain of general medicine; however, results may vary with domain. Additional training is likely to improve the out-of-box performance of all three products. Although the IBM software was found to have the lowest overall error rate, successive generations of speech recognition software are likely to surpass the accuracy rates found in this investigation.
Authors: Edward C Callaway; Clifford F Sweet; Eliot Siegel; John M Reiser; Douglas P Beall Journal: J Digit Imaging Date: 2002-04-30 Impact factor: 4.056
Authors: David N Mohr; David W Turner; Gregory R Pond; Joseph S Kamath; Cathy B De Vos; Paul C Carpenter Journal: J Am Med Inform Assoc Date: 2003 Jan-Feb Impact factor: 4.497
Authors: Anthony D Harris; Jessina C McGregor; Eli N Perencevich; Jon P Furuno; Jingkun Zhu; Dan E Peterson; Joseph Finkelstein Journal: J Am Med Inform Assoc Date: 2005-10-12 Impact factor: 4.497
Authors: Linda Dawson; Maree Johnson; Hanna Suominen; Jim Basilakis; Paula Sanchez; Dominique Estival; Barbara Kelly; Leif Hanlen Journal: J Med Syst Date: 2014-05-15 Impact factor: 4.460
Authors: Hanna Suominen; Maree Johnson; Liyuan Zhou; Paula Sanchez; Raul Sirel; Jim Basilakis; Leif Hanlen; Dominique Estival; Linda Dawson; Barbara Kelly Journal: J Am Med Inform Assoc Date: 2014-10-21 Impact factor: 4.497