| Literature DB >> 21875389 |
Tino Haderlein1, Elmar Nöth, Anton Batliner, Ulrich Eysholdt, Frank Rosanowski.
Abstract
Objective assessment of intelligibility on the telephone is desirable for voice and speech assessment and rehabilitation. A total of 82 patients after partial laryngectomy read a standardized text which was synchronously recorded by a headset and via telephone. Five experienced raters assessed intelligibility perceptually on a five-point scale. Objective evaluation was performed by support vector regression on the word accuracy (WA) and word correctness (WR) of a speech recognition system, and a set of prosodic features. WA and WR alone exhibited correlations to human evaluation between |r| = 0.57 and |r| = 0.75. The correlation was r = 0.79 for headset and r = 0.86 for telephone recordings when prosodic features and WR were combined. The best feature subset was optimal for both signal qualities. It consists of WR, the average duration of the silent pauses before a word, the standard deviation of the fundamental frequency on the entire sample, the standard deviation of jitter, and the ratio of the durations of the voiced sections and the entire recording.Entities:
Mesh:
Year: 2011 PMID: 21875389 DOI: 10.3109/14015439.2011.607470
Source DB: PubMed Journal: Logoped Phoniatr Vocol ISSN: 1401-5439 Impact factor: 1.487