Literature DB >> 31880571

Voice Conversion for Persons with Amyotrophic Lateral Sclerosis.

Yunxin Zhao, Mili Kuruvilla-Dugdale, Minguang Song.   

Abstract

Amyotrophic lateral sclerosis (ALS) results in progressive paralysis of voluntary muscles throughout the body. As speech deteriorates, individuals rely on pre-programmed messages available on commercial speech generating devices to communicate using one of the generic electronic voices on the device. To replace these generic voices and restore vocal identity, our aim is to develop personalized voices for people with ALS via the approach of voice conversion. The task is challenging because very few people have large quantities of their premorbid healthy speech recorded. Therefore, we have to rely on small quantities of dysarthric speech concomitant with an individual's disease stage. Further, progressive fatigue prohibits acquisition of large speech datasets and individuals display a range of dysarthria severities resulting from breathing, voice, articulation, resonance, and prosody disturbances. As the first step to address these problems, we use healthy source speakers and propose the approach of combining a structured sparse spectral transform with multiple linear regression-based frequency warping prediction for spectral conversion, and interpolating the transformed spectral frames for speech rate modification. Our experimental data included four healthy source speakers from the ARCTIC dataset, and four target ALS speakers with mild to severe dysarthria, forming 16 speaker pairs. Subjective listening evaluations showed that on average, (i) the proposed approach improved speech intelligibility by about 80% over the target speakers' speech, (ii) the converted voice was 3 times more similar to the target speakers' speech than to the source speakers' speech, and (iii) the converted speech quality was close to the MOS scale "good" relative to the source speakers' speech being "excellent."

Entities:  

Mesh:

Year:  2019        PMID: 31880571      PMCID: PMC7314644          DOI: 10.1109/JBHI.2019.2961844

Source DB:  PubMed          Journal:  IEEE J Biomed Health Inform        ISSN: 2168-2194            Impact factor:   5.772


  5 in total

1.  Structured Sparse Spectral Transforms and Structural Measures for Voice Conversion.

Authors:  Yunxin Zhao; Mili Kuruvilla-Dugdale; Minguang Song
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2018-07-27

2.  Reliability and agreement of ratings of ataxic dysarthric speech samples with varying intelligibility.

Authors:  C Sheard; R D Adams; P J Davis
Journal:  J Speech Hear Res       Date:  1991-04

3.  Nontraumatic spinal cord injury: incidence, epidemiology, and functional outcome.

Authors:  W O McKinley; R T Seel; J T Hardman
Journal:  Arch Phys Med Rehabil       Date:  1999-06       Impact factor: 3.966

4.  Speaking and Hearing Clearly: Talker and Listener Factors in Speaking Style Changes.

Authors:  Rajka Smiljanić; Ann R Bradlow
Journal:  Lang Linguist Compass       Date:  2009-01-01

5.  Profiling Speech and Pausing in Amyotrophic Lateral Sclerosis (ALS) and Frontotemporal Dementia (FTD).

Authors:  Yana Yunusova; Naida L Graham; Sanjana Shellikeri; Kent Phuong; Madhura Kulkarni; Elizabeth Rochon; David F Tang-Wai; Tiffany W Chow; Sandra E Black; Lorne H Zinman; Jordan R Green
Journal:  PLoS One       Date:  2016-01-20       Impact factor: 3.240

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.