Literature DB >> 35402977

Privacy-Preserving Deep Speaker Separation for Smartphone-Based Passive Speech Assessment.

Apiwat Ditthapron1, Emmanuel O Agu1, Adam C Lammert2.   

Abstract

Goal: Smartphones can be used to passively assess and monitor patients' speech impairments caused by ailments such as Parkinson's disease, Traumatic Brain Injury (TBI), Post-Traumatic Stress Disorder (PTSD) and neurodegenerative diseases such as Alzheimer's disease and dementia. However, passive audio recordings in natural settings often capture the speech of non-target speakers (cross-talk). Consequently, speaker separation, which identifies the target speakers' speech in audio recordings with two or more speakers' voices, is a crucial pre-processing step in such scenarios. Prior speech separation methods analyzed raw audio. However, in order to preserve speaker privacy, passively recorded smartphone audio and machine learning-based speech assessment are often performed on derived speech features such as Mel-Frequency Cepstral Coefficients (MFCCs). In this paper, we propose a novel Deep MFCC bAsed SpeaKer Separation (Deep-MASKS).
Methods: Deep-MASKS uses an autoencoder to reconstruct MFCC components of an individual's speech from an i-vector, x-vector or d-vector representation of their speech learned during the enrollment period. Deep-MASKS utilizes a Deep Neural Network (DNN) for MFCC signal reconstructions, which yields a more accurate, higher-order function compared to prior work that utilized a mask. Unlike prior work that operates on utterances, Deep-MASKS operates on continuous audio recordings.
Results: Deep-MASKS outperforms baselines, reducing the Mean Squared Error (MSE) of MFCC reconstruction by up to 44% and the number of additional bits required to represent clean speech entropy by 36%.

Entities:  

Keywords:  Impact Statement—The proposed Deep-MASKS mitigates cross-talk in speech encoded as MFCC features, which are widely utilized to preserve voice privacy in passive health assessment and other speech applications on smartphones; Mel-Frequency Cepstrum Coefficients (MFCCs); overlapped speech; speaker representation; speech separation

Year:  2021        PMID: 35402977      PMCID: PMC8940203          DOI: 10.1109/OJEMB.2021.3063994

Source DB:  PubMed          Journal:  IEEE Open J Eng Med Biol        ISSN: 2644-1276


  10 in total

1.  Classification of speech dysfluencies using LPC based parameterization techniques.

Authors:  M Hariharan; Lim Sin Chee; Ooi Chia Ai; Sazali Yaacob
Journal:  J Med Syst       Date:  2011-01-20       Impact factor: 4.460

2.  Speech impairment in a large sample of patients with Parkinson's disease.

Authors:  A K Ho; R Iansek; C Marigliani; J L Bradshaw; S Gates
Journal:  Behav Neurol       Date:  1999-01-01       Impact factor: 3.342

3.  Smartphones Offer New Opportunities in Clinical Voice Research.

Authors:  C Manfredi; J Lebacq; G Cantarella; J Schoentgen; S Orlandi; A Bandini; P H DeJonckere
Journal:  J Voice       Date:  2016-04-07       Impact factor: 2.009

4.  A voice-based automated system for PTSD screening and monitoring.

Authors:  Roger Xu; Gang Mei; Guangfan Zhang; Pan Gao; Timothy Judkins; Michael Cannizzaro; Jiang Li
Journal:  Stud Health Technol Inform       Date:  2012

5.  An Online Telepractice Model for the Prevention of Voice Disorders in Vocally Healthy Student Teachers Evaluated by a Smartphone Application.

Authors:  Elizabeth U Grillo
Journal:  Perspect ASHA Spec Interest Groups       Date:  2017-06-30

Review 6.  Evidence-based clinical voice assessment: a systematic review.

Authors:  Nelson Roy; Julie Barkmeier-Kraemer; Tanya Eadie; M Preeti Sivasankar; Daryush Mehta; Diane Paul; Robert Hillman
Journal:  Am J Speech Lang Pathol       Date:  2012-11-26       Impact factor: 2.408

7.  Smartphone Allows Capture of Speech Abnormalities Associated With High Risk of Developing Parkinson's Disease.

Authors:  Jan Rusz; Jan Hlavnicka; Tereza Tykalova; Michal Novotny; Petr Dusek; Karel Sonka; Evzen Ruzicka
Journal:  IEEE Trans Neural Syst Rehabil Eng       Date:  2018-06-29       Impact factor: 3.802

Review 8.  Connected Speech in Neurodegenerative Language Disorders: A Review.

Authors:  Veronica Boschi; Eleonora Catricalà; Monica Consonni; Cristiano Chesi; Andrea Moro; Stefano F Cappa
Journal:  Front Psychol       Date:  2017-03-06

9.  Enhancement of Neurocognitive Assessments Using Smartphone Capabilities: Systematic Review.

Authors:  John Michael Templeton; Christian Poellabauer; Sandra Schneider
Journal:  JMIR Mhealth Uhealth       Date:  2020-06-24       Impact factor: 4.773

10.  Atypical Repetition in Daily Conversation on Different Days for Detecting Alzheimer Disease: Evaluation of Phone-Call Data From Regular Monitoring Service.

Authors:  Yasunori Yamada; Kaoru Shinkawa; Keita Shimmei
Journal:  JMIR Ment Health       Date:  2020-01-12
  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.