Literature DB >> 33717604

A Hybrid Machine-Learning-Based Method for Analytic Representation of the Vocal Fold Edges during Connected Speech.

Ahmed M Yousef1, Dimitar D Deliyski1, Stephanie R C Zacharias2, Alessandro de Alarcon3, Robert F Orlikoff4, Maryam Naghibolhosseini1.   

Abstract

Investigating the phonatory processes in connected speech from high-speed videoendoscopy (HSV) demands the accurate detection of the vocal fold edges during vibration. The present paper proposes a new spatio-temporal technique to automatically segment vocal fold edges in HSV data during running speech. The HSV data were recorded from a vocally normal adult during a reading of the "Rainbow Passage." The introduced technique was based on an unsupervised machine-learning (ML) approach combined with an active contour modeling (ACM) technique (also known as a hybrid approach). The hybrid method was implemented to capture the edges of vocal folds on different HSV kymograms, extracted at various cross-sections of vocal folds during vibration. The k-means clustering method, an ML approach, was first applied to cluster the kymograms to identify the clustered glottal area and consequently provided an initialized contour for the ACM. The ACM algorithm was then used to precisely detect the glottal edges of the vibrating vocal folds. The developed algorithm was able to accurately track the vocal fold edges across frames with low computational cost and high robustness against image noise. This algorithm offers a fully automated tool for analyzing the vibratory features of vocal folds in connected speech.

Entities:  

Keywords:  automated machine-learning-based edge detection; connected speech; high-speed videoendoscopy

Year:  2021        PMID: 33717604      PMCID: PMC7954580          DOI: 10.3390/app11031179

Source DB:  PubMed          Journal:  Appl Sci (Basel)        ISSN: 2076-3417            Impact factor:   2.679


  30 in total

1.  Acoustic and perceptual parameters relating to connected speech are more reliable measures of hoarseness than parameters relating to sustained vowels.

Authors:  Benjamin Halberstam
Journal:  ORL J Otorhinolaryngol Relat Spec       Date:  2004       Impact factor: 1.538

2.  Endoscope motion compensation for laryngeal high-speed videoendoscopy.

Authors:  Dimitar D Deliyski
Journal:  J Voice       Date:  2004-11-21       Impact factor: 2.009

3.  Clinically evaluated procedure for the reconstruction of vocal fold vibrations from endoscopic digital high-speed videos.

Authors:  Jörg Lohscheller; Hikmet Toy; Frank Rosanowski; Ulrich Eysholdt; Michael Döllinger
Journal:  Med Image Anal       Date:  2007-04-29       Impact factor: 8.545

4.  Observation and analysis of in vivo vocal fold tissue instabilities produced by nonlinear source-filter coupling: a case study.

Authors:  Matías Zañartu; Daryush D Mehta; Julio C Ho; George R Wodicka; Robert E Hillman
Journal:  J Acoust Soc Am       Date:  2011-01       Impact factor: 1.840

5.  Videostroboscopic evaluation of the larynx.

Authors:  D M Bless; M Hirano; R J Feder
Journal:  Ear Nose Throat J       Date:  1987-07       Impact factor: 1.697

6.  Task specificity in adductor spasmodic dysphonia versus muscle tension dysphonia.

Authors:  Nelson Roy; Manon Gouse; Shannon C Mauszycki; Ray M Merrill; Marshall E Smith
Journal:  Laryngoscope       Date:  2005-02       Impact factor: 3.325

7.  Automated measurement of vocal fold vibratory asymmetry from high-speed videoendoscopy recordings.

Authors:  Daryush D Mehta; Dimitar D Deliyski; Thomas F Quatieri; Robert E Hillman
Journal:  J Speech Lang Hear Res       Date:  2010-08-10       Impact factor: 2.297

8.  Temporal Segmentation for Laryngeal High-Speed Videoendoscopy in Connected Speech.

Authors:  Maryam Naghibolhosseini; Dimitar D Deliyski; Stephanie R C Zacharias; Alessandro de Alarcon; Robert F Orlikoff
Journal:  J Voice       Date:  2017-06-21       Impact factor: 2.009

9.  Voice production mechanisms following phonosurgical treatment of early glottic cancer.

Authors:  Daryush D Mehta; Dimitar D Deliyski; Steven M Zeitels; Thomas F Quatieri; Robert E Hillman
Journal:  Ann Otol Rhinol Laryngol       Date:  2010-01       Impact factor: 1.547

10.  BAGLS, a multihospital Benchmark for Automatic Glottis Segmentation.

Authors:  Pablo Gómez; Andreas M Kist; Patrick Schlegel; David A Berry; Dinesh K Chhetri; Stephan Dürr; Matthias Echternach; Aaron M Johnson; Stefan Kniesburges; Melda Kunduk; Youri Maryn; Anne Schützenberger; Monique Verguts; Michael Döllinger
Journal:  Sci Data       Date:  2020-06-19       Impact factor: 6.444

View more
  1 in total

1.  Detection of Vocal Fold Image Obstructions in High-Speed Videoendoscopy During Connected Speech in Adductor Spasmodic Dysphonia: A Convolutional Neural Networks Approach.

Authors:  Ahmed M Yousef; Dimitar D Deliyski; Stephanie R C Zacharias; Maryam Naghibolhosseini
Journal:  J Voice       Date:  2022-03-15       Impact factor: 2.300

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.