Literature DB >> 32040514

Fully automatic segmentation of glottis and vocal folds in endoscopic laryngeal high-speed videos using a deep Convolutional LSTM Network.

Mona Kirstin Fehling1, Fabian Grosch1, Maria Elke Schuster2, Bernhard Schick3, Jörg Lohscheller1.   

Abstract

The objective investigation of the dynamic properties of vocal fold vibrations demands the recording and further quantitative analysis of laryngeal high-speed video (HSV). Quantification of the vocal fold vibration patterns requires as a first step the segmentation of the glottal area within each video frame from which the vibrating edges of the vocal folds are usually derived. Consequently, the outcome of any further vibration analysis depends on the quality of this initial segmentation process. In this work we propose for the first time a procedure to fully automatically segment not only the time-varying glottal area but also the vocal fold tissue directly from laryngeal high-speed video (HSV) using a deep Convolutional Neural Network (CNN) approach. Eighteen different Convolutional Neural Network (CNN) network configurations were trained and evaluated on totally 13,000 high-speed video (HSV) frames obtained from 56 healthy and 74 pathologic subjects. The segmentation quality of the best performing Convolutional Neural Network (CNN) model, which uses Long Short-Term Memory (LSTM) cells to take also the temporal context into account, was intensely investigated on 15 test video sequences comprising 100 consecutive images each. As performance measures the Dice Coefficient (DC) as well as the precisions of four anatomical landmark positions were used. Over all test data a mean Dice Coefficient (DC) of 0.85 was obtained for the glottis and 0.91 and 0.90 for the right and left vocal fold (VF) respectively. The grand average precision of the identified landmarks amounts 2.2 pixels and is in the same range as comparable manual expert segmentations which can be regarded as Gold Standard. The method proposed here requires no user interaction and overcomes the limitations of current semiautomatic or computational expensive approaches. Thus, it allows also for the analysis of long high-speed video (HSV)-sequences and holds the promise to facilitate the objective analysis of vocal fold vibrations in clinical routine. The here used dataset including the ground truth will be provided freely for all scientific groups to allow a quantitative benchmarking of segmentation approaches in future.

Entities:  

Year:  2020        PMID: 32040514      PMCID: PMC7010264          DOI: 10.1371/journal.pone.0227791

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


  55 in total

1.  Spatio-temporal analysis of irregular vocal fold oscillations: biphonation due to desynchronization of spatial modes.

Authors:  J Neubauer; P Mergell; U Eysholdt; H Herzel
Journal:  J Acoust Soc Am       Date:  2001-12       Impact factor: 1.840

2.  Analysis of vocal-fold vibrations from high-speed laryngeal images using a Hilbert transform-based methodology.

Authors:  Yuling Yan; Kartini Ahmad; Melda Kunduk; Diane Bless
Journal:  J Voice       Date:  2005-06       Impact factor: 2.009

3.  Measurement and reliability: statistical thinking considerations.

Authors:  J J Bartko
Journal:  Schizophr Bull       Date:  1991       Impact factor: 9.306

4.  A Multi-scale U-Net for Semantic Segmentation of Histological Images from Radical Prostatectomies.

Authors:  Jiayun Li; Karthik V Sarma; King Chung Ho; Arkadiusz Gertych; Beatrice S Knudsen; Corey W Arnold
Journal:  AMIA Annu Symp Proc       Date:  2018-04-16

Review 5.  A survey on deep learning in medical image analysis.

Authors:  Geert Litjens; Thijs Kooi; Babak Ehteshami Bejnordi; Arnaud Arindra Adiyoso Setio; Francesco Ciompi; Mohsen Ghafoorian; Jeroen A W M van der Laak; Bram van Ginneken; Clara I Sánchez
Journal:  Med Image Anal       Date:  2017-07-26       Impact factor: 8.545

6.  Quantifying spatiotemporal properties of vocal fold dynamics based on a multiscale analysis of phonovibrograms.

Authors:  Jakob Unger; Dietmar J Hecker; Melda Kunduk; Maria Schuster; Bernhard Schick; Joerg Lohscheller
Journal:  IEEE Trans Biomed Eng       Date:  2014-04-23       Impact factor: 4.538

7.  High-speed Videolaryngoscopy: Quantitative Parameters of Glottal Area Waveforms and High-speed Kymography in Healthy Individuals.

Authors:  Monike Tsutsumi; Seiji Isotani; Regina Aparecida Pimenta; Maria Eugenia Dajer; Adriana Hachiya; Domingos Hiroshi Tsuji; Niro Tayama; Hisayuki Yokonishi; Hiroshi Imagawa; Akihito Yamauchi; Shingo Takano; Ken-Ichi Sakakibara; Arlindo Neto Montagnoli
Journal:  J Voice       Date:  2016-10-25       Impact factor: 2.009

8.  Dermatologist-level classification of skin cancer with deep neural networks.

Authors:  Andre Esteva; Brett Kuprel; Roberto A Novoa; Justin Ko; Susan M Swetter; Helen M Blau; Sebastian Thrun
Journal:  Nature       Date:  2017-01-25       Impact factor: 49.962

9.  A dataset of laryngeal endoscopic images with comparative study on convolution neural network-based semantic segmentation.

Authors:  Max-Heinrich Laves; Jens Bicker; Lüder A Kahrs; Tobias Ortmaier
Journal:  Int J Comput Assist Radiol Surg       Date:  2019-01-16       Impact factor: 2.924

10.  Interactive Medical Image Segmentation Using Deep Learning With Image-Specific Fine Tuning.

Authors:  Guotai Wang; Wenqi Li; Maria A Zuluaga; Rosalind Pratt; Premal A Patel; Michael Aertsen; Tom Doel; Anna L David; Jan Deprest; Sebastien Ourselin; Tom Vercauteren
Journal:  IEEE Trans Med Imaging       Date:  2018-07       Impact factor: 10.048

View more
  13 in total

1.  LARNet-STC: Spatio-temporal orthogonal region selection network for laryngeal closure detection in endoscopy videos.

Authors:  Yang Yang Wang; Ali S Hamad; Kannappan Palaniappan; Teresa E Lever; Filiz Bunyak
Journal:  Comput Biol Med       Date:  2022-02-28       Impact factor: 4.589

2.  Videomics of the Upper Aero-Digestive Tract Cancer: Deep Learning Applied to White Light and Narrow Band Imaging for Automatic Segmentation of Endoscopic Images.

Authors:  Muhammad Adeel Azam; Claudio Sampieri; Alessandro Ioppi; Pietro Benzi; Giorgio Gregory Giordano; Marta De Vecchi; Valentina Campagnari; Shunlei Li; Luca Guastini; Alberto Paderno; Sara Moccia; Cesare Piazza; Leonardo S Mattos; Giorgio Peretti
Journal:  Front Oncol       Date:  2022-06-01       Impact factor: 5.738

3.  Genetics and voice production in childhood and adolescence - a review.

Authors:  Mette Pedersen; Anders Overgård Jønsson; Christian F Larsen
Journal:  Int J Pediatr Adolesc Med       Date:  2021-02-19

4.  BAGLS, a multihospital Benchmark for Automatic Glottis Segmentation.

Authors:  Pablo Gómez; Andreas M Kist; Patrick Schlegel; David A Berry; Dinesh K Chhetri; Stephan Dürr; Matthias Echternach; Aaron M Johnson; Stefan Kniesburges; Melda Kunduk; Youri Maryn; Anne Schützenberger; Monique Verguts; Michael Döllinger
Journal:  Sci Data       Date:  2020-06-19       Impact factor: 6.444

5.  A Hybrid Machine-Learning-Based Method for Analytic Representation of the Vocal Fold Edges during Connected Speech.

Authors:  Ahmed M Yousef; Dimitar D Deliyski; Stephanie R C Zacharias; Alessandro de Alarcon; Robert F Orlikoff; Maryam Naghibolhosseini
Journal:  Appl Sci (Basel)       Date:  2021-01-27       Impact factor: 2.679

6.  Novel Autosegmentation Spatial Similarity Metrics Capture the Time Required to Correct Segmentations Better Than Traditional Metrics in a Thoracic Cavity Segmentation Workflow.

Authors:  Kendall J Kiser; Arko Barman; Sonja Stieb; Clifton D Fuller; Luca Giancardo
Journal:  J Digit Imaging       Date:  2021-05-23       Impact factor: 4.056

7.  A single latent channel is sufficient for biomedical glottis segmentation.

Authors:  Andreas M Kist; Katharina Breininger; Marion Dörrich; Stephan Dürr; Anne Schützenberger; Marion Semmler
Journal:  Sci Rep       Date:  2022-08-22       Impact factor: 4.996

8.  OpenHSV: an open platform for laryngeal high-speed videoendoscopy.

Authors:  Andreas M Kist; Stephan Dürr; Anne Schützenberger; Michael Döllinger
Journal:  Sci Rep       Date:  2021-07-02       Impact factor: 4.379

9.  Rethinking glottal midline detection.

Authors:  Andreas M Kist; Julian Zilker; Pablo Gómez; Anne Schützenberger; Michael Döllinger
Journal:  Sci Rep       Date:  2020-11-26       Impact factor: 4.379

10.  Segmentation of Glottal Images from High-Speed Videoendoscopy Optimized by Synchronous Acoustic Recordings.

Authors:  Bartosz Kopczynski; Ewa Niebudek-Bogusz; Wioletta Pietruszewska; Pawel Strumillo
Journal:  Sensors (Basel)       Date:  2022-02-23       Impact factor: 3.576

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.