Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Fully automatic segmentation of glottis and vocal folds in endoscopic laryngeal high-speed videos using a deep Convolutional LSTM Network.

Literature DB >> 32040514

Fully automatic segmentation of glottis and vocal folds in endoscopic laryngeal high-speed videos using a deep Convolutional LSTM Network.

Mona Kirstin Fehling¹, Fabian Grosch¹, Maria Elke Schuster², Bernhard Schick³, Jörg Lohscheller¹.

Abstract

The objective investigation of the dynamic properties of vocal fold vibrations demands the recording and further quantitative analysis of laryngeal high-speed video (HSV). Quantification of the vocal fold vibration patterns requires as a first step the segmentation of the glottal area within each video frame from which the vibrating edges of the vocal folds are usually derived. Consequently, the outcome of any further vibration analysis depends on the quality of this initial segmentation process. In this work we propose for the first time a procedure to fully automatically segment not only the time-varying glottal area but also the vocal fold tissue directly from laryngeal high-speed video (HSV) using a deep Convolutional Neural Network (CNN) approach. Eighteen different Convolutional Neural Network (CNN) network configurations were trained and evaluated on totally 13,000 high-speed video (HSV) frames obtained from 56 healthy and 74 pathologic subjects. The segmentation quality of the best performing Convolutional Neural Network (CNN) model, which uses Long Short-Term Memory (LSTM) cells to take also the temporal context into account, was intensely investigated on 15 test video sequences comprising 100 consecutive images each. As performance measures the Dice Coefficient (DC) as well as the precisions of four anatomical landmark positions were used. Over all test data a mean Dice Coefficient (DC) of 0.85 was obtained for the glottis and 0.91 and 0.90 for the right and left vocal fold (VF) respectively. The grand average precision of the identified landmarks amounts 2.2 pixels and is in the same range as comparable manual expert segmentations which can be regarded as Gold Standard. The method proposed here requires no user interaction and overcomes the limitations of current semiautomatic or computational expensive approaches. Thus, it allows also for the analysis of long high-speed video (HSV)-sequences and holds the promise to facilitate the objective analysis of vocal fold vibrations in clinical routine. The here used dataset including the ground truth will be provided freely for all scientific groups to allow a quantitative benchmarking of segmentation approaches in future.

Entities: Chemical Disease Gene Species

Year: 2020 PMID： 32040514 PMCID： PMC7010264 DOI： 10.1371/journal.pone.0227791

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.240

55 in total

1. Spatio-temporal analysis of irregular vocal fold oscillations: biphonation due to desynchronization of spatial modes.

Authors: J Neubauer; P Mergell; U Eysholdt; H Herzel
Journal: J Acoust Soc Am Date: 2001-12 Impact factor: 1.840

2. Analysis of vocal-fold vibrations from high-speed laryngeal images using a Hilbert transform-based methodology.

Authors: Yuling Yan; Kartini Ahmad; Melda Kunduk; Diane Bless
Journal: J Voice Date: 2005-06 Impact factor: 2.009

3. Measurement and reliability: statistical thinking considerations.

Authors: J J Bartko
Journal: Schizophr Bull Date: 1991 Impact factor: 9.306

4. A Multi-scale U-Net for Semantic Segmentation of Histological Images from Radical Prostatectomies.

Authors: Jiayun Li; Karthik V Sarma; King Chung Ho; Arkadiusz Gertych; Beatrice S Knudsen; Corey W Arnold
Journal: AMIA Annu Symp Proc Date: 2018-04-16

Review 5. A survey on deep learning in medical image analysis.

Authors: Geert Litjens; Thijs Kooi; Babak Ehteshami Bejnordi; Arnaud Arindra Adiyoso Setio; Francesco Ciompi; Mohsen Ghafoorian; Jeroen A W M van der Laak; Bram van Ginneken; Clara I Sánchez
Journal: Med Image Anal Date: 2017-07-26 Impact factor: 8.545

6. Quantifying spatiotemporal properties of vocal fold dynamics based on a multiscale analysis of phonovibrograms.

Authors: Jakob Unger; Dietmar J Hecker; Melda Kunduk; Maria Schuster; Bernhard Schick; Joerg Lohscheller
Journal: IEEE Trans Biomed Eng Date: 2014-04-23 Impact factor: 4.538

7. High-speed Videolaryngoscopy: Quantitative Parameters of Glottal Area Waveforms and High-speed Kymography in Healthy Individuals.

Authors: Monike Tsutsumi; Seiji Isotani; Regina Aparecida Pimenta; Maria Eugenia Dajer; Adriana Hachiya; Domingos Hiroshi Tsuji; Niro Tayama; Hisayuki Yokonishi; Hiroshi Imagawa; Akihito Yamauchi; Shingo Takano; Ken-Ichi Sakakibara; Arlindo Neto Montagnoli
Journal: J Voice Date: 2016-10-25 Impact factor: 2.009

8. Dermatologist-level classification of skin cancer with deep neural networks.

Authors: Andre Esteva; Brett Kuprel; Roberto A Novoa; Justin Ko; Susan M Swetter; Helen M Blau; Sebastian Thrun
Journal: Nature Date: 2017-01-25 Impact factor: 49.962

9. A dataset of laryngeal endoscopic images with comparative study on convolution neural network-based semantic segmentation.

Authors: Max-Heinrich Laves; Jens Bicker; Lüder A Kahrs; Tobias Ortmaier
Journal: Int J Comput Assist Radiol Surg Date: 2019-01-16 Impact factor: 2.924

10. Interactive Medical Image Segmentation Using Deep Learning With Image-Specific Fine Tuning.

Authors: Guotai Wang; Wenqi Li; Maria A Zuluaga; Rosalind Pratt; Premal A Patel; Michael Aertsen; Tom Doel; Anna L David; Jan Deprest; Sebastien Ourselin; Tom Vercauteren
Journal: IEEE Trans Med Imaging Date: 2018-07 Impact factor: 10.048

13 in total

1. LARNet-STC: Spatio-temporal orthogonal region selection network for laryngeal closure detection in endoscopy videos.

Authors: Yang Yang Wang; Ali S Hamad; Kannappan Palaniappan; Teresa E Lever; Filiz Bunyak
Journal: Comput Biol Med Date: 2022-02-28 Impact factor: 4.589

2. Videomics of the Upper Aero-Digestive Tract Cancer: Deep Learning Applied to White Light and Narrow Band Imaging for Automatic Segmentation of Endoscopic Images.

Authors: Muhammad Adeel Azam; Claudio Sampieri; Alessandro Ioppi; Pietro Benzi; Giorgio Gregory Giordano; Marta De Vecchi; Valentina Campagnari; Shunlei Li; Luca Guastini; Alberto Paderno; Sara Moccia; Cesare Piazza; Leonardo S Mattos; Giorgio Peretti
Journal: Front Oncol Date: 2022-06-01 Impact factor: 5.738

3. Genetics and voice production in childhood and adolescence - a review.

Authors: Mette Pedersen; Anders Overgård Jønsson; Christian F Larsen
Journal: Int J Pediatr Adolesc Med Date: 2021-02-19

4. BAGLS, a multihospital Benchmark for Automatic Glottis Segmentation.

Authors: Pablo Gómez; Andreas M Kist; Patrick Schlegel; David A Berry; Dinesh K Chhetri; Stephan Dürr; Matthias Echternach; Aaron M Johnson; Stefan Kniesburges; Melda Kunduk; Youri Maryn; Anne Schützenberger; Monique Verguts; Michael Döllinger
Journal: Sci Data Date: 2020-06-19 Impact factor: 6.444

5. A Hybrid Machine-Learning-Based Method for Analytic Representation of the Vocal Fold Edges during Connected Speech.

Authors: Ahmed M Yousef; Dimitar D Deliyski; Stephanie R C Zacharias; Alessandro de Alarcon; Robert F Orlikoff; Maryam Naghibolhosseini
Journal: Appl Sci (Basel) Date: 2021-01-27 Impact factor: 2.679

6. Novel Autosegmentation Spatial Similarity Metrics Capture the Time Required to Correct Segmentations Better Than Traditional Metrics in a Thoracic Cavity Segmentation Workflow.

Authors: Kendall J Kiser; Arko Barman; Sonja Stieb; Clifton D Fuller; Luca Giancardo
Journal: J Digit Imaging Date: 2021-05-23 Impact factor: 4.056