| Literature DB >> 35166079 |
Jeoung Kun Kim1, Yoo Jin Choo2, Gyu Sang Choi3, Hyunkwang Shin3, Min Cheol Chang4, Donghwi Park5.
Abstract
BACKGROUND: Videofluoroscopic swallowing study (VFSS) is currently considered the gold standard to precisely diagnose and quantitatively investigate dysphagia. However, VFSS interpretation is complex and requires consideration of several factors. Therefore, considering the expected impact on dysphagia management, this study aimed to apply deep learning to detect the presence of penetration or aspiration in VFSS of patients with dysphagia automatically.Entities:
Keywords: Deep Learning; Deglutition; Swallowing Reflex; VFSS
Mesh:
Year: 2022 PMID: 35166079 PMCID: PMC8845107 DOI: 10.3346/jkms.2022.37.e42
Source DB: PubMed Journal: J Korean Med Sci ISSN: 1011-8934 Impact factor: 2.153
Fig. 1The steps of the modeling process applied in this study.
VFSS = videofluoroscopic swallowing study.
Fig. 2ROC curve for the data validation models. The AUC of the validation dataset of the VFSS images for the convolutional neural network model was 0.942 for normal findings, 0.878 for penetration, and 1.000 for aspiration. For calculating the average AUC, both macro and micro average AUC was employed. Macro average AUC was 0.940 and micro average AUC was 0.961.
AUC = area under the curve, ROC = receiver operating characteristic, VFSS = videofluoroscopic swallowing study.
Performances of the deep-learning model
| Sample size (patients) | 133, 70% for training, 57, 30% for validation, total 190 | |||
| Sample ratio (patients) | Normal: 113, 59.47%; penetration: 32, 16.84%; aspiration: 45, 23.68% | |||
| Sample size (images) | 665, 70% for training, 285, 30% for validation, total 950 each for high-peak and low-peak images | |||
| Sample ratio (images) | Normal: 690, 72.63%; penetration:147, 15.47%; aspiration: 213, 22.42% for high-peak images | |||
| Normal: 700, 73.68%; penetration: 40, 4.21%; aspiration: 210, 22.11% for low-peak images | ||||
| CNN model | Model for high-peak images | Model for low-peak images | ||
| -MobileNet with fine-tuning | -MobileNet with fine-tuning | |||
| -SGD optimizer, relu activation | -SGD optimizer, elu activation | |||
| -Data augmentation, dropout and early stopping for reducing overfitting | -Data augmentation, dropout and early stopping for reducing overfitting | |||
| -Image size 320 × 180 × 3 as input | -Image size 320 × 180 × 3 as input | |||
| -Training accuracy: 100% | -Training accuracy: 100% | |||
| -Validation accuracy: 93.68% | -Validation accuracy: 93.68% | |||
| VFSS classifier performance | Classifier of high-peak images for individual patient | Classifier of low-peak images for individual patient | ||
| -Training accuracy: 100% | -Training accuracy: 100% | |||
| -Validation accuracy: 94.74% | -Validation accuracy: 94.74% | |||
| VFSS integrated classifier performance | -Training accuracy: 100%, validation accuracy: 94.74% | |||
| -Validation ROC AUC for normal 0.942, penetration 0.878, aspiration 1.000 | ||||
| -Validation macro average ROC AUC 0.940, micro average ROC AUC 0.961 | ||||
CNN = convolutional neural network, SGD = stochastic gradient descent, VFSS = videofluoroscopic swallowing study, ROC = receiver operating characteristics, AUC = area under the curve.
The criteria for the integration of the classification results of high-peak and low-peak images
| Classification model | Dysphagia classification criteria |
|---|---|
| Initial classifier in each high-peak and low-peak images | Normal: NI ≥ 4 |
| Penetration: NI < 4 and AI = 0 | |
| Aspiration: NI < 4 and AI ≥ 1 | |
| Integrated classifier (final decision) | Normal: N = 2 |
| Penetration: N ≤ 1 and A = 0 | |
| Aspiration: N ≤ 1 and A ≥ 1 |
NI = normal image, AI = aspiration image, N = normal decision, A = aspiration decision.
Characteristics of patients with dysphagia who were included in this study
| Characteristics | Values | |
|---|---|---|
| Age, yr | 66.83 ± 15.47 | |
| Sex, male:female | 92:88 | |
| Normal:penetration:aspiration | 113 (59.47):32 (16.84):45 (23.68) | |
| Cause | ||
| Stroke | 92 (48.42) | |
| Spinal cord injury, cervical level | 16 (8.42) | |
| Parkinson's disease | 15 (7.89) | |
| Motor neuron disease | 19 (10.00) | |
| Dementia | 23 (12.11) | |
| Deconditioning | 25 (13.16) | |
Values are presented as mean ± SD or number (%).