| Literature DB >> 34194682 |
Junya Wu1, Tianshu Zhou2, Yufan Guo3, Yu Tian1, Yuting Lou3, Hua Ru2, Jianhua Feng3, Jingsong Li1,2.
Abstract
A clinical diagnosis of tic disorder involves several complex processes, among which observation and evaluation of patient behavior usually require considerable time and effective cooperation between the doctor and the patient. The existing assessment scale has been simplified into qualitative and quantitative assessments of movements and sound twitches over a certain period, but it must still be completed manually. Therefore, we attempt to find an automatic method for detecting tic movement to assist in diagnosis and evaluation. Based on real clinical data, we propose a deep learning architecture that combines both unsupervised and supervised learning methods and learns features from videos for tic motion detection. The model is trained using leave-one-subject-out cross-validation for both binary and multiclass classification tasks. For these tasks, the model reaches average recognition precisions of 86.33% and 86.26% and recalls of 77.07% and 78.78%, respectively. The visualization of features learned from the unsupervised stage indicates the distinguishability of the two types of tics and the nontic. Further evaluation results suggest its potential clinical application for auxiliary diagnoses and evaluations of treatment effects.Entities:
Mesh:
Year: 2021 PMID: 34194682 PMCID: PMC8203362 DOI: 10.1155/2021/5531186
Source DB: PubMed Journal: J Healthc Eng ISSN: 2040-2295 Impact factor: 2.682
Figure 1The architecture of the proposed method. (1) Stage 1: extracting representative visual features. (2) Stage 2: training an LSTM using visual features.
Original TS video dataset.
| Category | Labeled dataset | Unlabeled dataset |
|---|---|---|
| Videos | 13 | 55 |
| Minutes | 136 | 709 |
Figure 2Category proportion of normalized video clips of patients in (a) the multiclass classification task and (b) the binary classification task. The ordinate indicates the patient number in the labeled TS dataset.
Figure 3Region of interest in data preprocessing. (a) is the output of the MTCNN, and (b), (c), and (d) are the random data augmentation methods applied.
Evaluations of the multiclass classification task.
| Accuracy | Precision | Recall | F1-score | |
|---|---|---|---|---|
| C3D [ | 0.7252 (±0.108) | 0.7483 (±0.047) | 0.7023 (±0.051) | 0.7194 (±0.032) |
| TSN [ | 0.8988 (±0.117) | 0.8354 (±0.089) | 0.7284 (±0.054) | 0.7600 (±0.070) |
| Ours-acc1 |
|
|
|
|
| Ours-f12 | 0.9363 (±0.0390) | 0.7628 (±0.209) | 0.7362 (±0.198) | 0.7391 (±0.198) |
1Ours-acc means the proposed architecture with the watch-accuracy strategy. 2Ours-f1 means the proposed architecture with the watch-F1 strategy. p value <0.01; p value < 0.001.
Figure 4Evaluation result of one subject. The confusion matrix is shown in the middle; the correct detection cases from the multiclass classification task are shown on the left; and the misclassification cases are shown on the right. For the sake of patient privacy, the images used in the cases were blurred.
Evaluations of binary classification task.
| Accuracy | AUC_ROC | AUC_PR | Precision | Recall | F1-score | |
|---|---|---|---|---|---|---|
| Ours-acc | 0.8890 (±0.0458) | 0.7532 (±0.080) | 0.7035 (±0.138) | 0.8057 (±0.103) | 0.7532 (±0.103) | 0.7634 (±0.093) |
| Ours-f1 |
|
|
|
|
|
|
Figure 5Visualization of two tic video clips of representation learning in Stage 1. The first row shows the original video clip frames; the second row shows the corresponding CAM image.
Evaluations of some items of scales.
| Test ID | Number of tic areas | Tic frequency (tics/min) | Time for evaluation (min) | ||||
|---|---|---|---|---|---|---|---|
| Clinician | Our model | Clinician | Our model | Clinician | Our model | Clinician review | |
| 1 | 2 | 2 | 6 | 5 | >40 | <5 | <5 |
| 2 | 2 | 2 | 6 | 7 | >40 | <5 | <5 |
| 3 | 1 | 2 | 2 | 1 | >30 | <5 | <5 |
| 4 | 1 | 1 | 40 | 37 | >70 | <5 | <10 |
| 5 | 1 | 1 | 3 | 1 | >30 | <5 | <5 |
| 6 | 1 | 1 | 9 | 12 | >50 | <5 | <10 |
| 7 | 1 | 1 | 15 | 14 | >60 | <5 | <5 |
| 8 | 1 | 1 | 0 | 0 | >30 | <5 | <5 |
| 9 | 1 | 1 | 1 | 1 | >30 | <5 | <5 |
| 10 | 1 | 1 | 11 | 8 | >50 | <5 | <10 |
| 11 | 2 | 2 | 3 | 3 | >30 | <5 | <5 |
| 12 | 1 | 0 | 0 | 0 | >30 | <5 | <5 |
| 13 | 1 | 2 | 4 | 2 | >40 | <5 | <5 |
|
|
|
|
| – | |||
Non-TS patient evaluation.
| No. | Accuracy | Number of clips |
|---|---|---|
| 1 | 0.9701 | 67 |
| 2 | 0.9531 | 192 |
| 3 | 0.9016 | 193 |
| 4 | 0.9890 | 91 |