| Literature DB >> 35556206 |
Yipei Wang1, Qianye Yang2, Lior Drukker3, Aris Papageorghiou3, Yipeng Hu4,2, J Alison Noble4.
Abstract
PURPOSE: For highly operator-dependent ultrasound scanning, skill assessment approaches evaluate operator competence given available data, such as acquired images and tracked probe movement. Operator skill level can be quantified by the completeness, speed, and precision of performing a clinical task, such as biometry. Such clinical tasks are increasingly becoming assisted or even replaced by automated machine learning models. In addition to measurement, operators need to be competent at the upstream task of acquiring images of sufficient quality. To provide computer assistance for this task requires a new definition of skill.Entities:
Keywords: Deep learning; Fetal ultrasound; Skill assessment; Ultrasound
Mesh:
Year: 2022 PMID: 35556206 PMCID: PMC9307537 DOI: 10.1007/s11548-022-02642-y
Source DB: PubMed Journal: Int J Comput Assist Radiol Surg ISSN: 1861-6410 Impact factor: 3.421
Fig. 1Overview of the task model-specific skill assessment framework
Task model performance for different data splits as varies
| Task | Split | Accuracy | Sensitivity | Specificity | Task | Split | Accuracy | Sensitivity | Specificity | ||
|---|---|---|---|---|---|---|---|---|---|---|---|
| HC | A | 0.01 | 0.58 | 0.79 | 0.52 | AC | C | 0.01 | 0.80 | 0.38 | 0.84 |
| 0.02 | 0.74 | 0.80 | 0.64 | 0.02 | 0.71 | 0.62 | 0.75 | ||||
| 0.03 | 0.73 | 0.71 | 0.82 | 0.03 | 0.71 | 0.58 | 0.82 | ||||
| 0.04 | 0.89 | 0.90 | 0.66 | 0.04 | 0.64 | 0.59 | 0.76 | ||||
| B | 0.01 | 0.68 | 0.34 | 0.82 | D | 0.01 | 0.60 | 0.57 | 0.60 | ||
| 0.02 | 0.67 | 0.60 | 0.80 | 0.02 | 0.73 | 0.69 | 0.75 | ||||
| 0.03 | 0.77 | 0.79 | 0.66 | 0.03 | 0.77 | 0.79 | 0.74 | ||||
| 0.04 | 0.85 | 0.87 | 0.56 | 0.04 | 0.65 | 0.61 | 0.77 |
Ablation study results for different input data modalities at
| Criterion | Data modality | RMSE | PCC | Criterion | Data modality | RMSE | PCC |
|---|---|---|---|---|---|---|---|
| Both | 0.164 ± 0.189 | − 0.176 ± 0.201 | Both | 0.171 ± 0.116 | 0.212 ± 0.257 | ||
| Motion | 0.204 ± 0.137 | 0.059 ± 0.161 | Motion | 0.209 ± 0.095 | 0.509 ± 0.191 | ||
| Video | 0.234 ± 0.075 | − 0.485 ± 0.218 | Video | 0.156 ± 0.085 | 0.798 ± 0.256 |
Ablation study results of , with different cut-off values, at
| Split | Cut off | Specificity | RMSE | PCC | Split | Cut off | Specificity | RMSE | PCC |
|---|---|---|---|---|---|---|---|---|---|
| A | 0.710 | 0.8 | 0.272 ± 0.185 | − 0.543 ± 0.323 | B | 0.702 | 0.8 | 0.318 ± 0.257 | − 0.017 ± 0.296 |
| 0.799 | 0.9 | 0.457 ± 0.197 | − 0.076 ± 0.165 | 0.713 | 0.9 | 0.474 ± 0.106 | 0.023 ± 0.310 |
Ablation study results of and , with different values, at
| Criterion | RMSE | PCC | Criterion | RMSE | PCC | ||
|---|---|---|---|---|---|---|---|
| 15 | 0.359 ± 0.146 | 0.373 ± 0.333 | 1 | 0.295 ± 0.111 | 0.457 ± 0.204 | ||
| 30 | 0.328 ± 0.159 | 0.459 ± 0.397 | 4 | 0.229 ± 0.090 | 0.510 ± 0.185 | ||
| 60 | 0.262 ± 0.133 | 0.278 ± 0.327 | 8 | 0.240 ± 0.096 | 0.294 ± 0.223 | ||
| 120 | 0.230 ± 0.156 | 0.097 ± 0.332 | 16 | 0.266 ± 0.135 | 0.415 ± 0.193 |
Performance of skill assessment predictor for the AC task
| Criterion | Split | RMSE | PCC | Criterion | Split | RMSE | PCC | ||
|---|---|---|---|---|---|---|---|---|---|
| C | 0.01 | 0.299 ± 0.075 | −0.602 ± 0.326 | C | 0.01 | 0.316 ± 0.046 | − 0.214 ± 0.179 | ||
| 0.02 | 0.294 ± 0.09 | − 0.383 ± 0.331 | 0.02 | 0.456 ± 0.123 | 0.374 ± 0.170 | ||||
| 0.03 | 0.380 ± 0.128 | 0.147 ± 0.490 | 0.03 | 0.486 ± 0.166 | 0.253 ± 0.257 | ||||
| 0.04 | 0.389 ± 0.180 | 0.124 ± 0.382 | 0.04 | 0.260 ± 0.099 | 0.252 ± 0.168 | ||||
| D | 0.01 | 0.371 ± 0.074 | 0.401 ± 0.166 | D | 0.01 | 0.258 ± 0.066 | 0.092 ± 0.151 | ||
| 0.02 | 0.374 ± 0.076 | − 0.421 ± 0.299 | 0.02 | 0.458 ± 0.146 | 0.310 ± 0.182 | ||||
| 0.03 | 0.363 ± 0.127 | 0.277 ± 0.444 | 0.03 | 0.359 ± 0.133 | 0.163 ± 0.294 | ||||
| 0.04 | 0.397 ± 0.180 | 0.145 ± 0.448 | 0.04 | 0.312 ± 0.096 | 0.124 ± 0.267 |
Fig. 2Three example scan clips plotted along the time, with the time-synchronised skill assessment scores, in orange and in blue, with both task model-generated scores (dotted lines) and the predicted scores (solid lines). The red boxed frames were the manually annotated ground truth for the diagnostic planes