| Literature DB >> 33037490 |
Richard Bieck1, Katharina Heuermann2, Markus Pirlich2, Juliane Neumann3, Thomas Neumuth3.
Abstract
PURPOSE: In the context of aviation and automotive navigation technology, assistance functions are associated with predictive planning and wayfinding tasks. In endoscopic minimally invasive surgery, however, assistance so far relies primarily on image-based localization and classification. We show that navigation workflows can be described and used for the prediction of navigation steps.Entities:
Keywords: Attention networks; Deep learning; Endoscopic navigation; FESS; Machine translation; Natural language processing; Workflow prediction
Mesh:
Year: 2020 PMID: 33037490 PMCID: PMC7671992 DOI: 10.1007/s11548-020-02264-2
Source DB: PubMed Journal: Int J Comput Assist Radiol Surg ISSN: 1861-6410 Impact factor: 2.924
Fig. 1Definitions for the establishment of navigation prediction in functional endoscopic sinus surgery: b Example for our natural language processing-based prediction function during a FESS procedure. a Possible transitions between endoscopic states that relate to an observed anatomical landmark with examples of an occurring landmark combination (blue) and semantic relations used (red), transitions are bi-directional due to the possible movement between landmarks at any moment in time throughout the procedure
Fig. 2Explanation for the extraction process of annotating endoscopic images with landmark content and parsing into sentence-level descriptions: a image frames, where an anatomical landmark was visible in a FESS recording, b navigation activity c navigation vocabulary, d pairwise sentence-level representation of consecutive navigation activities and e class representations of consecutive navigation activities
Fig. 3Overview of the prediction models employed: a Two-layer long short term memory network, b first-order hidden Markov model, c standard encoder-decoder-network with gated recurrent units (GRU) and d transformer architecture with encoder and decoder stacks and attention blocks
Overview of the results for the annotated surgical navigation workflows. (LMC—number of unique landmark combinations in a workflow, mLMV—mean landmark visibility duration in a workflow)
| Wf | Steps in n | Duration in s | LMC in n | mLMV in s | Wf | Steps in n | Duration in s | LMC in n | mLMV in s |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 81 | 864 | 8 | 13.95 | 12 | 238 | 1896 | 9 | 6.97 |
| 1 | 62 | 876 | 9 | 12.35 | 13 | 110 | 1107 | 9 | 7.31 |
| 2 | 45 | 717 | 6 | 11.47 | 14 | 174 | 1364 | 12 | 5.41 |
| 3 | 19 | 308 | 6 | 10.11 | 15 | 453 | 6775 | 12 | 7.33 |
| 4 | 66 | 600 | 9 | 8.44 | 16 | 184 | 2477 | 10 | 8.03 |
| 5 | 30 | 1766 | 6 | 13.77 | 17 | 200 | 2233 | 12 | 7.25 |
| 6 | 304 | 3602 | 9 | 23.24 | 18 | 83 | 836 | 9 | 7.94 |
| 7 | 83 | 1895 | 9 | 37.29 | 19 | 208 | 2439 | 9 | 6.90 |
| 8 | 358 | 5334 | 11 | 6.59 | 20 | 287 | 3177 | 8 | 7.25 |
| 9 | 248 | 2996 | 16 | 8.38 | 21 | 133 | 1219 | 8 | 6.50 |
| 10 | 46 | 1877 | 6 | 4.09 | 22 | 123 | 1282 | 8 | 6.71 |
| 11 | 315 | 3425 | 14 | 8.97 | 167.39 | 2133.26 | 9.35 | 10.27 |
Overview of the data distribution landmarks observed individually and in combination in absolute values and as observation fractions of the overall dataset
| Landmark | Observations, individual | Observations, in combination | Observations, accumulated | |||
|---|---|---|---|---|---|---|
| Middle_nasal_concha | 885 | 0.23 | 1081 | 0.82 | 1966 | 0.51 |
| Middle_nasal_meatus | 539 | 0.14 | 500 | 0.13 | 1039 | 0.27 |
| Maxillary_sinus_orifice | 425 | 0.11 | 278 | 0.07 | 703 | 0.18 |
| Out_of_patient | 492 | 0.13 | 0 | 0.13 | 492 | 0.13 |
| Uncinate_process_of_ethmoid | 60 | 0.01 | 158 | 0.04 | 218 | 0.06 |
| Ethmoidal_bulla | 67 | 0.01 | 82 | 0.02 | 149 | 0.04 |
| Spheno_ethmoidal_recess | 31 | < 0.01 | 9 | < 0.01 | 40 | 0.01 |
Comparison of sentence translation results for the sequence-to-sequence (S2S) and transformer models (TRF). (BL-1—BLEU-1 metric, JD—Jaccard Distance, R-L—ROUGE-L Recall Metric, F1—Approximated Accuracy, higher means better, except for JD, values are averaged)
| Model | BL-1 | JD | R-L | F1 |
|---|---|---|---|---|
| S2S | 0.73 | 0.29 | 0.77 | |
| Transformer | 0.81 | 0.24 | 0.87 |
Prediction results for the position-specific accuracy of sentence words for the sequence-to-sequence (S2S) and transformer models (TRF). (Pr—precision, Re—recall, higher means better, highest and lowest scores are highlighted)
| Sentence Term | TRF | S2S | ||
|---|---|---|---|---|
| Pr | Re | Pr | Re | |
| Step Count | ||||
| Sinus | 0.74 | 0.73 | 0.57 | 0.51 |
| Landmark Group | 0.53 | 0.73 | 0.40 | 0.38 |
| Landmark Combination | 0.53 | 0.60 | 0.32 | 0.29 |
| Direction | 0.58 | 0.74 | 0.55 | 0.75 |
| Overall | 0.67 | 0.75 | 0.56 | 0.57 |
| F1-Score (Accuracy) | ||||
Bold numbers were chosen to highlight important numbers and relevant scores
Fig. 4Examples for a good and b bad sentence translation results generated with the TRF model through decaying beam search decoding (SRC—source sentence, PRD—predicted sentences, TRG—target sentence) as well as c examples of the TRF where the training of word relations failed between a current and a future sentence. Red markings show the theoretically ideal weight distribution in the TRF’s decoder when sentence structures are effectively learned
Prediction results for the position-specific accuracy of specific landmarks accumulated for individual and in-combination observations using leave-one-out cross-validation. (Pr—precision, Re—recall, higher means better, highest and lowest scores are highlighted)
| Landmark | Transformer | LSTM | HMM | S2S | ||||
|---|---|---|---|---|---|---|---|---|
| Pr | Re | Pr | Re | Pr | Re | Pr | Re | |
| Middle_nasal_concha | 0.62 | 0.67 | 0.83 | 0.62 | 0.42 | |||
| Middle_nasal_meatus | 0.71 | 0.65 | 0.36 | 0.41 | ||||
| Maxillary_sinus_orifice | 0.52 | 0.59 | 0.36 | 0.39 | 0.35 | 0.58 | 0.27 | 0.35 |
| Out_of_patient | 0.50 | 0.43 | 0.31 | 0.20 | 0.00 | 0.00 | 0.24 | 0.34 |
| Uncinate_process_of_ethmoid | 0.38 | 0.34 | 0.19 | 0.22 | 0.00 | 0.00 | 0.22 | 0.20 |
| Ethmoidal_bulla | 0.42 | 0.49 | 0.22 | 0.31 | 0.00 | 0.00 | 0.20 | 0.24 |
| Spheno_ethmoidal_recess | 0.50 | 0.54 | 0.00 | 0.00 | 0.00 | 0.00 | 0.23 | |
| Overall | 0.53 | 0.53 | 0.34 | 0.35 | 0.30 | 0.27 | 0.31 | 0.32 |
| F1-Score (Accuracy) | ||||||||
Bold numbers were chosen to highlight important numbers and relevant scores