| Literature DB >> 32350787 |
Sophia Bano1, Francisco Vasconcelos2, Emmanuel Vander Poorten3, Tom Vercauteren4, Sebastien Ourselin4, Jan Deprest5, Danail Stoyanov2.
Abstract
PURPOSE: Fetoscopic laser photocoagulation is a minimally invasive surgery for the treatment of twin-to-twin transfusion syndrome (TTTS). By using a lens/fibre-optic scope, inserted into the amniotic cavity, the abnormal placental vascular anastomoses are identified and ablated to regulate blood flow to both fetuses. Limited field-of-view, occlusions due to fetus presence and low visibility make it difficult to identify all vascular anastomoses. Automatic computer-assisted techniques may provide better understanding of the anatomical structure during surgery for risk-free laser photocoagulation and may facilitate in improving mosaics from fetoscopic videos.Entities:
Keywords: Computer assisted interventions (CAI); Deep learning; Fetoscopy; Surgical vision; Twin-to-twin transfusion syndrome (TTTS); Video segmentation
Mesh:
Year: 2020 PMID: 32350787 PMCID: PMC7261278 DOI: 10.1007/s11548-020-02169-0
Source DB: PubMed Journal: Int J Comput Assist Radiol Surg ISSN: 1861-6410 Impact factor: 2.924
Fig. 1Representative cropped images from the seven fetoscopic videos used in our experiments displaying the four multi-label event classes
Fig. 2An overview of the proposed FetNet for event classification in fetoscopic videos. Spatial representation of each frame is encoded by a CNN (VGG16 architecture) while the temporal representation is encoded using LSTM followed by fully connected layers. Differential learning rate is applied during network training
Distribution of fetoscopic videos (in frames) for each event label
| Video# | Resolution | #Frames | Clear view | Occlusion | Tool | Ablation |
|---|---|---|---|---|---|---|
| Video 1 |
| 25,900 | 5604 | 13,167 | 7903 | 3163 |
| Video 2 |
| 17,030 | 5643 | 2439 | 8877 | 1498 |
| Video 3 |
| 12,000 | 1896 | 6570 | 6846 | 306 |
| Video 4 |
| 17,450 | 1886 | 5227 | 12,113 | 1273 |
| Video 5 |
| 22,000 | 9370 | 2900 | 7336 | 2458 |
| Video 6 |
| 27,000 | 8020 | 7474 | 10,155 | 5064 |
| Video 7 |
| 17,400 | 8328 | 8112 | 2522 | 156 |
| Total | 138,780 | 40,747 | 45,889 | 55,752 | 13,891 |
Configuration details of different methods under comparison
| Method | Network details | Learning rate |
|---|---|---|
| Ablation_detect [ | ResNet50 [ | Fixed to |
| VGGFE_SVM [ | Features from the VGG16 [ | – |
| VGG16_fine [ | VGG16 network with FC layers having 2048, 512 and 3 units | Fixed to |
| VGG16_temporal [ | Temporal averaging with a median filter of size 6 (samples) applied to the VGG16 predictions | – |
| FetNet_noDL | Proposed FetNet (Fig. | Fixed to |
| FetNet_DL | Proposed FetNet (Fig. |
All networks are initialised with the pre-trained ImageNet weights
7-fold cross-validation results of the proposed FetNet and its comparison with the existing methods
| Method | Class | |||||
|---|---|---|---|---|---|---|
| Clear | Occlusion | Tool | Ablation | Average | ||
| Ablation_detect [ | Precision | – | – | – | 0.81 | 0.81 |
| Recall | – | – | – | 0.71 | 0.71 | |
| F1-score | – | – | – | 0.76 | 0.76 | |
| VGGFE_SVM | Precision | 0.52 | 0.55 | 0.68 | 0.32 | 0.52 |
| Recall | 0.42 | 0.70 | 0.50 | 0.19 | 0.45 | |
| F1-score | 0.46 | 0.62 | 0.58 | 0.24 | 0.47 | |
| VGG16_fine | Precision | 0.66 | 0.69 | 0.76 | 0.96 | 0.77 |
| Recall | 0.47 | 0.69 | 0.73 | 0.61 | 0.63 | |
| F1-score | 0.55 | 0.69 | 0.74 | 0.75 | 0.68 | |
| VGG16_temporal | Precision | 0.72 | 0.70 | 0.76 | 0.96 | 0.79 |
| Recall | 0.46 | 0.68 | 0.73 | 0.56 | 0.61 | |
| F1-score | 0.56 | 0.69 | 0.74 | 0.71 | 0.68 | |
| FetNet_noDL | Precision | 0.72 | 0.70 | 0.86 | 0.95 | 0.81 |
| Recall | 0.78 | 0.60 | 0.90 | 0.69 | 0.74 | |
| F1-score | 0.74 | 0.65 | 0.88 | 0.80 | 0.77 | |
| FetNet_DL | Precision | 0.86 | 0.69 | 0.92 | 0.96 | 0.86 |
| Recall | 0.84 | 0.79 | 0.94 | 0.95 | 0.88 | |
| F1-score | 0.85 | 0.74 | 0.93 | 0.95 | 0.87 | |
Fig. 3Precision-recall curves along with AUCs of different methods under comparison for a clear view, b occlusion, c tool and d ablation classes. e The micro-average precision-recall over all the classes
Fig. 4Performance comparison of different methods. F1-scores and standard deviations; a over 7-folds for each event; b over 4 events for each fold
Fig. 5A snapshot of timeline showing predictions for video 1. Groundtruth (top) and correct predictions from VGG_fine (middle) and FetNet_DL (bottom) are shown in blue. The erroneous predictions are shown in red