| Literature DB >> 35614504 |
Adrian Krenzer1, Kevin Makowski2, Amar Hekalo2, Daniel Fitting3, Joel Troya3, Wolfram G Zoller4, Alexander Hann3, Frank Puppe2.
Abstract
BACKGROUND: Machine learning, especially deep learning, is becoming more and more relevant in research and development in the medical domain. For all the supervised deep learning applications, data is the most critical factor in securing successful implementation and sustaining the progress of the machine learning model. Especially gastroenterological data, which often involves endoscopic videos, are cumbersome to annotate. Domain experts are needed to interpret and annotate the videos. To support those domain experts, we generated a framework. With this framework, instead of annotating every frame in the video sequence, experts are just performing key annotations at the beginning and the end of sequences with pathologies, e.g., visible polyps. Subsequently, non-expert annotators supported by machine learning add the missing annotations for the frames in-between.Entities:
Keywords: Annotation; Automation; Deep learning; Endoscopy; Gastroenterology; Machine learning; Object detection
Mesh:
Year: 2022 PMID: 35614504 PMCID: PMC9134702 DOI: 10.1186/s12938-022-01001-x
Source DB: PubMed Journal: Biomed Eng Online ISSN: 1475-925X Impact factor: 3.903
Comparison between video and image annotation tools
| Tool | CVAT | LabelImg | labelme | VoTT | VIA | |
|---|---|---|---|---|---|---|
| Image | ||||||
| Video | - | - | ||||
| Usability | Easy | Easy | Medium | Medium | Hard | |
| Formats | VOC | - | ||||
| COCO | - | - | ||||
| YOLO | - | - | - | |||
| TFRecord | - | - | - | |||
| Others | - | - |
Comparison of FastCAT and CVAT by video. This table shows our comparison of the well-known CVAT annotation tool to our new annotation tool FastCAT in terms of annotation speed. Videos 1 and 2 are open source and annotated. Videos 3–10 are from the University Hospital Würzburg
| Speed (SPF) | Total time (min) | Video information | |||||
|---|---|---|---|---|---|---|---|
| CVAT | FastCat | CVAT | FastCat | Frames | Polyps | Framesize | |
| Video 1 | 3.79 | 1.75 | 23.43 | 10.82 | 371 | 1 | 384x288 |
| Video 2 | 4.39 | 2.49 | 32.85 | 18.63 | 449 | 1 | 384x288 |
| Video 3 | 2.82 | 1.42 | 60.11 | 30.27 | 1279 | 1 | 898x720 |
| Video 4 | 4.09 | 2.00 | 56.85 | 27.80 | 834 | 1 | 898x720 |
| Video 5 | 4.57 | 2.39 | 53.24 | 27.84 | 699 | 2 | 898x720 |
| Video 6 | 1.66 | 0.61 | 18.01 | 6.62 | 651 | 1 | 898x720 |
| Video 7 | 1.70 | 0.64 | 11.22 | 4.22 | 396 | 1 | 898x720 |
| Video 8 | 1.55 | 0.76 | 34.13 | 16.73 | 1321 | 2 | 898x720 |
| Video 9 | 1.87 | 0.88 | 34.91 | 16.43 | 1120 | 1 | 898x720 |
| Video 10 | 2.74 | 0.92 | 77.68 | 26.08 | 1701 | 4 | 898x720 |
| Mean | 2.92 | 1.39 | 40.24 | 18.54 | 882 | 1.5 | 795x633 |
Comparison of FastCAT and CVAT by user. This table shows our comparison of the well-known CVAT annotation tool to our new annotation tool FastCAT in terms of quality of annotation and annotation speed. The quality metric is the F1-score. We count a TP if the drawn box matches the ground truth box more than 70 %
| Quality (%) | Speed (SPF) | Total time (min) | Medical Experience | ||||
|---|---|---|---|---|---|---|---|
| CVAT | FastCat | CVAT | FastCat | CVAT | FastCat | ||
| User 1 | 99.30 | 99.50 | 7.33 | 3.71 | 48.78 | 25.30 | Low |
| User 2 | 98.85 | 98.90 | 3.47 | 1.88 | 23.38 | 13.70 | Low |
| User 3 | 97.97 | 98.51 | 4.59 | 1.53 | 31.28 | 11.17 | Low |
| User 4 | 98.93 | 99.75 | 5.12 | 2.57 | 33.96 | 16.53 | Middle |
| User 5 | 98.53 | 98.83 | 5.41 | 2.49 | 37.00 | 18.10 | Middle |
| User 6 | 98.52 | 99.23 | 4.04 | 3.24 | 27.90 | 24.95 | Low |
| User 7 | 99.45 | 99.30 | 5.20 | 2.70 | 35.01 | 21.28 | Middle |
| User 8 | 99.35 | 99.08 | 5.25 | 2.86 | 33.90 | 19.57 | Low |
| User 9 | 99.12 | 98.54 | 4.12 | 2.25 | 27.12 | 14.99 | Low |
| User 10 | 98.93 | 99.48 | 5.63 | 2.76 | 37.53 | 19.89 | Low |
| Mean | 98.98 | 99.03 | 5.79 | 2.93 | 33.59 | 18.55 | Low |
Fig. 1Learning process of the non-expert annotators. The figure shows the speed of the annotator in seconds per frame (SPF) over the annotation experience measured by the total number of annotated videos by that point for both our tool and CVAT
Fig. 2Effect of AI performance on annotation speed. Plotted are the speed of the annotators in seconds per frame over the AI performance given by its F1-score on a video-by-video basis, where the AI used for prediction is the same for each video. Every point is computed as the average over all annotators
Comparison of CVAT and FastCAT. The tables show the reduction of annotation time of the domain experts. Tgca stands for the time gained compared to annotation with CVAT and is the reduction of workload in %. Video 1 and video 2 are not used for this analysis as the open-source data do not provide full colonoscopies, but just polyp sequences and therefore it is not possible to perform an appropriate comparison
| Total time (min) | Tgca (%) | Video information | ||||
|---|---|---|---|---|---|---|
| FastCat | CVAT | Length (min) | Freezes | Polyps | ||
| Video 3 | 0.50 | 60.11 | 99.15 | 15.76 | 2 | 1 |
| Video 4 | 0.67 | 56.85 | 98.82 | 17.70 | 6 | 1 |
| Video 5 | 1.09 | 53.24 | 97.95 | 23.12 | 4 | 2 |
| Video 6 | 0.77 | 18.01 | 95.72 | 6.30 | 2 | 1 |
| Video 7 | 0.70 | 11.22 | 93.79 | 13.05 | 5 | 1 |
| Video 8 | 1.78 | 34.13 | 94.76 | 27.67 | 13 | 2 |
| Video 9 | 1.50 | 34.91 | 95.70 | 20.53 | 4 | 1 |
| Video 10 | 2.92 | 77.68 | 96.24 | 24.36 | 15 | 4 |
| Mean | 1.24 | 43.26 | 96.52 | 18.56 | 6.38 | 1.62 |
Fig. 3Annotation framework for fast domain expert labeling supported by an automated AI prelabeling
Fig. 4Video Review UI. The figure shows the list of freeze frames, the corresponding child frames, and annotations within the image on the right side. In the bottom part of the view, the user can insert comments, open reports, delete classes, and see all individual classes. The diseased tissue is delineated via bounding boxes
Fig. 5Image annotation UI. The figure shows a list of all available frames on the left with labeling functionality for a specific annotation and the whole image. The image to be annotated is displayed on the right. The diseased tissue is delineated via bounding boxes