| Literature DB >> 33863970 |
Taesung Kim1, Jinhee Kim1, Hyuk Soon Choi2, Eun Sun Kim2, Bora Keum2, Yoon Tae Jeen2, Hong Sik Lee2, Hoon Jai Chun2, Sung Yong Han3, Dong Uk Kim3, Soonwook Kwon4, Jaegul Choo5, Jae Min Lee6.
Abstract
The advancement of artificial intelligence (AI) has facilitated its application in medical fields. However, there has been little research for AI-assisted endoscopy, despite the clinical significance of the efficiency and safety of cannulation in the endoscopic retrograde cholangiopancreatography (ERCP). In this study, we aim to assist endoscopists performing ERCP through automatic detection of the ampulla and the identification of cannulation difficulty. We developed a novel AI-assisted system based on convolutional neural networks that predict the location of the ampulla and the difficulty of cannulation to the ampulla. ERCP data of 531 and 451 patients were utilized in the evaluation of our model for each task. Our model detected the ampulla with mean intersection-over-union 64.1%, precision 76.2%, recall 78.4%, and centroid distance 0.021. In classifying the cannulation difficulty, it achieved the recall of 71.9% for the class of easy cases and that of 61.1% for that of difficult cases. Remarkably, our model accurately detected AOV with varying morphological shape, size, and texture on par with the level of a human expert and showed promising results for recognizing cannulation difficulty. It demonstrated its potential to improve the quality of ERCP by assisting endoscopists.Entities:
Year: 2021 PMID: 33863970 PMCID: PMC8052314 DOI: 10.1038/s41598-021-87737-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1An example of an endoscopic image (left), its original bbox annotation (middle), transformed pixel-wise soft mask label (right). The centroid of the bbox is marked as a dot. For labels (middle and right), the image was overlapped to visualize the location of them on the image.
Baseline characteristics of the enrolled patients who underwent ERCP.
| Value | |
|---|---|
| Patients, | 531 |
| Age, mean ± SD (years) | 66.0 ± 15.2 |
| Male, | 303 (57) |
| Biliary gallstone disease | 289 (54) |
| Malignant bile duct stricture | 104 (20) |
| Cholangitis or papillitis only | 72 (14) |
| Postoperative adverse event | 19 (4) |
| Benign biliary stricture | 15 (3) |
| Others | 32 (6) |
| Success, | 525 (99) |
| < 5 min | 362 (68) |
| ≥ 5 min | 163 (31) |
| Failure, | 6 (1) |
| Cannulation time, median ± SE (s) | 130.0 ± 305.5 |
ERCP endoscopic retrograde cholangiopancreatography, SD standard deviation, SE standard error.
Figure 2Examples of the model prediction and GT label. A model prediction is present in two different ways, the green bbox (upper) and heatmap visualization (lower). In both cases, the white bbox indicates the GT label. For each prediction, IoU and centroid distance are written above. Heatmap results show that the predicted masks from our model accurately match AOVs in size and shape, even ones with IoU around 30%.
Figure 3(a) The success plot with decreasing IoU threshold. The model that predicts soft masks is superior to the model that directly predicts bboxes. The former has a success rate of 91.4% for the threshold of 0.3, showing sufficient performance to assist the ERCP procedure. (b) The success plot with increasing centroid distance threshold. The model with soft mask output always performs better than the model with bbox output. Also, the former achieved a high success rate for the challenging threshold of 2%.
Performance for binary classification of cannulation difficulty.
| Model | VGG | ResNet | DenseNet | |||
|---|---|---|---|---|---|---|
| Easy | Difficult | Easy | Difficult | Easy | Difficult | |
| Precision | 0.787 ± 0.081 | 0.434 ± 0.083 | 0.772 ± 0.076 | 0.436 ± 0.047 | ||
| Recall | 0.607 ± 0.048 | 0.611 ± 0.098 | 0.631 ± 0.057 | 0.615 ± 0.045 | ||
| F1 score | 0.680 ± 0.016 | 0.511 ± 0.048 | 0.694 ± 0.061 | 0.507 ± 0.026 | ||
| ACC | 0.614 ± 0.024 | 0.627 ± 0.037 | ||||
| AUC | 0.657 ± 0.061 | 0.626 ± 0.034 | ||||
Bold fonts represent the best performance among the methods.
AUC area under the receiver operating characteristic curve, ACC accuracy.
Performance for four-class classification.
| Model | VGG | ResNet-macro F1 | ResNet-ACC | DenseNet |
|---|---|---|---|---|
| Macro-average precision | 0.360 ± 0.143 | 0.340 ± 0.033 | 0.393 ± 0.110 | |
| Macro-average recall | 0.342 ± 0.068 | 0.328 ± 0.041 | 0.379 ± 0.067 | |
| Macro-average F1 score | 0.304 ± 0.080 | 0.364 ± 0.029 | 0.350 ± 0.095 | |
| Accuracy | 0.691 ± 0.070 | 0.667 ± 0.078 | 0.699 ± 0.064 |
Bold fonts represent the best performance among the methods.
ResNet-macro F1 ResNet early stopped with macro-average F1-score, ResNet-ACC ResNet early stopped with accuracy.
Figure 4Grad-CAM results of the cannulation difficulty prediction model for accurately predicted examples. The heatmap visualizations are the outputs of Grad-CAM for the written GT label. They show where the model focused attention to estimate the cannulation difficulty. The lighter the color, the stronger is the attention required. The results show that the model sees distinct features of each label.