| Literature DB >> 35845028 |
Linda S Yang1, Evelyn Perry1, Leonard Shan2, Helen Wilding3, William Connell1, Alexander J Thompson1, Andrew C F Taylor1, Paul V Desmond1, Bronte A Holt1.
Abstract
Background and aims Artificial intelligence (AI) technology is being evaluated for its potential to improve colonoscopic assessment of inflammatory bowel disease (IBD), particularly with computer-aided image classifiers. This review evaluates the clinical application and diagnostic test accuracy (DTA) of AI algorithms in colonoscopy for IBD. Methods A systematic review was performed on studies evaluating AI in colonoscopy of adult patients with IBD. MEDLINE, Embase, Emcare, PsycINFO, CINAHL, Cochrane Library and Clinicaltrials.gov databases were searched on 28 th April 2021 for English language articles published between January 1, 2000 and April 28, 2021. Risk of bias and applicability were assessed with the Quality Assessment of Diagnostic Accuracy Studies-2 tool. Diagnostic accuracy was presented as median (interquartile range). Results Of 1029 records screened, nine studies with 7813 patients were included for review. AI was used to predict endoscopic and histologic disease activity in ulcerative colitis, and differentiation of Crohn's disease from Behcet's disease and intestinal tuberculosis. DTA of AI algorithms ranged between 52-91 %. The sensitivity and specificity for AI algorithms predicting endoscopic severity of disease were 78 % (range 72-83, interquartile range 5.5) and 91 % (range 86-96, interquartile range 5), respectively. Conclusions AI has been primarily used to assess disease activity in ulcerative colitis. The diagnostic performance is promising and suggests potential for other clinical application of AI in IBD colonoscopy such as dysplasia detection. However, current evidence is limited by retrospective data and models trained on still images only. Future prospective multicenter studies with full-motion videos are needed to replicate the real-world clinical setting. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution-NonDerivative-NonCommercial License, permitting copying and reproduction so long as the original work is given appropriate credit. Contents may not be used for commercial purposes, or adapted, remixed, transformed or built upon. (https://creativecommons.org/licenses/by-nc-nd/4.0/).Entities:
Year: 2022 PMID: 35845028 PMCID: PMC9286774 DOI: 10.1055/a-1846-0642
Source DB: PubMed Journal: Endosc Int Open ISSN: 2196-9736
Fig. 1PRISMA flow diagram. Template from: Page MJ, McKenzie JE, Bossuyt PM et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ 2021; 372:n71
Study characteristics.
|
|
|
|
|
|
|
|
|
|
|
|
Bhambhvani
| Single center, retrospective | UC | CNN (101 layer) | Grading MES | Hyper-Kvasir publicly available retrospective dataset | 2 endoscopists | NA | 116 validation set, | No |
|
Gottlieb
| Multi center, retrospective | UC | Bidirectional RNN (5-fold cross validation) | Grading MES and UCEIS, per colon section | 629 videos from Phase 2 trial (5.9 million frames) | 1 endoscopist | 157 | 157 videos (1.5 million frames) | Yes |
|
Gutierrez Becker
| Multi center, retrospective | UC | CNN (5-fold cross validation) | Grading MES per colon section | 351 sigmoidoscopy videos (4371 frames) from Phase 2 multicenter RCT | Central reader(s) for clinical trials | 1105 | 1672 sigmoidoscopy videos (no. of frames not reported) | Yes |
|
Ozawa
| Single center, retrospective | UC | CNN (22-layers) | Predicting endoscopic disease activity using MES (0–1 vs 2–3) | Retrospective database of day procedure clinic 26,304 | 2–3 endoscopists | 114 | 3981 | No |
|
Stidham
| Single center, retrospective | UC | CNN (10-fold cross validation) | Predicting endoscopic disease activity using MES (0–1 vs. 2–3) | 16 514 | 2 endoscopists, 3 rd reviewer for discrepancies | 304 (internal still images) | 1652 still images | Yes |
|
Takenaka
| Single center, prospective | UC | Deep neural network | Predicting UCEIS and histologic remission | 40758 | 3 endoscopists | 875 internal | 4187 | No |
|
Takenaka
| Single center, prospective | UC | Deep neural network | Predicting patient prognosis | 40758 | 3 endoscopists | 875 | 4187 | No |
|
Yao
| Single center, prospective | UC | CNN (5-fold cross validation) | Grading MES per frame | 51 videos (60 frames per second) | 2 local endoscopists for training, | 157 | 264 | Yes |
|
Kim
| Single center, retrospective | Crohn’s | CNN | Differentiating Behçet’s disease (BD), Crohn’s disease (CD), and intestinal tuberculosis (ITB) | 5, 237 | 2 endoscopists | 697 internal validation set, | 697 validation set (286 BD, 244 CD, 167 ITB) | No |
CLE, confocal laser endomicroscopy; EC, endocytoscopyl LSTM, long short-term memory; MES, Mayo endoscopic subscore; NBI, narrow band imaging; RNN, recurrent neural network; UC, ulcerative colitis; UCEIS, UC endoscopic index of severity; WLE, white light endoscopy.
Quality assessment of Diagnostic Accuracy Studies-2 risk of bias assessment.
|
|
|
|
|
|
|
|
Bhambhvani et al.
| Low | Low | Low | Unclear | 2 |
|
Gottlieb et al.
| Low | Low | High | Unclear | 3 |
|
Gutierrez Becker et al.
| High | Low | Unclear | Low | 3 |
|
Ozawa et al.
| Low | Low | Low | Low | 1 |
|
Stidham et al.
| Low | Low | Low | Low | 1 |
|
Takenaka et al.
| Low | Low | Low | Low | 1 |
|
Takenaka et al.
| Low | Low | Low | Low | 1 |
|
Yao et al.
| Low | Low | Unclear | High | 3 |
|
Kim et al.
| High | Low | High | High | 4 |
Outcomes of artificial intelligence models for prediction of ulcerative colitis using Mayo Endoscopic Subscore.
|
|
|
|
|
|
|
|
|
|
| |
|
Ozawa et al.
| 1 | Overall | – | – | – | – | MES 0: 73 | MES 0 vs 1–3: 0.86 (0.84–0.87) | – | |
| With vs without topical treatment | – | – | – | – | – | MES 0 vs 1–3: | Correlation between Matts grade | |||
| Each location of the colorectum | – | – | – | – | – | MES 0 vs 1–3 | – | |||
| Right colon | 0.83 | |||||||||
| Left colon | 0.83 | |||||||||
| Rectum | 0.92 | |||||||||
| MES 0–1 vs 2–3 | ||||||||||
| Right colon | 0.99 | |||||||||
| Left colon | 0.99 | |||||||||
| Rectum | 0.94 | |||||||||
|
Stidham et al.
| 1 | MES 0–1 vs MES 2–3, images | 83 (81–85) | 96 (95–97) | 86 (85–88) | 0.94 (93–95) | MES 0: 89 | 0.970 (0.967–0.972) | κ 0.84 | |
| MES 0–1 vs MES 2–3, video-based images | – | – | 68 (67–69) | 0.98 (97–99) | MES 0: 75 | 0.966 (0.963–0.969) | κ 0.75 | |||
|
Bhambhvani
| 2 | Overall/Average | 72 | 86 | 78 | 87 | 77 | – | – | |
| MES 1 | 67 | 91 | 74 | 88 | – | 0.89 | – | |||
| MES 2 | 86 | 68 | 78 | 80 | – | 0.86 | – | |||
| MES 3 | 64 | 97 | 82 | 93 | – | 0.96 | – | |||
|
Gottlieb, 2020
| 3 | MES 0 | 88 (82–93) | 97 (94–99) | 78 (71–84) | 98 (96–100) | – | 0.92 | Endoscopic healing 96 % (95 % CI, 92–99 %) | |
| MES 1 | 65 (57–72) | 92 (88–96) | 73 (66–80) | 88 (83–93) | – | 0.78 | ||||
| MES 2 | 60 (53–68) | 77 (70–84) | 43 (35–51) | 87 (82–92) | – | 0.69 | ||||
| MES 3 | 74 (67–81) | 95 (92–98) | 91 (87–95) | 84 (79–90) | – | 0.85 | ||||
|
Gutierrez Becker, 2021
| 3 | AI model trained on raw videos | – | – | – | – | – | Raw videos | – | |
| AI model trained on external dataset still images | – | – | – | – | – | Raw videos | – | |||
|
Yao, 2021
| 3 | Local validation set with informative image classifier | – | – | – | – | 78 | – | κ 0.84 (95 % CI, 0.75–0.92) | |
| Local validation set without informative image classifer | – | – | – | – | 65 | – | κ 0.63 (95 % CI, 0.52–0.89) | |||
| External validation set with informative image classifier | MES 0: 50 | MES 0: 97 | – | – | 57 | MES 0: 0.95 | κ 0.59 (95 % CI, 0.46–0.71) | |||
| Segments with a MES of 0 or 1 | 65 (54–75) | 98 (94–99) | 87 (76–94) | 92 (89–95) | 91 (88–94) | – | – | |||
| Per patient assessment | 86 (75–94) | 93 (80–99) | 94 (85–99) | 83 (69–92) | 89 (81–94) | – | – | |||
AI, artificial intelligence; MES, Mayo endoscopic subscore; κ, Kappa co-efficient; -, not recorded.
Outcomes of artificial intelligence models for prediction of Ulcerative Colitis using Ulcerative Colitis Endoscopic Index of Severity.
|
|
|
|
|
|
|
|
|
|
Takenaka et al.
| 1 | 92 | 91 | 86 | 95 | – | – |
|
Takenaka et al.
| 1 | Endoscopic remission: 93 (92–94) | Endoscopic remission: 88 (87–88) | Endoscopic remission: 84 (83–85) | Endoscopic remission: 95 (94–96) | Endoscopic remission 90 (89–91) | – |
| Histologic remission: 92 (91–93) | Histologic remission: 94 (93–94) | Histologic remission: 94 (93–95) | Histologic remission: 92 (91–93) | Histologic remission: 93 (92–94) | |||
|
Gottlieb et al.
| 3 | UCEIS 0: 67 | UCEIS 0: 98 | UCEIS 0: 98 | UCEIS 0: 67 | – | UCEIS 0: 0.885 |
AUROC, area under receiver operating characteristics curve; CI, confidence interval; NPV, negative predictive value; PP, positive predictive value; QWK, quadratic weighted kappa; UCEIS, UC endoscopic index of severity; -, not recorded.
Fig. 2Sensitivity and specificity of AI algorithms on ulcerative colitis . a Overall results. b Average sensitivity and specificity per MES grade.
Summary of limitations of current evidence and potential strategies for improvement.
|
|
|
|
| High interobserver variability for differentiating mild and moderate disease | Prospective and multicenter image dataset from various clinical settings (clinical trials, tertiary center, day procedures, primary care) | Wider clinical applicability of AI algorithms |
| Algorithms are trained on images without imperfections such as debris, tissue damage from biopsies, poor focus | Prospective and multicenter image dataset with higher variation in the types of training images | Application of informative image classifier to raw full-motion videos |
| NBI or dye-based chromoendoscopy images not included | Prospective, multicenter study design | AI models on detection of IBD-related dysplasia |
AI, artificial intelligence; DMARDs, disease modifying anti-rheumatic drugs; IBD, inflammatory bowel disease; NBI, narrow band imaging; 5-ASA, 5-aminosalicylic acid.