Literature DB >> 33967550

Artificial intelligence in gastroenterology and hepatology: Status and challenges.

Jia-Sheng Cao1, Zi-Yi Lu2, Ming-Yu Chen1, Bin Zhang1, Sarun Juengpanich2, Jia-Hao Hu1, Shi-Jie Li1, Win Topatana2, Xue-Yin Zhou3, Xu Feng1, Ji-Liang Shen1, Yu Liu4, Xiu-Jun Cai5.   

Abstract

Originally proposed by John McCarthy in 1955, artificial intelligence (AI) has achieved a breakthrough and revolutionized the processing methods of clinical medicine with the increasing workloads of medical records and digital images. Doctors are paying attention to AI technologies for various diseases in the fields of gastroenterology and hepatology. This review will illustrate AI technology procedures for medical image analysis, including data processing, model establishment, and model validation. Furthermore, we will summarize AI applications in endoscopy, radiology, and pathology, such as detecting and evaluating lesions, facilitating treatment, and predicting treatment response and prognosis with excellent model performance. The current challenges for AI in clinical application include potential inherent bias in retrospective studies that requires larger samples for validation, ethics and legal concerns, and the incomprehensibility of the output results. Therefore, doctors and researchers should cooperate to address the current challenges and carry out further investigations to develop more accurate AI tools for improved clinical applications. ©The Author(s) 2021. Published by Baishideng Publishing Group Inc. All rights reserved.

Entities:  

Keywords:  Artificial intelligence; Challenges; Gastroenterology; Hepatology; Status

Mesh:

Year:  2021        PMID: 33967550      PMCID: PMC8072192          DOI: 10.3748/wjg.v27.i16.1664

Source DB:  PubMed          Journal:  World J Gastroenterol        ISSN: 1007-9327            Impact factor:   5.742


Core Tip: Artificial intelligence (AI) technologies are widely used for medical image analysis in the gastroenterology and hepatology fields. Several AI models have been developed for accurate diagnosis, treatment, and prognosis based on images of endoscopy, radiology, pathology, achieving high performance comparable to experts. However, we should be aware of the certain constraints that limit the acceptance and utilization of AI tools in clinical practice. To use AI wisely, doctors and researchers should work together to address the current challenges and develop more accurate AI tools to improve patient care.

INTRODUCTION

Originally proposed by John McCarthy in 1955, artificial intelligence (AI) which involves machine learning (ML) and problem solving, has achieved a breakthrough and revolutionized the processing methods of clinical medicine with the increasing workloads of medical records and digital images. In clinical practice, AI consists of several overlapping technologies such as ML, artificial neural networks (ANNs), deep learning (DL), convolutional neural networks (CNNs), and recurrent neural networks[1,2] (Figure 1). Since the 1980s, ML has been performed to construct a mathematical model and predict outcomes based on input data, and it is roughly divided into supervised (labeled data), unsupervised (unlabeled data), and semi-supervised (both labeled and unlabeled data) learning techniques[3]. Recently, as a subset of ML, ANNs have received increased interest because they can identify and learn input data by themselves instead of being labeled by experts[4]. In the last decade, DL, a new model of ML, holds great promise in clinical medicine. DL is particularly suitable for enormous complex or highly dimensional medical image analysis and predictive modeling tasks using the multilayers of ANNs, including CNNs and recurrent neural networks[5,6]. Notably, given that convolutional and pooling layers can extract distinct features and fully connected layers can make a final classification, CNNs have demonstrated excellent performance in image recognition such as endoscopy, radiology, and pathology[7,8].
Figure 1

Timeline and related technologies of artificial intelligence. AI: Artificial intelligence; ANN: Artificial neural network; CNN: Convolutional neural network; RNN: Recurrent neural network.

Timeline and related technologies of artificial intelligence. AI: Artificial intelligence; ANN: Artificial neural network; CNN: Convolutional neural network; RNN: Recurrent neural network. In the fields of gastroenterology and hepatology, doctors are paying attention to AI technologies for the diagnosis, treatment, and prognosis of various diseases due to the heterogeneous expertise levels of doctors (majoring in endoscopy, radiology, and pathology), time-consuming procedures, and increasing workloads. Specifically, doctors usually assess medical images visually to detect and diagnose diseases based on personal expertise and experience. As the maturity of digitalization increases, a quantitative assessment of imaging information has become the reality instead of relatively inaccurate qualitative reasoning[9,10]. Although a lot of time is necessary to review and check image analysis traditionally, little information can be obtained. For example, using AI technologies to process pathology images can assess the histopathological classification and predict gene mutations in liver cancer[11], while only the mass nature can be identified by conventional pathology assessment. As a country with a high population, China has produced rapidly increasing medical records, which result in the high workloads[12]. Despite the progression of AI, gastroenterologists and hepatologists should always be aware of its limitations such as the retrospective manner of included studies and the utilization of not particularly suitable databases. In addition, it demands that doctors prepare for the effects and changes of AI on clinical practice in the real world. In this review, we aim to (1) introduce how AI technologies process input data, learn from input data, validate the established model; (2) summarize the AI applications in endoscopy, radiology, pathology for accurate diagnosis, treatment, and prognosis; and (3) discuss the current limitations and future considerations of AI applications in the fields of gastroenterology and hepatology (Figure 2).
Figure 2

Artificial intelligence-assisted endoscopy, radiology, and pathology applications for medical image analysis in the fields of gastroenterology and hepatology, including detecting and evaluating lesions, facilitating treatment, and predicting treatment response and prognosis, and other potentials, using several deep learning models. CT: Computed tomography; MRI: Magnetic resonance imaging.

Artificial intelligence-assisted endoscopy, radiology, and pathology applications for medical image analysis in the fields of gastroenterology and hepatology, including detecting and evaluating lesions, facilitating treatment, and predicting treatment response and prognosis, and other potentials, using several deep learning models. CT: Computed tomography; MRI: Magnetic resonance imaging.

METHODS IN DEEP LEARNING

As the most suitable approach for medical image analysis, the DL approach does not require shaped regions of interest on images to complete feature selection and extraction based on a neural network structure[13,14]. After data collection and processing, the correct neural network is chosen to establish a model, followed by model validation to assess its true generalizability.

Data processing

Raw data are collected and analyzed, and corrupt data are identified and cleaned in the processing phase. Data selection methods are provided in Scikit-Learn[15], a Python machine learning library, which consist of univariate selection, feature importance, correlation matrix, and recursive feature elimination or addition. Other programming languages such as R Studio (http://www.r-project.org) or MATLAB software (University of New Mexico, New Mexico, United States) also offer a successful environment for AI, and they provide similar approaches to address specific tasks. Useful data and relevant variables from multiple data sources, which are applied to predict outcomes, are selected and divided into an initial training set and a testing set that allow training and internal validation of the model. Data in the training set should be different and nonredundant from that in the testing set. Notably, for small datasets, a higher proportion of data should be included in the testing set to measure the performance of the trained model accurately through cross-validation or in a bootstrapping procedure.

Modeling

After transforming the data into an appropriate format, different tools are developed for implementing ML. Although several programming tools such as Python, R Studio, and MATLAB vary among themselves, they provide similar options and algorithms to adjust the parameters based on specific tasks. The major classification algorithms for testing are Naive Bayes, Decision Trees, Support Vector Machine, K-Nearest Neighbor, and Ensemble Classifiers. Oversampling or undersampling of the unbalanced training data can be utilized to improve the representation of classes and prevent model bias during the modeling stage. Currently, as the calculation workload of the batch learning process is heavy, minibatch learning is more popularized with repeating epochs, which usually decreases errors for the training and testing phases. However, an early stopping technique would be adopted to address the overfitting problem if repeating epochs cannot ensure error reduction. Based on the evaluations of model performance, developers conduct feature engineering again to manipulate the features and approve the predictive values of the model. After the optimization phase, selection for the model is primarily based on trial-and-error and the best performance for specific problem-solving. Finally, model optimization is performed with adjusted parameters by testing different configurations.

Model validation

To evaluate the AI approaches, one of the most significant requirements is external validation, which is called the blind test. Any model developed within one dataset will merely reflect its idiosyncrasies, and will have poor performance in analyzing new settings. In addition, models can also be validated by internal data testing (e.g., k-fold cross-validation). In k-fold cross-validation, the dataset is separated into k subsets, including one subset for testing and the remaining (k-1) subsets for training a model. With all data used in both training and testing sets, the cross-validation process is repeated k times. The model performance is finally calculated as the average value of all k iterations. The k varies depending on the size of the dataset. For example, leave-one-out validation may be used in a small training set (< 200 data points), which means that k is equivalent to the dataset size. The appropriate and robust predictive model should have consistent performance between training and testing sets, preventing overfitting discrepancies.

ARTIFICIAL INTELLIGENCE IN ENDOSCOPY

With the advent and continuous improvement of fiberoptics, endoscopy has been playing a significant role in the diagnosis and treatment of gastrointestinal diseases. However, gastrointestinal diseases remain an enormous economic burden and lead to high mortality worldwide. AI is applicable in the gastroenterology fields within endoscopy[16-18], such as identification of esophageal and gastric neoplasia in esophagogastroduodenoscopy (EGD), detection of gastrointestinal bleeding in wireless capsule endoscopy (WCE), and polyp detection and characterization in colonoscopy, etc[19-63] (Table 1).
Table 1

Summary of key studies on artificial intelligence-assisted endoscopy in gastroenterology fields

Ref. Country Disease studied Design of study Application Number of cases Type of machine learning algorithm Outcomes (%)
Accuracy
Sensitivity/Specificity
Esophagogastroduodenoscopy
Takiyama et al[19], 2018JapanAnatomical location of upper gastrointestinal tractRetrospectiveRecognition of the anatomical location of upper gastrointestinal tractTraining: 27335 images: 663 larynx, 3252 esophagus, 5479 upper stomach, 7184 middle stomach, 7539 lower stomach, and 3218 duodenum; Testing: 17081 images: 363 larynx, 2142 esophagus, 3532 upper stomach, 6379 middle stomach, 3137 lower stomach, and 1528 duodenumCNNsLarynx: 100; Esopha us: 100; Stomach: 99; Duodenum: 99Larynx: 93.9/100; Esophagus: 95.8/99.7; Stomach: 98.9/93; Duodenum: 87/99.2
Wu et al[20], 2019ChinaDiseases of upper gastrointestinal tractProspectiveMonitor blind spots of upper gastrointestinal tractTraining: 1.28 million images from 1000 object classes; Testing: 3000 images for DCNN1, and 2160 images for DCNN2CNNs90.487.57/95.02
van der Sommen et al[21], 2016NetherlandsEN-BERetrospectiveDetection of EN in BE21 patients with EN-BE (60 images), 23 patients without EN-BE (40 images)SVMNA86/87
Swager et al[22], 2017NetherlandsEN-BERetrospectiveDetection of EN in BE60 images: 40 with EN-BE and 30 without EN-BESVM9590/93
Hashimoto et al[23], 2020United States EN-BERetrospectiveDetection of EN in BETraining: 916 images with EN-BE; Testing: 458 images: 225 dysplasia and 233 non-dysplasiaCNNs95.496.4/94.2
Ebigbo et al[24], 2020GermanyEAC-BERetrospectiveDetection of EAC in BETraining: 129 images; Testing: 62 images: 36 EAC and 26 normal BECNNs89.983.7/100
Horie et al[25], 2019JapanEAC and ESCCRetrospectiveDetection of EAC and ESCCTraining: 384 patients with 32 EAC and 397 ESCC (8428 images); Testing: 47 patients with 8 EAC and 41 ESCC (1118 images)CNNs9898/79
Kumagai et al[26], 2019JapanESCCRetrospectiveDetection of ESCCTraining: 240 patients (4715 images: 1141 ESCC and 3574 benign lesions); Testing: 55 patients (1520 images: 467 ESCC and 1053 benign)CNNs90.992.6/89.3
Zhao et al[27], 2019ChinaESCC RetrospectiveDetection of ESCC165 patients with ESCC and 54 patients without ESCC (1383 images)CNNs89.287.0/84.1
Cai et al[28], 2019ChinaESCCRetrospectiveDetection of ESCCTraining: 746 patients (2438 images: 1332 abnormal and 1096 normal); Testing: 52 patients (187 images)CNNs91.497.8/85.4
Nakagawa et al[29], 2019JapanESCCRetrospectiveDetermination of invasion depthTraining: 804 patients with ESCC (14338 images: 8660 non-ME and 5678 ME); Testing: 155 patients with ESCC (914 images: 405 non-ME and 509 ME)CNNsSM1/SM2, 3: 91.0; Invasion depth: 89.6SM1/SM2, 3: 90.1/95.8; Invasion depth: 89.8/88.3
Tokai et al[30], 2020JapanESCCRetrospectiveDetermination of invasion depth Training: 1751 images with ESCC; Testing: 42 patients with ESCC (293 images)CNNs80.984.1/80.9
Ali et al[31], 2018PakistanEGCRetrospectiveDetection of EGC56 patients with EGC, 120 patients without EGCSVM8791.0/82.0
Sakai et al[32], 2018JapanEGCRetrospectiveDetection of EGCTraining: 58 patients (348943 images: 172555 EGC and 176388 normal); Testing: 58 patients (9650 images: 4653 EGC and 4997 normal)CNNs87.680.0/94.8
Kanesaka et al[33], 2018JapanEGCRetrospectiveDetection of EGCTraining: 126 images: 66 EGC and 60 normal; Testing: 81 images: 61 EGC and 20 normalSVM96.396.7/95.0
Wu et al[34], 2019ChinaEGCRetrospectiveDetection of EGCTraining: 9691 images: 3710 EGC and 5981 normal; Testing: 100 patients: 50 EGC and 50 normalCNNs92.594.0/91.0
Horiuchi et al[35], 2020JapanEGCRetrospectiveDetection of EGCTraining: 2570 images: 1492 EGC and 1078 gastritis; Testing: 285 images: 151 EGC and 107 gastritisCNNs85.395.4/71.0
Zhu et al[36], 2019ChinaInvasive GCRetrospectiveDetermination of invasion depthTraining: 245 patients with GC and 545 patients without GC (5056 images); Testing: 203 images: 68 GC and 135 normalCNNs89.276.5/95.6
Luo et al[37], 2019ChinaEAC, ESCC, and GCProspectiveDetection of upper gastrointestinal cancersTraining: 15040 individuals (125898 images: 31633 cancer and 94265 control); Testing: 1886 individuals (15637 images: 3931 cancer and 11706 control)CNNs91.5-97.794.2/85.8
Nagao et al[38], 2020JapanGCRetrospectiveDetermination of invasion depth1084 patients with GC (16557 images); Training: Testing = 4:1CNNs94.584.4/99.4
Wireless capsule endoscopy
Ayaru et al[39], 2015United KingdomSmall bowel bleedingRetrospectivePrediction of outcomesTraining: 170 patients with small bowel bleeding; Testing: 130 patients with small bowel bleedingANNsRecurrent bleeding 88; Therapeutic intervention: 88; Severe bleeding: 78Recurrent bleeding: 67/91; Therapeutic intervention: 80/89; Severe bleeding: 73/80
Xiao et al[40], 2016ChinaSmall bowel bleedingRetrospectiveDetection of bleeding in GI tractTraining: 8200 images: 2050 bleeding and 6150 non-bleeding; Testing: 1800 images: 800 bleeding and 1000 non-bleedingCNNs99.699.2/99.9
Usman et al[41], 2016South KoreaSmall bowel bleedingRetrospectiveDetection of bleeding in GI tractTraining: 75000 pixels: 25000 bleeding and 50000 non-bleeding; Testing: 8000 pixels: 3000 bleeding and 5000 non-bleedingSVM91.893.7/90.7
Sengupta et al[42], 2017United States Small bowel bleedingRetrospectivePrediction of 30-d mortalityTraining: 4044 patients with small bowel bleeding; Testing: 2060 patients with small bowel bleedingANNs8187.8/90/9
Leenhardt et al[43], 2019FranceSmall bowel bleedingRetrospectiveDetection of GIATraining: 600 images: 300 hemorrhagic GIA and 300 non-hemorrhagic GIA; Testing: 600 images: 300 hemorrhagic GIA and 300 non-hemorrhagic GIACNNs98100.0/96.0
Aoki et al[44], 2020JapanSmall bowel bleedingRetrospectiveDetection of small bowel bleedingTraining: 41 patients (27847 images: 6503 bleeding and 21344 normal); Testing: 25 patients (10208 images: 208 bleeding and 10000 non-bleeding)CNNs99.8996.63/99.96
Yang et al[45], 2020ChinaSmall bowel polypsRetrospectiveDetection of small bowel polyps1000 images: 500 polyps and 500 non-polypsSVM96.0095.80/96.20
Vieira et al[46], 2020PortugalSmall bowel tumorsRetrospectiveDetection of small bowel tumors39 patients (3936 images: 936 tumors and 3000 normal)SVM97.696.1/98.3
Colonoscopy
Fernández-Esparrach et al[47], 2016SpainColorectal polypsRetrospectiveDetection of polyps24 videos containing 31 different polypsEnergy maps7970.4/72.4
Komeda et al[48], 2017JapanColorectal polyps RetrospectiveDetection of polypsTraining: 1800 images: 1200 adenoma and 600 non-adenoma; Testing: 10 casesCNNs70.083.3/50.0
Misawa et al[49], 2017JapanColorectal polypsRetrospectiveDetection of polypsTraining: 1661 images: 1213 neoplasm and 448 non-neoplasm; Testing: 173 images: 124 neoplasm and 49 non-neoplasmSVM87.894.3/71.4
Misawa et al[50], 2018JapanColorectal polypsRetrospectiveDetection of polyps196631 frames: 63135 polyps and 133496 non-polypsCNNs76.590.0/63.3
Chen et al[51], 2018ChinaColorectal polypsRetrospectiveDetection of diminutive colorectal polypsTraining: 2157 images: 681 hyperplastic and 1476 adenomas; Testing: 284 images: 96 hyperplastic and 188 adenomasDNNs90.196.3/78.1
Urban et al[52], 2018United StatesColorectal polypsRetrospectiveDetection of polypsTraining: 8561 images: 4008 polyps and 4553 non-polyps; Testing: 1330 images: 672 polyps and 658 non-polypsCNNs96.496.9/95.0
Renner et al[53], 2018GermanyColorectal polypsRetrospectiveDifferentiation of neoplastic from non-neoplastic polypsTraining: 788 images: 602 adenomas and 186 non-adenomatous polyps; Testing: 186 images: 52 adenomas and 48 hyperplastic lesionsDNNs78.092.3/62.5
Wang et al[54], 2018United StatesColorectal polypsRetrospectiveDetection of polypsTraining: 5545 images: 3634 polyps and 1911 non-polyps; Testing: 27113 images: 5541 polyps and 21572 non-polypsCNNs9894.4/95.9
Mori et al[55], 2018JapanColorectal polypsProspectiveA diagnose-and-leave strategy for diminutive, non-neoplastic rectosigmoid polypsTraining: 61925 images; Testing: 466 cases (287 neoplastic polyps, 175 nonneoplastic polyps, and 4 missing specimens)SVM96.593.8/91.0
Byrne et al[56], 2019CanadaColorectal polypsRetrospectiveDetection and classification of polypsTraining: 60089 frames of 223 videos (29% NICE type 1, 53% NICE type 2 and 18% of normal mucosa with no polyp); Testing: 125 videos: 51 hyperplastic polyps and 74 adenomaCNNs94.098.0/83.0
Blanes-Vidal et al[57], 2019DenmarkColorectal polypsRetrospectiveDetection of polyps131 patients with polyps and 124 patients without polypsCNNs96.497.1/93.3
Lee et al[58], 2020South KoreaColorectal polypsRetrospectiveDetection of polypsTraining: 306 patients (8593 images: 8495 polyp and 98 normal); Testing: 15 patients (15 polyps videos)CNNs93.489.9/93.7
Gohari et al[59], 2011IranCRCRetrospectiveDetermination of prognostic factors of CRC1219 patients with CRCANNsColon cancer: 89; Rectum cancer: 82.7NA/NA
Biglarian et al[60], 2012IranCRCRetrospectivePrediction of distant metastasis in CRC1219 patients with CRCANNs82NA/NA
Takeda et al[61], 2017JapanCRCRetrospectiveDiagnosis of invasive CRCTraining: 5543 images: 2506 non-neoplasms, 2667 adenomas, and 370 invasive cancers; Testing: 200 images: 100 adenomas and 100 invasive cancersSVM94.189.4/98.9
Ito et al[62], 2019JapanCRCRetrospectiveDiagnosis of cT1b CRCTraining: 9942 images: 5124 cTis + cT1a, 4818 cT1b, and 2604 cTis + cT1a; Testing: 5022 images: 2604 cTis + cT1a, and 2418 cT1bCNNs81.267.5/89.0
Zhou et al[63], 2020ChinaCRCRetrospectiveDiagnosis of CRCTraining: 3176 patients with CRC and 9003 patients without CRC (464105 images: 28071 CRC and 436034 non-CRC); Testing: 307 patients with CRC and 1956 patients without CRC (84615 images: 11675 CRC and 72940 non-CRC)CNNs96.391.4/98.0

AI: Artificial intelligence; CNN: Convolutional neural network; EN: Early-stage neoplasia; BE: Barrett’s esophagus; SVM: Support vector machine; NA: Not available; EAC: Esophageal adenocarcinoma; ESCC: Esophageal squamous cell carcinoma; EGC: Early-stage gastric cancer; GC: Gastric cancer; ANN: Artificial neural network; GI: Gastrointestinal; GIA: Gastrointestinal angioectasia; DNN: Deep neural network; CRC: Colorectal cancer.

Summary of key studies on artificial intelligence-assisted endoscopy in gastroenterology fields AI: Artificial intelligence; CNN: Convolutional neural network; EN: Early-stage neoplasia; BE: Barrett’s esophagus; SVM: Support vector machine; NA: Not available; EAC: Esophageal adenocarcinoma; ESCC: Esophageal squamous cell carcinoma; EGC: Early-stage gastric cancer; GC: Gastric cancer; ANN: Artificial neural network; GI: Gastrointestinal; GIA: Gastrointestinal angioectasia; DNN: Deep neural network; CRC: Colorectal cancer.

EGD

Inadequate examination of the upper gastrointestinal tract is one of the reasons for misdiagnosing several EGD diseases. Based on AI-assisted EGD, the upper gastrointestinal tract can be divided into the pharynx, esophagus, stomach (upper, middle, lower), and duodenum with high values of the area under the curve (AUC)[19]. Furthermore, several AI technologies have classified the images of the stomach within EGD to significantly monitor the blind spots, and their accuracy has reached the ability of experienced endoscopists[20,34]. Endoscopic surveillance for Barrett’s esophagus (BE) is the potential risk factor for esophageal adenocarcinoma (EAC), of which the prognosis is related to disease staging[64,65]. However, accurate detection of esophageal neoplasia and early EAC remains difficult for experienced endoscopists[66]. An AI system developed by Ebigbo et al[24] enabled early detection of EAC with high sensitivity and specificity, they subsequently designed a real-time system for neoplasia classification in magnification. Both accurate detection of early EAC is important in BE images and the novel system of high invasion accuracy also deserves clinical attention[23]. In esophageal squamous cell carcinoma (ESCC), these tumors are often diagnosed at advanced stages, while early ESCC seems to be detected based on endoscopists’ experience because they are almost impossible to visualize with white light endoscopy. Fortunately, with AI technologies, small esophageal lesions (< 10 mm) are recognized, and there is an AI system showing diagnostic accuracy of 91.4%, which is higher than that of high-level (with experience of > 15 years, 88.8%), mid-level (with experience of 5-15 years, 81.6%), and junior-level (with experience of < 5 years, 77.2%) endoscopists[28]. In addition, the prognosis of ESCC can be proved by differentiating tumor invasion depth[29,30]. The prognosis of gastric cancer (GC) mainly depends on the early detection and invasion depth of the disease. It is extremely difficult for endoscopists to recognize early gastric cancer (EGC), which is often accompanied by gastric mucosal inflammation, and the false-negative rate of EGC in EGD has reached nearly 25.0%[67,68]. AI-assisted EGD has the potential to address tough tissues. However, the first reported CNNs-based AI system for detection of EGC had a low positive predictive value of 30.6%, leading to misdiagnosis of gastritis and misinterpretation of the gastric angle as GC[69]. In 2019, Wu et al[34] examined the detection of GC by AI and validated 200 endoscopic images, with increased accuracy, sensitivity, and specificity values (92.5%, 94%, and 91%, respectively). Furthermore, the AI system, named GRAIDS, has achieved diagnostic sensitivity close to that of expert endoscopists (94.2% vs 94.5%), and it demonstrated a robust performance showing high diagnostic accuracy in a multicenter study[37]. Besides detection, one of the most important criteria for curative resection is the invasion depth. The invasion depth prediction of GC by AI was first developed by Kubota et al[70], and the model showed the accuracy of T-stages (T1 = 77%, T2 = 49%, T3 = 51%, T4 = 55%, respectively). Considering that endoscopic mucosal resection is appropriate for intramucosal cancers (M) and submucosal cancers (invasion < 500 μm) (SM1), a more detailed classification is urgently needed. Therefore, an AI system was developed to differentiate the depths of M or SM1 and SM2 (submucosal invasion ≥ 500 μm) for GC with significantly higher sensitivity, specificity, and accuracy than those of skilled endoscopists[38].

WCE

AI-assisted WCE enables endoscopists to highlight suspicious regions on examination of the digestive tract noninvasively, including detection of small bowel bleeding, ulcers, and polyps, celiac disease, etc. Based on specific AI classifiers and validation techniques (mainly k-fold cross-validation), these models utilized still frames, pixels, or real-time videos to identify patients with small bowel bleeding with accuracy above 90% for most studies[40,41,43,44]. A CNNs-based algorithm, established in a retrospective analysis of 10000 WCE images (8200 and 1800 in the training and testing set, respectively) and validated by 10-fold cross-validation, was proposed for automatic detection of small bowel bleeding. The model was performed with a high F-1 score of 99.6% and precision of 99.9% for both active and inactive bleeding frames[40]. Besides detection, several emerging AI tools have been developed to stratify patients for the possibility of recurrent bleeding, treatment requirement, and mortality estimate to prevent repeated endoscopies in a significant proportion of patients with potential recurrent upper or lower gastrointestinal bleeding[39,42].

Colonoscopy

Colorectal polyp detection and appropriate polypectomy during colonoscopy is the standard way to prevent colorectal cancer (CRC). Since missed colorectal polyps can potentially progress into CRC, AI-assisted colonoscopy has been developed for polyp detection and characterization, and predicting the prognosis of CRC. In terms of polyp detection, an automated AI system using an energy map was developed in 2016, and it showed barely satisfactory performance[47]. Urban et al[52] used 8641 labeled images and 20 colonoscopy videos as the training and testing set to establish a CNNs model to identify colonic polyps, and the model had an accuracy of 96.4%[52]. Notably, the models should be validated to improve accuracy. After validating the model developed by Wang et al[54] with 27113 newly collected images from 1138 patients, the model showed acceptable performance (sensitivity = 94.38%, specificity = 95.2%, and AUC = 0.984). In addition, polyp characterization with magnifying endoscopic images is useful for identifying pit or vascular patterns to improve performance. AI tools with narrow-band imaging[51] or endoscopic videos[56] can be used to differentiate diminutive hyperplastic polyps and adenomas with high accuracy. Specifically, diminutive polyps (≤ 5 mm) may also be identified during colonoscopy[55]. In addition, AI may assist doctors in predicting the prognosis of CRC. An ANNs model, which was developed from a dataset of 1219 CRC patients, may predict patient survival and influential factors more accurately than a Cox regression model[59], and it also enables doctors to predict the risk of distant metastases[60].

ARTIFICIAL INTELLIGENCE IN RADIOLOGY

There is a disproportionate growing rate between radiological imaging data and the number of trained radiologists, and it has forced radiologists to compensate by increasing productivity[71]. The emergence of AI technologies has eased the current dilemma and dramatically advanced radiology image analysis, including ultrasound, computed tomography (CT), and magnetic resonance imaging (MRI) in the fields of gastroenterology and hepatology. In addition, the rise of radiomics, which is a new technology in radiology and cancer, can extract abundant quantifiable objective data to evaluate surgical resection and predict treatment response[72-112] (Table 2).
Table 2

Summary of key studies on artificial intelligence-assisted radiology in hepatology fields

Ref. Country Disease studied Design of study Application Number of cases Type of machine learning algorithm Outcomes (%)
Accuracy
Sensitivity/Specificity
Ultrasound-based medical image recognition
Gatos et al[72], 2016United StatesHepatic fibrosisRetrospectiveClassification of CLD85 images: 54 healthy and 31 CLDSVM8783.3/89.1
Gatos et al[73], 2017United StatesHepatic fibrosisRetrospectiveClassification of CLD124 images: 54 healthy and 70 CLDSVM87.393.5/81.2
Chen et al[74], 2017ChinaHepatic fibrosisRetrospectiveClassification of the stages of hepatic fibrosis in HBV patients513 HBV patients with different hepatic fibrosis (119 S0, 164 S1, 88 S2, 72 S3, and 70 S4)SVM, Naive Bayes, RF, KNN82.8792.97/82.50
Li et al[75], 2019ChinaHepatic fibrosisProspectiveClassification of the stages of hepatic fibrosis in HBV patients144 HBV patientsAdaptive boosting, decision tree, RF, SVM8593.8/76.9
Gatos et al[76], 2019United StatesHepatic fibrosisRetrospectiveClassification of CLD88 healthy individuals (88 F0 fibrosis stage images) and 112 CLD patients (112 images: 46 F1, 16 F2, 22 F3, and 28 F4)CNNs82.5NA/NA
Wang et al[77], 2019ChinaHepatic fibrosisProspectiveClassification of the stages of hepatic fibrosis in HBV patientsTraining: 266 HBV patients (1330 images); Testing: 132 HBV patients (660 images)CNNsF4: 100; ≥ F3: 99; ≥ F2: 99F4: 100.0/100.0; ≥ F3: 97.4/95.7; ≥ F2: 100.0/97.7
Kuppili et al[78], 2017United StatesMAFLDRetrospectiveDetection and characterization of FLD63 patients: 27 healthy and 36 MAFLDELM, SVMELM: 96.75; SVM: 89.01NA/NA
Byra et al[79], 2018PolandMAFLDRetrospectiveDiagnosis of the amount of fat in the liver55 severely obese patientsCNNs, SVM96.3100/88.2
Biswas et al[80], 2018United StatesMAFLDRetrospectiveDetection and risk stratification of FLD63 patients: 27 healthy and 36 MAFLDCNNs, SVM, ELMCNNs: 100; SVM: 82; ELM: 92NA/NA
Cao et al[81], 2020ChinaMAFLDRetrospectiveDetection and classification of MAFLD240 patients: 106 healthy, 57 mild MAFLD, 67 moderate MAFLD, and 10 severe MAFLDCNNs95.8NA/NA
Guo et al[82], 2018ChinaLiver tumorsRetrospectiveDiagnosis of liver tumors93 patients with liver tumors: 47 malignant lesions (22 HCC, 5 CC, and 10 RCLM), and 46 benign lesionsDNNs90.4193.56/86.89
Schmauch et al[83], 2019FranceFLLRetrospectiveDetection and characterization of FLLTraining: 367 patients (367 images); Testing: 177 patientsCNNsDetection: 93.5; Characterization: 91.6NA/NA
Yang et al[84], 2020ChinaFLLRetrospectiveDetection of FLLTraining: 1815 patients with FLL (18000 images); Testing: 328 patients with FLL (3718 images)CNNs84.786.5/85.5
CT/MRI-based medical image recognition
Choi et al[85], 2018South KoreaHepatic fibrosisRetrospectiveStaging liver fibrosis by using CT imagesTraining: 7461 patients: 3357 F0, 113 F1, 284 F2, 460 F3, 3247 F4; Testing: 891 patients: 118 F0, 109 F1, 161 F2, 173 F3, 330 F4CNNs92.1–95.0 84.6–95.5/89.9–96.6
He et al[86], 2019United StatesHepatic fibrosisRetrospectiveStaging liver fibrosis by using MRI imagesTraining: 225 CLD patients; Testing: 84 patientsSVM81.872.2/87.0
Ahmed et al[87], 2020EgyptHepatic fibrosisRetrospectiveDetection and staging of liver fibrosis by using MRI images37 patients: 15 healthy and 22 CLDSVM83.781.8/86.6
Hectors et al[88], 2020United StatesLiver fibrosisRetrospectiveStaging liver fibrosis by using MRI imagesTraining: 178 patients with liver fibrosis; Testing: 54 patients with liver fibrosisCNNsF1-F4: 85; F2-F4: 89; F3-F4: 91; F4: 83F1-F4: 84/90; F2-F4: 87/93; F3-F4: 97/83; F4: 68/94
Vivanti et al[89], 2017IsraelLiver tumorsRetrospectiveDetection and segmentation of new tumors in follow-up by using CT images246 liver tumors (97 new tumors)CNNs8670/NA
Yasaka et al[90], 2018JapanLiver massesRetrospectiveDetection and differentiation of liver masses by using CT imagesTraining: 460 patients with liver masses (1068 images: 240 Category A, 121 Category B, 320 Category C, 207 Category D, 180 Category E); Testing: 100 images with liver masses: 21 Category A, 9 Category B, 35 Category C, 20 Category D, 15 Category ECNNs84Category A: 71/NA; Category B: 33/NA; Category C: 94/NA; Category D: 90/NA; Category E: 100/NA
Ibragimov et al[91], 2018United StatesLiver diseases requiring SBRTRetrospectivePrediction of hepatotoxicity after liver SBRT by using CT images125 patients undergone liver SBRT: 58 liver metastases, 36 HCC, 27 cholangiocarcinoma, and 4 other histopathologiesCNNs85NA/NA
Abajian et al[92], 2018United StatesHCCRetrospectivePrediction of HCC response to TACE by using MRI images36 HCC patients treated with TACERF7862.5/82.1
Zhang et al[93], 2018United StatesHCCRetrospectiveClassification of HCC by using MRI images20 patients with HCCCNNs80NA/NA
Morshid et al[94], 2019United StatesHCCRetrospectivePrediction of HCC response to TACE by using CT images105 HCC patients received first-line treatment with TACECNNs74.2NA/NA
Nayak et al[95], 2019IndiaCirrhosis; HCCRetrospectiveDetection of cirrhosis and HCC by using CT images40 patients: 14 healthy, 12 cirrhosis, 14 cirrhosis with HCCSVM86.9100/95
Hamm et al[96], 2019United StatesCommon hepatic lesionsRetrospectiveClassification of common hepatic lesions by using MRI imagesTraining: 434 patients with common hepatic lesions; Testing: 60 patients with common hepatic lesionsCNNs9292/98
Wang et al[97], 2019United StatesCommon hepatic lesionsRetrospectiveDemonstration of a proof-of-concept interpretable DL system by using MRI images60 common hepatic lesions patientsCNNsNA82.9/NA
Jansen et al[98], 2019NetherlandsFLLRetrospectiveClassification of FLL by using MRI images95 patients with FLL (125 benign lesions: 40 adenomas, 29 cysts, and 56 hemangiomas; and 88 malignant lesions: 30 HCC and 58 metastases)RF77Adenoma: 80/78; Cyst: 93/93; Hemangioma: 84/82; HCC: 73/56; Metastasis: 62/77
Mokrane et al[99], 2020FranceHCCRetrospectiveDiagnosis of HCC in patients with cirrhosis by using CT imagesTraining: 106 patients: 85 HCC and 21 non-HCC; Testing: 36 patients: 23 HCC and 13 non-HCCSVM, KNN, RF7070/54
Shi et al[100], 2020ChinaHCCRetrospectiveDetection of HCC from FLL by using CT imagesTraining: 359 lesions: 155 HCC and 204 non-HCC; Testing: 90 lesions: 39 HCC and 51 non-HCCCNNs85.674.4/94.1
Alirr et al[101], 2020KuwaitLiver tumorsRetrospectiveSegmentation of liver tumorsTraining: 100 images with liver tumors;Testing: 31 images with liver tumorsCNNs95.2NA/NA
Zheng et al[102], 2020ChinaPancreatic cancerRetrospectivePancreas segmentation by using MRI images20 patients with PDACCNNs99.86NA/NA
Radiomics
Liang et al[103], 2014ChinaHCCRetrospectivePrediction of recurrence for HCC patients who received RFA83 patients with HCC receiving RFA as first treatment (18 recurrence and 65 non-recurrence)SVM8267/86
Zhou et al[104], 2017ChinaHCCRetrospectiveCharacterization of HCC46 patients with HCC: 21 low-grade (Edmondson grades I and II) and 25 high-grade (Edmondson grades III and IV)Free-form curve-fitting86.9576.00/100.00
Abajian et al[105], 2018United StatesHCCRetrospectivePrediction of response to intra-arterial treatment36 patients undergone trans-arterial treatmentRF7862.5/82.1
Ibragimov et al[91], 2018United StatesLiver tumorsRetrospectivePrediction of hepatobiliary toxicity of SBRT125 patients undergone liver SBRT: 58 metapatients, 36 HCC, 27 cholangiocarcinoma, and 4 other primary liver tumor histopathologiesCNNs85NA/NA
Morshid et al[94], 2019United StatesHCCRetrospectivePrediction of HCC response to TACE105 patients with HCC: 11 BCLC stage A, 24 BCLC stage B, 67 BCLC stage C, and 3 BCLC stage DCNNs74.2NA/NA
Ma et al[106], 2019ChinaHCCRetrospectivePrediction of MVI in HCCTraining: 110 patients with HCC: 37 with MVI and 73 without MVI; Testing: 47 patients with HCC: 18 with MVI and 29 without MVISVM76.665.6/94.4
Dong et al[107], 2020ChinaHCCRetrospectivePrediction and differentiation of MVI in HCC Prediction: 322 patients with HCC: 144 with MVI and 178 without MVI; Differentiation: 144 patients with HCC and MVIRF, mRMRPrediction: 63.4; Differentiation: 73.0 Prediction: 89.2/48.4; Differentiation: 33.3/80.0
He et al[108], 2020ChinaHCCProspectivePrediction of MVI in HCCTraining: 101 patients with HCC; Testing: 18 patients with HCCLASSO84.4NA/NA
Schoenberg et al[109], 2020GermanyHCCProspectivePrediction of disease-free survival after HCC resectionTraining: 127 patients with HCC; Testing: 53 patients with HCCRF78.8NA/NA
Zhao et al[110], 2020ChinaHCCRetrospectivePrediction of ER of HCC after partial hepatectomyTraining: 78 patients with HCC: 40 with ER and 38 without ER; Testing: 35 patients with HCC: 18 with ER and 17 without ERLASSO80.880.0/81.6
Liu et al[111], 2020ChinaHCCRetrospectivePrediction of progression-free survival of HCC patients after RFA and SRRFA: Training: 149 HCC patients undergone RFA Testing: 65 HCC patients undergone RFA; SR: Training: 144 HCC patients undergone SR Testing: 61 HCC patients undergone SRCox-CNNsRFA: 82.0; SR: 86.3NA/NA
Chen et al[112], 2021ChinaHCCRetrospectivePrediction of HCC response to first TACE by using CT imagesTraining: 355 patients with HCC; Testing: 118 patients with HCCLASSO8185.2/77.2

AI: Artificial intelligence; CLD: Chronic liver disease; SVM: Support vector machine; HBV: Hepatitis-B virus; RF: Random forests; KNN: K-nearest neighbor; CNN: Convolutional neural network; NA: Not available; MAFLD: Metabolic associated fatty liver disease; FLD: Fatty liver disease; ELM: Extreme learning machine; HCC: Hepatocellular carcinoma; CC: Cholangiocarcinoma; RCLM: Colorectal cancer liver metastases; DNN: deep neural network; FLL: Focal liver lesions; CLD: Chronic liver disease; SBRT: Stereotactic body radiation therapy; TACE: Transarterial chemotherapy; PDAC: Pancreatic ductal adenocarcinoma; RFA: Radiofrequency ablation; BCLC: Barcelona clinic liver cancer staging; MVI: Microvascular invasion; mRMR: Minimum redundancy maximum relevance; LASSO: Least absolute shrinkage and selection operator; ER: Early recurrence; SR: Surgical resection.

Summary of key studies on artificial intelligence-assisted radiology in hepatology fields AI: Artificial intelligence; CLD: Chronic liver disease; SVM: Support vector machine; HBV: Hepatitis-B virus; RF: Random forests; KNN: K-nearest neighbor; CNN: Convolutional neural network; NA: Not available; MAFLD: Metabolic associated fatty liver disease; FLD: Fatty liver disease; ELM: Extreme learning machine; HCC: Hepatocellular carcinoma; CC: Cholangiocarcinoma; RCLM: Colorectal cancer liver metastases; DNN: deep neural network; FLL: Focal liver lesions; CLD: Chronic liver disease; SBRT: Stereotactic body radiation therapy; TACE: Transarterial chemotherapy; PDAC: Pancreatic ductal adenocarcinoma; RFA: Radiofrequency ablation; BCLC: Barcelona clinic liver cancer staging; MVI: Microvascular invasion; mRMR: Minimum redundancy maximum relevance; LASSO: Least absolute shrinkage and selection operator; ER: Early recurrence; SR: Surgical resection.

Abdominal ultrasound

AI technologies have been applied to abdominal ultrasound-based medical images for the assessment of liver diseases, such as hepatic fibrosis and mass lesions. A support vector machine-derived approach was developed by Gatos et al[72] to detect and classify chronic liver disease (CLD) based on abdominal ultrasound. After quantifying 85 ultrasound images (54 healthy and 31 with CLD), the proposed model showed superior results (accuracy = 87.0%, sensitivity = 83.3%, and specificity = 89.1%), which greatly improved the diagnostic and classification accuracy of CLD. Furthermore, CNNs are employed to identify and isolate regions of different stiffness temporal stability under ultrasound to explore the impact on CLD diagnosis. The updated detection algorithm has augmented the accuracy to 95.5% after excluding unreliable areas and reducing interobserver variability[76]. Detecting and classifying hepatic mass lesions as benign or malignant is equally important. Schmauch et al[83] performed supervised training (367 ultrasonic images together with the radiological reports) to build a DL model, and the resulting algorithm had high receiver operating characteristic curves of 0.93 and 0.916 in lesion detection and characterization, respectively. Although the model could increase the diagnostic accuracy and detect potential malignant mass lesions, it should be further validated. In addition, combining AI technologies with contrast-enhanced ultrasound may improve the performance to identify and characterize liver cancer. For example, after AI-assisted contrast-enhanced ultrasound was applied to detect liver lesions in the arterial, portal, and late phases, the accuracy, sensitivity and specificity of the examination were markedly increased[82]. Due to the misty demonstration of gastroenterology within ultrasound, an AI-assisted ultrasound tool was limited.

CT/MRI

Liver diseases often present indeterminate behaviors on abdominal CT, and a biopsy is recommended according to the European Association for the Study of the Liver guidelines[113]. Based on a large dataset of CT images (7461 patients diagnosed with liver fibrosis), a CNNs model was developed and it outperformed the radiologists’ interpretation[85]. Furthermore, depending on ANNs-based contrast-enhanced CT images from 460 patients, Yasaka et al[90] conducted a retrospective study to classify liver masses into five categories with high accuracy, including (1) primary hepatocellular carcinoma (HCC); (2) malignant tumors apart from HCC; (3) early HCC, indeterminate masses, or dysplastic nodules; (4) hemangiomas; and (5) cysts. For patients diagnosed with liver tumors or pancreatic cancer, it is crucial to complete the liver or pancreas segmentation to assess the lesions and make the ideal treatment plan. Instead of conventional manual segmentation, a CNNs model was proposed to segment liver tumors based on CT images, with an accuracy of more than 80.0%, favoring suitable decision-making[101]. Additionally, a CNNs model was also developed for pancreas localization and segmentation using CT images[102]. Furthermore, monitoring tumor recurrence plays an important role in follow-up CT. Vivanti et al[89] collected and integrated the initial appearance of tumors, CT behaviors, and quantification of the tumor loads throughout the disease course, and then they designed an automated detection model of tumor recurrence with an accuracy of 86%. Besides depending on CT images, a DL approach for pancreas segmentation can also be designed from MRI images. Several AI-assisted studies have shown promising results in classifying MRI liver lesions with/without risk factors and patients’ clinical data, improving the accuracy and yields of reference models[93,96,98,102].

Radiomics

Currently, radiomics has received great interest from doctors because this AI-assisted technology can extract indiscoverable quantifiable objective data of the radiological images and reveal the association with potential biological processes[114,115]. Preoperative stratification of patients at different risk of recurrence and prediction of survival after resection is fundamental to improve prognosis. As an independent risk factor of recurrence, microvascular invasion (MVI) cannot be provided in conventional radiological techniques[116]. Several studies have managed to use radiomic algorithms based on ultrasound, CT, or MRI to elaborate radiomic signatures for preoperative prediction of MVI[106-108]. Besides the prediction of recurrence, radiomics may also be utilized to predict survival after surgical resection. However, compared to the excellent AI models based on pathologic images, radiomics-based predictive models merely attain a low value of 0.78[109]. Beyond recurrence and survival prediction purposes, radiomics can also be utilized for prediction of patients’ response to transarterial chemoembolization (TACE) and radiofrequency ablation (RFA), and post-radiotherapy hepatotoxicity. A CNNs model developed from 105 HCC patients’ CT images had higher accuracy in predicting response to TACE than the Barcelona Clinic Liver Cancer stages[94]. In addition, Chen et al[112] designed an excellent clinical-radiomic model to predict objective response to first TACE based on 595 HCC patients’ CT images, which could assist the selection of HCC patients for TACE. Another study used radiomics of MRI images with clinical data to perform prediction of TACE response[105]. For HCC in the early stages, RFA is a recommended option. Based on radiomics, Liang et al[103] designed a model to predict the RFA response and HCC recurrence after RFA, obtaining high AUC, sensitivity, and specificity. Additionally, post-radiotherapy hepatotoxicity should be monitored to adjust the position and dose of radiotherapy. A CNNs model not only identified that irradiation of the proximal portal vein was associated with poor prognosis, it also predicted post-radiotherapy hepatotoxicity with an AUC of 0.85[91]. Ibragimov et al[91] applied a CNNs model to determine the consistent patterns in toxicity-related dose plans, and the AUC of the model for dose planned analysis was increased from 0.79 to 0.85 after the combination with some pre-treatment clinical features, showing that the combined framework can indicate the accurate position and dose of radiotherapy.

ARTIFICIAL INTELLIGENCE IN PATHOLOGY

Pathological analysis is considered the gold standard for the diagnosis of diseases in the fields of gastroenterology and hepatology. Currently, there is a shortage of pathologists around the world, which has become an obstruction for maintaining the accuracy of pathological analysis[117]. With the development of the whole-slide imaging (WSI) scanner and AI technologies, a combination of both technologies can ease the medical burden, improve the diagnosis accuracy, and even predict gene mutations and prognosis[118-147] (Table 3).
Table 3

Summary of key studies on artificial intelligence-assisted pathology in the gastroenterology and hepatology fields

Ref. Country Disease studied Design of study Application Number of cases Type of machine learning algorithm Outcomes (%)
Accuracy
Sensitivity/Specificity
Basic AI-based pathology: diagnosis
Tomita et al[118], 2019United StatesBE and EACRetrospectiveDetection and classification of cancerous and precancerous esophagus tissueTraining: 379 images with 4 classes: normal, BE-no-dysplasia, BE-with-dysplasia, and adenocarcinoma; Testing: 123 images with 4 classes: normal, BE-no-dysplasia, BE-with-dysplasia, and adenocarcinomaCNNsMean: 83; BE-no-dysplasia: 85; BE-with-dysplasia: 89; Adenocarcinoma: 88Normal: 69/71 BE-no-dysplasia: 77/88; BE-with-dysplasia: 21/97; Adenocarcinoma: 71/91
Sharma et al[119], 2017GermanyGCRetrospectiveClassification and necrosis detection of GC454 patients (6810 WSIs: 4994 for cancer classification and 1816 for necrosis detection) (HER2 immunohistochemical stain and HE stained)CNNsCancer classification: 69.90; Necrosis detection: 81.44NA/NA
Li et al[120], 2018ChinaGCRetrospectiveDetection of GC700 images: 560 GC and 140 normal (HE stained)CNNs100NA/NA
Leon et al[121], 2019ColombiaGCRetrospectiveDetection of GC40 images: 20 benign and 20 malignantCNNs89.72NA/NA
Sun et al[122], 2019ChinaGCRetrospectiveDiagnosis of GC500 WSIs of gastric areas with typical cancerous regionsDNNs91.6NA/NA
Ma et al[123], 2020ChinaGCRetrospectiveClassification of lesions in the gastric mucosaTraining: 534 WSIs (1616713 images: 544925 normal, 544624 chronic gastritis, and 527164 cancer) (HE stained) Testing: 153 WSIs (399240 images: 135446 normal, 125783 chronic gastritis, and 138011 cancer) (HE stained)CNNs, RFBenign and cancer: 98.4; Normal, chronic gastritis, and GC: 94.5Benign and cancer: 98.0/98.9; Normal, chronic gastritis, and GC: NA/NA
Yoshida et al[124], 2018JapanGastric lesionsRetrospectiveClassification of gastric biopsy specimens3062 gastric biopsy specimens (HE stained)CNNs55.689.5/50.7
Qu et al[125], 2018JapanGastric lesionsRetrospectiveClassification of gastric pathology imagesTraining: 1080 patches: 540 benign and 540 malignant; Testing: 5400 patches: 2700 benign and 2700 malignantCNNs96.5NA/NA
Iizuka et al[126], 2020JapanGastric and colonic epithelial tumorsRetrospectiveClassification of gastric and colonic epithelial tumors4128 cases of human gastric epithelial lesions and 4036 of colonic epithelial lesions (HE stained)CNNs, RNNsGastric adenocarcinoma: 97; Gastric adenoma: 99; Colonic adenocarcinoma: 96; Colonic adenoma: 99NA/NA
Korbar et al[127], 2017United StatesColorectal polypsRetrospectiveClassification of different types of colorectal polyps on WSIsTraining: 458 WSIs; Testing: 239 WSIsA modified version of a residual network9388.3/NA
Wei et al[128], 2020United StatesColorectal polypsRetrospectiveClassification of colorectal polyps on WSIsTraining: 326 slides with colorectal polyps: 37 tubular, 30 tubulovillous or villous, 111 hyperplastic, 140 sessile serrated, and 8 normal; Testing: 238 slides with colorectal polyps: 95 tubular, 78 tubulovillous or villous, 41 hyperplastic, and 24 sessile serratedCNNsTubular: 84.5; Tubulovillous or villous: 89.5; Hyperplastic: 85.3; Sessile serrated: 88.7Tubular: 73.7/91.6; Tubulovillous or villous: 97.6/87.8; Hyperplastic: 60.3/97.5; Sessile serrated: 79.2/89.7
Shapcott et al[129], 2018UnitedKingdomCRCRetrospectiveDiagnosis of CRC853 hand-marked imagesCNNs84NA/NA
Geessink et al[130], 2019NetherlandsCRCRetrospectiveQuantification of intratumoral stroma in CRC129 patients with CRCCNNs94.691.1/99.4
Song et al[131], 2020ChinaCRCRetrospectiveDiagnosis of CRCTraining: 177 slides: 156 adenoma and 21 non-neoplasm; Testing: 362 slides: 167 adenoma and 195 non-neoplasmCNNs90.489.3/79.0
Wang et al[132], 2015ChinaHepatic fibrosisRetrospectiveAssessment of HBV-related liver fibrosis and detection of liver cirrhosisTraining: 105 HBV patients; Testing: 70 HBV patientsSVM82NA/NA
Forlano et al[133], 2020UnitedKingdomMAFLDRetrospectiveDetection and quantification of histological features of MAFLDTraining: 100 MAFLD patients; Testing: 146 MAFLD patientsK-meansSteatosis: 97; Inflammation: 96; Ballooning: 94; Fibrosis: 92NA/NA
Li et al[134], 2017ChinaHCCRetrospectiveNuclei grading of HCC4017 HCC nuclei patchesCNNs96.7G1: 94.3/97.5; G2: 96.0/97.0;G3: 97.1/96.6; G4: 99.5/95.8
Kiani et al[135], 2020United StatesLiver cancer (HCC and CC)RetrospectiveHistopathologic classification of liver cancerTraining: 70 WSIs: 35 HCC and 35 CC Testing: 80 WSIs: 40 HCC and 40 CCSVM84.272/95
Advanced AI-based pathology: prediction of gene mutations and prognosis
Steinbuss et al[136], 2020GermanyGastritisRetrospectiveIdentification of gastritis subtypesTraining: 92 patients (825 images: 398 low inflammation, 305 severe inflammation, and 122 A gastritis) (HE stained) Testing: 22 patients (209 images: 122 low inflammation, 38 severe inflammation, and 49 A gastritis) (HE stained)CNNs84A gastritis: 88/89; B gastritis: 100/93; C gastritis: 83/100
Liu et al[137], 2020ChinaGastrointestinal neuroendocrine tumorRetrospectivePrediction of Ki-67 positive cells12 patients (18762 images: 5900 positive cells, 6086 positive cells, and 6776 background from ROIs) (HE and IHC stained)CNNs97.897.8/NA
Kather et al[138], 2019GermanyGC and CRCRetrospectivePrediction of MSI in GC and CRCTraining: 360 patients (93408 tiles); Testing: 378 patients (896530 tiles)CNNs84NA/NA
Bychkov et al[139], 2018 FinlandCRCRetrospectivePrediction of CRC outcome420 CRC tumor tissue microarray samplesCNNs, RNNs69NA/NA
Kather et al[140], 2019GermanyCRCRetrospectivePrediction of survival from CRC histology slidesTraining: 86 CRC tissue slides (> 100000 HE image patches); Testing: 25 CRC patients (7180 images)CNNs98.7NA/NA
Echle et al[141], 2020GermanyCRCRetrospectiveDetection of dMMR or MSI in CRCTraining: 5500 patients; Testing: 906 patientsA modified shufflenet DL system9298/52
Skrede et al[142], 20203R23 Song 2020CRCRetrospectivePrediction of CRC outcome after resectionTraining: 828 patients (> 12000000 image tiles); Testing: 920 patientsCNNs7652/78
Sirinukunwattana et al[143], 2020UnitedKingdomCRCRetrospectiveIdentification of consensus molecular subtypes of CRCTraining: 278 patients with CRC; Testing: 574 patients with CRC: 144 biopsies and 430 TCGANeural networks with domain-adversarial learningBiopsies: 85; TCGA: 84NA/NA
Jang et al[144], 2020South KoreaCRCRetrospectivePrediction of gene mutations in CRCTraining: 629 WSIs with CRC (HE stained) Testing: 142 WSIs with CRC (HE stained)CNNs64.8-88.0NA/NA
Chaudhary et al[145], 2018United StatesHCCRetrospectiveIdentification of survival subgroups of HCCTraining: 360 HCC patients’ data using RNA-seq, miRNA-seq and methylation data from TCGA; Testing: 684 HCC patients’ data (LIRI-JP cohort: 230; NCI cohort: 221; Chinese cohort: 166, E-TABM-36 cohort: 40, and Hawaiian cohort: 27)DLLIRI-JP cohort: 75; NCI cohort: 67; Chinese cohort: 69; E-TABM-36 cohort: 77; Hawaiian cohort: 82NA/NA
Saillard et al[146], 2020FranceHCCRetrospectivePrediction of the survival of HCC patients treated by surgical resectionTraining: 206 HCC (390 WSIs); Testing: 328 HCC (342 WSIs)CNNs (SCHMOWDER and CHOWDER)SCHMOWDER: 78; CHOWDER: 75NA/NA
Chen et al[11], 2020ChinaHCCRetrospectiveClassification and gene mutation prediction of HCCTraining: 472 WSIs: 383 HCC and 89 normal liver tissue; Testing: 101 WSIs: 67 HCC and 34 normal liver tissue CNNsClassification: 96.0; Tumor differentiation: 89.6; Gene mutation: 71-89NA/NA
Fu et al[147], 2020UnitedKingdomEAC, GC, CRC, and liver cancersRetrospectivePrediction of mutations, tumor composition and prognosis17335 HE-stained images of 28 cancer typesCNNsVariable across tumors/gene alterationsNA/NA

AI: Artificial intelligence; BE: Barrett’s esophagus; EAC: Esophageal adenocarcinoma; CNN: Convolutional neural network; GC: Gastric cancer; WSI: Whole-slide image; NA: Not available; DNN: Deep neural network; RF: Random forests; RNN: Recurrent neural network; CRC: Colorectal cancer; HBV: Hepatitis-B virus; SVM: Support vector machine; MAFLD: Metabolic associated fatty liver disease; HCC: Hepatocellular carcinoma; CC: Cholangiocarcinoma; ROI: Region of interest; IHC: Immunohistochemistry; MSI: Microsatellite instability; dMMR: Mismatch-repair deficiency; TCGA: The Cancer Genome Atlas; DL: Deep learning.

Summary of key studies on artificial intelligence-assisted pathology in the gastroenterology and hepatology fields AI: Artificial intelligence; BE: Barrett’s esophagus; EAC: Esophageal adenocarcinoma; CNN: Convolutional neural network; GC: Gastric cancer; WSI: Whole-slide image; NA: Not available; DNN: Deep neural network; RF: Random forests; RNN: Recurrent neural network; CRC: Colorectal cancer; HBV: Hepatitis-B virus; SVM: Support vector machine; MAFLD: Metabolic associated fatty liver disease; HCC: Hepatocellular carcinoma; CC: Cholangiocarcinoma; ROI: Region of interest; IHC: Immunohistochemistry; MSI: Microsatellite instability; dMMR: Mismatch-repair deficiency; TCGA: The Cancer Genome Atlas; DL: Deep learning.

Basic AI-assisted pathology: diagnosis

The basic role of pathology is disease diagnosis. In the fields of gastroenterology, there is an increasing need for automatic pathological analysis and diagnosis of GC. Based on the virtual version of pathological slices, several studies were performed to identify and classify GC automatically with high AUCs[120-122,126]. For example, a CNNs model was developed to distinguish gastric mass lesions including gastric adenocarcinoma, adenoma and non-neoplastic lesions, and it has achieved the highest AUC of 0.97 for the identification of gastric adenocarcinoma[126]. With regard to colorectal lesions, Wei et al[128] trained an AI-assisted model to classify colorectal polyps on WSIs, and notably, the performance of the model was similar to that of local pathologists whether in a single institution or other institutions. Besides diagnosis, a model based on more than 400 WSIs was developed to differentiate five common subtypes of colorectal polyps with accuracy of 93%[127]. In CRC, Shapcott et al[129] performed a retrospective study to develop a CNNs model for diagnosis based on 853 hand-marked images with an accuracy of 84%. In the fields of hepatology, AI-assisted pathology is applied in patients with hepatitis B virus (HBV), metabolic associated fatty liver disease, HCC, etc. An automated, stain-free AI system can quantify the amount of fibrillar collagen to evaluate the degree of HBV-related fibrosis with the AUC > 0.82[132]. For patients with metabolic associated fatty liver disease, AI-assisted pathology tools were used to identify and quantify pathological changes including steatosis, macrosteatosis, lobular inflammation, ballooning, and fibrosis[133], and the algorithm output scores for quantitative comparison with experienced pathologists achieved good agreement. However, limited AI-assisted pathology tools have been built for HCC diagnosis. Notably, the MFC-CNN-ELM program was designed for nuclei grading of biopsy specimens from HCC patients, which revealed high performance in classifying tumor cells of different differentiation stages[134].

Advanced AI-assisted pathology: prediction of gene mutations and prognosis

Apart from AI-assisted pathology tools in diagnosis, it is no surprise that many tools have been developed for the prediction of gene mutations and prognosis in the fields of gastroenterology and hepatology. In CRC, AI tools have shown great effectiveness in predicting prognosis across all tumor stages based on WSIs[139,140], and several prospective multicenter studies have further validated the high prognosis performance[142]. Notably, a subset of genetic defects occurring in gastroenterology is related to some morphological features detected on WSIs. Among screened genetic defects, microsatellite instability and mismatch-repair deficiency are associated with the survival of gastrointestinal and colorectal cancer patients receiving immunotherapy. Therefore, an AI tool was designed to predict microsatellite instability and mismatch-repair deficiency directly from pathology, and it finally showed reasonably good performance in assisting immunotherapy[138,141]. Notably, Kather et al[140] further validated the above model’s performance in predicting overall survival from CRC pathology slides with a hazard ratio of 2.29 in CRC-specific overall survival (OS) and an hazard ratio of 1.63 in OS, respectively. However, besides the above-mentioned studies that have focused on tumor detection of CRC, few studies were designed to predict gene mutations and prognosis due to the more complicated and heterogeneous histomorphology in gastric diseases than that in the colon[136,137]. In the fields of hepatology, AI tools are mainly used to predict gene mutations and prognosis in HCC. For example, a model has higher accuracy in predicting survival postoperatively than using a composite score of clinical and pathological factors in HCC. In addition, the model may generalize well after validating the performance in an external dataset with different staining and scanning methods[146]. Chen et al[11] investigated a CNN (Inception V3) for automatic classification (benign/malignant classification with 96.0% accuracy, and differentiation degree with 89.6% accuracy) and gene mutation prediction from WSIs after resection of HCC. It was found that CTNNB1, FMN2, TP53, and ZFX4 could be predicted from WSIs with external AUCs from 0.71 to 0.89. Currently, after integrating clinical data, biological data, genetic data, and pathological data, the novel model may also be a promising approach. The first multi-omics model combined ribonucleic acid (RNA) sequencing, miRNA sequencing and methylation data from The Cancer Genome Atlas, and then employed AI technologies to predict and differentiate survival of HCC patients[145]. Other attempts have been made to develop models that can predict gene mutations directly based on WSIs of HCC. Using AI-assisted pathology, some approaches can predict gene expression and RNA sequencing, which may have the potential for clinical translation[147]. Interestingly, some gene expression such as PD-1 and PD-L1 expression, inflammatory gene signatures, and biomarkers of inflammation did trend with improved survival and response in HCC patients[148].

LIMITATIONS AND FUTURE CONSIDERATIONS

This review retrospectively summarized some key and representative articles with the possibility of missing some publications in AI-related journals. Although various studies have shown promising results in the fields of AI-assisted gastroenterology and hepatology, there are still several limitations to be discussed and resolved. One of the major criticisms is the lack of high-quality training, testing, and validation datasets for the development and validation of AI models. Due to the retrospective manner of most studies, selection bias must be considered at the training stage, meanwhile, overfitting and spectrum bias may result in overestimation of the model accuracy and generalization. According to the rigorous “six-steps” translation pipelines[149], doctors and AI researchers should join the calls that advocate for developing interconnected networks of collecting raw acquisition data which was shifted from processed medical images over the world and training AI on a large scale to obtain robust and generalizable models. Furthermore, the black-box nature of AI technologies has become a barrier to clinical practice, because developers and users do not know the details about how computers output the conclusion. Explainable AI for reliable healthcare is worth investigating to reach clinical interpretability and transparency. In addition, from the perspective of ethics and legal liabilities, AI models may potentially cause errors and challenge the patient-doctor relationship despite the fact that they improve the clinical workflow with enhanced precision. Especially in the fields of gastroenterology and hepatology, cancer discrimination may mean a completely different treatment. If misdiagnosis occurs during AI application, who should take responsibility- the doctor, the programmer, the company providing the system, or the patient? Issues such as ethics and legal liabilities should be demonstrated in the early phase to maintain the balance between minimal error rates and maximal patient benefits[150,151]. There have been an increasing number of studies applying AI to gastroenterology and hepatology over the past decade. In the future, the trend will continue and larger studies will be carried out to compare the performance of medical professionals with AI vs professionals without AI to highlight the importance of AI assistance. AI technologies will be utilized to develop more accurate models to predict and monitor disease progression and potential complications, and these models may ameliorate the insufficiency of medical resources in the remote underserved or developing regions. Besides, AI-assisted personalized imaging protocols and immediate three-dimensional reconstruction may further improve the diagnostic efficiency and accuracy. Researchers will be able to realize the mechanism of disease progression and treatment response through the combination of multi-modality images or multi-omics data. In addition, there is an emerging trend applying AI to drug development, such as prediction of compound toxicity, physical properties, and biological activities, which may assist chemotherapy for digestive system malignancy. Furthermore, AI could be used to process the data generated from the tissue-on-a-chip platform which could better summarize the tumor microenvironment, thus reach precise and individual chemotherapy in gastroenterology and hepatology. As synthetic lethality becomes a promising genetically targeted cancer therapy[152,153], AI could also be used for the detection of target synthetic lethal partners of overexpressed or mutated genes in tumor cells to kill cancers. Finally, AI tools could not replace endoscopists, radiologists, and pathologists in the near and even distant future. Computers would make predictions and doctors would make the final decision, in other words, they would always work together to benefit patients.

CONCLUSION

AI is rapidly developing and becoming a promising tool in medical image analysis of endoscopy, radiology, and pathology to improve disease diagnosis and treatment in the fields of gastroenterology and hepatology. Nevertheless, we should be aware of the constraints that limit the acceptance and utilization of AI tools in clinical practice. To use AI wisely, doctors and researchers should cooperate to address the current challenges and develop more accurate AI tools to improve patient care.

ACKNOWLEDGEMENTS

We thank Yun Cai for polishing our manuscript. We are grateful to our colleagues for their assistance in checking the data of the studies.
  147 in total

1.  Computer-aided detection of early neoplastic lesions in Barrett's esophagus.

Authors:  Fons van der Sommen; Svitlana Zinger; Wouter L Curvers; Raf Bisschops; Oliver Pech; Bas L A M Weusten; Jacques J G H M Bergman; Peter H N de With; Erik J Schoon
Journal:  Endoscopy       Date:  2016-04-21       Impact factor: 10.093

2.  Self-driving cars and AI-assisted endoscopy: Who should take the responsibility when things go wrong?

Authors:  Nicholas Ch Poon; Joseph Jy Sung
Journal:  J Gastroenterol Hepatol       Date:  2019-04       Impact factor: 4.029

3.  Artificial intelligence in upper GI endoscopy - current status, challenges and future promise.

Authors:  Honggang Yu; Rajvinder Singh; Seon Ho Shin; Khek Yu Ho
Journal:  J Gastroenterol Hepatol       Date:  2021-01       Impact factor: 4.029

4.  Detection of lesions in dysplastic Barrett's esophagus by community and expert endoscopists.

Authors:  Dirk W Schölvinck; Kim van der Meulen; Jacques J G H M Bergman; Bas L A M Weusten
Journal:  Endoscopy       Date:  2016-11-17       Impact factor: 10.093

5.  Real-time artificial intelligence for detection of upper gastrointestinal cancer by endoscopy: a multicentre, case-control, diagnostic study.

Authors:  Huiyan Luo; Guoliang Xu; Chaofeng Li; Longjun He; Linna Luo; Zixian Wang; Bingzhong Jing; Yishu Deng; Ying Jin; Yin Li; Bin Li; Wencheng Tan; Caisheng He; Sharvesh Raj Seeruttun; Qiubao Wu; Jun Huang; De-Wang Huang; Bin Chen; Shao-Bin Lin; Qin-Ming Chen; Chu-Ming Yuan; Hai-Xin Chen; Heng-Ying Pu; Feng Zhou; Yun He; Rui-Hua Xu
Journal:  Lancet Oncol       Date:  2019-10-04       Impact factor: 41.316

6.  Application of Artificial Intelligence in Gastrointestinal Endoscopy.

Authors:  Jia Wu; Jiamin Chen; Jianting Cai
Journal:  J Clin Gastroenterol       Date:  2021-02-01       Impact factor: 3.062

7.  Deep Learning With Sampling in Colon Cancer Histology.

Authors:  Mary Shapcott; Katherine J Hewitt; Nasir Rajpoot
Journal:  Front Bioeng Biotechnol       Date:  2019-03-27

8.  Randomised controlled trial of WISENSE, a real-time quality improving system for monitoring blind spots during esophagogastroduodenoscopy.

Authors:  Lianlian Wu; Jun Zhang; Wei Zhou; Ping An; Lei Shen; Jun Liu; Xiaoda Jiang; Xu Huang; Ganggang Mu; Xinyue Wan; Xiaoguang Lv; Juan Gao; Ning Cui; Shan Hu; Yiyun Chen; Xiao Hu; Jiangjie Li; Di Chen; Dexin Gong; Xinqi He; Qianshan Ding; Xiaoyun Zhu; Suqin Li; Xiao Wei; Xia Li; Xuemei Wang; Jie Zhou; Mengjiao Zhang; Hong Gang Yu
Journal:  Gut       Date:  2019-03-11       Impact factor: 23.059

9.  Evaluation of a Deep Neural Network for Automated Classification of Colorectal Polyps on Histopathologic Slides.

Authors:  Jason W Wei; Arief A Suriawinata; Louis J Vaickus; Bing Ren; Xiaoying Liu; Mikhail Lisovsky; Naofumi Tomita; Behnaz Abdollahi; Adam S Kim; Dale C Snover; John A Baron; Elizabeth L Barry; Saeed Hassanpour
Journal:  JAMA Netw Open       Date:  2020-04-01

10.  Deep Learning for Classification of Colorectal Polyps on Whole-slide Images.

Authors:  Bruno Korbar; Andrea M Olofson; Allen P Miraflor; Catherine M Nicka; Matthew A Suriawinata; Lorenzo Torresani; Arief A Suriawinata; Saeed Hassanpour
Journal:  J Pathol Inform       Date:  2017-07-25
View more
  1 in total

Review 1.  Applications of Artificial Intelligence Based on Medical Imaging in Glioma: Current State and Future Challenges.

Authors:  Jiaona Xu; Yuting Meng; Kefan Qiu; Win Topatana; Shijie Li; Chao Wei; Tianwen Chen; Mingyu Chen; Zhongxiang Ding; Guozhong Niu
Journal:  Front Oncol       Date:  2022-07-27       Impact factor: 5.738

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.