Literature DB >> 35221775

Identification of gastric cancer with convolutional neural networks: a systematic review.

Yuxue Zhao¹, Bo Hu², Ying Wang¹, Xiaomeng Yin³, Yuanyuan Jiang⁴, Xiuli Zhu¹.

Abstract

The identification of diseases is inseparable from artificial intelligence. As an important branch of artificial intelligence, convolutional neural networks play an important role in the identification of gastric cancer. We conducted a systematic review to summarize the current applications of convolutional neural networks in the gastric cancer identification. The original articles published in Embase, Cochrane Library, PubMed and Web of Science database were systematically retrieved according to relevant keywords. Data were extracted from published papers. A total of 27 articles were retrieved for the identification of gastric cancer using medical images. Among them, 19 articles were applied in endoscopic images and 8 articles were applied in pathological images. 16 studies explored the performance of gastric cancer detection, 7 studies explored the performance of gastric cancer classification, 2 studies reported the performance of gastric cancer segmentation and 2 studies analyzed the performance of gastric cancer delineating margins. The convolutional neural network structures involved in the research included AlexNet, ResNet, VGG, Inception, DenseNet and Deeplab, etc. The accuracy of studies was 77.3 - 98.7%. Good performances of the systems based on convolutional neural networks have been showed in the identification of gastric cancer. Artificial intelligence is expected to provide more accurate information and efficient judgments for doctors to diagnose diseases in clinical work.

Entities: Chemical

Keywords: Classification; Convolutional neural network; Detection; Diagnosis; Gastric cancer

Year: 2022 PMID： 35221775 PMCID： PMC8856868 DOI： 10.1007/s11042-022-12258-8

Source DB: PubMed Journal: Multimed Tools Appl ISSN： 1380-7501 Impact factor: 2.577

Introduction

Gastric cancer is the cancer of the cavity organs with the highest incidence [11], which is a serious threat to human health. The prognosis of gastric cancer is closely related to disease stage. The diagnosis and treatment of early gastric cancer is helpful to the recovery of patients, and the 5-year survival rate of patients can exceed 90% [59]. However, most patients have been already at an advanced stage when they are diagnosed [73]. Due to the limited treatment, the survival rate of advanced gastric cancer is low and the prognosis is poor [26]. With the advancement of medical technology and the enhancement of people’s health awareness, the diagnosis and treatment of gastric cancer has become an urgent need for more patients. Therefore, to improve the accuracy of gastric cancer identification, especially early gastric cancer, has become the focus of current researches. Gastric cancer is evaluated by endoscopy, pathological pictures, imaging examination, etc. First, endoscopy has been widely used in the detection of gastric cancer. Image-enhanced endoscopy, such as narrow-band imaging [57] and linked color imaging [52], can accurately analyze the surface structure. The studies showed that the application of these endoscopic methods could improve the accuracy of gastrointestinal tumor diagnosis [10, 27]. However, a study showed that 10% of upper gastrointestinal cancers were still missed by endoscopy [41]. Even if two experts participate in an endoscopy unit, there would be missed diagnosis [61]. The reason was that doctors' accurate diagnosis of gastroscopy images required years of experience accumulation. Secondly, histopathological image recognition is the gold standard of tumor diagnosis. The shortage of pathologists has led to a huge workload for pathologists and diagnostic errors [68]. Finally, imaging examination plays an important role in the evaluation of lymph node metastasis of gastric cancer. Imaging examination is mainly based on morphological characteristics of the lesions. For example, perigastric adipose tissue is rich so it’s indistinguishable from lymph nodes. Coupled with the inexperience of doctors, missed diagnosis and misdiagnosis may occur. In particular, when the number of patients is large, the accuracy of diagnosis will inevitably decrease [12]. With the increasing requirements of more accurate detection, classification and segmentation or delineating margins, artificial intelligence (AI) is booming in medical applications. AI is to make machines think like people. Machine learning is one of the most important parts of AI. Compared with traditional machine learning, such as support vector machines and Bayesian networks, deep learning has better accuracy and flexibility, and can be more easily adapted to different fields and applications. Convolution neural network (CNN) is one of the most prominent algorithms for image processing in deep learning (Fig. 1). The basic structure of CNN includes convolution layer, pooling layer, full connection layer [69]. The convolutional layer performs feature extraction from a large amount of data. The pooling layer compresses the input feature map to extract the main features. And it can reduce the dimension of the extracted feature information, simplify the network calculation complexity, and improve the calculation speed. The function of the fully connected layer is to fit together all the features and send the output value to the classifier to get the final prediction result.

Fig. 1

The relationship between artificial intelligence, machine learning, deep learning and convolutional neural network

CNN: Convolutional neural network

The relationship between artificial intelligence, machine learning, deep learning and convolutional neural network CNN: Convolutional neural network Classic CNN structures include AlexNet [29], ResNet [14], VGG [53], Inception [60] and DenseNet [20], etc. Network depth refers to the number of layers that need to update parameters through training, such as convolutional layers, fully connected layers, etc. The basic structure of AlexNet includes 8 layers in total. There are 5 convolutional layers, followed by 3 fully connected layers. The accuracy of AlexNet has a big improvement compared to traditional methods. The kernel size of the first 4 convolutional layers are 11 × 11, 5 × 5, 3 × 3, 3 × 3, and 3 × 3. The kernel size of the first graphics processing unit is 3 × 3 and the kernel size of the second graphics processing unit is 5 × 5 in the fifth convolution layer. The channels of the 5 convolutional layers are 96, 256, 384, 384 and 256 respectively. The first two fully connected layers have 4096 neurons respectively, and the final output softmax has 1000 neurons. The VGG network is developed on the basis of the AlexNet network. Compared with AlexNet, VGG uses the convolution kernel with 3 × 3, and the pooling kernel with 2 × 2. The developer changed the original three fully connected layers that were the same as AlexNet in the network into one 7 × 7 and two 1 × 1 convolutional layers in turn. The entire network is divided according to the number of layers. The most widely used is VGG-16, which is 13 convolutional layers and 3 fully connected layers. GoogLeNet is a deep neural network model based on the inception module launched by Google. The inception module replaces the traditional operations of convolution and activation. Inception is characterized by the use of 1 × 1 convolution for dimensionality reduction, while convolution and re-aggregation can be performed on multiple scales. This network structure replaces all the fully connected layers with simple global average pooling, which greatly reduces the total number of parameters of the model. Subsequently, the model has been improving. Inception-v2, Inception-v3, Inception-v4 and other versions have been developed. ResNet has been designed a residual block to allow us to train deeper networks and increase model performance. There are 5 different depth structures in PyTorch’s official code, which are 18, 34, 50, 101 and 152 layers. The more common network is ResNet-50. It has a total of 50 layers of structure, including 49 convolutional layers and a fully connected layer. The size of the convolution kernel includes three types: 1 × 1, 3 × 3, and 7 × 7. DenseNet is a convolutional neural network with dense connection. The emergence of DenseNet breaks away from the fixed thinking of deepening the number of network layers (ResNet) and widening the network structure (Inception) to improve network performance. Its advantage is that the network is narrower and has fewer parameters. CNN has excellent performance in many fields, such as computer vision and natural language processing. Especially in the field of computer vision, CNN is the important model for image classification, image retrieval, object detection and semantic segmentation. For example, agile CNN was used to diagnose the benign and malignant pulmonary nodules in chest CT images [72]. Meanwhile, CNN showed good diagnostic performance in the diagnosis of liver lesions [70] and breast ultrasound images [45]. CNN can also be used for pathological image classification. For example, fully connected CNN with extreme learning machine model was used to classify hepatocellular carcinoma nuclei [32]. CNN was also used for histopathological classification of osteosarcoma [43] and epithelial matrix pathological image classification [19]. In order to help doctors identify gastric cancer more accurately and in less time, a number of computer-aided diagnosis schemes have been developed [2, 25, 63]. Computer-aided diagnosis can help doctors reduce omissions and mischaracterizations of gastric tumor changes, thereby helping to solve the limitations of current doctors’ lack of experience in examinations. Deep learning, especially CNN, has become a smarter and more accurate image processing technology. CNN is expected to become the mainstream application technology in the AI identification of gastric cancer. The purpose of this system review is to summarize all current applications of CNN in gastric cancer identification and evaluate the performance of CNN in gastric cancer identification. This research can contribute to solving the knowledge gap in the development of CNN in gastric cancer identification.

Methods

Literature search

Literature retrieval was completed from Embase, Cochrane Library, PubMed, Web of Science database on 17th September, 2020 to find out the CNN study of gastric cancer's medical images . The PRISMA checklist was used for reporting the systematic review. We used medical subject headings and free-text words to search. The search terms in this article mainly include “machine learning”, “artificial Intelligence”, “convolutional neural network”, “deep learning”, “data mining”, “algorithm”, “tumor”, “neoplasm”, “carcinoma”, “cancer”, “lesion”, “endoscope”, “pathology”, “computed tomography”, “ultrasonography”, “x-ray”, “Magnetic Resonance Imaging”, “stomach”, “gastric”, “digestive system”, “diagnosis”, “identification”, “classification”, “detection”, “segmentation”. There was no time limit on the publication of included articles. To describe the proposed CNN performance in the identification of gastric cancer, we compared measures of precision, sensitivity, specificity, area under the curve (AUC) and accuracy (if available) in the models.

Selection criteria

The following inclusion and exclusion criteria were used to select the study. The inclusion criteria were: Studies on the identification of gastric cancer images or lesions, Using models based on CNN or the main components of CNN, The diagnostic performance indexes of AI algorithm include precision, accuracy, sensitivity, specificity or AUC, Studies of human subjects, Publications in English. Guidelines, case reports, review articles, abstracts of conference papers, letters to editors and editorials were not included in this study.

Data extraction and quality assessment

The two authors (XY and YJ) extracted the data independently. The forms filled out by each author were compared, and dissenting opinions were resolved through review and discussion. In case of unresolved differences, the third evaluator acted as the final arbiter. The authors extracted following contents from the article: (a) author information (b) year of publication (c) study design (d) data set source (e) basic information about patients (f) number of images (g) number of lesions (h) CNN tasks (classification, detection, segmentation and delineating margins) (i) whether independent test sets are used (j) types of medical images or lesions. We carefully evaluated quality of the studies included in systematic review. The Quality Assessment on Diagnostic Accuracy Studies (QUADAS-2) [65] was used to evaluate quality of the included study. The tool consists of 4 areas: patient selection, index test, reference standard, and flow and timing [65]. Each area is assessed on the basis of high, low or unclear bias risks, and the first 3 areas are also assessed on the basis of high, low or ambiguous concerns about applicability [65]. Review Manager version 5.3 (RevMan for Windows 10, Nordic Cochrane Centre) was used to generate summary diagrams of methodology quality assessment.

Results

Search results

A total of 27 studies were included in this study. Figure 2 showed the PRISMA flow chart of research selection, which summarized the process of research and retrieval selection. A total of 6925 articles were selected for retrieval and deduplication in the database. 1071 irrelevant articles were deleted in the preliminary screening. The full texts of 98 articles were evaluated. Finally, 27 articles which met the inclusion criteria and were suitable for systematic evaluation were included (Fig. 2).

Fig. 2

PRISMA flowchart shows the searched articles using convolutional neural networks in gastric cancer image

Characteristics of data

In all the studies, 19 articles [3, 6, 7, 15, 17, 22, 31, 34, 36–38, 40, 46, 51, 62, 64, 66, 71, 74] were about the application of CNN in gastric cancer diagnosis in the field of endoscopy and 8 articles [8, 18, 21, 33, 35, 48, 56, 58] were the application of CNN in the identification of gastric pathological images. A total of 99,777 patients, 1,422,523 images and 2605 lesions were included in this review. The average age of the patients involved in the dataset was 49.9 years old, and the range was 26 – 92 years old. Male patients accounted for 50.9% and female patients accounted for 49.1%. Most studies were based on images analysis and only one study [15] was based on lesions analysis. 2 studies [46, 51] analyzed both the images and the lesions. The detailed features of the study were shown in Table 1. The data set with the best performance in each article were selected. The performance results were shown in Table 2.

Table 1

Studies on convolutional neural network and gastric cancer

Ref	Year	Study design	Location	Patients	Images or Lesions	Images	Lesions	Types of medical images or lesions	Task	Independent test set
An et al. [3]	2020	Retrospective	China	1095	Images	2488	NR	Endoscopic	Delineating Margins	No
Cho et al. [7]	2020	Retrospective	Korea	846	Images	3105	NR	Endoscopic	Detect invasion depth	Yes
Cho et al. [6]	2019	Retrospective	Korea	1269	Images	5017	NR	Endoscopic	Classification	Yes
Hirasawa et al. [15]	2018	Prospective	Japan	69	Lesions	NR	77	Endoscopic	Detection	Yes
Horiuchi et al. [17]	2019	Retrospective	Japan	NR	Images	2828	NR	Endoscopic	Detection	Yes
Ikenoyama et al. [22]	2020	Retrospective	Japan	2779	Images	16,524	NR	Endoscopic	Detection	Yes
Lee et al. [31]	2019	Retrospective	Korea	NR	Images	787	NR	Endoscopic	Detection	No
Li et al. [34]	2020	Retrospective	China	NR	Images	2429	NR	Endoscopic	Detection	Yes
Ling et al. [36]	2020	Retrospective	China	342	Images	4969	NR	Endoscopic	Delineating Margins	Yes
Liu et al. [37]	2018	Retrospective	China	NR	Images	3871	NR	Endoscopic	Classification	Yes
Lui et al. [38]	2019	Retrospective	Hong Kong, China	NR	Images	3000	NR	Endoscopic	Detection	Yes
Luo et al. [40]	2019	Retrospective	China	84,424	Images	1,036,496	NR	Endoscopic	Detection	Yes
Nagao et al. [46]	2020	Retrospective	Japan	1084	Images/ Lesions	16,557	2434	Endoscopic	Detect invasion depth	Yes
Shibata et al. [51]	2020	Retrospective	Japan	135	Images/ Lesions	1741	94	Endoscopic	Detection	No
Ueyama et al. [62]	2020	Retrospective	Japan	349	Images	7874	NR	Endoscopic	Detection	Yes
Wang et al. [64]	2019	Retrospective	China	NR	Images	104,864	NR	Endoscopic	Detection	Yes
Wu et al. [66]	2019	Retrospective	China	NR	Images	9351	NR	Endoscopic	Detection	Yes
Yoon et al. [71]	2019	Retrospective	Korea	800	Images	11,539	NR	Endoscopic	Detection/ Detect invasion depth	Yes
Zhu et al. [74]	2020	Retrospective	China	993	Images	993	NR	Endoscopic	Detect invasion depth	Yes
Qu et al. [48]	2018	Retrospective	Japan	NR	Images	48,000	NR	Pathological	Classification	No
Cho et al. [8]	2020	Retrospective	Korea	432	Images	803	NR	Pathological	Classification	Yes
Iizuka et al. [21]	2020	Retrospective	Japan	NR	Images	5103	NR	Pathological	Classification	Yes
Sun et al. [58]	2019	Retrospective	China	NR	Images	500	NR	Pathological	Segmentation	No
Liang et al. [35]	2019	Retrospective	China	NR	Images	1900	NR	Pathological	Segmentation	No
Hu et al. [18]	2019	Retrospective	China	30	Images	65,328	NR	Pathological	Classification	No
Li et al. [33]	2019	Prospective	China	120	Images	48,000	NR	Pathological	Classification	No
Song et al. [56]	2020	Retrospective	China	4210	Images	6917	NR	Pathological	Detection	Yes

Note: NR: Not reported

Table 2

Studies of Images that provided evaluation index results

Ref	Precision(%)	Accuracy (%)	Sensitivity (%)	Specificity (%)	AUC(%)
An et al. [3]	N/A	88.9	N/A	N/A	N/A
Cho et al. [7]	N/A	77.3	80.4	80.7	88.7
Cho et al. [6]	N/A	81.9	75.9	85.3	87.7
Hirasawa et al. [15]	N/A	98.6	92.2	N/A	N/A
Horiuchi et al. [17]	N/A	85.3	95.4	71.0	N/A
Ikenoyama et al. [22]	N/A	N/A	58.40	87.30	75.7
Lee et al. [31]	N/A	96.49	N/A	N/A	97
Li et al. [34]	N/A	90.91	91.18	90.64	N/A
Ling et al. [36]	N/A	88.1	N/A	N/A	N/A
Liu et al. [37]	99	96	99	N/A	N/A
Lui et al. [38]	N/A	91.0	97.1	85.9	91
Luo et al. [40]	N/A	92.8	94.2	92.3	N/A
Nagao et al. [46]	N/A	N/A	N/A	N/A	95.90
Shibata et al. [51]	N/A	N/A	96	N/A	N/A
Ueyama et al. [62]	N/A	98.70	98	100	N/A
Wang et al. [64]	N/A	N/A	79.622	78.48	N/A
Wu et al. [66]	N/A	92.5	94.0	91.0	N/A
Yoon et al. [71]	N/A	N/A	91.0	97.6	98.1
Zhu et al. [74]	N/A	89.16	76.47	95.56	94
Qu et al. [48]	86.9	N/A	N/A	N/A	96.3
Cho et al. [8]	N/A	78	100	56	100
Iizuka et al. [21]	N/A	N/A	N/A	N/A	98
Sun et al. [58]	91.6	N/A	N/A	N/A	N/A
Liang et al. [35]	N/A	91.09	N/A	N/A	N/A
Hu et al. [18]	N/A	94.38	94.99	93.76	N/A
Li et al. [33]	N/A	96.5	96.6	96.7	N/A
Song et al. [56]	N/A	87.3	99.6	84.3	98.6

Note: N/A: Not available

Studies on convolutional neural network and gastric cancer Images/ Lesions Images/ Lesions Note: NR: Not reported Studies of Images that provided evaluation index results Note: N/A: Not available All studies evaluated by the QUADAS tool had a high risk of bias or were unclear in one area at least, except Ikenoyama et al. [22]. High-risk researches [3, 6–8, 17, 18, 21, 31, 33–38, 46, 48, 51, 56, 58, 66, 71] in one of the seven fields at least were rated as low methodological quality. Figure 3 showed the literature quality evaluation results. Besides, in order to better understand the inclusion study, we used Word Art (https://wordart.com/) to analyze the co-occurrence map of keywords, as shown in Fig. 4. We found that the hot keywords were artificial intelligence, convolutional neural network, deep learning, endoscopy, narrow band imaging, cancer, classification, diagnosis, which were consistent with the focus of identification of gastric cancer.

Fig. 3

The literature quality evaluation results. High: The relevant entry in each area is high risk in risk of bias; high concerns in applicability concerns; Unclear: The relevant entry in each area is unclear risk in risk of bias; unclear concerns in applicability concerns; Low: The relevant entry in each area is low risk in risk of bias; low concerns in applicability concerns. Cho 2020 a: Refers to the study whose first author is Bum-Joo Cho [7]. Cho 2020 b: Refers to the study whose first author is Kyung-Ok Cho [8]

Fig. 4

Keyword analysis of word cloud containing papers

Detection (16 Studies)

There are 16 studies [7, 15, 17, 22, 31, 34, 38, 40, 46, 51, 56, 62, 64, 66, 71, 74] on the application of CNN in the detection of gastric cancer. 15 studies [7, 15, 17, 22, 31, 34, 38, 40, 46, 51, 62, 64, 66, 71, 74] were published in the field of gastric endoscopy. Among them, 4 studies [7, 46, 71, 74] included the detection of invasion depth of gastric cancer. A study [56] was based on CNN to judge the benign and malignant pathological images. All articles evaluated the detection performance of CNN with one of the five indicators of precision, accuracy, sensitivity, specificity or AUC at least. 11 studies [7, 15, 17, 31, 34, 38, 40, 56, 62, 66, 74] described the accuracy, ranging from 77.3 to 98.6%. 14 studies [7, 15, 17, 22, 34, 38, 40, 51, 56, 62, 64, 66, 71, 74] described sensitivity, ranging from 58.40 to 99.6%. 12 studies [7, 17, 22, 34, 38, 40, 56, 62, 64, 66, 71, 74] described specificity, ranging from 71.0 to 100%. In the studies [7, 22, 31, 38, 46, 56, 71, 74] that provided AUC, the AUC was greater than or equal to 75.7% (Table 2). Hirasawa et al. [15], Luo et al. [40], Li et al. [34], Cho et al. [7], Ikenoyama et al. [22] and Song et al. [56] used multi-center, multi-institutional data sets. Hirasawa et al. [15] retrospectively obtained a large number of images from two hospitals and two clinics in Japan to establish a gastric cancer image diagnosis system with an accuracy rate of 98.6%. Luo et al. [40] collected data from five hospitals in China for analysis to ensure sufficient and diverse data. Narrow-band imaging was used in 4 studies [17, 34, 38, 62] and white light imaging was used in 8 studies [7, 31, 40, 51, 64, 66, 71, 74]. 3 studies [15, 22, 46] used images with standard white light, chromoendoscopy using indigo carmine spraying and narrow-band imaging. Song et al. [56] used the H&E-stained whole slide images for training and testing on CNN. From the technical perspective, CNN structures in different studies have their own characteristics. Hirasawa et al. [15] and Ikenoyama et al. [22] used a Single Shot MultiBox Detector method without changing their algorithm. Single Shot MultiBox Detector is a deep CNN consisting of 16 or more layers. To train and test CNN, both studies used the Caffe deep learning framework with a global learning rate of 0.0001. In the study of Hirasawa et al. [15], CNN analyzed more than 2,000 test images in only 47 s. This high rate of recognizing and judging images far exceeded the recognition ability of human beings. Horiuchi et al. [17] chose the GoogLeNet structure which was a 22-layer CNN to process the data. And the Caffe deep learning framework was used to train and verify the CNN system. The test speed of this model on images could reach 0.02 s/image. Li et al. [34] used the Inception-v3 model, which consisted of 11 Inception modules. An important feature of Inception-v3 was that it split a larger two-dimensional convolution into a smaller one-dimensional convolution, which improved computational efficiency of the model. The Keras deep learning framework was used to train the Inception-v3 model. Lui et al. [38] constructed an image classifier based on pre-trained ResNet. The ResNet structure has 5 convolutional layers and 3 fully connected layers. Nagao et al. [46], Shibata et al. [51], Ueyama et al. [62] and Zhu et al. [74] used the ResNet-50 structure to identify gastric cancer images or lesions. Nagao et al. [46], Ueyama et al. [62] and Zhu et al. [74] all adopted transfer learning methods to build the identification system. Shibata et al. [51] proposed a Mask R-CNN method, which introduced ResNet-50 as the backbone. The mask layer consisted of 7 convolutional layers. Yoon et al. [71] used the VGG-16 model, which included 16 convolutional layers and a rectified linear unit. The research adapted the last layer of VGG-16 into a two-dimensional fully connected layer to classify the input endoscopic images. Wu et al. [66] aimed to establish an early gastric cancer diagnosis system for esophagogastroduodenoscopy without blind spots. This study combined VGG-16 and ResNet-50 to identify images. The TensorFlow deep learning framework was used to train, validate and test the system. Cho et al. [7] used Inception-ResNet-v2 and the DenseNet-161 models to distinguish gastric cancer images. The results showed that in the external verification set, the average AUC values of Inception-ResNet-v2 and the DenseNet-161 could reach 0.769 and 0.887, respectively. Lee et al. [31] used Inception-v3, ResNet-50 and VGG-16 structures to detect gastric benign ulcers and cancers. Depth is important for many visual recognition tasks. The ResNet-50 architecture used in this study showed the best results. Wang et al. [64] combined the three CNN structures of AlexNet, GoogLeNet and VGG to construct a multi-column convolutional neural network. The results showed that the ensemble model was superior to other single classifiers in image analysis. The network structure was implemented by using the Caffe framework. Luo et al. [40] developed a real-time artificial intelligence-assisted image recognition system, which was based on the DeepLab-v3 algorithm. It has been applied in clinical practice. Song et al. [56] used DeepLab-v3 architecture to help pathologists identify benign and malignant images of stomach's entire slide. The performance of CNN in the detection of gastric cancer was compared with that of experts in 5 studies [22, 34, 38, 40, 66]. Luo et al. [40] showed that the accuracy of expert level can be achieved when using CNN-based diagnosis system to assist non-expert doctors in disease diagnosis. Li et al. [34] found that the specificity and accuracy of CNN’s diagnostic system was not significantly different from that of experts, but CNN’s diagnostic sensitivity was significantly higher than that of experts. Lui et al. [38] pointed out that the accuracy and AUC of the AI system were superior to all junior endoscopes. Wu et al. [66] found that the detection performance of Deep CNN was better than that of endoscopy at all levels. Ikenoyama et al. [22] compared the diagnostic ability of CNN with 67 endoscopists. The results showed that the sensitivity of CNN was significantly higher than endoscopists (by 26.5%; 95% confidence interval, 14.9 – 32.5%) and the detection time was greatly shortened.

Classification (7 Studies)

7 studies [6, 8, 18, 21, 33, 37, 48] used CNN to classify gastric cancer. 2 studies [6, 37] reported the classification of CNN in endoscopic images and 5 studies [8, 18, 21, 33, 48] reported the classification level of CNN in pathological images. Liu et al. [37] and Qu et al. [48] reported the precision of 99% and 86.9%, respectively. In the 5 studies [6, 8, 18, 33, 37] reporting accuracy and sensitivity, the range of accuracy was 78 – 96.5% and the sensitivity was 75.9 – 100%. In the 4 studies [6, 8, 18, 33] reporting specificity, the ranged was 56 – 96.7%. In the 4 studies [6, 8, 21, 48] reporting AUC, the AUC range was 87.7 – 100%. Cho et al. [6], Iizuka et al. [21] and Li et al. [33] used multicentre data sets. Cho et al. [6] collected 5017 white lights images to establish a classification model. The images of gastric cancer were divided into five categories: advanced gastric cancer, early gastric cancer, highgrade dysplasia, lowgrade dysplasia and non-neoplastic lesions. Iizuka et al. [21] used data from two hospitals and The Cancer Genome Atlas to the analyze whole slide images of stomach and colon. In addition, Li et al. [33] used prospective data sets from two Chinese hospitals. Cho et al. [6] used 3 CNN structures of Inception-v4, ResNet-152 and Inception-ResNet-v2. Inception-RESNET-v2 was the best structure for gastric cancer classification with an accuracy of 81.9%. Liu et al. [37] proposed a new transfer learning deep CNN method. 4 typical pre-trained CNNs were selected to construct the classifier, namely VGG-16, Inception-v3, Inception-ResNet-v2 and ResNet-50. All the original network architecture of the CNN model in the article was kept before the first fully connected layer and a new 3 layers are added: the global average convergence layer and 4 fully connected layers. The results showed that the CNN’s classification performance of fine-tuning pre-training proposed was better than that of traditional manual features and trained CNN. The transfer learning of CNN structure had a strong potential in the classification of narrow-band imaging images of the stomach. Qu et al. [48] proposed a CNN scheme using two-level fine-tuning and introduced an intermediate data set related to the target, which could solve the problem of insufficient data. The program used VGG-16, AlexNet and Inception-v3 methods. Cho et al. [8] established an automatic classifier using whole slide images from 432 patients from The Cancer Genome Atlas. AlexNet, ResNet-50 and Inception-v3 architectures to classify whole slide images were used in the study. Iizuka et al. [21] used Inception-v3 network as the architecture of the classification model. And a depth multiplier of 0.35 was used to reduce the number of parameters, thus achieving a more streamlined structure. Hu et al. [18] proposed a Spectral-Spatial-CNN method to classify the hyperspectral images of gastric cancer. Its performance was better than that of artificial neural network and support vector machine. Li et al.[33] used ResNet-34 combined with a spectral spatial classification method to classify gastric cancer tissues, including a convolutional layer and 4 residual block residual modules. Cho et al. [6], Cho et al. [8], Iizuka et al. [21] and Li et al. [33] used TensorFlow to train, test and verify the model. The deep learning model proposed by Hu et al. [18] was implemented by the Keras framework.

Segmentation (2 Studies)

Sun et al. [58] and Liang et al. [35] reported the application of CNN in the segmentation of gastric cancer pathological images. Sun et al. [58] proposed a multi-scale embedded network to segment pathological images. ResNet-101 was selected as the backbone network of the segmentation model. The pixel-level accuracy reached 91.60% and mean Intersection over Union was 82.65%. Liang et al. [35] used a pathological image segmentation model based on the combination of full convolution network and the patch-based method. The model framework consisted of 30 convolutional layers, 7 pooling layers, 7 down-sampling layers and an output layer. This study proposed the reiterative learning method, which enabled full convolution network to achieve good segmentation effect in the case of using the gastric cancer pathological data set without manual annotation. The accuracy of training on sequential patches in the 1th iteration was about 5% better than patch model. The recall rate of training on mixed patches in the 1th iteration was 5.29% higher than patch model.

Delineating margins (2 Studies)

A real-time monitoring system was developed by Ling et al. [36]. This system was used to accurately identify the differentiation state of early gastric cancer and delineate the edge in magnifying narrow-band imaging endoscopy. The study used multi-center data sets from Nanjing Gulou Hospital and Wuhan University People’s Hospital in China. An et al. [3] established a system based on fully convolutional neural network for sketching the edges of early gastric cancer in indigo carmine chromoendoscopy or white light endoscopy. 2 studies had all introduced Unet++ for more accurate segmentation of the image range. The accuracy of the system was 85.7% for chromoendoscopy images and 88.9% for white light endoscopy images. The resection area predicted by the system covered all the lesion areas of all patients, while the resection area based on magnifying narrow-band imaging only covered all cancer areas of 80.00% of the patients.

Discussion

The lack of access to primary health care and the correctness of diagnosis represent a major challenge for the global health care system [54, 55]. In recent years, the application of machine learning algorithms in medicine has been more and more extensive, from simpler traditional machine learning methods to newly developed deep learning methods [23, 44, 47]. Both endoscopy and pathological examination are operator-dependent, and the diagnosis process is completely subjective [16, 49]. However, artificial intelligence-assisted inspection may help to provide another opinion and avoid reliance on operators in diagnostic tests [49]. The application of the algorithm has a great driving force for improving the reliability of clinical diagnosis and promoting the development of medical and health undertakings. Our systematic review shows that the CNN models of gastric cancer perform well in detection, classification, segmentation and delineating margins. There is no uniform standard and evaluation standard for the specific CNN model design. It is an ideal research process to compare the precision, accuracy, sensitivity, specificity or AUC of CNN with other model methods and data sets in the studies. In all studies, the precision, accuracy, sensitivity, specificity or AUC were greater than 70% except Ikenoyama et al. [22] and Cho et al. [8] All articles were published between 2018 and 2020. This shows that CNN has developed rapidly in the identification of cancer in recent years, and the application of deep learning in medical decision-making is more extensive. The countries involved in the research were concentrated in China, Japan,South Korea and Hong Kong, China. We infer that more studies on gastric cancer in these countries are related to the higher incidence of gastric cancer than in other countries. 9 studies [6, 15, 21, 22, 33, 34, 36, 40, 56] used multi-center, multi-institutional data sets. At the same time, most of the researches processed the images that did not conform to the standards. During model training and testing, the image quality of model recognition is high. An important prerequisite for the application of AI in medical decision-making is sufficient and high-quality medical images. A number of data sets for pathological image studies are derived from The Cancer Genome Atlas [9, 28]. However, there is no uniform and public large-scale image database in the endoscopic examination of gastric cancer. Data acquisition is still a major challenge. CNN has great potential in image recognition. Compared with traditional machine learning, it has better discrimination performance. Developers can process images by using a more mature network structure, such as VGG. CNN can also be combined with transfer learning and parameter fine-tuning to make the network structure and data more coordinated, which can alleviate the problem of insufficient training data and over-fitting. Since AlexNet was proposed in 2012, CNN has ushered in explosive development. With the emergence of networks such as VGG, Inception, and ResNet, CNN’s network depth is getting larger and larger to ensure accuracy. And it is more inclined to use smaller convolution kernels to ensure calculation speed. CNN is gradually developing towards lightweight and rapid development. The emergence of MobileNet [30] which can meet needs of mobile terminals makes CNN’s development prospects broader. The application of AI in gastrointestinal diseases is increasingly mature. A review published by Min et al. [42] evaluated the performance of deep learning in the disease detection through gastrointestinal endoscopy and compared it with healthcare professionals. The results of CNN’s endoscope diagnosis were encouraging, but its diagnostic ability was highly dependent on the quality and quantity of training data. A meta-analysis showed that machine learning performs well in the diagnosis of helicobacter pylori infection based on patients and images [4]. Another meta-analysis showed that AI was more accurate in the diagnosis of esophageal and gastric neoplastic lesions, with AUC > 0.88 [39]. Many diseases are increasingly inclined to use artificial intelligence to assist diagnosis. For example, during the global novel coronavirus pneumonia epidemic, some scholars used deep learning algorithm to identify chest CT images [67]. In the test set, AUC could be achieved with the accuracy, sensitivity and specificity of 0.819, 0.760, 0.811 and 0.615, respectively [67]. At present, AI technology is widely used, but it still has some limitations. Currently, one of the limitations of deep learning is interpreting and understanding questions related to how AI models make decisions, known as “black box” questions [1]. The emergence of explainable AI offers hope for solving this problem [50]. Secondly, there are moral and safety issues such as the privacy of patients data and how to determine who is responsible for patients' misdiagnosis or incorrect treatment. Despite some challenges, the prospect of AI is very exciting. The biggest feature of AI is efficient calculation and accurate decision-making. Technically, CNN’s predictions can be automated. The creation of a telemedicine service platform and even AI automatic diagnosis services are very likely to be realized. With the continuous accumulation of available data, the performance and reliability of AI will continue to be improved. It can be used as a tool for identification of endoscopic pictures or pathological slices and used to discover diseases and predict the depth of invasion. It can also be used to classify the type of cancer and to delineate the edges of pathological slices. The CNN-based identification technology for gastric cancer is expected to become the mainstream deep learning technology in the next few decades. We believe that doctors and pathologists can use the results provided by CNN as a reference to solve the diagnosis of difficult cases in the near future. At the same time, it is no longer necessary to examine the images of all patients one by one, but only to identify those unclear and difficult images, which will greatly reduce the burden of work. AI can also play an important role in areas where the economy is underdeveloped and medical resources are relatively scarce. The diversification of online medical methods, such as online consultation, has led to the rapid development of telemedicine services. Based on the telemedicine service platform, CNN can identify patient images in areas where lacking professional doctors, greatly improving the medical level of remote areas. We hope to see that clinicians, pathologists and artificial intelligence work together to support healthcare. The advantages of this review included the careful selection of research reports on CNN algorithms evaluated in gastric cancer images. We avoided other redundant studies that evaluate non-CNN-based learning algorithms and articles that were not original studies. We compared the data sources of researches, the number of images and the performance of each model, etc., and strictly evaluated the quality of the research. In addition, previous studies [13, 24, 69] have reviewed the application of AI to gastric diseases, but this research was the first to focus entirely on the application of CNN in gastric cancer. At the same time, this study elaborated on the details of various CNNs, and fully excavated CNN’s information in gastric cancer detection, classification, segmentation and delineating margins, etc. This can provide more comprehensive help for future CNN-based research on gastric cancer identification. There are also limitations in this study. First of all, this study only summarized the results of gastric cancer imaging published by CNN in English, and did not include excellent articles in other languages. Secondly, it is difficult to make simple comparisons of performance due to the different difficulty of images or lesions evaluation. Third, images included in research came from different countries and regions, and the research objects were from different races, so the images were heterogeneous. At the same time, there were differences in methods and evaluation indicators. No meta-analysis is carried out on the article. Fourth, most of the current studies were retrospective and mainly based on still images. The recent guideline [5] points out that AI needs to be validated in multicenter randomized controlled trials before routine clinical use, so prospective and multicenter studies are needed in the future.

Conclusions

In the summary of CNN’s research on medical images, it finds that CNN is a good tool for identifying gastric cancer. All the studies showed good identification performance and the model accuracy of all studies was 77.3 – 98.7%. CNN is expected to become an important tool to help doctors and pathologists improve the accuracy and efficiency of disease identification.

63 in total

1. Quantization Friendly MobileNet (QF-MobileNet) Architecture for Vision Based Applications on Embedded Platforms.

Authors: Uday Kulkarni; Meena S M; Sunil V Gurlahosur; Gopal Bhogar
Journal: Neural Netw Date: 2020-12-29

2. Joint multiple fully connected convolutional neural network with extreme learning machine for hepatocellular carcinoma nuclei grading.

Authors: Siqi Li; Huiyan Jiang; Wenbo Pang
Journal: Comput Biol Med Date: 2017-03-22 Impact factor: 4.589

3. Spotting malignancies from gastric endoscopic images using deep learning.

Authors: Jang Hyung Lee; Young Jae Kim; Yoon Woo Kim; Sungjin Park; Youn-I Choi; Yoon Jae Kim; Dong Kyun Park; Kwang Gi Kim; Jun-Won Chung
Journal: Surg Endosc Date: 2019-02-04 Impact factor: 4.584

4. Computer-aided diagnosis of breast ultrasound images using ensemble learning from convolutional neural networks.

Authors: Woo Kyung Moon; Yan-Wei Lee; Hao-Hsiang Ke; Su Hyun Lee; Chiun-Sheng Huang; Ruey-Feng Chang
Journal: Comput Methods Programs Biomed Date: 2020-01-25 Impact factor: 5.428

Review 5. Accuracy of artificial intelligence-assisted detection of upper GI lesions: a systematic review and meta-analysis.

Authors: Thomas K L Lui; Vivien W M Tsui; Wai K Leung
Journal: Gastrointest Endosc Date: 2020-06-17 Impact factor: 9.427

6. Deep Learning with Convolutional Neural Network for Differentiation of Liver Masses at Dynamic Contrast-enhanced CT: A Preliminary Study.

Authors: Koichiro Yasaka; Hiroyuki Akai; Osamu Abe; Shigeru Kiryu
Journal: Radiology Date: 2017-10-23 Impact factor: 11.105

7. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies.

Authors: Penny F Whiting; Anne W S Rutjes; Marie E Westwood; Susan Mallett; Jonathan J Deeks; Johannes B Reitsma; Mariska M G Leeflang; Jonathan A C Sterne; Patrick M M Bossuyt
Journal: Ann Intern Med Date: 2011-10-18 Impact factor: 25.391

8. Deep learning-based multi-view fusion model for screening 2019 novel coronavirus pneumonia: A multicentre study.

Authors: Xiangjun Wu; Hui Hui; Meng Niu; Liang Li; Li Wang; Bingxi He; Xin Yang; Li Li; Hongjun Li; Jie Tian; Yunfei Zha
Journal: Eur J Radiol Date: 2020-05-05 Impact factor: 3.528

9. Convolutional neural network for the diagnosis of early gastric cancer based on magnifying narrow band imaging.

Authors: Lan Li; Yishu Chen; Zhe Shen; Xuequn Zhang; Jianzhong Sang; Yong Ding; Xiaoyun Yang; Jun Li; Ming Chen; Chaohui Jin; Chunlei Chen; Chaohui Yu
Journal: Gastric Cancer Date: 2019-07-22 Impact factor: 7.370

10. Application of artificial intelligence using a convolutional neural network for diagnosis of early gastric cancer based on magnifying endoscopy with narrow-band imaging.

Authors: Hiroya Ueyama; Yusuke Kato; Yoichi Akazawa; Noboru Yatagai; Hiroyuki Komori; Tsutomu Takeda; Kohei Matsumoto; Kumiko Ueda; Kenshi Matsumoto; Mariko Hojo; Takashi Yao; Akihito Nagahara; Tomohiro Tada
Journal: J Gastroenterol Hepatol Date: 2020-07-28 Impact factor: 4.029

1 in total

1. Clinical Observation of MRI Scanning Combined with Clinical Nursing for Surgical Breast Cancer Patients.

Authors: Huan Zhang; Yanan Yin; Wenjing Tao; Ling Liu
Journal: Int J Anal Chem Date: 2022-05-17 Impact factor: 1.698

1 in total