Literature DB >> 35396625

COVID-CCD-Net: COVID-19 and colon cancer diagnosis system with optimized CNN hyperparameters using gradient-based optimizer.

Abstract

Coronavirus disease-2019 (COVID-19) is a new types of coronavirus which have turned into a pandemic within a short time. Reverse transcription-polymerase chain reaction (RT-PCR) test is used for the diagnosis of COVID-19 in national healthcare centers. Because the number of PCR test kits is often limited, it is sometimes difficult to diagnose the disease at an early stage. However, X-ray technology is accessible nearly all over the world, and it succeeds in detecting symptoms of COVID-19 more successfully. Another disease which affects people's lives to a great extent is colorectal cancer. Tissue microarray (TMA) is a technological method which is widely used for its high performance in the analysis of colorectal cancer. Computer-assisted approaches which can classify colorectal cancer in TMA images are also needed. In this respect, the present study proposes a convolutional neural network (CNN) classification approach with optimized parameters using gradient-based optimizer (GBO) algorithm. Thanks to the proposed approach, COVID-19, normal, and viral pneumonia in various chest X-ray images can be classified accurately. Additionally, other types such as epithelial and stromal regions in epidermal growth factor receptor (EFGR) colon in TMAs can also be classified. The proposed approach was called COVID-CCD-Net. AlexNet, DarkNet-19, Inception-v3, MobileNet, ResNet-18, and ShuffleNet architectures were used in COVID-CCD-Net, and the hyperparameters of this architecture was optimized for the proposed approach. Two different medical image classification datasets, namely, COVID-19 and Epistroma, were used in the present study. The experimental findings demonstrated that proposed approach increased the classification performance of the non-optimized CNN architectures significantly and displayed a very high classification performance even in very low value of epoch.

Entities: Chemical

Keywords: COVID-19; Colon cancer diagnosis; Convolutional neural network (CNN); Gradient-based optimizer (GBO); Hyperparameter optimization

Mesh：

Year: 2022 PMID： 35396625 PMCID： PMC8993211 DOI： 10.1007/s11517-022-02553-9

Source DB: PubMed Journal: Med Biol Eng Comput ISSN： 0140-0118 Impact factor: 3.079

Introduction

COVID-19 broke out in the world in early December 2019 and rapidly turned into a pandemic. According to the World Health Organization (WHO) data, 227,940,972 people have been infected, while 4,682,899 people have been killed by the disease around the world until today [1]. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the virus which has caused COVID-19 pandemic [2]. Common symptoms of COVID-19 pandemic can be listed as fever, muscle pain, dry cough, head ache, a sore throat ve chest pain [3, 4]. Due to these sypmtoms, COVID-19 has been accepted as a respiratory tract disease. It may take these symptoms 2 to 14 days to appear in a person who has been infected with the virus [5]. Despite recent attempts at finding a treatment method, such as a drug or vaccine, against the disease, no viable solutions to COVID-19 have been found yet. Various medical imaging techniques such as X-ray and computed tomography (CT) can be considered as important tools in the diagnosis of COVID-19 cases [6, 7]. Coronavirus usually causes lung infections. Therefore, chest X-ray and CT images are widely used by physicians and radiologists for an accurate and quick diagnosis in the patients infected with the virus. Polymerase chain reaction (PCR) test method is widely used for the diagnosis of COVID-19. However, the test is not always accessible at all healthcare points. It must be also noted that compared to PCR tests, X-ray and CT-based imaging techniques are usually more reliable and accessible. When CT and X-ray methods are compared, X-ray machines are preferred more by radiologists and physicians because of their accessibility nearly in every location including remote rural areas, cost-effectiveness, and their capacity to perform imaging in a fairly short period of time [5]. However, it is also time-consuming for physicians and radiologists to evaluate the patients’ X-ray images. Furthermore, it also runs the risk of inaccurate diagnosis because the detection of infected areas in an image requires technical know-how and medical experience. Therefore, an accurate and quick computer-assisted diagnosis system is needed for COVID-19 cases. The following literature review indicated that deep learning (DL) algorithms were used in order to diagnose COVID-19 in X-ray images succesfully [5, 8–12]. Introduced by Kononen [13] in 1998, tissue microarray (TMA) is an innovative and high-performance technique used for the analysis of multiple tissue samples. It is a high-end technology with a remarkable performance and has been used in the analysis of molecular identifiers recently. There is sufficient evidence to claim that epidermal growth factor receptor (EGFR) plays an important role in tumor development [14]. In parallel with this, it was also observed that EGFR played an important role in the initation and progress of colorectal cancer [15]. The present study proposes a convolutional neural network (CNN) classification approach with optimized hyperparameters using gradient-based optimizer (GBO) algorithm [16]. CNN is the most widely used DL model. The proposed approach was used to classify COVID-19, normal, and viral pneumonia. In addition, it can be also used to classify other types such as epithelial and stromal regions in EFGR-colon in digitized tumor TMAs. Real-world applications in many different fields such as medicine, agriculture, and engineering can be approached as an optimization problem. To this day, numerous optimization approaches have been developed in order to solve real-world problems in an effective way. However, high-performance optimization approaches are needed due to the fact that the difficulty of these optimization problems is increasing day by day. In this respect, metaheuristic algorithms (MAs), which are known as global optimization techniques, have been widely used to solve challenging optimization problems [17-22]. Artificial neural network (ANN) is an important machine learning approach inspired by the neural system in human mind. It involves an input layer, hidden layer, and output layer, and aims to adjust optimal values in relation with the weight of each neuron in ANN following a training process [23]. The performance of an ANN structure is heavily affected by the number and variety of training data. If an insufficient number of data is used in the training process, the performance of ANN is very likely to decrease. Various changes have been so far applied to ANN structure to design feedback and multi-layer model structures, which paved the way for the solution of non-linear problems. With the advent of multi-layer neural network models, the number of layers in an ANN structure has also increased and led to the development of CNN, which is a high-performance version of ANN models. Introduced during the 1990s, CNN was not preferred due to computer hardware incapacity in this period [23]. However, thanks to the technological developments in computer hardware and graphical processing unit (GPU) in the following years, CNN performances have also increased remarkably in recent years, and it became one of the most widely used machine learning approaches in various fields such as health, transportation, security, stock exchange, and law. Various CNN architectures have been so far proposed in the existing literature, as manifested by several examples such as MobileNet-V2, ShuffleNet, GoogleNet, VGG-16, VGG-19, and AlexNet. In these CNN architectures, hyperparameters such as learning rate, solver, L2 regularization, gradient threshold method, and gradient threshold are known to affect CNN performance directly. Therefore, it is not surprising that various studies in the existing literature attempted to offer solutions to the optimization of these hyperparameters. The present study benefited from AlexNet, DarkNet-19, Inception-v3, MobileNet, ResNet-18, and ShuffleNet architectures for the proposed approach, i.e., a COVID-19 and colon cancer diagnosis system with optimized hyperparameters using GBO. In order to optimize hyperparameters such as learning rate, solver, L2 regularization, gradient threshold method, and gradient threshold in these architectures, GBO algorithm proposed by Ahmadianfar et al. [16] was used in the present study. Inspired by Newton’s method, GBO is one of the most recent metaheuristic optimization approaches. The present study aims to optimize hyperparameters in AlexNet, DarkNet-19, Inception-v3, MobileNet, ResNet-18, and ShuffleNet and increase its classification performance. The main contributions of the present study can be summarized as follows: The present study proposes a high-performance approach which can classify both COVID-19 and colon cancer in TMAs. No approach which can classify both diseases has been so far proposed in the current literature. The proposed COVID-CCD-Net approach benefits from GBO [16] algorithm proposed in 2020 in order to optimize hyperparameters in AlexNet, DarkNet-19, Inception-v3, MobileNet, ResNet-18, and ShuffleNet. The present study aims to obtain a high level of accuracy with a low value of epoch in AlexNet, DarkNet-19, Inception-v3, MobileNet, ResNet-18, and ShuffleNet architectures in the proposed COVID-CCD-Net approach. On the other hand, the non-optimized CNN methods obtained a much lower level of accuracy with the same value of epoch. The organization of the present study is as follows: “Section 2” describes the related works. “Section 3” presents gradient-based optimizer and convolutional neural networks. “Section 4” describes the proposed COVID-CCD-Net approach. “Section 5” presents experiments and results, and Section 6 concludes the study.

Related works

Hyperparameter optimization

In order to optimize hyperparameters in CNN, various approaches such as adaptive gradient optimizer [24], Adam optimizer [25], Bayesian optimization [26], equilibrium optimization [27], evolutionary algorithm [28], genetic algorithm [29], grid search [30], particle swarm optimization [31, 32], random search [30, 33], simulating annealing [33], and tree-of-parzen estimators [33], whale optimization algorithm [34], and weighted random search [35] have been so far proposed. random search, simulating annealing, and tree-of-parzen estimators. In addition to its comprehensivess as a searching algorithm, grid search aims to identify the most optimal values for hyperparameters through a manually specified subset of hyperparameter space [36]. However, since the grid of configurations grows exponentially depending on the number of hyperparameters during the hyperparameter optimization process, the algorithm is not often useful for the optimization of deep neural networks [36]. During the hyperparameter optimization in CNN, it may take a few hours or a whole day to evaluate a hyperparameter selection, which causes serious computational problems. Similar to grid search algorithm, random search algorithm too encounters various disadvantages in sampling a sufficient number of points to be evaluated [37]. Bayesian optimization has been a popular technique for hyperparameter optimization recently [38]. One of the main advantages in Bayesian optimization–based neural network optimization is that it does not require running neural network completely. On the other hand, its complexity and high-dimensional hyperparameter space makes Bayesian optimization an impractical and expensive approach for hyperparameter optimization [36]. One of the biggest disadvantages of genetic algorithm is that it usually becomes stuck in a local optimal value and, as a result, results in yielding early convergence and non-optimal solutions [39]. Therefore, hyperparameter optimization techniques which benefit from genetic algorithm–based approaches are also likely to be problematic. Lima [33] compared various hyperparameter optimization algorithms such as random search, simulating annealing, and tree-of-parzen estimators in order to find the most effective CNN architecture in the classification of benign and malignant small pulmonary nodules. Kumar and Hati [24] proposed the adaptive gradient optimizer–based deep convolutional neural network (ADG-dCNN) approach for bearing and rotor faults detection in squirrel cage induction motor. Ilievski et al. [40] used radial basis function (RBF) as a surrogate of hyperparameter optimization in order to reduce the complexity of original network. Talathi [41] proposed a simple sequential model based optimization algorithm in order to optimize hyperparameters in deep CNN architectures. Rattanavorragant and Jewajinda proposed an approach using an island-based genetic algorithm in order to optimize hyperparameters in DNN automatically [42]. This approach involves two steps: hyperparameter search and a detailed DNN training. Navaneeth and Suchetha proposed the optimized one-dimensional CNN with support vector machine (1-D CNN-SVM) approach in order to diagnose chronic kidney diseases using PSO algorithm [43]. Compared to the literature review above, the main contribution of the present study is that the proposed COVID-CCD-Net approach can detect two important diseases: COVID-19 and colon cancer in TMAs. In addition, the proposed approach benefits from GBO, which is a metaheuristic approach, for the optimization of CNN models to overcome various problems mentioned in the existing literature.

Deep learning approaches for COVID-19

In recent times, many studies focusing on the diagnosis of COVID-19 using CNN have been published [44-50]. The literature review indicates that some of these studies [45-47] focused on the diagnosis of COVID-19 in non-COVID cases. On the other hand, there are also studies which classified cases into three groups as COVID, normal, and pneumonia [48-50]. Within the framework of the present study, the proposed COVID-CCD-Net approach classifies chest X-ray images into three different groups as COVID, normal, and pneumonia. Shi et al. [51] performed a detailed literature review regarding the state-of-the-art computer-assisted methods for the diagnosis of COVID-19 in X-ray and CT scans. Castiglioni et al. [52] benefited from two chest X-ray datasets containing 250 COVID-19 and 250 non-COVID cases in order to perform training, validation, and testing processes for Resnet-50. Hemdan et al. [53] proposed a deep learning–based approach called COVIDX-Net in order to diagnose COVID-19 in chest X-ray images automatically. This study involved seven different deep architectures, namely MobileNetV2, VGG19, InceptionV3, DenseNet201, InceptionResNetV2, ResNetV2, and Xception. Khan et al. [54] proposed a CNN-based approach called CoroNet in order to diagnose COVID-19 using X-ray and CT scans based on Xception architecture. The experimental studies demonstrated that the proposed model yielded an overall accuracy rate of 89.6% in four different classes (COVID vs. pneumonia bacterial vs. pneumonia viral vs. normal) and an overall accuracy rate of 95% in three different classes (normal vs. COVID vs. pneumonia). The proposed COVID-CCD-Net approach differs from other studies on the detection of COVID-19 using CNN models in that it improves classification performance by optimizing hyperparameters of CNN models thanks to GBO approach.

Computer-aided colon cancer detection approaches

As can be seen in various studies in the existing literature, the number of studies dealing with automatic diagnosis of colon cancer in TMAs is limited. Nguyen et al. [55] analyzed different ensemble approaches for colorectal tissue classification using highly efficient TMAs and proposed an ensemble deep learning–based approach with two different neural network architectures called VGG16 and CapsNet. Thanks to this approach, they classified colorectal tissues in highly efficient TMAs into three different categories, namely tumor, normal, and stroma/others. Xu et al. [56] proposed a deep CNN approach in order to perform the segmentation and classification of epithelial ve stromal regions in TMAs. This study benefited from two different datasets containing breast and colorectal cancer images. Finally, Linder et al. [57] proposed an approach for an automatic detection of epithelial ve stromal regions in colorectal cancer TMAs thanks to texture features and a SVM classifier. The proposed COVID-CCD-Net approach is superior to other studies on the detection of colon cancer in TMAs using CNN models in that it optimizes the hyperparameters of CNN models, which significantly increases the detection accuracy rates of colon cancer. The effective performance of CNN in image classification contributes to the present study to a higher extent compared to other studies using other approachs for the classification of colon cancer in TMAs in the existing literature.

Theoretical background

Gradient-based optimizer

Inspired by gradient-based Newton’s method, GBO was proposed by Ahmadianfar et al. [16] as one of the most recent metaheuristic algorithms. This algorithm is based on two main operators: gradient search rule (GSR) and local escaping operator (LEO). Main steps of GBO are described below.

Initialization process

In GBO, each member of the population is called a “vector” and, as seen in Eq. 1, the population consists of N number of vectors in a D-dimension search space. As shown in Eq. 2, each vector in the initial population is created by assigning random values within the boundaries of search space. Here, Xmin and Xmax are lower and upper boundaries in the search space, respectively, while rand(0,1) is a random number in a range of [0,1].

Gradient search rule

GSR operator is used in GBO in order to increase exploration ability, eliminate local minimum, and accelerate the convergence rate. Thus, optimal solutions can be obtained within the search space [16]. The position of a vector in the next iteration (x) is calculated using Eqs. 3 and 4 with: X1, X2, and x, which denotes the current position of the vector. r and r are random numbers in a range of [0, 1]. X1 and X2 in this equation are shown in the following equations: Here, x and xbest are the current position and the best vector in the population, respectively. GSR denotes the gradient search rule, while DM represents the direction of movement. GSR enables GBO to assign randomly, improve its exploration ability and eliminate local minimals. GSR can be calculated as shown in the following equations [16]: Here, rand(1:N) is an N-dimensional random number, r, r, r, and r denote random integer numbers selected from a range of [1, N], and, finaly, step represents the step size. DM shown in Eq. 11 helps the current position of the vector (x) move along the direction of xbest - x and thus provides local searching in order to improve convergence speed of GBO [16]. Global exploration and local exploitation must be balanced in an algorithm in order to find solutions closer to a global optimal value. p and p parameters in Eqs. 4, 7, and 11 are used to balance exploration and exploitation in GBO [16]. These parameters are calculated using the following equations: Here, βmin and βmax are 0.2 and 1.2, respectively, and m denotes the current number of iteration. M represents the maximum number of iteration.

Local escaping operator

LEO is used to improve efficiency of GBO. It can change the position of x vector significantly. Thanks to LEO, XLEO, which is a new vector, is created as shown in Eqs. 15 and 16, and assigned to x vector, as shown in Eq. 17. Here, f1 and f2 are random numbers generated in a range of [−1, 1], and u1, u2, and u3 are three randomly generated and different numbers, while x is a newly generated vector. u1, u2, u3, and x are defined as shown in the following equations: Here, rand, μ1, and μ2 are random numbers in a range of [0, 1], xrand denotes a randomly generated new vector, and x is a vector randomly selected from the population [16]. Flowchart of the GBO is shown in Fig. 1.

Fig. 1

Flowchart of the GBO

Convolutional neural networks

Convolutional neural networks (CNN) is a special type of neural network inspired by the biological model of animal visual cortex [58, 59]. They are particularly used in the field of image and sound processing due to their main advantage: the extraction of automatic and adaptive features during a training process [60]. In CNNs, the variable of the network structure (kernel size, stride, padding, etc.) and the network trained (learning rate, momentum, optimization strategies, batch size etc.) are known as hyperparameters [29], which must be adjusted accurately for a more effective CNN performance. In the present study, learning rate, solver, L2 regularization, gradient threshold method, and gradient threshold value, which are among network trained hyperparameters of AlexNet, DarkNet-19, Inception-v3, MobileNet, ResNet-18, and ShuffleNet, were optimized using GBO algorithm. Learning rate, which is also known as step size, is decisive in terms of updating weights [61, 62]. Solver, on the other hand, represents the optimization method to be used such as Adam, Sgdm, or Rmsprop [63]. The L2 regularization, which is also called weight decay, is a simple regularization method that scales weights down in proportion to their current size [64, 65]. Gradient threshold method and gradient threshold value are parameters related to gradient clipping. If the gradient increases exponentially in magnitude, it means that the training is unstable and can diverge within a few iterations. Gradient clipping helps avoid the exploding gradient problem. If the gradient exceeds the value of gradient threshold, then the gradient is clipped according to gradient threshold method [66, 67]. Input image size in AlexNet architecture, developed by Krizhevsky et al. [68], is 227×227. It consists of 5 convolution and 3 fully connected layers, thus reaching a depth of 8 layers. DarkNet-19 has a depth of 19 layers and its input image size is 256×256 [69]. Introduced by Szegedy et al. [70], Inception-v3 model has a depth of 48 layers with an input image size of 299×299. ResNet-18, which has a depth of 18 layers and an input image size of 224×224, was developed by He et al. [71]. Zhang et al. [72] proposed ShuffleNet model with a depth of 50 layers and an input image size of 224×224. Finally, MobileNet, which was proposed by Sandler et al. [73], has a depth of 53 layers and an input image size of 224×224.

Hyperparameter optimization of CNN models using gradient-based optimizer

In the present study, hyperparameters of AlexNet, DarkNet-19, Inception-v3, MobileNet, ResNet-18, and ShuffleNet CNN models such as learning rate, solver, L2 regularization, gradient threshold method, and gradient threshold value were optimized using GBO algorithm in order to classify COVID-19, normal, and viral pneumonia in chest X-ray images. In addition, other types such as epithelial and stromal regions in epidermal growth factor receptor (EFGR) colon in TMAs can also be classified. The proposed approach is called COVID-CCD-Net, as shown in the flowchart in Fig. 2.

Fig. 2

Flowchart of COVID-CCD-Net

Flowchart of COVID-CCD-Net In the proposed COVID-CCD-Net approach, initial parameters of GBO such as ε, the number of population and maximum number of iteration are adjusted. Then, an initial population is created by using vectors with randomly assigned values. Each vector consists of 5 dimensions which represent learning rate, solver, L2 regularization, gradient threshold method, and gradient threshold parameters of CNN models. Lower boundary (LB) and upper boundary (UB) values of these parameters are given in Table 1. Learning rate, L2 regularization, and gradient threshold are real values which are randomly generated between LB and UB values. If the solver value is 1, 2, or 3, “sgdm,” “adam,” and “rmsprop” optimization method is selected, respectively. If the gradient threshold method value is 1, 2, or 3, “l2norm,” “global-l2norm,” and “absolute-value” method is selected, respectively. In parallel with these boundaries, each vector in the initial population is generated using the formula in Eq. 22:

Table 1

Hyperparameters to be optimized and their ranges

Parameter	LB	UB
Learning rate	0.00001	0.01
Solver	1	3
L2 regularization	0.00001	0.01
Gradient threshold method	1	3
Gradient threshold	0.1	10

Hyperparameters to be optimized and their ranges The following steps are taken in order to calculate the fitness value of each vector: Firstly, X vector whose fitness value will be calculated is sent to CNN model and the values of X vector are assigned to learning rate, solver, L2 regularization, gradient threshold method, and gradient threshold parameters of CNN model. Later, CNN model is trained using the training dataset. Following the training processes, validation accuracy value obtained from the training is sent back to GBO and assigned as the fitness value of X vector. As shown in Fig. 2, each step of the algorithm is iterated until it reaches a maximum number of iterations. At the end, the vector with the most optimal fitness value is accepted as the solution of the problem.

Experiments and results

The present study proposes the COVID-CCD-Net approach in which learning rate, solver, L2 regularization, gradient threshold method, and gradient threshold parameters of AlexNet, DarkNet-19, Inception-v3, MobileNet, ResNet-18, and ShuffleNet were optimized using GBO. The classification performance of the proposed approach was tested using two different medical image classification datasets. Additionally, the results of this test were compared with those obtained from non-optimized AlexNet, DarkNet-19, Inception-v3, MobileNet, ResNet-18, and ShuffleNet CNN models. In addition, Quasi-Newton (Q-N) algorithm [74], one of the most fundamental optimization methods, was also used to optimize the hyperparameters of CNN models and compared with the proposed COVID-CCD-Net approach. The following sub-sections describe medical image classification datasets, experiment setup, and present comparative experimental findings.

Medical image classification datasets

COVID-19 [75, 76] and Epistroma [77] datasets were selected for the experimental studies. COVID-19 dataset consists of three classes, namely “Covid-19,” “Normal,” and “Viral Pneumonia,” with a total of 3829 images. Epistroma dataset, on the other hand, consists of two classes, namely “epithelium” and “stroma,” with a total of 1376 images. In both datasets, 80% and 20% images were used for training and testing processes, respectively and we have performed 5-fold cross-validation. Ten percent of the training data in each data set was also used for validation. Samples images from both datasets are shown in Fig. 3.

Fig. 3

Sample images from datasets

Experimental setup

All experimental studies were carried out on MATLAB R2020a platform. The number of vectors in GBO population and the maximum number of iterations were selected as 10 in the proposed COVID-CCD-Net approach. In other words, the fitness function is called 100 times. Q-N algorithm performs search starting at a single point instead of a population-based search. For a healthier comparison with the proposed approach, the number of maximum iterations was selected as 100 in Q-N algorithm to call the fitness function 100 times. In addition, default MATLAB values for solver, L2 regularization, gradient threshold method, and gradient threshold parameters were selected as “sgdm,” “0.0001,” “l2norm,” and “Inf,” respectively for non-optimized CNN models. Values of epoch for all CNN models were selected as 2 for COVID-19 dataset as 5 for Epistroma dataset. Mini batch size was set to 25. Twenty independent experimental studies were conducted on these datasets for all CNN models, and the obtained mean accuracy, maximum accuracy, F1-score, and standard deviation values were compared to measure the performances of all models.

Experimental results

Mean accuracy, maximum accuracy, F1-score, and standard deviation values obtained from 20 different independent studies on COVID-19 and Epistroma datasets are given in Tables 2 and 5, respectively. The findings were also shown in bar charts in Figs. 4 and 5 to give a clearer picture of the overall findings.

Table 2

Accuracy, F1-score and Std. dev. results of validation and test for the COVID-19 dataset

	Validation				Test
	Acc. (mean)	Acc.(max)	F1-score	Std. dev.	Acc. (mean)	Acc.(max)	F1-score	Std. dev.
AlexNet	90.065	93.638	90.254	2.627	87.089	90.601	87.305	2.590
DarkNet-19	93.189	95.269	93.292	1.833	91.149	94.125	91.309	1.526
Inception-v3	89.233	93.964	89.291	3.610	86.436	90.601	86.631	3.171
MobileNet	82.007	86.134	81.716	3.052	81.312	85.248	81.333	2.470
ResNet-18	91.313	93.638	91.418	1.460	88.962	91.253	89.115	1.793
ShuffleNet	88.679	94.454	88.665	4.035	86.449	90.339	86.535	3.075
COVID-CCD-Net (AlexNet)	96.354	97.879	96.460	1.166	96.044	98.172	96.138	0.927
COVID-CCD-Net (DarkNet-19)	97.553	98.532	97.654	0.631	97.369	98.303	97.458	0.496
COVID-CCD-Net (Inception-v3)	95.718	97.553	95.830	1.139	94.791	96.736	94.900	1.736
COVID-CCD-Net (MobileNet)	95.057	97.227	95.179	1.502	94.608	98.042	94.718	1.816
COVID-CCD-Net (ResNet-18)	97.977	98.532	98.063	0.301	98.107	98.695	98.158	0.353
COVID-CCD-Net (ShuffleNet)	96.223	96.900	96.312	0.424	96.730	98.172	96.805	0.623
Quasi-Newton-based AlexNet	94.043	96.574	94.205	4.123	91.879	93.994	92.099	3.038
Quasi-Newton-based DarkNet-19	96.807	97.226	96.914	0.373	94.072	95.430	94.225	1.399
Quasi-Newton-based Inception-v3	94.266	95.432	94.331	0.868	91.051	93.733	91.179	2.554
Quasi-Newton-based MobileNet	93.541	95.595	93.653	1.723	89.713	92.298	89.925	1.990
Quasi-Newton-based ResNet-18	95.834	96.248	95.916	0.511	93.019	94.125	93.124	0.869
Quasi-Newton-based ShuffleNet	95.050	95.759	95.142	1.055	90.551	93.081	90.731	2.336

Table 5

Accuracy, F1-score, and Std. dev. result of validation and test for the Epistroma dataset

	Validation				Test
	Acc. (mean)	Acc.(max)	F1-score	Std. dev.	Acc. (mean)	Acc.(max)	F1-score	Std. dev.
AlexNet	91.750	93.182	91.615	0.924	90.800	92.727	90.691	1.144
DarkNet-19	93.773	97.273	93.608	2.017	92.909	95.636	92.714	2.518
Inception-v3	92.477	95	92.264	1.255	93.073	96	92.870	1.481
MobileNet	92.886	94.545	92.727	1.066	92.145	94.909	92.009	1.641
ResNet-18	94.091	95	93.950	0.692	94.218	95.636	94.097	1.048
ShuffleNet	89.454	92.273	89.149	2.133	90.491	93.818	90.270	2.105
COVID-CCD-Net (AlexNet)	98.341	99.545	98.278	0.711	97.618	98.909	97.542	0.683
COVID-CCD-Net (DarkNet-19)	98.727	100	98.676	0.996	98.273	99.636	98.211	1.538
COVID-CCD-Net (Inception-v3)	99.705	100	99.692	0.424	98.964	99.636	98.924	0.476
COVID-CCD-Net (MobileNet)	95.318	98.636	95.161	3.561	94.255	97.455	94.096	3.752
COVID-CCD-Net (ResNet-18)	99.545	100	99.526	0.330	98.836	99.636	98.793	0.481
COVID-CCD-Net (ShuffleNet)	97.818	99.091	97.736	0.880	96.164	97.455	96.046	0.955
Quasi-Newton based AlexNet	95.636	97.727	95.505	2.286	95.491	97.455	95.362	2.337
Quasi-Newton based DarkNet-19	97.545	99.091	97.461	2.073	95.491	97.091	95.333	1.326
Quasi-Newton based Inception-v3	97.818	98.636	97.739	0.985	96.436	97.091	96.310	0.598
Quasi-Newton based MobileNet	95.545	96.364	95.407	1.132	93.964	95.273	93.802	0.913
Quasi-Newton based ResNet-18	98.001	99.091	97.929	1.944	96.655	98.182	96.541	1.530
Quasi-Newton based ShuffleNet	97.091	98.636	96.991	1.997	95.927	97.455	95.805	2.063

Fig. 4

Bar charts for the COVID-19 dataset

Fig. 5

Bar charts for the Epistroma dataset

Accuracy, F1-score and Std. dev. results of validation and test for the COVID-19 dataset Bar charts for the COVID-19 dataset Bar charts for the Epistroma dataset The findings related to COVID-19 dataset demonstrated that in the training process, COVID-CCD-Net (ResNet-18) reached the highest mean validation accuracy, maximum validation accuracy, and F1-score values with 97.977, 98.532, and 98.063, respectively. The second highest values were yielded by COVID-CCD-Net (DarkNet-19) with 97.553, 98.532, and 97.654, while non-optimized MobileNet displayed a lower performance with 82.007, 86.134, and 81.716. In the testing process, COVID-CCD-Net (ResNet-18) classified test images with a mean accuracy rate of 98.107%, followed by Darknet-19 with a mean accuracy rate of 97.369%. MobileNet displayed the lowest performance in terms of training and testing. validation and test accuracy for COVID-19 dataset before and after optimization with COVID-CCD-Net are given in Table 3 and the results demonstrated that COVID-CCD-Net increased the classification performance of the non-optimized CNN models by 6.22–13.29%. The performance was improved when Q-N algorithm was used to optimize the hyperparameters of non-optimized CNN models. However the performance increased between 2.92 and 8.40%, demonstrating that GBO displays a higher performance in the hyperparameter optimization in COVID-19 dataset.

Table 3

Validation and test accuracy before and after optimization with COVID-CCD-Net on the COVID-19 dataset

	Validation			Test
	Before optimization	After optimization	Performance improvement	Before optimization	After optimization	Performance improvement
AlexNet	90.065	96.354	6.289	87.089	96.044	8.955
DarkNet-19	93.189	97.553	4.364	91.149	97.369	6.22
Inception-v3	89.233	95.718	6.485	86.436	94.791	8.355
MobileNet	82.007	95.057	13.05	81.312	94.608	13.296
ResNet-18	91.313	97.977	6.664	88.962	98.107	9.145
ShuffleNet	88.679	96.223	7.544	86.449	96.73	10.281

Validation and test accuracy before and after optimization with COVID-CCD-Net on the COVID-19 dataset It can understand from the findings related to Epistroma dataset that in the training process, the highest mean accuracy, maximum accuracy, and F1-score values were obtained by COVID-CCD-Net (Inception-v3) with 99.705, 100, and 99.692, respectively. Similarly, COVID-CCD-Net (Inception-v3) also yielded the highest values in the testing process with 98.964, 99.636, and 98.924. It was followed by ResNet-18 with 99.545, 100, and 99.526 for the training and 98.836, 99.636, and 98.793 for the testing process. On the other hand, the lowest performance in the training and testing process was displayed non-optimmized ShuffleNet with 89.454, 92.273, and 89.149 and 90.491, 93.818, and 90.270, respectively. validation and test accuracy for epistroma dataset before and after optimization with COVID-CCD-Net are given in Table 4 and the results demonstrated that COVID-CCD-Net increased the classification performance of the non-optimized CNN models by 2.11–6.81%. The performance was improved when Q-N algorithm was used to optimize the hyperparameters of non-optimized CNN models. It can be seen in Table 5 that the performance increased between 1.81 and 5.43%, demonstrating that GBO displays a higher performance in the hyperparameter optimization in Epistroma dataset.

Table 4

Validation and test accuracy before and after optimization with COVID-CCD-Net on the Epistroma dataset

	VALIDATION			TEST
	Before Optimization	After Optimization	Performance Improvement	Before Optimization	After Optimization	Performance Improvement
AlexNet	91.75	98.341	6.591	90.8	97.618	6.818
DarkNet-19	93.773	98.727	4.954	92.909	98.273	5.364
Inception-v3	92.477	99.705	7.228	93.073	98.964	5.891
MobileNet	92.886	95.318	2.432	92.145	94.255	2.11
ResNet-18	94.091	99.545	5.454	94.218	98.836	4.618
ShuffleNet	89.454	97.818	8.364	90.491	96.164	5.673

Validation and test accuracy before and after optimization with COVID-CCD-Net on the Epistroma dataset Accuracy, F1-score, and Std. dev. result of validation and test for the Epistroma dataset As shown in Tables 2 and 5, GBO algorithm remarkably improves the performance of non-optimized CNN models in COVID-19 and Epistroma datasets. Additionally, experimental studies indicated that GBO algorithm displayed a higher performance in hyperparameter optimization in both datasets compared to Q-N algorithm. Mean training accuracy curves of all models obtained from COVID-19 dataset are shown in Fig. 6. While COVID-CCD-Net (ResNet-18) displayed a faster convergence, non-optimized MobileNet displayed a slower convergence. Mean training accuracy curves of all models obtained from Epistroma dataset are shown in Fig. 7, COVID-CCD-Net (Inception-v3), COVID-CCD-Net (ResNet-18), and COVID-CCD-Net (DarkNet-19) displayed a fast convergence in the first 20 iterations and a lower convergence in the remaining iterations.

Fig. 6

Mean training accuracy curves for the COVID-19 dataset

Fig. 7

Mean training accuracy curves for the Epistroma dataset

Mean training accuracy curves for the COVID-19 dataset Mean training accuracy curves for the Epistroma dataset Maximum and mean confusion matrix values of all models obtained from the testing processes for COVID-19 and Epistroma datasets are shown in Fig. 8 and Fig. 9. A confusion matrix is a table which is used to describe the performance of a model by referring to its accuracy rates in each class. Rows and columns in a confusion matrix correspond to the predicted class (output class) and true class (target class), respectively.

Fig. 8

Confusion matrices of COVID-19 dataset

Fig. 9

Confusion matrices of Epistroma dataset

Confusion matrices of COVID-19 dataset Confusion matrices of Epistroma dataset The receiver operating characteristic (ROC) curves of COVID-19 and Epistroma datasets are provided in Fig. 10 and Fig. 11 respectively, which showing the relationship between the false positive rate (FPR) and the true positive rate (TPR). It can be clearly seen, in COVID-19 dataset COVID-CCD-Net (ResNet-18) and in Epistroma dataset COVID-CCD-Net (Inception-v3) have higher true positive rates.

Fig. 10

ROC curves for COVID-19 dataset

Fig. 11

ROC curves for Epistroma dataset

ROC curves for COVID-19 dataset ROC curves for Epistroma dataset Table 6 and Table 7 compare the performance of the COVID-CCD-Net with several state-of-the art methods on COVID-19 and Epistroma datasets. It can be seen obviously; the the COVID-CCD-Net has the highest classification accuracy among the compared methods for both datasets.

Table 6

Comparison of the results with state-of-the art CNN methods for COVID-19 dataset

Source	Models/methods	Number of classes	Overall acc (%)
Song et al. [78]	DRE-Net	2 3	86 93
Ozturk et al.[79]	DarkCovidNet	3 2	87.02 98.08
Wang et al. [80]	CNNs, transfer learning	2	89.5
Keidar et al.[81]	Data augmentation, segmentation and CNN	2	90.3
Wang et al. [50]	Customized CNN architectur	3	93.33
Zhang et al.[82]	COVID19XrayNet	2	91.92
Goel et al.[83]	OptCoNet	3	97.78
Proposed	COVID-CCD-Net	3	98.107

Table 7

Comparison of the results with state-of-the art CNN methods for Epistroma dataset

Source	Models/methods	Number of classes	Overall acc (%)
Alinsaif and Lang[84]	Fine tuning CNN	2	98.84
Cascianelli et al.[85]	Dimensionality reduction strategies for CNN	2	94.7
Huang et al.[86]	CNNs, transfer learning	2	93.5
Bianconi et al.[87]	CNNs, Resnet50	2	97.3
Proposed	COVID-CCD-Net	2	98.96

Comparison of the results with state-of-the art CNN methods for COVID-19 dataset 2 3 86 93 3 2 87.02 98.08 Comparison of the results with state-of-the art CNN methods for Epistroma dataset

Conclusion

In order to classify Covid-19, normal, and viral pneumonia in chest X-ray images as well as epithelial and stromal regions in TMA images accurately, the present study proposed the COVID-CCD-Net approach with the optimized hyperparameters of AlexNet, DarkNet-19, Inception-v3, MobileNet, ResNet-18, and ShuffleNet CNN models using GBO, which is one of the most recent metaheuristic optimization algorithms. Network-trained parameters of these CNN models such as learning rate, solver, L2 regularization, gradient threshold method, and gradient threshold were optimized and tuned using GBO algorithm. In the GBO, each vector of the population represents a set of CNN’s hyperparameters, and the algorithm searches for the hyperparameter values that help the model display the highest classification performance. Two different medical image classification datasets, i.e., COVID-19 and Epistroma, were used in the experimental study. While GBO hyperparameter optimization improved the performance of non-optimized CNN models in COVID-19 dataset by 6.22% to 13.29%, the contribution of Q-N algorithm did not exceed 2.92% to 8.40%. Similarly, GBO hyperparameter optimization improved the performance of non-optimized CNN models in Epistroma dataset by 2.11% to 6.81%, Q-N algorithm improved it only 1.81% to 4.53%. These results demonstrated that the proposed approach significantly improved the classification performance of AlexNet, DarkNet-19, Inception-v3, MobileNet, ResNet-18, and ShuffleNet CNN models and displayed a better performance compared to non-optimized CNN models. One of the main problems in CNN-based classification approaches is their need for a high number of high-quality images for a succesful classification performance and optimal values for the hyperparameters of CNN architecture. In the present study, a sufficient number of images was used to complete training process for CNN architecture, and the proposed COVID-CCD-Net approach was used to optimize the hyperparameters of CNN architectures to overcome the above-mentioned problems. Future studies will focus on the optimization of different hyperparameters such as filter size, filter number, stride, and padding using various metaheuristic optimization algorithms.

21 in total

1. Source characterization of airborne pollutant emissions by hybrid metaheuristic/gradient-based optimization techniques.

Authors: Roseane A S Albani; Vinicius V L Albani; Antonio J Silva Neto
Journal: Environ Pollut Date: 2020-09-14 Impact factor: 8.071

2. Impact of EGFR expression on colorectal cancer patient prognosis and survival.

Authors: J-P Spano; C Lagorce; D Atlan; G Milano; J Domont; R Benamouzig; A Attar; J Benichou; A Martin; J-F Morere; M Raphael; F Penault-Llorca; J-L Breau; R Fagard; D Khayat; P Wind
Journal: Ann Oncol Date: 2005-01 Impact factor: 32.976

3. Tissue microarrays for high-throughput molecular profiling of tumor specimens.

Authors: J Kononen; L Bubendorf; A Kallioniemi; M Bärlund; P Schraml; S Leighton; J Torhorst; M J Mihatsch; G Sauter; O P Kallioniemi
Journal: Nat Med Date: 1998-07 Impact factor: 53.440

4. Essentials for Radiologists on COVID-19: An Update-Radiology Scientific Expert Panel.

Authors: Jeffrey P Kanne; Brent P Little; Jonathan H Chung; Brett M Elicker; Loren H Ketai
Journal: Radiology Date: 2020-02-27 Impact factor: 11.105

5. CoroDet: A deep learning based classification for COVID-19 detection using chest X-ray images.

Authors: Emtiaz Hussain; Mahmudul Hasan; Md Anisur Rahman; Ickjai Lee; Tasmi Tamanna; Mohammad Zavid Parvez
Journal: Chaos Solitons Fractals Date: 2020-11-23 Impact factor: 5.944

6. Application of deep learning techniques for detection of COVID-19 cases using chest X-ray images: A comprehensive study.

Authors: Soumya Ranjan Nayak; Deepak Ranjan Nayak; Utkarsh Sinha; Vaibhav Arora; Ram Bilas Pachori
Journal: Biomed Signal Process Control Date: 2020-11-19 Impact factor: 3.880