Literature DB >> 34177037

Convolutional neural network for diagnosis of viral pneumonia and COVID-19 alike diseases.

Abdullahi Umar Ibrahim¹, Mehmet Ozsoz¹, Sertan Serte², Fadi Al-Turjman³, Salahudeen Habeeb Kolapo⁴.

Abstract

Reverse-Transcription Polymerase Chain Reaction (RT-PCR) method is currently the gold standard method for detection of viral strains in human samples, but this technique is very expensive, take time and often leads to misdiagnosis. The recent outbreak of COVID-19 has led scientists to explore other options such as the use of artificial intelligence driven tools as an alternative or a confirmatory approach for detection of viral pneumonia. In this paper, we utilized a Convolutional Neural Network (CNN) approach to detect viral pneumonia in x-ray images using a pretrained AlexNet model thereby adopting a transfer learning approach. The dataset used for the study was obtained in the form of optical Coherence Tomography and chest X-ray images made available by Kermany et al. (2018, https://doi.org/10.17632/rscbjbr9sj.3) with a total number of 5853 pneumonia (positive) and normal (negative) images. To evaluate the average efficiency of the model, the dataset was split into on 50:50, 60:40, 70:30, 80:20 and 90:10 for training and testing respectively. To evaluate the performance of the model, 10 K Cross-validation was carried out. The performance of the model using overall dataset was compared with the means of cross-validation and the currents state of arts. The classification model has shown high performance in terms of accuracy, sensitivity and specificity. 70:30 split performed better compare to other splits with accuracy of 98.73%, sensitivity of 98.59% and specificity of 99.84%.

Entities: Chemical

Keywords: CNN; COVID‐19; artificial intelligence; pretrained AlexNet; viral pneumonia

Year: 2021 PMID： 34177037 PMCID： PMC8209916 DOI： 10.1111/exsy.12705

Source DB: PubMed Journal: Expert Syst ISSN： 0266-4720 Impact factor: 2.812

Artificial Intelligence Area Under the Curve Batch Normalization Computer Assisted Diagnosis Convolutional Neural Network Convolution Computerized Tomography Chest X‐ray Deep Learning Fully Connected Layers Feature Map False Negative False Positive Graphical Processing Unit Middle East Respiratory Syndrome Machine Learning Magnetic Resonance Imaging Random Access Rectified Linear Unit (ReLu) Reverse‐Transcription Polymerase Chain Reaction Severe Acute Respiratory Syndrome Coronavirus 1 and 2 Support Vector Machine Transfer Learning Transfer Learning True Positive World Health Organization

INTRODUCTION

Pneumonia is a disease caused by different types of pathogens, which include viruses, bacteria and fungi. Different species that causes pneumonia are shown in Table 1. According to World Health Organization (2018), over 4 million premature deaths occur as a result of diseases related to household air pollution including pneumonia and Tuberculosis. More than 150 million people were estimated to be infected with pneumonia annually and the disease is more prevalence in children less than 5 years of age. Globally, pneumonia is among the top diseases that affect children and account for 15% of mortality of infants and children below 5 years leading to over 1.4 million death in 2018 and 2.56 million in 2017. Even though the prevalence of the disease is common between children, it can also affect all age range. The cases of pneumonia are predominant in underdeveloped countries with poor healthcare sectors, lack of medical personnel and resources for diagnosis and treatment (Gilani et al., 2012; Stephen et al., 2019).

TABLE 1

Classification of pneumonia based on pathogens

Pathogen	Species
Viruses	Influenza virus, Severe Acute Respiratory Syndrome Coronavirus (SAR‐CoV‐1 and 2), Middle East Respiratory Syndrome (MERS) Coronavirus, Adenovirus, Enteroviruses, Hantavirus etc.
Bacteria	Legionella pneumophila, Streptococcus pneumoniae, Mycoplasma pneumoniae, Chlamydophila pneumoniae etc.
Fungi	Aspergillus spp, Histoplasmosis, Pneumocystis jirovecii, Coccidioidomycosis, Mucoromycetes, Cryptococcus etc.

Classification of pneumonia based on pathogens COVID‐19 is among diseases cause by virus from the Coronaviridae family. Several strains of this family have caused global concerns in the past such as Middle East respiratory syndrome coronavirus (MERS‐CoV) in 2012 and severe acute respiratory syndrome coronavirus (SARS‐CoV) in 2002 (Dowel et al., 2004; Oboho et al., 2015). COVID‐19 was declared pandemic by the World Health Organization (WHO) in mid‐March 2020 as a result of outbreak of a new viral strain, which was first recorded on the eve of January, 2020 in Wuhan China. COVID‐19 has spread to almost every country infecting more than 30 million with over 800 thousand deaths globally as of 08 October, 2020 (WHO, 2020). The major symptoms of COVID‐19 include fever, cough, difficulty in breathing and in severe cases, it can lead to pneumonia, kidney failure and eventually death (Banerjee et al., 2019; Chen et al., 2020). The disease is more severe to patients suffering from other diseases such as impaired immune system disorders, patients placed on a ventilator machine, people who smoke and patients suffering from asthma and other chronic diseases (Kolhar et al., 2020; Rahman et al., 2020; Srivastava et al., 2020). The use of artificial intelligence and machine learning in healthcare systems is growing exponentially due to its ability in detecting diseases, diagnosing clinical issues, discovering drugs, etc. The use of specific machine learning models has even outperformed both microbiologists and pathologists in diagnosis of specific cases due to their pattern recognition ability (Bakator & Radosav, 2018; Hu et al., 2020; Paules et al., 2020; Wang, Casalino, et al., 2019). Clinicians employ different approaches to diagnose viral pneumonia such as the use of blood test, chest x‐ray, sputum test and pulse oximetry. The gold standard technique is the use of RT‐PCR for detection of the viral strain and the use of Computerized Tomography (CT) scan images which are interpreted by radiologist. Even though many studies have reported the efficiency of using artificial intelligence models for detection of viral pneumonia, these approaches are limited to the use of CT scan images acquired from patients who visited clinics or in a health care setting. Different strains of viruses are associated with viral pneumonia such as Influenza virus, Respiratory syncytial virus, Human metapneumovirus, adenoviruses, coronaviruses with COVID‐19 as the recent viral strain on the list (Chowdhury et al., 2020; Ruuskanen et al., 2011). The unavailability of test kit and lockdown of cities as a result of the COVID‐19 pandemic are other major challenges. In order to solve these challenges, we utilized pretrained AlexNet Model to classify viral pneumonia and normal (i.e., healthy) CT scan. The integration of IoT, artificial intelligence and biosensors contributed to the advancement of smart systems that can be used to detect, manage and control diseases. The use of smart sensing tools and monitoring devices designed using chips and sensors improved various aspect of healthcare systems in terms of detection of pathogens that causes disease, monitoring of medication, storage and analysis of vital signals, medical records management and rehabilitation of diseases. The potential of IoT in detection of diseases revolve around the use of AI driven biosensor which collects physiological and other form of data from patient's smartphones and wearable devices and applies AI or ML techniques to detect changes in patient's vital signal patterns (Kanaparthi et al., 2019; Kavakiotis et al., 2017; Paiva et al., 2018).

Machine learning (ML) and deep learning (DL)

Machine learning is a subset of AI that often uses statistical methods to give computer the ability to learn patterns from data without being explicitly programmed. ML algorithms are categorized into supervised, unsupervised, semi‐supervised (hybrid of supervised and unsupervised) and reinforcement. Supervised ML is the most common approach employ in healthcare system which utilize labelled data for models to learn features for prediction and classification (base on patterns). Supervised ML algorithms include neural networks (NNs), support vector machine (SVM), Decision Tree, Random Forests etc. (Paiva et al., 2018). Unsupervised ML utilize unlabelled data to enable model to learn and predict output based on patterns learn from input data. Clustering and rule mining are the most common algorithms use in unsupervised ML However, reinforcements learning relies on the use of experience acquired by performing a given task (Catthoor & Van Hoof, 2018; Wang, Casalino et al., 2019). The use of DL as a sub‐branch of artificial intelligence comprises of deeper neural networks that can identify more complex non‐linear patterns in data acquired from medical devices (such as microscope, MRI) and IoT ecosystems (such as sensors, devices implants and monitors) and provide meaningful output for decision making. There are various neural networks architectures that have been developed. Some of the architectures have performed better than others in terms of regression, classification and denoising images. The current architectures based on CNN include AlexNet with eight layers, VGGNet with 19 and 16 layers, Inception module also known as GoogleNet with 22 layers and 9 modules and Residual or ResNet with 152 layers (Russakovsky et al., 2015; Yu et al., 2020). The principle behind the application of CNN in classification or regression revolves around series of dot products of weight matrices and input matrix. These processes are categorized into two stages known as feature learning and classification (Wang, Sun, et al., 2020). Feature learning is based on the use of convolutional blocks with operations such as convolution, a process of computing input matrix and feature matrix to obtain a convolve map or feature map. Activation operation is the use of activation function such as tanh, sigmoid and Rectified Linear Unit (ReLu) to squash output into zero or within ranges of 0 and 1 or from −1 to 1. The main function of pooling operation is to reduce computation by taking the most important part of the convolve map by either max pooling or average (mean) pooling (Kang et al., 2020; Wang, Muhammad, et al., 2020). The output is obtained after these operations in all the layers (including fully connected layers or global average pooling layers) and the use of classifier such as SoftMax based on probabilities to categorized output.

Application of artificial intelligence in detection of pathogenic diseases

Artificial intelligence has been applied in different field of medicine for detection of diseases associated with cancer, tuberculosis, diabetic retinopathy, pneumonia such as bacterial pneumonia and viral pneumonia (influenza virus and recently, SAR‐CoV‐2). The most common type of dataset used by medical expert includes microscopic slide images and radiographic images (such as CT and CXR). These diseases are classified using different DL models such as ImageNet models (AlexNet, VGGNet, GoogleNet and ResNet). However, these diseases can also be classified using models designed from scratch or using hybrid models (Bakator & Radosav, 2018; Chowdhury et al., 2020; Kallianos et al., 2019; Wang, Casalino, et al., 2019).

Challenges

As the number of people suffering from pneumonia (especially the ones caused by Influenza virus and SAR‐CoV‐2) continue to grow rapidly. There is high need for testing kits that can enable massive detection and provide result in a short period of time. Detection of viral pneumonia such as COVID‐19 and non‐COVID‐19 viral pneumonia is very critical for prevention and control. Health expert required sophisticated technology to accurately detect these pathogens. Moreover, detection of individual pathogens using molecular testing is still not up to standard of point of care diagnostics, instead specimens are sent out to specialized or equipped laboratories for RT‐PCR sequencing and diagnosis. Pneumonia as one of the symptoms of COVID‐19 and other Bacterial pneumonia have been a major challenge for medical and healthcare sectors in many underdeveloped countries and remote communities with limited diagnosis tools and treatment approach. Other approach utilized by medical experts is the use of chest X‐ray images which are cheaper, reliable and fast. However, interpretation of the images can sometimes be tedious to qualified radiologists. Therefore, the development of fast, cheap, simple and accurate detection approach for diagnosis and predictions of these diseases are highly required.

Contribution

Accordingly, our contributions can be summarized as follows. We utilized Pre‐trained (through transfer learning) AlexNet model for detection of pneumonia in CT Scan images. We carried out 10 k cross validation to estimate the model will perform on unseen dataset. We evaluated the performance of the models based on accuracy, sensitivity and specificity for general dataset and mean average of the parameters for 10 K cross validation. The remaining parts of this article are organized as follows. Section 2 overviews related work on the use of AI for the detection of pneumonia. In Section 3, we introduced the adopted model with dataset description, model training and cross validation. In Section 4, we discuss about the result obtained from training and testing of the model, comparison of general dataset with cross validation and comparison of our models with the state of art. Finally, we include concluding remarks in Section 5.

RELATED WORK

Throughout the last decade, scientists have been trying to integrate the application of AI, ML, DL in healthcare system. Researchers have utilized CNN to solve challenges in medicine such as disease detection using classification and segmentation approaches in skin disorders, brain and breast cancer, and in diabetes (retinopathy) diseases. In the field of microbiology, microbiologists, radiologists and computer scientists have been working together to detect microbial diseases such as tuberculosis, malaria and pneumonia using computer aided diagnosis (Kallianos et al., 2019). X‐ray images are the basic data used for detection of pneumonia using ML approach. This idea is adopted by Stephen et al. (2019). The authors utilized DL approach to classify X‐ray images samples. The research employed a CNN that is built from scratch using Keras open source with TensorFlow to extract distinctive features from positive and negative images. The dataset contains 5856 X‐ray images of normal and pneumonia images collected from pediatric patients between 1 to 5 years old. The dataset was further augmented to yield a greater number of training dataset. The model was tested on different data size (100–300) and the model achieved average accuracy of 94.81%, 93.01% training and validation respectively. ChestX‐Ray8 a new dataset from Chest X‐ray Database and Benchmarks was utilized by Wang et al. (2019). The datasets contain X‐ray images with total number of 108,948 from 32,717 patients for detection of thoracic diseases. The authors trained the dataset using CNN networks such as AlexNet, VGGNet‐16, GoogleNet and ResNet‐50. The research achieved AUC value of 0.6333 for “pneumonia”. A similar study carried out by Rajpurkar et al. (2017) based on 121‐layer CNN called CHeXNet. This research utilized more than 100 thousand frontal view X‐ray images with 14 different diseases. For detection of pneumonia, the model achieved AUC value of 0.8887 with the model outperforming radiologist. The use of AI and CT scans for detection of COVID‐19 is provided by Wang, Kang, et al. (2020). 453 CT scan images of confirmed COVID‐19 cases of patient diagnosed with viral pneumonia are utilized as dataset. The images are classified into training, testing and validation. The model achieved validation accuracy of 82.9%, sensitivity of 84% and specificity of 80.5% while the external testing result has shown an accuracy of 73.1%, sensitivity of 74% and specificity of 67%. Saraiva et al. (2019) classified X‐ray Images of childhood pneumonia using CNN model. The research datasets were made available online by Kermany, Zhang, et al. (2018) which are labelled as Optical Coherence Tomography (OCT) and Chest X‐Ray Images with total number of 5863 images. The model was train base on cross validation (k = 5) and the model achieved 95.30% average accuracy. Recently, Chouhan et al. (2020) utilized transfer learning to classify X‐ray images into positive and negative pneumonia samples. The research employed transfer learning models of Resnet (Inception V3), GoogleNet, DenseNet121 and AlexNet. A total of 5856 normal and pneumonia (bacteria and virus) were used. The models achieved respective training (at different epochs) and testing accuracies with AlexNet (98.97% and 92.86%), DenseNet121 (99.23% and 92.62%), GoogleNet (99.48% and 93.12%) and ResNet (99.48% and 94.23%). A broader study is reported by Xu et al. (2020). The authors proposed an artificial intelligence technique to screen and distinguish between two different types of viral pneumonia which include COVID‐19, Influenza‐A and healthy cases using patients CT images. 618 CT scans (224 CT samples from 224 patients with Influenza‐A virus, 219 from 110 patients with COVID‐19 and 175 CT samples from healthy people) are utilized as dataset which undergoes image processing before training using 3‐dimensional DL model. The result has shown that the model achieved overall accuracy of 86.7%. Peng et al. (2020) utilized small number of datasets obtained from 32 patient already diagnosed with COVID‐19 using RT‐PCR method. The study utilized four AI‐driven tools and the study has shown AI can be used to improve confirmed diagnosis rate for clinical cases of COVID‐19. To discriminate between viral and bacterial pneumonia, Rajaraman et al. (2018) employed CNN (VGG‐16, residual and inception CNN) for detection of pneumonia in pediatric chest radiographs by localizing the Region of Interest (ROI). The dataset contains total number of 5856 (which include viral, bacterial pneumonia and normal CXR images). The models achieved 96.2% accuracy for bacterial pneumonia and 93.6% for viral pneumonia. A more sophisticated study is carried out by Zech et al. (2018) who utilized deep NN and split validation approach to detect pneumonia in X‐ray images. The study employed a total number of 158,323 chest radiographs collected from three different institutions. The results have shown higher accuracy and AUC values. The summary of literature review is presented in Table 2.

TABLE 2

Detection of different types of pneumonia using AI‐driven tools

Reference	Type of pneumonia	Dataset	Result
Stephen et al., 2019	Viral pneumonia (strain not specified)	5856 X‐ray images	Average Ac of 94.81% training and 93.01% for validation
Rajpurkar et al., 2017	Not specified	108,948 X‐ray images	0.6333 AUC
X. Wang et al., 2017	Not specified	100, 000 X‐ray images	0.8887 AUC
Wang, Kang et al. 2020	Viral pneumonia (COVID‐19)	453 CT scan images	The model achieved validation AC of 82.9%, SV of 84% and SP of 80.5%, testing AC of 73.1%, SV of 74% and SF of 67%.
Saraiva et al., 2019	Viral pneumonia (strain not specified)	5863 Chest X‐Ray Images	AC of 95.30%
Chouhan et al., 2020	Viral and Bacterial pneumonia (strains not specified)	5863 Chest X‐Ray Images	Different models were used
Xu et al., 2020	viral pneumonia (COVID‐19, Influenza‐A)	618 CT scan Images	Ac of 86.7%.
Rajaraman et al., 2018	Viral and Bacterial pneumonia (strains not specified)	5856 chest X‐Ray	Ac of 96.2% accuracy for bacterial pneumonia and 93.6% for viral pneumonia
Zech et al., 2018	Viral and Bacterial pneumonia (strains not specified)	158,323 chest radiographs	Different models were used

Abbreviations: Ac, Accuracy; AUC, Area under the curve; Sf, Specificity; Sv, Sensitivity.

Detection of different types of pneumonia using AI‐driven tools Abbreviations: Ac, Accuracy; AUC, Area under the curve; Sf, Specificity; Sv, Sensitivity.

THE PROPOSED APPROACH

In this section, we detailed the proposed approach procedures and its main assumptions. The work flow of the study design is schematically shown in Figure 1. In this study, a pretrained AlexNet model is used for classification of pneumonia from normal Chest X‐ray images. Apart from AlexNet, there are other high performing CNN models such as VGGNet, GoogleNet and ResNet, but due its simplicity, a smaller number of layers, minimum error and computational time restraints, it was utilized, nonetheless.

FIGURE 1

The workflow is represented schematically. CXR images are used to train the network using Pretrained AlexNet model for classification of pneumonia and normal (healthy)

Dataset

We obtained X‐ray images made available by Kermany, Zhang et al. (2018). The dataset contains three folders (training, validation and testing with a total number of 5856 positive and negative cases. In each folder there is a subfolder with names pneumonia and Normal folders. The dataset description is based on X‐ray images collected from retrospective pediatric patients between the age of 1 to 5 as shown in Figure 2 and describe in Table 3.

FIGURE 2

Pediatric CXR scans. Left: Pneumonia. Right: Normal CXR scan

TABLE 3

Dataset description

Label	Number
Positive	4273
Negative	1583
Total	5856

Pediatric CXR scans. Left: Pneumonia. Right: Normal CXR scan Dataset description

Model training

For training of datasets, we employed Matlab installed on personal computer with window‐64‐bit, 8GB random access memory (RAM), with an intel ® Core i7‐3537U and graphical Processing unit (GPU). 30% of the dataset split as testing dataset are used to evaluate the model performance. Pretrained AlexNet model is employed due to its high accuracy in carrying out feature extraction and image classification. Figure 3 shows the AlexNet architecture employed to classify X‐ray images. AlexNet model contain 5 convolution (CONV) blocks or layer with convolutional filters size of 3×3 without padding and 2×2 window size for max pooling operation. The last 3 layers are 2 fully connected layers (FCL) and output layer. Other terms include Batch Normalization (BN) and Feature Map (FM). SoftMax activation function is utilized in the output layer for classification. Minibatch optimization is a gradient descent that is used to optimize the model. The training is carried out using 20 epochs with 0.0001 as learning rate.

FIGURE 3

Training of models using AlexNet model. AlexNet model contain 5 convolution (CONV) blocks or layers. The first 2 CONV layers are made up of 3 operations which include convolution, max pooling and normalization. Third and fourth layer are made up of only convolution while fifth layer is made up of convolution and max pooling. The last 3 layers are 2 fully connected layers (FCL) and output layer with SoftMax activation function for classification

Data split

According to literature, scientist recommended the use of 80% for training and 20% for testing. In order to check the performance of different split ratios, we trained the model based on 50:50, 60:40, 70:30, 80:20 and 90:10 for training and testing respectively. The data split for each ratio is presented in Table 4.

TABLE 4

Data split

	Split	Training		Split	Testing
S/No	%	Positive	Negative	%	Positive	Negative
1	50	2137	792	50	2136	791
2	60	2564	950	40	1709	633
3	70	2991	1108	30	1282	475
4	80	3418	1266	20	855	317
5	90	3846	1425	10	427	158

Note: Total number of dataset = 5856, Positive = 4273, Negative = 1583.

Data split Note: Total number of dataset = 5856, Positive = 4273, Negative = 1583.

Cross validation

Cross validation is a vital method used in machine learning for parameter selection and evaluation of learning performance and prediction. In this study, we utilized K‐fold cross validation approach where the datasets are split into K sets of equal size (i.e., K = 10). In each K sets K−1 is used as training dataset and 1 set is used as validation dataset. Training of the dataset is repeated for K number of times (i.e., n = k) (Fan & Hauser, 2018). The average performance of the training and testing dataset is computed as the evaluation index for the models. This approach is very efficient especially when there are limited number of samples as it takes advantage of the whole dataset. Hence, cross validation dataset classifications are presented in Supplimentary Data S1.

Evaluation and confusion matrix

To evaluate the performance of the trained models, three parameters are employed; accuracy, sensitivity and specificity. Accuracy is termed as the ratio of correctly classified images over total number of images, it is also termed as the sum of sensitivity and specificity. For evaluating the accuracy and loss of a model the following formulas are utilized as shown in Equations (1) and (2). where N is the overall number of images during training and testing, and n is the number of images and PC is the probability of the correctly classified images. Confusion matrix is the common approach used for evaluation of model performance based on True Positive (TP), True Negative (TN), False Positive (FP) and False Negative (FN). TPs is the number of samples that are correctly identified by the model as positive cases or number of cases who actually have pneumonia according to each model. TNs is the number of samples that are correctly identified by the model as negative cases or number of cases who are actually healthy (normal) and classified as negative according to each model. FPs are the number of samples that are incorrectly classified as negative by the model or number of cases that are actually negative (normal or healthy) but classified as pneumonia according to each model. FNs are the number of samples that are incorrectly classified as positive by the model or number of cases that are actually positive (pneumonia) but classified as normal or healthy according to each model as shown in Table 5.

TABLE 5

Confusion matrix

│ Predicted │	— Actual —
│ Predicted │		True Positive (+)	False Negative (−)
	True Positive	True +	False +
	False Negative	False −	True −

Confusion matrix │ Predicted │ True Positive rate (Sensitivity) is the proportion of positive image samples that are correctly identified as positive sample (i.e., it shows the percentage of positive samples that are correctly identified as positives). The formula of sensitivity is shown in Equation (3). False positive rate (FPR) also known as Specificity is the proportion of positive samples that are incorrectly identified as positive samples (i.e., it shows the percentage of negative samples that are incorrectly identified as positives). The formula of sensitivity is shown in Equation (4).

RESULT AND DISCUSSION

General dataset

We trained the models with the entire dataset without cross validation. We utilized 5856 total images which are partition into 50:50, 60:40, 70:30, 80:20 and 90:10 for training and testing. The models were trained in Matlab with 5740 number of iterations, 20 epochs and 0.0001 learning rate. In terms of 50:50 split, the model achieved training accuracy of 97.98%, testing accuracy of 97.94%, sensitivity of 96.21% and specificity of 99.00%. By increasing the number of training dataset to 60% and reducing testing dataset to 40%, the model achieved training accuracy of 98.94%, testing accuracy of 98.95%, sensitivity of 99.09% and specificity of 98.81%. The difference between training accuracy and testing accuracy achieved by the models (trained on 50:50 and 60:40) are less compare to models trained on 70, 80 and 90%. This is as a result of using same amount or close amount of training and testing splits. Training the model using 70% and testing using 30% (i.e., 70:30) result in training accuracy of 99.19%, testing accuracy of 98.73%, sensitivity of 98.59% and specificity of 99.84%. In terms of data 80:20 split, the model achieved training accuracy of 99.36%, testing accuracy of 100%, sensitivity of 99.11% and specificity of 99.65%. By increasing the number of training dataset to 90% and reducing testing dataset to 10%, the model achieved training accuracy of 99.86%, testing accuracy of 100%, sensitivity of 99.70% and specificity of 100%. These higher performances are achieved as a result of training the models with large number of datasets and testing using fewer number of datasets (Table 6).

TABLE 6

General dataset result

Split	Training accuracy	Testing accuracy	Sensitivity	Specificity
50–50	97.96	97.94	96.71	99.00
60–40	98.94	98.95	99.09	98.81
70–30	99.19	98.73	98.59	98.84
80–20	99.36	100.00	99.11	99.66
90–10	99.86	100.00	99.70	100.00

General dataset result

Cross validation

The results have shown that training accuracy is greater than testing accuracy in all the k‐folds except 4‐fold where testing accuracy is higher than training accuracy. However, the average result of training accuracy (i.e., 97.70%) is greater than average result of testing accuracy (i.e., 96.04%). The result of sensitivity and specificity varies in the 10‐folds. The average result of sensitivity (97.34%) and specificity (97.79%) indicated that the model has successfully classified both negative and positive images. The result of cross validation is presented in Table 7.

TABLE 7

Cross validation result for pneumonia

K fold	Tr(A)	V	Ts(A)	Sv	Sf
1	98.35	0.9835	96.67	0.9800	0.9846
2	96.78	0.9678	94.71	0.9767	0.9650
3	97.72	0.9772	96.55	0.9867	0.9743
4	97.56	0.9756	94.71	0.9567	0.9815
5	97.72	0.9772	98.16	0.9567	0.9835
6	97.48	0.9748	94.14	0.9867	0.9712
7	96.86	0.9686	93.45	0.9800	0.9650
8	98.35	0.9835	96.21	0.9633	0.9897
9	98.27	0.9827	95.63	0.9867	0.9815
10	97.88	0.9788	97.13	0.9633	0.9835
Average	976.97/10 97.70	9.76970/10 0.9770	960.36/10 96.04	9.37368/10 0.9734	9.7798/10 0.9779

Abbreviations: Sf, Specificity; Sv, Sensitivity; Tr(A), Training accuracy; Ts(A), Testing accuracy; V, Validation.

Cross validation result for pneumonia 976.97/10 97.70 9.76970/10 0.9770 960.36/10 96.04 9.37368/10 0.9734 9.7798/10 0.9779 Abbreviations: Sf, Specificity; Sv, Sensitivity; Tr(A), Training accuracy; Ts(A), Testing accuracy; V, Validation.

General dataset performance against cross validation

As shown in Table 8, for general dataset we obtained different performance parameters based on training accuracy, testing accuracy, sensitivity and specificity for all the dataset split, while for cross validation we obtained an average performance of 97.70% training accuracy, 96.04% testing accuracy, 97.35% sensitivity and 97.78% specificity. This shows that the average performance of cross validation achieved lower training accuracy, testing accuracy and specificity than general dataset.

TABLE 8

Comparison between general dataset and cross validation

Split	Training accuracy	Testing accuracy	Sensitivity	Specificity
50–50	97.96	97.94	96.71	99.00
60–40	98.94	98.95	99.09	98.81
70–30	99.19	98.73	98.59	98.84
80–20	99.36	100.00	99.11	99.66
90–10	99.86	100.00	99.70	100.00
CV	97.70	96.04	97.35	97.98

Abbreviation: CV, Cross validation.

Comparison between general dataset and cross validation Abbreviation: CV, Cross validation.

Discussion

Radiologist have been relying on radiological images for interpreting pneumonia based on the presence of infiltrates (white spots in the patient's lungs) to identify or interpret the presence of the infection and other complications such as pleural effusions or abscesses. This approach can be very tedious for large images and thus, can lead to misinterpretation. The use of computer aided diagnosis (CAD) which was introduced in 1990s offer a simple, reliable, precise and fast approach of interpreting results related to medical images. CAD approach assist pathologist and radiologist in identifying disease and healthy images while preventing misinterpretation (Matsugu et al., 2003). The use of CNN to classify and characterize X‐ray images has shown a better accuracy and precision than manual classification by some radiologist. Since the development of deep neural network, scientist have been utilizing different CNN models such as AlexNet, VGGNet 16 and 17, GoogleNet, ResNet and other networks built from scratch to detect pneumonia in x‐ray images. These computer models are developed based on mathematical algorithms to solve problems such as predictions and image classification using probability score. The results presented in Table 6 has shown that increasing the number of dataset lead to increase in training accuracy. However, our results are in line with the study carried out by Prashanth et al. (2020) based on data splits from 50%–90%. Moreover, 70:30 split is chosen as the best performing model which is “fit” compare to 80:20 and 90:10 which are relatively “overfit” due to testing on small number of datasets. The result obtained from training and testing the performance of the models are presented in Table 6 and Figure 4.

FIGURE 4

Classification of pneumonia using AlexNet

Classification of pneumonia using AlexNet Comparing our result (based on 70:30 split) with state of art, we obtained a testing accuracy of 98.73% using general dataset and testing accuracy of 97.70% using the average accuracies of cross validation result. Our model has achieved a better accuracy than the study conducted by Stephen et al. (2019) using the same dataset but different model that is built from scratch which achieved average accuracy of 94.81%. Saraiva et al. (2019) utilized the same dataset with our study, the authors split the dataset into 5 K‐folds and achieved 95.30% average accuracy while we split our dataset into 10 k‐folds and achieved average accuracy of 97.70%. Rajaraman et al. (2018) utilized VGG‐16 to classify both bacterial and viral pneumonia. The models achieved 96.2% and 93.6% compare to our model that achieved 98.73% for 70:30 dataset split t using Pretrained AlexNet models. The result presented in Table 9 has shown that using transfer learning yield higher accuracy than building network from scratch as well as using large amount of dataset.

TABLE 9

Comparison with similar studies from literature

Rf	No of dataset	Model	A/AUC	Sv	Sf
70:30	5856	PA	98.73	98.59	98.84
CV	5856	PA	97.35	97.35	97.78
Stephen et al., 2019	5856	CNN	94.81	‐	‐
Chouhan et al., 2020	5856	PA	92.86	‐	‐
Saraiva et al., 2019	5856	CNN	95.30	‐	‐
Rajaraman et al., 2018	5856	CNN	92.2, 93.6	‐	‐
Kanaparthi et al., 2019	108,948	PA	0.6333	‐	‐
Rajpurkar et al., 2017	100,000	CHeXNet	0.8887	‐	‐

Abbreviations: A, Accuracy; AUC, Area under the curve; CV, Cross validation; PA, Pretrained AlexNet; Sf, Specificity; Sv, Sensitivity.

Comparison with similar studies from literature Abbreviations: A, Accuracy; AUC, Area under the curve; CV, Cross validation; PA, Pretrained AlexNet; Sf, Specificity; Sv, Sensitivity.

CONCLUSION

The recent outbreak of COVID‐19 has caused a global concern leading to over 30 million confirmed cases and more than 800 thousand death. Pneumonia is among the symptoms associated with COVID‐19. However, other pathogens are known to cause pneumonia such as viral pneumonia (caused by Influenza virus) and bacterial pneumonia (caused by Streptococcus pneumoniae). These viruses are mostly diagnosed using bench diagnosis assays which utilize chemical reagent, trained pathologist and radiologist, with longer procedure and heavy workload. To solve these challenges, we utilized a method based on DL and transfer learning approach. We trained our models using 5865 CT scan images based on different splits (50:50, 60:40…90:10) and CV to distinguish between viral pneumonia and healthy patients. For classification of pneumonia using X‐ray images based on 70:30 split, our model achieved testing accuracy of 98.73%, sensitivity of 98.59% and specificity of 99.84% and 96.04% testing accuracy, 97.35% sensitivity and 97.78% specificity using cross validation means. Our result is in line with the notion that CNN models can be used for classifying medical images with higher accuracy and precision. These models can now serve as a confirmation system for diagnosis of viral pneumonia by maximizing miss diagnosis and offer an alternative to relieve the heavy and tedious workload experiencing by radiologist and pathologist in Near East University Hospital. Some of the limitations of our study include the use of frontal radiographs without augmentation. Normally, frontal images are the types interpreted by radiologist without the need of rotation or colour shift. Another challenge is the lack of sufficient amount of dataset. Thus, with large amount of dataset we can utilize different pretrained architectures such as VGGNet, GoogleNet and ResNet. In the future, this model can be used for classification of COVID‐19 as well as the use of IoT‐enabled system integrated with artificial intelligence for prediction of viral pneumonia. Different image processing techniques can also be applied on the datasets such as annotations and image segmentations. The performance of the models can also be improved by using hybrid models such as combining SVM with pretrained models or models designed from scratch.

CONFLICT OF INTEREST

The authors declare no conflicts of interest. Appendix S1: Supporting information Click here for additional data file.

24 in total

1. Subject independent facial expression recognition with robust face detection using a convolutional neural network.

Authors: Masakazu Matsugu; Katsuhiko Mori; Yusuke Mitari; Yuji Kaneda
Journal: Neural Netw Date: 2003 Jun-Jul

2. Deep Learning in Medicine-Promise, Progress, and Challenges.

Authors: Fei Wang; Lawrence Peter Casalino; Dhruv Khullar
Journal: JAMA Intern Med Date: 2019-03-01 Impact factor: 21.873

Review 3. How far have we come? Artificial intelligence for chest radiograph interpretation.

Authors: K Kallianos; J Mongan; S Antani; T Henry; A Taylor; J Abuya; M Kohli
Journal: Clin Radiol Date: 2019-01-28 Impact factor: 2.350

4. Unsupervised heart-rate estimation in wearables with Liquid states and a probabilistic readout.

Authors: Anup Das; Paruthi Pradhapan; Willemijn Groenendaal; Prathyusha Adiraju; Raj Thilak Rajan; Francky Catthoor; Siebren Schaafsma; Jeffrey L Krichmar; Nikil Dutt; Chris Van Hoof
Journal: Neural Netw Date: 2018-01-12

5. 2014 MERS-CoV outbreak in Jeddah--a link to health care facilities.

Authors: Ikwo K Oboho; Sara M Tomczyk; Ahmad M Al-Asmari; Ayman A Banjar; Hani Al-Mugti; Muhannad S Aloraini; Khulud Z Alkhaldi; Emad L Almohammadi; Basem M Alraddadi; Susan I Gerber; David L Swerdlow; John T Watson; Tariq A Madani
Journal: N Engl J Med Date: 2015-02-26 Impact factor: 91.245

6. Visualization and Interpretation of Convolutional Neural Network Predictions in Detecting Pneumonia in Pediatric Chest Radiographs.

Authors: Sivaramakrishnan Rajaraman; Sema Candemir; Incheol Kim; George Thoma; Sameer Antani
Journal: Appl Sci (Basel) Date: 2018-09-20 Impact factor: 2.679

Review 7. Bats and Coronaviruses.

Authors: Arinjay Banerjee; Kirsten Kulcsar; Vikram Misra; Matthew Frieman; Karen Mossman
Journal: Viruses Date: 2019-01-09 Impact factor: 5.048

8. Convolutional neural network for diagnosis of viral pneumonia and COVID-19 alike diseases.

Authors: Abdullahi Umar Ibrahim; Mehmet Ozsoz; Sertan Serte; Fadi Al-Turjman; Salahudeen Habeeb Kolapo
Journal: Expert Syst Date: 2021-04-26 Impact factor: 2.812

Review 9. Emerging coronaviruses: Genome structure, replication, and pathogenesis.

Authors: Yu Chen; Qianyun Liu; Deyin Guo
Journal: J Med Virol Date: 2020-02-07 Impact factor: 2.327

2 in total

1. An intelligent prediagnosis system for disease prediction and examination recommendation based on electronic medical record and a medical-semantic-aware convolution neural network (MSCNN) for pediatric chronic cough.

Authors: Zhu Zhu; Jing Li; Jian Huang; Zheming Li; Hongjian Zhang; Siyu Chen; Qianhui Zhong; Yulan Xie; Shasha Hu; Yinshuo Wang; Dejian Wang; Gang Yu
Journal: Transl Pediatr Date: 2022-07

2. Convolutional neural network for diagnosis of viral pneumonia and COVID-19 alike diseases.

Authors: Abdullahi Umar Ibrahim; Mehmet Ozsoz; Sertan Serte; Fadi Al-Turjman; Salahudeen Habeeb Kolapo
Journal: Expert Syst Date: 2021-04-26 Impact factor: 2.812

2 in total