Literature DB >> 35880010

A hybrid random forest deep learning classifier empowered edge cloud architecture for COVID-19 and pneumonia detection.

Murugan Hemalatha1.   

Abstract

COVID-19 is a global pandemic that mostly affects patients' respiratory systems, and the only way to protect oneself against the virus at present moment is to diagnose the illness, isolate the patient, and provide immunization. In the present situation, the testing used to predict COVID-19 is inefficient and results in more false positives. This difficulty can be solved by developing a remote medical decision support system that detects illness using CT scans or X-ray images with less manual interaction and is less prone to errors. The state-of-art techniques mainly used complex deep learning architectures which are not quite effective when deployed in resource-constrained edge devices. To overcome this problem, a multi-objective Modified Heat Transfer Search (MOMHTS) optimized hybrid Random Forest Deep learning (HRFDL) classifier is proposed in this paper. The MOMHTS algorithm mainly optimizes the deep learning model in the HRFDL architecture by optimizing the hyperparameters associated with it to support the resource-constrained edge devices. To evaluate the efficiency of this technique, extensive experimentation is conducted on two real-time datasets namely the COVID19 lung CT scan dataset and the Chest X-ray images (Pneumonia) datasets. The proposed methodology mainly offers increased speed for communication between the IoT devices and COVID-19 detection via the MOMHTS optimized HRFDL classifier is modified to support the resources which can only support minimal computation and handle minimum storage. The proposed methodology offers an accuracy of 99% for both the COVID19 lung CT scan dataset and the Chest X-ray images (Pneumonia) datasets with minimal computational time, cost, and storage. Based on the simulation outcomes, we can conclude that the proposed methodology is an appropriate fit for edge computing detection to identify the COVID19 and pneumonia with higher detection accuracy.
© 2022 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  Cloud computing; Deep learning; Healthcare industry; Heat Transfer Search algorithm; Random forest; Web services

Year:  2022        PMID: 35880010      PMCID: PMC9300559          DOI: 10.1016/j.eswa.2022.118227

Source DB:  PubMed          Journal:  Expert Syst Appl        ISSN: 0957-4174            Impact factor:   8.665


Introduction

The healthcare system plays a major role in predicting one's emergency situation and when diagnosed in the initial stage it can even increase the lifetime of the patients. The novel Coronavirus infection that developed in Wuhan, China, and was recognized by the World Health Organization (WHO) in December 2019 is seen as a significant threat to mankind (Sohrabi et al., 2020).This disease outbreak turned into a pandemic by March 2020. The symptoms of the disease are fever, dry cough, breathing problems, loss of taste and smell, etc. The symptoms of the virus develop within a time interval of 1 to 14 days. The WHO mainly suggested the RT-PCR Test which is a real-time reverse transcription-polymerase chain reaction (rRT-PCR) test to identify COVID-19 (Ford et al., 2021). Virus traces from the sick person's nose and mouth are usually taken with a swab to diagnose this condition. To identify The COVID-19 disease within 30 min, rapid diagnostic tests were also developed. The high cost and low production of these devices made a massive amount of people suffer who are affected by this disease in the rural areas. These techniques also faced severe criticism due to the high number of false-negative predictions (Carpenter, Mudd, West, Wilber, & Wilber, 2020). Hence, this arises the need for an efficient Medical decision support system for COVID19 diagnosis. Pneumonia is a disease that affects the lungs where the air sacs get filled with pus which results in chills, fever, and breathing problems. This disease mainly occurs due to different bacteria like Streptococcus, Staphylococcus, Pseudomonas, Haemophilus, Chlamydia, Mycoplasma, several viruses, and certain fungi, and protozoans. This disease can be divided into two forms, bronchial pneumonia, and lobar pneumonia. The manual analysis of a large number of CT scans and X-rays is error-prone and time-consuming (Das, Kumar, Kaur, Kumar, & Singh, 2020). The COVID19 and pneumonia mainly target the respiratory system of the human body, the chest X-rays and CT scan images play a vital role in diagnosing this disease. Artificial intelligence and machine learning techniques tend to be effective to yield insights from the massive data collected (Hung et al., 2020, Rodríguez-Rodríguez et al., 2021). The increased volume of data can be efficiently handled via edge computing techniques. Edge computing (Xu, Chen, & Ren, 2017) mainly increases the processing capabilities of edge devices via on-site processing before transferring the data to the cloud. The edge devices have a pre-cloud layer where the computation-intensive IoT devices collect the information and do the substantial processing without depending on the cloud to send the information to the user. Edge computing is suitable for applications that incorporate IoT devices that are interconnected with the edge devices or cloud. In this way, the number of packets that need to be transferred to the cloud is minimized which directly affects energy consumption and increased storage (Mach & Becvar, 2017). Since the integration of IoT and edge devices offers diverse benefits, in this work the former is applied to process the CT scans. These devices also help to perform remote diagnoses in locations far away and the patients cannot afford to get the treatment from a radiologist. Since COVID19 and Pneumonia cause severe health risks to both elderly and younger patients it benefits the patients when identified at an earlier stage. In healthcare applications, the minimum powered devices are called the edge devices and the model designed for these devices needs to consume minimal computation and energy without minimizing the performance of the application. The main aim of the proposed work is to develop a cloud edge computing-based application which forwards the computationally intensive operations to the cloud and sends the results to the edge devices. The incorporation of edge computing in our proposed work helps to minimize the latency for delay-sensitive applications and cloud computing solves the memory-related issues by offering additional storage capacities. The emerging usage of deep learning techniques for natural language processing, image processing, speech recognition, and object identification has inspired us to incorporate a deep learning classifier for the prediction of COVID19 and pneumonia (Jamshidi et al., 2020, Khemasuwan et al., 2020). Since the Deep Learning (DL) classifier does not involve manual training for feature selection and classification it helps to minimize the time taken for classifying the normal and abnormal samples. The reliable classification results are provided by the deep learning classifier due to the usage of the increased number of hidden layers which does intricate processing (Shone et al., 2018, Lane and Georgiev, 2015). This paper presents a hybrid Random Forest Deep learning (HRFDL) classifier for COVID-19 and pneumonia disease diagnosis in IoT-based edge devices. The Random Forest classifier is mainly selected due to its ability to handle the high dimensional data and rapid convergence. However, the results obtained from the random classifier have a high false-positive rate. To overcome this problem, the MLP is used in this paper which offers a high True Positive Rate and a low false-positive rate (FPR). This is the main reason to hybridize the RF and MLP algorithms using a majority voting rule to yield an improved detection accuracy. The hyperparameters of the Deep learning architecture are optimized using the multi-objective Modified Heat Transfer Search (MOMHTS) which is formulated by integrating the three modes of the standard Heat Transfer Search algorithm. To prevent the local optima trapping present in the heat transfer search (HTS) algorithm and achieve an efficient trade-off between the exploration and the exploitation phases of this algorithm, this paper adds a new step called the synchronous heat transfer to the conventional HTS algorithm. In this way, a MOMHTS algorithm is formulated which helps to minimize the energy and space complexity associated with the resource-constrained edge devices when deploying the HRFDL classifier. The performance of the model is evaluated using two real-world medical datasets namely the COVID-19 lung CT scan dataset and the Chest X-ray images (Pneumonia) dataset via different performance metrics such as sensitivity, specificity, ROC curve, accuracy, confusion matrix, and F1-score. The rest of this paper is arranged accordingly. Section 2 presents the summary of the literary works and section 3 elaborates on the proposed model in detail. Section 4 presents the experimental analysis carried out using different performance metrics and conventional techniques to evaluate the efficiency of the proposed methodology. Section 5 discusses the details of the proposed methodology for COVID19 outbreak prevention and section 6 concludes this paper.

Literature review

EdgeCare, a leveraging edge computing solution built for mobile healthcare systems employing collaborative data management, was presented by Li, Huang, Li, Yu, and Shu (2019). They are mainly incorporating decentralized and collaborative data management in this work to support the massive volume of global healthcare data, enhance the overall system performance, and minimize the complexity associated with real-time healthcare applications. The healthcare data is processed and traded using the local authorities who manage the edge server. The optimal incentive mechanism is offered by the Stackelberg game-based optimization algorithm which assists both the users and data miners in trading. Vasconcelos, Sarmento, Reboucas Filho, and de Albuquerque (2020) presented an improved edge-cloud architecture using artificial intelligence techniques for brain CT image analysis. In this work, the authors are mainly identifying the stroke disease with the help of the Computed Tomography (CT) scan result. Even though Magnetic Resonance Imaging (MRI) is preferred by most authors, CT scan is taken by the majority of people due to the minimal cost and time. The applications developed for the Internet of Things (IoT) devices need to support low computation and low storage costs. Hence the authors proposed an adaptive analysis of the Brain Tissue Densities (Adaptive ABD) model using edge computing devices for the excellent processing capabilities they offer. This methodology offers a low computational cost with an average execution time of 0.087 s per sample. However, the efficiency of this method is evaluated in a small CT scan dataset. Singh and Kolekar (2021) identified the COVID19 via CT images using collaborative edge cloud computing. They mainly used a fine-tuned MobileNet V2 model since it is often complex to implement the deep learning model in resource-constrained devices. To support the mobile and edge devices even further, the MobileNet V2 architecture was even optimized in terms of complexity and size. The experiments were conducted on a real-world CT scan images dataset and the classification accuracy of this model was 96.40%. However, the optimal hyperparameters for the Mobile V2 model are not optimized. Akkaoui, Hei, and Cheng (2020) employed an EdgeMediChain framework (PMLA) which utilizes the collaborative and distributed data management framework supported by blockchain and edge computing. The main problem they are trying to overcome is the inability of cloud computing architecture to process the massive data generated from the body sensors. The main advantages offered by this methodology are the higher throughput and minimal execution time which is nearly equal to 84.75% for a total of 2000 simultaneous transactions. However, this methodology does not offer real-time disease diagnosis results which help in efficient decision making. Pustokhina et al. (2020) utilized both cloud computing and edge computing techniques to monitor the vital signs of the patients on a daily basis. To achieve this objective, they presented an approach known as the Effective Training Scheme for the deep neural network (DNN) (ETS-DNN) for timely data collection and processing from the internet of medical things (IoMT) devices to identify the internal patterns that exist in the data. The patient's data is captured via the IoMT sensors and sent to the edge computing platform enabled with the ETS-DNN technique for processing. A Hybrid Modified Water Wave optimization algorithm for tuning the DNN parameters. The hybrid algorithm is formed by integrating the Modified Water Wave Optimization algorithm with Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm. The report generated by the ETS-DNN model is forwarded to the cloud server which provides access to the healthcare professionals. Muhammed, Mehmood, Albeshri, and Katib (2018) developed a personalized ubiquitous cloud and edge-enabled networked healthcare system known as UbeHealth which integrates the IoT, edge computing, deep learning, big data, and High-Performance Computing (HPC). They are mainly focusing on optimizing the Quality of Service (QoS) parameters, energy consumption, and bandwidth. The improved QoS is offered to its users using the cloudlet, mobile, network, and cloud layers. Mohammedqasim and Ata (2022) enhanced the accuracy of COVID19 diagnosis via an optimized deep learning architecture. They are optimizing the deep learning architecture via grid search to overcome the unbalanced data problem. In this way, they are minimizing the processing time. Bhatia, Manocha, Ahanger, and Alqahtani (2022) handled the COVID19 outbreak via an artificial intelligence technique. They are mainly identifying the COVID19 outbreak with the help of wearable sensors and the Radio Frequency Identification Device (RFID). The infection degree and disease outbreak are analyzed with the help of the J48 decision tree and Temporal Network Analysis. Ghosh and Ghosh (2022) identified COVID and pneumonia via a Deep residual neural network-based chest X-ray image enhancement (ENResNet) approach. they have built the ENResNet using eight residual modules and the model is trained using residual images. The residual images are generated via batch normalization. Ding, Li, Li, Wang, and Zhang (2019) utilized a Tinier YOLO algorithm for diagnosing the upper gastrointestinal disease in real time via the cloud-edge collaborative framework. This work improved the sensitivity and specificity of upper gastrointestinal screening. They are integrating the cloud and edge platform to provide real-time lesion identification in the upper gastrointestinal tract. Ortiz, Zouai, Kazar, Garcia-de-Prado, and Boubeta-Puig (2022) presented a study to analyze respiratory illness by solving the issues by integrating the three tiers (fog, cloud, and IoT) via a two-way communication protocol. The two-way communication improves the real-time decision-making performance by acquiring the proper contextual and location information.

Research gap

The COVID19 is an infectious disease that is rapidly spread worldwide and causes serious health risks. This arises the need for early health monitoring and prevention. The state-of-art disease risk prediction system is capable of identifying the diseases up to a certain extent. However, the accuracy of the results is mainly affected by the diversity and incomplete data collected from the IoT sensors. The restricted IoT resource capabilities also limit the processing of large amounts of data on a timely basis. The existing system faced major challenges such as low accuracy and F1-score, deployment on imbalanced datasets, limited dataset availability, used pre-trained architectures with fixed input size which is not applicable in realtime, overfitting issues, focused on binary classifications, and a minimal number of samples used for training resulting in increased false-positive rates. To address these challenges, this paper presents a novel MOMHTS optimized HRFDL classifier for monitoring COVID19 and pneumonia via edge computing. Section 3 presents the details of the disease monitoring in detail along with the improvements made in the existing artificial intelligence models by integrating edge computing technology.

Proposed methodology

The state-of-art works provide a reliable medical decision support system to aid in efficient decision making by the healthcare professionals. Several kinds of research have been conducted to make the machine learning algorithms support edge devices and minimize their latencies along with a security improvement. To support edge devices of different sizes and complexity, the machine learning algorithms need to be optimized. In the field of computer science, deep learning has shown revolutionary improvement due to its efficiency in accessing different datasets, powerful parallelization, and specialized hardware development. The processing power of the edge devices is too low to deploy the deep learning architectures. The proposed architecture is mainly implemented to support end devices with power constraints and perform computationally efficient operations. The diseases such as COVID-19 and pneumonia are monitored in our proposed framework shown in Fig. 1 using the MOMHTS optimized HRFDL architecture. The proposed three-tier architecture comprises three layers: data generation, edge, and cloud. The IoT layer is mainly formed via the sensors, actuators, and information exchange devices. These are the IoT nodes that acquire the data from the physical world. Based on the collected data these devices provide the results (COVID 19/Pneumonia positive) to the users after processing the values obtained. The storage and computing devices are in the cloud layer and the intermediary devices are in the edge layer. The sensor node needs a small powerful processing unit to efficiently connect with other layers. The data acquisition layer mainly collects the crucial physiological symptoms, personal information, users' contact number, etc. Every user initially obtains their unique identifier (UID) by entering their phone numbers and other personal details. The primary symptoms such as body temperature and blood pressure are captured via the Wireless Body Area Networks (WBAN) and sent to the user's mobile device via Bluetooth. The data obtained is sent to the edge server via the WiFi or 4G/5G network. The advanced symptoms are obtained via X-ray and CT scan images. After the symptoms of the users are identified and updated, the data is immediately transferred to the cloud.
Fig. 1

Outline of the proposed framework.

Outline of the proposed framework. The disease prediction and monitoring framework are deployed in the bottom layer. Initially, the data is collected and then sent to the edge layer which consists of edge devices and the proposed methodology is also installed in this layer. If any individual is diagnosed positive (COVID19 or pneumonia), then the edge devices alert the concerned authorities (Patient, guardian, medical institution, or hospital) with an alert message. The progression of the disease is analyzed by the cloud layer based on the patient's ID and their location derived. This information helps to prevent the progression of the disease in a certain area. The detailed description of each layer is presented below:

Layer-1: Data generation

The data regarding the illness (COVID19 and Pneumonia) is acquired from this layer. To diagnose pneumonia and COVID19, we retrieved the CT scan results of the patients. To help the government and other NGO agencies to provide medical aid, resources, and services to the patient. The data acquired is transformed into the edge layer for further processing where different edge devices are present including the patient’s edge device.

Layer-2: Edge layer

The edge layer is mainly used for processing and classification of the CT scan images acquired. It acts as an intermediate between the cloud and the physical layer. The edge layer minimizes the network traffic and latency. The proposed MOMHTS optimized HRFDL model is implemented in the edge layer with resource-constrained edge devices. The resource-constrained devices can also act as IoT devices. To increase the computational efficiency different hardware accelerators are also attached to the end devices. If any abnormalities exist in the patient's CT scan, then the disease diagnosed is sent as a message to the user. To further notify the healthcare institutions and the officials the patient ID along with the location is sent to the cloud for further processing.

Layer-3: Cloud layer

The cloud layer performs various complex tasks to overcome the limitations associated with edge devices such as data storage and low computational power. The cloud layer performs centralized operations over multiple Virtual Machines (VM). The data stored in the centralized data warehouse can be accessed via government authorities and authorized medical practitioners. The cloud layer normally does the following operations: The increase in the number of COVID19 and pneumonia patients leads to a shortage of medical equipment. To identify these diseases different medical types of equipment are needed such as testing devices, respiratory devices, ventilators, and oxygen cylinders. Since ventilators and respiratory devices are crucial in treating the disease, appropriate equipment handling is necessary. Hence a tradeoff between these types of equipment, supply, and demand needs to be achieved. The places that face an increase in COVID19 patients need a more supply of this equipment. Hence with the help of the cloud, one can track a disease outbreak, the number of people affected, and offer resource optimization. The records related to the number of patients being infected, number of mortality, number of active cases, and the number of recoveries. Since we need regular updates about the disease outbreak, this information is continuously monitored. Based on the information, necessary precautions are taken to control the virus spread and treat the patients. Since the infection rate and disease spread is high, it is necessary to control the disease outbreak and identify the locations of potential danger. Both the local information and the newly identified disease cases are uploaded to the cloud periodically. Since the information is accessible by the public, in this way both the public and the civic bodies are alerted. After the emergence of the novel Coronavirus, there was only limited data available to train the deep learning model. Since the training of edge devices is very complex, the proposed model is uploaded to the cloud and the result updates obtained are sent to the edge devices. The on-the-air updates are essential for every health-related application since we have included pneumonia disease. The edge layer identifies the diseases as shown in algorithm-1. Algorithm-1: Disease diagnosis at the edge layer.

Formulation of MOMHTS algorithm

Heat transfer search

Using the basic principles of heat transfer and thermodynamics, the HTS algorithm is formulated (Kumar, Tejani, Pholdee, & Bureerat, 2021). The algorithm mainly implies that a thermal system can achieve thermal equilibrium with different modes of heat transfer (conduction, convection, and radiation). These heat transfer modes are also the search procedures in the HTS optimizer. The equal selection probability value of each mode is assigned a random number ρ in the range [0,1]. For the conduction mode, the value of ρ is in the range 0–0.333 and in the radiation mode, it is in the range 0.333–0.6666. The range of convection mode is 0.6666–1. Based on the ρ value, the results are updated as per the heat transfer in each iteration.

Thermal conduction mode

Molecules exchange heat from higher energy levels to lower energy levels when it contacts each other this process is said to be conduction. The conduction mode in the HTS algorithm is divided into two segments. Segment 1: when Segment 2: when The molecules that are updated are represented as; x = 1, 2, 3, …, n; the randomly selected solutions are indicated as y; xy; y ∊ (1,2,3,….,n); the design variable index is denoted as µ and is selected randomly and can be given as µ ∊ (1,2,…..,m); the current iteration is given as iter and the total number of iteration is denoted as; the temperature change of the molecules is given as and and are called conductance parameters. CDF is the conduction factor and to maintain the balance between intensification and diversification CDF is indicated as 2. In each iteration, there will be one design variable modification that happens.

Thermal convection mode

Convection mode always removes the energy level indifference between the system and surrounding by convection heat transfer. The system molecules and system Surrounding associate with each other to establish thermal equilibrium. Equation for thermal convection mode is given in eq (2), (3). represents the updated molecules; x = 1, 2, 3, …, n; µ ∊ (1, 2, …, m), which indicates design variable index; the function estimation is represented as iter; Prob is the probability variable and can be given as P ∈ [0.6666, 1]; is the maximum number of iteration; is an arbitrary number varying from 0 to 1; the newton’s law for the cooling convection parameters can be given as and; the surrounding temperature isand has been considered as a constant reference; meanwhile the mean temperature of the system can be denoted as, it will change during the convection process; the tradeoff between the exploration and exploitation is represented by TCF in the convection phase; and CF is set to be 10.

Thermal radiation mode

Due to the heavy temperature, heat transfer occurs in the form of electromagnetic waves due to radiation released. The system molecules and system surroundings associate with each other to establish thermal equilibrium. This is governed by the Stefan–Boltzmann law of thermodynamics. The equation for thermal radiation mode is given in eq (5) and eq (6). The updated molecules in the thermal radiation model is indicated as; x = 1, 2, 3, …, n; µ ∈ (1, 2, …, m); x  ≠ y; y ∈ (1, 2, 3, …, n); the randomly selected molecules are indicated as y and the current iteration is indicated as iter; Prob ∈ [0.3333,0.6666] is probability variable. The randomly selected number that lies under the range of 0 to 1 is given as and and also known as radiation parameters of Stefan–Boltzmann equation; meanwhile the variation of temperature between the surroundings and molecules of the system is given as and. In order to maintain the balance between diversification and intensification, we have set the RDF (radiation factor) as 2. The variables belonging to the iteration are changed during the iteration process of radiation.

Multi-objective modified heat transfer search (MOMHTS) optimizer

The metaheuristic optimizer is mainly applied to generate new solutions that are better than the previous ones or search for a global optimum in a feasible search space. Another feature of the metaheuristic algorithm is that the optimizer should prevent local optima trapping. If the above features are integrated together, the excellent performance of the algorithm can be derived. To achieve this objective one needs to achieve a tradeoff between two phases namely the exploration and exploitation phases. The exploration is known as diversification and the exploitation is known as the intensification process. The exploitation process improves the convergence rate whereas the exploration process minimizes the convergence rate as per the postulates of some popular metaheuristic algorithm and their convergence behavior. An increased exploration phase also helps to identify the global optima but with less efficiency and in some cases increased exploitation leads to premature convergence. In these scenarios, an accurate tradeoff between the exploration and exploitation phases needs to be achieved which is an unresolved issue in optimization. In Heat Transfer Search (HTS) algorithm, the system molecules interact with their adjacent molecules to minimize the thermal imbalance and transfer heat. The instant energy transfer process is achieved using any one of the three HTS modes. For polynomial functions, the radiation process tends to be more effective and the convection and conduction processes are effective when solving the non-linear functions. The HTS model transfers continuously to speed up the thermal balance. To overcome the limitations of the HTS algorithm, a multi-objective Modified Heat Transfer Search (MOMHTS) is presented and it integrates the three modes of the standard HTS algorithm is integrated. The HTS algorithm mainly offers premature convergence since the new solution created is mainly based on the randomly selected solution, best solution, and average solution due to the high proximity in between them. To achieve a tradeoff between the exploration and exploitation phases, a new step known as synchronous heat transfer is included in the standard HTS algorithm.

Synchronous heat transfer

In the HTS algorithm, the energy is mainly interacted in the form of heat to achieve thermal equilibrium between the system and adjacent molecules. Since the three heat transfer modes are integrated into the model, heat transfer mostly occurs since for every generation there is an equal likelihood. The synchronous heat transfer speeds up the search process and the heat transfer possibility mainly relies on the probability factor of conduction (P), probability factor of convection (P), and probability factor of radiation (P). The values of the P, P, and P values range from 0 to 1 and. When the conduction mode is activated during heat transfer, the first one-third of the molecules in the population are updated. The second and third one-third solutions are updated during the radiation and convection mode. To obtain a synchronized form of optimization, the conduction, convection, and radiation modes are integrated to support nonlinear, linear, and polynomial functions. Hence the value of and are assigned as 0.333 respectively to achieve an equal probability value. The three modes are simultaneously implemented by substituting the probability variables such as ρ1 (Conduction probability value), ρ2 (radiation probability value), and ρ3 (Convection probability value). The outline of the MOMHTS algorithm is presented in Fig. 2 .
Fig. 2

Outline of the MOMHTS model.

Outline of the MOMHTS model.

HRFDL architecture formation

The disease detection rate from the CT image dataset is improved using the hybrid model which comprises of three architectures. Initially, the feature extraction is performed and in the next stage, the classification is conducted using the optimized classifiers (Yoo, Kim, Kim, & Kang, 2021). To improve the detection performance a step further, a voting methodology is used in the last step. The classifiers with high true positive rates are selected to enhance the voting effects. The disease is mainly identified by the standard machine learning model with a scoring range of 0 to 1. The detection process is identified using the random forest and multilayer perceptron (MLP) algorithm. In the feature extraction stage, the features from the CT scan images are extracted and the classification is done using deep learning and the random forest algorithm. At last, a rule-based majority voting scheme is deployed to obtain the final decision values. The majority voting is mainly included to identify the disease accurately and to ignore the classifier that provides erroneous output. For an instance, if there is a normal CT scan that resembles an abnormal sample, then if two models classify the sample as normal and another model classifies this sample as abnormal which is correct means the simple voting rule mainly identifies the sample as benign. To overcome this issue, a rule-based majority voting scheme is proposed in this paper which utilizes the benefits of the minority classification to minimize the false alarm rate. The majority voting scheme mainly offers a higher probability value to the classifier that identifies the normal sample well. To find a reliable classifier, a priority rule is added in identifying the normal samples. A simple majority voting rule is applied for the samples not classified using the priority rule. Based on the probabilistic density, a decision-making algorithm is provided to identify the abnormal samples. In real-time, it is often complex to identify the abnormal samples accurately and in these scenarios, a single machine learning classifier is not adequate to offer a higher true positive rate and the low false-positive rate at the same time. Hence to improve the machine learning model in terms of detection rate we add a deep learning classifier. To improve the detection rate of the deep learning classifier we are using the MOMHTS algorithm. The initial and optimized deep learning model is presented in Table 1 .
Table 1

Optimized DL values using MOMHTS algorithm.

ParametersInitial ValuesOptimized Values
Number of nodes250205
Epoch512364
Batch size2048Greater than 200
Hidden layer1612
OptimizerStochastic gradient descentRoot Mean Squared Propagation and Adamax
Dropout rate0.5
Optimized DL values using MOMHTS algorithm. The parameters used for optimization are the number of nodes, epochs, batch size, hidden layer, and dropout. The optimal parameters are found by varying the values from low to high. The optimal value identified where sequentially applied to every hyperparameter. The random weight, cross-entropy, and activation function were set by default. A random uniform interval is used for random weight initialization for the first layer in the range [-0.05, 0.05]. In the second layer, the Xabier uniform initialize is used which draws random samples in the range [-limit, limit]. The bias value is initialized as zero and the rectified linear unit (ReLU) is the activation function used in the hidden layer. The ReLU is a non-linear function and the softmax function is the activation used in the output layer.

Decision-making algorithm

To decide on the MLP and RF models we are using the majority rule for decision making and based on the results the random forest achieved higher classification results but with a high false-positive rate for the normal samples. The false-positive rate is minimized via the additional voting technique presented as shown in Table 2 . The results obtained by combining the two rules are presented in Table 3 . The hybrid classifier used in this paper offers a low false positive and high true positive rate. The MOMHTS algorithm-optimized Deep learning classifier either offers a high true positive or a low false-positive rate. The rule-based majority vote technique overcomes the higher false positive issues.
Table 2

Decision-making results.

Rule NumberOutput
1If Results of Random Forest <=0.5 then output “normal”
2If Results of Random Forest < 0.5 and MLP < 0.5 then output “normal”
Table 3

Integration of different rules.

IndexRule applied
1Majority Voting
2Majority Voting + 1
3Majority Voting + 2
4Majority Voting + 1,2
Decision-making results. Integration of different rules. The rule-based majority model follows both rule policies and also applies a majority rule. Even though every single optimizer is optimized means it does not specify that the ensemble of classifiers is optimized. Our model offers the best performance when applied both rules shown in Table 2. When majority voting is applied, the proposed model offers the highest true positive results for more than 90% of cases. The added rules are shown in Table 3 return low false-positive results. Based on our results achieved the random forest classifier offered higher classification efficiency for the abnormal samples and the MLP classifier offered higher classification efficiency for the normal samples. The optimized decision-making algorithm is formulated as shown in algorithm-2. The proposed framework is presented in Fig. 3 .
Fig. 3

Proposed MOMHTS optimized.

Proposed MOMHTS optimized. Algorithm 2: Decision-making algorithm formulated using rules shown in Table 2.

Web service based implementation

The proposed model is implemented as a web service in the cloud environment. A web service is mainly a task that is completed by the cloud and it is based on different non-functional requirements such as cost, security, etc in this work we are focused on the cost of the web service. Each service provider such as Amazon EC2 and Google applications has its own processing requirements and it changes as per the usage of the consumer per hour, per GB, per MB, etc. The service cost of the website is represented as shown below: In the above equation, mainly represents the service cost applied by the cloud to the consumer to finish their task in a sequential manner. The MOMHTS algorithm is used to optimize the overall service cost by allocating the optimal budget and number of cloud service providers. Since the increase in cloud providers increases the communication cost, the number of cloud service providers selected is also optimized.

Experimental results and analysis

The experiments were conducted on an Inspiron 24–5000 desktop equipped with a Windows 10 OS and 11th Generation Intel® Core™ i3-1115G4 Processor (6 MB Cache, up to 4.1 GHz). The proposed Hybrid model was implemented in Matlab. A detailed description of the experiments conducted, the dataset used, etc is provided in this section.

Dataset description

The COVID19 lung CT scan dataset (LuisBlanche, 2020) and Chest X-ray images (Pneumonia) (Mooney, 2018) are the datasets used in our proposed methodology. The CT scan images in the COVID19 lung CT scan dataset are obtained from different COVID19 related articles such as JAMA, Lancet, bioRxiv, medRxiv, etc. Based on the figure captions, the abnormalities are identified. A total of 349 COVI19 cases and 397 non-COVID19 cases were present in the dataset. The pneumonia dataset consists of 5,232 chest X-ray images where the 1,349 samples belong to normal classes and the remaining 3,883 samples belong to the abnormal classes. In both datasets, 70% of images were used for training and the remaining 30% of images were used for testing. After no improvement is noticed in the loss and accuracy after 100 epochs, the training process is automatically stopped. Both datasets used are publicly available COVID19 datasets. Fig. 4 and Fig. 5 present the samples present in both the COVID19 lung CT scan dataset and Chest X-ray images (Pneumonia) dataset respectively. The images obtained were of different sizes and they were cut down to a resolution of 224 × 224 × 3 before giving it as input for training.
Fig. 4

Samples images from the COVID19 lung CT scan dataset (a)-(b) Normal CT scan results, and (c)-(d) Abnormal CT scan results.

Fig. 5

Samples images from the Chest X-ray images (Pneumonia) dataset (a)-(b) Normal results, and (c)-(d) Abnormal results.

Samples images from the COVID19 lung CT scan dataset (a)-(b) Normal CT scan results, and (c)-(d) Abnormal CT scan results. Samples images from the Chest X-ray images (Pneumonia) dataset (a)-(b) Normal results, and (c)-(d) Abnormal results.

Performance evaluation metrics

The performance of the classifier is mainly evaluated using different performance metrics such as accuracy, sensitivity, specificity, and F-score. The main aim of the different classifiers is to minimize the number of false-positive and false-negative results during classification. The diagnosis of a novel disease via the CT scan and X-rays is very important to society. A brief description of the different performance metrics used is presented in this section. Sensitivity (S1): The true positive rate of the classifier is mainly determined by the sensitivity metric which mainly evaluates the efficiency of the model to predict the true positive results correctly. Specificity (S2): The ability of the classifier to distinguish the true negative rate is represented as specificity. Precision (P): The positive prediction capability of the model is evaluated using prediction which is also known as the Positive Predictive Value (PPV). F1-Score: A tradeoff between precision and recall is obtained via F1-Score. Accuracy: It mainly identifies the efficiency of the classifier to predict the samples correctly.

Results

The latency of the standard HRFDL algorithm and the optimized HRFDL algorithm for the disease diagnosis healthcare application is presented in Fig. 6 . Based on the results we can observe that the latency value increases with an increase in the number of patients. The HRFDL offers minimal latency and the optimized HRFDL algorithm offers an improved latency. The proposed optimized HRFDL algorithm takes a minimal latency of 0.07 s which is relatively low when compared to the standard HRFDL algorithm which consumes more than 0.08 s. From the results obtained we can conclude that the proposed model shows improved performance in terms of latency.
Fig. 6

Comparison in terms of latency.

Comparison in terms of latency. The proposed MOMHTS algorithm-optimized HRFDL model is mainly developed to minimize the diagnosis time taken in edge devices. The time is computed to classify the images in the COVID19 lung CT scan dataset. The results are shown in Fig. 7 and based on the results the performance of the proposed model is higher than the existing Adaptive ABD (Vasconcelos et al., 2020), ETS-DNN (Pustokhina et al., 2020), PMLA (Akkaoui et al., 2020), MobileNetv2 architecture (Singh & Kolekar, 2021), and UbeHealth (Muhammed et al., 2018). Based on the results we can confirm that the proposed model is four times faster than the existing techniques.
Fig. 7

Comparative analysis using time complexity.

Comparative analysis using time complexity. The results of the proposed optimized HRFDL algorithm are computed using the F1 measure and the results obtained are plotted in Fig. 8 .
Fig. 8

Comparative analysis using F-measure.

Comparative analysis using F-measure. Adaptive ABD (Vasconcelos et al., 2020), ETS-DNN (Pustokhina et al., 2020), PMLA (Akkaoui et al., 2020), MobileNetv2 architecture (Singh & Kolekar, 2021), and UbeHealth (Muhammed et al., 2018) are the different state-of-art techniques taken for comparison. The results obtained by the proposed model show the model's efficiency to classify the normal and abnormal results accurately. An F-score of 99% is accomplished by our proposed methodology. The UbeHealth and MobileNetV2 architecture obtained the next F-measure values but these models are not quite effective when it comes to edge device deployment. The Adaptive ABS and ETS-DNN obtain the lowest F-measure of all. The specificity values of the proposed MOMHTS algorithm-optimized HRFDL classifier are compared with the state-of-art classifier and the results obtained are shown in Fig. 9 . Based on the results obtained, the lowest specificity score of 94% is obtained by the Adaptive ABD model since its efficiency decreases for large datasets. The specificity score improves as the number of patients increases for our proposed model and our proposed model achieved a specificity score of 99% for a total of 1500 patients.
Fig. 9

Comparative analysis using Specificity.

Comparative analysis using Specificity. The overall performance of the proposed model in two datasets in terms of accuracy and sensitivity is presented in Table 4 . The proposed model is compared with different existing techniques such as Adaptive ABD (Vasconcelos et al., 2020), ETS-DNN (Pustokhina et al., 2020), PMLA (Akkaoui et al., 2020), MobileNetv2 architecture (Singh & Kolekar, 2021), and UbeHealth (Muhammed et al., 2018) using the COVID19 lung CT scan dataset. The Table 4 achieves the highest accuracy and sensitivity values when compared to the existing techniques. The results shown in Table 5 are self-explanatory and it mainly indicates that the proposed model is capable of identifying the normal and abnormal classes with higher accuracy and sensitivity values. The accuracy of the existing techniques is often low due to the lack of parameter optimization.
Table 4

Performance evaluation using accuracy and sensitivity.

TechniqueAccuracy (%)Sensitivity (%)
Adaptive ABD (Vasconcelos et al., 2020)9294
ETS-DNN (Pustokhina et al., 2020)9495
MobileNetv2 architecture (Singh & Kolekar, 2021)96.4096
UbeHealth (Muhammed et al., 2018)8986
Proposed MOMHTS algorithm-optimized HRFDL classifier9999
Table 5

Prediction results obtained for the proposed classifier for different training samples.

Input imagesActual classResults obtained by our proposed modelPredicted class
COVID-19 (Absence)0.004541COVID-19 (Absence)
COVID-19 (Presence)0.99878COVID-19 (Presence)
COVID-19 (Presence)0.99454COVID-19 (Presence)
COVID-19 (Absence)0.000278COVID-19 (Absence)
Pneumonia (Absence)0.0005447Pneumonia (Absence)
Pneumonia (Presence)0.99458Pneumonia (Presence)
Pneumonia (Presence)0.99754Pneumonia (Presence)
Performance evaluation using accuracy and sensitivity. Prediction results obtained for the proposed classifier for different training samples. The Receiver Operating Characteristic (ROC) curve obtained for the Chest X-ray images and COVID19 lung X-ray scan is presented in Fig. 10 (a) and (b). The true positive rate is plotted on the X-axis and the false positive rate is plotted on the Y-axis. A reliable classifier is capable of minimizing the false positive rate and maximizing the true positive rate as maximum as possible. For the two datasets, the ROC curve of the optimized proposed model is near the top left corner which shows significant performance.
Fig. 10

ROC curve results. (a) Results for the Chest X-ray images (Pneumonia) dataset and (b) Results for the COVID19 lung CT scan dataset.

ROC curve results. (a) Results for the Chest X-ray images (Pneumonia) dataset and (b) Results for the COVID19 lung CT scan dataset. The storage issues related to the edge devices are optimized even further using the flat buffer format and in this way, the proposed model can even handle mobile devices with restricted computational capabilities. Fig. 11 provides the application size of different classifiers in mobile devices. The findings provided in Fig. 11 illustrate that even with an application size of 8.3 MB, the proposed model provides considerable results. The UbeHealth classifier has an increased application size of 30 MB which is relatively higher than the existing techniques. The results obtained by the MOMHTS algorithm-optimized HRFDL classifier are presented in Table 5 along with the input image samples.
Fig. 11

Comparison in terms of application sizes.

Comparison in terms of application sizes. The confusion matrix mainly evaluates the performance of a classifier on a dataset whose results are already present beforehand. Table 6, Table 7 present the confusion matrix results obtained for the COVID19 lung CT scan and Chest X-ray images (Pneumonia) dataset.
Table 6

Confusion matrix for the COVID19 lung CT scan dataset.

Table 7

Confusion matrix for the Chest X-ray images (Pneumonia) dataset.

Confusion matrix for the COVID19 lung CT scan dataset. Confusion matrix for the Chest X-ray images (Pneumonia) dataset. From a total of 349 COVID19 samples, the proposed model is capable of classifying 346 images correctly and from a total of 397 non-COVID19 samples, the proposed model is capable of classifying the 393 images correctly. The proposed model offers an accuracy of 99% in the COVID19 lung CT scan dataset. In the Chest X-ray images (Pneumonia) dataset, the proposed model identifies a total of 3844 pneumonia cases correctly from the total of 3805 cases. This model also offers an accuracy of 99% for the disease classification.

Computational complexity analysis

Energy consumption (EC) mainly measures the amount of energy spent in transferring the data from the source to the destination. Fig. 12 presents the computational complexity analysis of the proposed model in terms of energy consumption and it is mainly measured for the proposed model with and without optimization. Based on the graph, we can conclude that the proposed model's energy consumption increases without optimization but decreases with optimization.
Fig. 12

Computational complexity analysis using energy consumption for the proposed model with and without optimization.

Computational complexity analysis using energy consumption for the proposed model with and without optimization. The computational time and memory consumed by the proposed model for storing data of different sizes are presented in Table 8 . As per Table 8, we can notice that the proposed model takes a time of 175 s to store data of 200 MB and a memory capacity of 1,304,578 bytes. The memory and computational time details for the different data size of 20–300 MB is presented in Table 8.
Table 8

Computational Time and memory analysis.

Data Size (MB)Time (Seconds)Memory (Bytes)
50491,102,568
100571,201,456
1501101,295,874
2001751,304,578
2501901,352,464
3002301,459,875
Computational Time and memory analysis.

Discussion

The proposed model is formed of different components which minimize the COVID19 outbreak and serves as a reliable model for COVID19 prevention. Each user in the network has an automatically generated user ID and each user ID is associated with a number that defines the severity of the infection. Based on the user ID and their current location derived from the devices, the infected users who spread the disease can be recognized. The relationship that exists between the unaffected and infected individuals is identified via a tool known as the Gephi 0.9.1. The COVID19 disease is mainly spread via the air droplets emitted during coughing and uninfected people can also get infected when inhaling the cough droplets. Based on the classification results obtained from the proposed model, the location of the individual is continuously monitored to identify whether they are in proximity to an unaffected individual. Government organizations can reduce the disease progression by closely monitoring the infected and uninfected person samples classified by the MOMHTS optimized HRFDL classifier. The high proximity between the affected and unaffected individuals is monitored using the radio waves generated by the Radio Frequency Identification (RFID) tags. These tags are placed in the chest area of the infected individuals in a certain location and it is activated when any person approaches the infected. Each user can identify the RFID tag worn by others via a mobile application mainly designed for this purpose. When an uninfected individual comes near the infected individual, the application sends an alert to maintain a 1–2 m distance. The contact information is transferred to the cloud and the user is continuously monitored via a 5G/4G internet connection. In this way, the contact between the uninfected users with the infected persons is prevented as maximum as possible. The user proximity is identified at regular intervals and a 25-second window is used for this purpose.

Conclusion

This paper proposes a MOMHTS algorithm-optimized HRFDL classifier for edge computing IoT-enabled medical environment. Initially, the IoT devices such as CT scanners and X-ray machines capture the chest area and identify if any abnormality is present or not. The proposed model is optimized using the MOMHTS algorithm which tunes the hyperparameters associated with the DNN algorithm and minimizes the number of false positives predicted by the model. The proposed model yields fast detection and efficient classification of COVID-19 and pneumonia in patients living in rural areas. The advantages offered by this model are cost-efficiency, higher accuracy, and minimized execution time which is very important when deploying the applications in an IoT-based edge environment. In this way, it allows the deep learning classifier to explore even further. The proposed model's majority voting system considerably enhances the outcome of the random forest and MLP classifier. After the disease classes (COVID-19/Pneumonia) are identified the results obtained are transformed into the cloud server which can be later utilized by the healthcare professionals and government workers to aid in early diagnosis and notify the people in the surrounding location to be aware of this disease. The proposed work is simulated in real-time by conducting a series of experiments and the results are evaluated using different performance metrics such as sensitivity, specificity, accuracy, ROC curve, F1-score, confusion matrix, etc. The proposed methodology offers an accuracy of 99% for both the COVID19 lung CT scan dataset and Chest X-ray images (Pneumonia) datasets. The size of the developed application is 8 MB which is relatively lower than other models and a latency value of 0.076 is obtained for a total patient record of 1500. In the future, we plan to optimize the security of the web services and also identify the different types of pneumonia.

Ethical statements

Funding: Not applicable.

Compliance with ethical standards

Human and animal rights

This article does not contain any studies with human or animal subjects performed by any of the authors.

Informed consent

Informed consent was obtained from all individual participants included in the study. Not applicable. Not applicable.

Availability of data and material

Data sharing is not applicable to this article as no new data were created or analyzed in this study. Not applicable.

Authors contributions

MH agreed on the content of the study. MH collected all the data for analysis. MH agreed on the methodology. MH completed the analysis based on agreed steps. Results and conclusions are discussed and written together. The author read and approved the final manuscript.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Input: CT images of chest area affected with COVID19 and pneumonia
Output: Disease class whether normal or abnormal (COVID19 or pneumonia)
Step-1: From the data generation layer, retrieve the chest CT images and create a unique patient ID for each record
Step-2: The images obtained from the data acquisition layer are preprocessed and then the size of the image is altered to match the proposed model
Step-3: The image is given as an input to the proposed model and the classification result is taken as output
Step-4: If Class predicted = Abnormal (COVID19/Pneumonia)then
  Output the classifier result to the user along with their patient ID for future admissions
  Send both the location and patient ID to the cloud to alert the government/health organizations to conduct the patient checkup
Else
  Output the classified result to the user with the date for the next checkup
  Erase the existing patient ID and their location from the cloud storage
End If
Step-5:Exit
Rule-based Majority Vote (MLP, Random Forest)
Input: CT images from pneumonia and COVID19 datasets
Output: Normal/Abnormal (Pneumonia or COVID19)
If (RF<=0.5 and MLP<=0.5) then
Return Normal
Else
Return Majority Vote (MLP, Random Forest)
  12 in total

1.  Automated Deep Transfer Learning-Based Approach for Detection of COVID-19 Infection in Chest X-rays.

Authors:  N Narayan Das; N Kumar; M Kaur; V Kumar; D Singh
Journal:  Ing Rech Biomed       Date:  2020-07-03

2.  Epidemiologic characteristics associated with SARS-CoV-2 antigen-based test results, rRT-PCR cycle threshold values, subgenomic RNA, and viral culture results from university testing.

Authors:  Laura Ford; Christine Lee; Ian W Pray; Devlin Cole; John Paul Bigouette; Glen R Abedi; Dena Bushman; Miranda J Delahoy; Dustin W Currie; Blake Cherney; Marie Kirby; Geroncio Fajardo; Motria Caudill; Kimberly Langolf; Juliana Kahrs; Tara Zochert; Patrick Kelly; Collin Pitts; Ailam Lim; Nicole Aulik; Azaibi Tamin; Jennifer L Harcourt; Krista Queen; Jing Zhang; Brett Whitaker; Hannah Browne; Magdalena Medrzycki; Patricia Shewmaker; Gaston Bonenfant; Bin Zhou; Jennifer Folster; Bettina Bankamp; Michael D Bowen; Natalie J Thornburg; Kimberly Goffard; Brandi Limbago; Allen Bateman; Jacqueline E Tate; Douglas Gieryn; Hannah L Kirking; Ryan Westergaard; Marie Killerby
Journal:  Clin Infect Dis       Date:  2021-04-13       Impact factor: 9.079

3.  Real-time data of COVID-19 detection with IoT sensor tracking using artificial neural network.

Authors:  Roa'a Mohammedqasem; Hayder Mohammedqasim; Oguz Ata
Journal:  Comput Electr Eng       Date:  2022-04-06       Impact factor: 3.818

4.  Artificial intelligence-inspired comprehensive framework for Covid-19 outbreak control.

Authors:  Munish Bhatia; Ankush Manocha; Tariq Ahamed Ahanger; Abdullah Alqahtani
Journal:  Artif Intell Med       Date:  2022-03-26       Impact factor: 7.011

5.  Deep learning empowered COVID-19 diagnosis using chest CT scan images for collaborative edge-cloud computing platform.

Authors:  Vipul Kumar Singh; Maheshkumar H Kolekar
Journal:  Multimed Tools Appl       Date:  2021-06-28       Impact factor: 2.577

Review 6.  Artificial intelligence in pulmonary medicine: computer vision, predictive model and COVID-19.

Authors:  Danai Khemasuwan; Jeffrey S Sorensen; Henri G Colt
Journal:  Eur Respir Rev       Date:  2020-10-01

7.  Artificial Intelligence and COVID-19: Deep Learning Approaches for Diagnosis and Treatment.

Authors:  Mohammad Behdad Jamshidi; Ali Lalbakhsh; Jakub Talla; Zdenek Peroutka; Farimah Hadjilooei; Pedram Lalbakhsh; Morteza Jamshidi; Luigi La Spada; Mirhamed Mirmozafari; Mojgan Dehghani; Asal Sabet; Saeed Roshani; Sobhan Roshani; Nima Bayat-Makou; Bahare Mohamadzade; Zahra Malek; Alireza Jamshidi; Sarah Kiani; Hamed Hashemi-Dezaki; Wahab Mohyuddin
Journal:  IEEE Access       Date:  2020-06-12       Impact factor: 3.367

Review 8.  Diagnosing COVID-19 in the Emergency Department: A Scoping Review of Clinical Examinations, Laboratory Tests, Imaging Accuracy, and Biases.

Authors:  Christopher R Carpenter; Philip A Mudd; Colin P West; Erin Wilber; Scott T Wilber
Journal:  Acad Emerg Med       Date:  2020-07-26       Impact factor: 5.221

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.